Whenever I visit clients or respond to emails about data collection I always try to make the point that it is super critical to identify the sample based on strict criteria to minimize the inevitable chase at the end for missing data and to minimize the processing of the inevitable edge cases. No matter how structured the disclosure requirements are set out in SEC regulations or the Accounting Standards Codification it is inevitable that some proportion of the SEC filers will get ‘creative’ in their form of the disclosure. When they get creative – data collection becomes much more tedious as it becomes necessary to identify the structure of their disclosure before we can sort out how to capture the data. If we can precisely identify the sample firms before we turn to collecting the data items then we will reduce the effort we spend chasing the odd form of the diclosure.
I am helping a client understand how to use our platform to collect a data item that is disclosed only in the 10-K – it is not a required disclosure in any other filing and it is something that is not likely to be disclosed in any other filing (I did some tests and could not find this item disclosed in the combined millions of filings that are searchable with directEDGAR). So related to this I do encourage our users to review the regulations (either the SEC disclosure requirement as set out in the Code of Federal Regulations or the Accounting Standards Codification).
So our client is trying to collect a particular data item – their sample was derived from some other financial data source. It may seem a normal presumption that if a company has data available from some other financial data source then there should be a 10-K with this disclosure.
In this particular case there are three problems with the sample from our client. The first is that some of the sample firms have public data because they have public debt. So while they file a 10-K they might not have some data items included in the 10-K because the disclosure requirements differ by the nature of the laws that establish their filing obligations (ABS issuers versus public debt only versus common stock). So while these companies file a 10-K they will not have the particular disclosure our client is trying to collect. The second problem is that the sample firms may have not had a filing obligation at the time they showed up in the sample. The third problem is that some of their sample are foreign registrants whose filing obligations differ substantially – they have the option to file 20-F and 6-Ks rather than the expected 10-K/Q and 8-Ks (as well as a myriad of other filing differences).
The most common way to determine if a company has publicly traded equity is to look for evidence in one of the other data sources that would normally be used to source some of the data for research. I suggest that as there is not an easy way using SEC filings to determine if a company has publicly traded common stock. In other words there is not really an easy way using directEDGAR to establish whether a filer has publicly traded common stock. For example, I played around with some searches to identify those 10-K filers that are privately held and struggled – because this is not a mandated disclosure. One search I tried was to search all 10-K filings for the existence within the first 800 word of the beginning of the document registrant or issuer or company within 10 words of the phrase privately held
Some of the results (LEVI STRAUSS and CINEMARK USA) were exactly what I was looking for – those registrants are (were) privately held. However, many of the results were not what I was looking for. Therefore if I needed to collect data from companies that had public equity – the best way to define the sample would be to use another tool to determine if they do have public equity.
The second and third issues that needed to be addresses is whether or not the company filed 10-Ks (since that is filing that contains the data we are looking for) in the window that is needed for this study. We can use directEDGAR’s 10-K Filing History archive to establish whether or not a company has filed 10-Ks and for what period. Our client had a list of approximately 13,000 CIK -YEAR observations which represented 3,862 unique CIKs. I used their list of unique CIKs to create a request file to determine the 10-K availability for their sample. This file helped me in two ways. First, for some of their sample CIK-YEAR pairs the date they were trying to collect data for was after (or before) the last (or first) date of the 10-K filings. For example, they needed something from a 10-K filed by CIK 737644 after 1/1/2001. The problem is that this CIK filed their last 10-K in 1997 (I determined this by using the 10-K history file results).
They can use the result file to determine if there is a 10-K filing within the time span that they need to collect the data. And even better – the process also creates a file called missing.csv (clever name) that listed the CIKs from the request file for which no 10-K filing had ever been filed. There were 477 CIKs from their original list of 3,862 CIKs that had never filed a 10-K.
So while we could not use directEDGAR to establish if any in their sample did not have publicly traded equity we could use it to establish whether they filed any 10-Ks and also for what period. The advantage of doing this work at the beginning is that we can more precisely define the data we should expect to collect.