I was doing a demo with a prospective client last week and had a typical experience – I submitted a request file for director compensation data for 2008 that the client supplied. They had identified a sample of registrants and wanted to see how to access data related to their sample. I used the file while they were watching and pulled the data – unfortunately the missing-cik-year report listed 67 CIKs for which no data was available for 2008. I am glad we provide this summary so our users don’t have to muck around and identify which of their sample is missing. Of course the next question is why are they missing?
Usually I will take them on a tour of EDGAR for a few of the listed CIKs and show that their are no filings for the time period. The first listed CIK in the list of missing values was 3133 (AMSOUTH BANCORPORATION). The reason why director compensation for AMSOUTH for 2008 is not available is readily apparent from this image
Our potential client observed that they understood but would like a more concrete way to establish whether or not data should be available. I absolutely understood that and this issue has been bothering me for a while.
One alternative we considered was to try to find all of the delisting notices (15-12B) and create some summary of data from those filings. Unfortunately – too many registrants do not actually file a 15-12B. Further – there is another problem – sometimes data is missing because the registrant has not yet registered or even if they have registered they may not yet be obligated to file the reports that contain some of the data objects our clients are trying to collect.
The solution that we have settled on for the time being is to create a summary file that lists for every CIK that has ever filed any form of the 10-K the date of their first 10-K filing and the date of their most recent 10-K filing. We have done so and uploaded this data to our distribution server. We are calling this data type 10-K_HISTORY it should be visible in our ExtractionPreprocessed data window before 12/1.
Because this is a snapshot at a point in time we are setting this data up with an RDATE of 20180101. This means that your request file will have to have the value of 2018 in the YEAR column for every CIK you want to check.
When you submit the request file we will return a results file that includes the following headings:
Note – the reference to FIRST_FILING and RECENT_FILING are specifically references to any form of a 10-K filing (10KSB, 10K405 etc). So they are not really the first filing the registrant made on EDGAR. The balance sheet date values are the balance sheet date that the filing covers.
We hope this makes it easier to understand why you might be missing data. So rather than having to inspect EDGAR for relevant dates you can use your missing report to construct a request file to then check for missing values. Here is a screen shot after testing the process with the results I alluded to at the beginning of this post:
We hope this makes data validation more efficient and less painful. Since each CIK has only one row of data this should be quick data to access and act on.
Of course there are always catches. I had 67 observations that were missing data for 2008. I submitted all 67 CIKs and the results included another missing report for 5 CIKs. Well it turns out that these 5 CIKs have never filed any form of 10-K. For example, one of the CIKs belonged to COCA-COLA EUROPEAN PARTNERS PLC (1650107). They file 20-F and 6-K forms. Another belonged to DROPBOX (1467623) – they just went public in 2018 and have yet to file a 10-K.