New Tagging Examples

While I am slightly behind the schedule I shared in my last post – we are making progress. I have been hesitant to share exactly what this new tagging scheme would look like until now. Below are two examples of the new metadata we will be injecting into the 10-K and EX-13s. At the present time I do not plan to alter the metadata we inject into the the other exhibits except to add the EDGARLINK value. My initial thought is that you can access the metadata associated with the 10-K if you need some value for data collected from an exhibit.

Below is the metadata we will add to Slack’s 10-K that was filed on 3/12/2020. (We use Slack internally and I love it).

<meta name="SIC" content="7385">
<meta name="FYE" content="0131">
<meta name="CONAME" content="SLACK TECHNOLOGIES, INC.">
<meta name="ACCEPTANCETIME" content="20200312163209">
<meta name="ZIPCODE" content="94105">
<meta name="ENTITYSMALLBUSINESS" content="FALSE">
<meta name="ENTITYSHELLCOMPANY" content="FALSE">
<meta name="ENTITYPUBLICFLOAT" content="7200000000">
<meta name="PUBLICFLOATDATE" content="20190731">
<meta name="ENTITYFILERCATEGORY" content="NAF">
<meta name="ENTITYPUBLICSHARESDATE" content="20200229">
<meta name="ENTITYPUBLICSHARESLABEL_1" content="CommonClassA">
<meta name="ENTITYPUBLICSHARESCOUNT_1" content="362046257">
<meta name="ENTITYPUBLICSHARESLABEL_2" content="CommonClassB">
<meta name="ENTITYPUBLICSHARESCOUNT_2" content="194761524">
<meta name="AUDITOR" content="KPMG">
<meta name="AUDITREPORTDATE" content="20200312">
<meta name="AUDITORSINCE" content="2015">
<meta name="AUDITORCITY" content="SAN FRANCISCO">
<meta name="AUDITORSTATE" content="CALIFORNIA">
<meta name="EDGARLINK" content="">

Below is the metadata we will add to Peloton’s 10-K filed on 9/10/2020. Note the acceptance time indicates the RDATE for this filing would be 20200911 since it was accepted after 5:30 pm on 9/10. (No I don’t have a Peloton bike!)

<meta name="SIC" content="3600">
<meta name="FYE" content="0630">
<meta name="CONAME" content="PELOTON INTERACTIVE, INC.">
<meta name="ACCEPTANCETIME" content="20200910180637">
<meta name="ZIPCODE" content="10001">
<meta name="ENTITYSMALLBUSINESS" content="FALSE">
<meta name="ENTITYSHELLCOMPANY" content="FALSE">
<meta name="ENTITYPUBLICFLOAT" content="6281462442">
<meta name="PUBLICFLOATDATE" content="20191231">
<meta name="ENTITYFILERCATEGORY" content="NAF">
<meta name="ENTITYPUBLICSHARESDATE" content="20200831">
<meta name="ENTITYPUBLICSHARESLABEL_1" content="CommonClassA">
<meta name="ENTITYPUBLICSHARESCOUNT_1" content="239427396">
<meta name="ENTITYPUBLICSHARESLABEL_2" content="CommonClassB">
<meta name="ENTITYPUBLICSHARESCOUNT_2" content="49261234">
<meta name="AUDITOR" content="ERNST & YOUNG">
<meta name="AUDITREPORTDATE" content="20200910">
<meta name="AUDITORSINCE" content="2017">
<meta name="AUDITORCITY" content="NEW YORK">
<meta name="AUDITORSTATE" content="NEW YORK">
<meta name="EDGARLINK" content="">

There are two immediate implications of these changes. First, if you do a Summary or Context extraction – these values will be included in the results. The name value will be the column heading and the content value will be the row value. The second implication is that you can filter search results by the values of the content. Clearly you are not going to want to filter by the EDGARLINK – but the ability to filter by ENTITYFILERCATEGORY will help you more efficiently identify those subject to particular disclosure requirements.

To identify all of those that have multiple classes of stock we would just add the following to our search (ENTITYPUBLICSHARESLABEL_2 contains(*)). The Fields menu will list all of these fields so you don’t have to memorize the labels we have used.

We were asked to provide the EDGARLINK to allow you to map/match data you collect from our platform with data collected from other platforms that provide the accession number or a direct link to the filing. The EDGARLINK value can be parsed easily in Excel to give you the accession number.

Right now the constraint is AUDITOR – we have auditor data back to 2011 at the present time. We have been improving our collection strategy for this field and hope to accelerate the collection process in the coming months. The special challenge in collecting this value are those cases where the signature is an image file and then we want the location and audit report date as well. So even though many of you might be able to pull this out from AA but others can’t and we think this is a valuable field when controlling for disclosure.

We will not be able to initially add the AUDITORSINCE value for many filings with auditor changes prior to 2018 because that is going to require some separate effort to identify that data value. Procter & Gamble has been audited by Deloitte since 1890 – so we can trivially add that field to all of their 10-K filings. But we only have 533 CIK’s that have an AUDITORSINCE value prior to 1994. We have 2,457 that have had the same auditor since 2010.

My neighbors and some colleagues share an inside joke, I have stated many times that doing X is like making bread – it is a process rather than something that can happen perfectly the first time. As we move back into older filing windows there are many complications and challenges associated with identifying the metadata values (like making bread). Thus – while I expect the addition of the values to be relatively easy to manage moving forward I do anticipate some unexpected challenges as we attempt to add this data to historical filings. (Fortunately we have directEDGAR to support or work!)

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s