ESG 10-K Extraction (SEC EDGAR)

This project focuses on reliably finding the latest 10-K filings and extracting ESG-relevant sections from SEC EDGAR with a reproducible method.

Latest Extraction Method (Accurate)

  1. Query the SEC submissions endpoint for a company CIK (or map tickers to CIKs).
  2. Filter forms = 10-K and sort by the newest filingDate.
  3. Use the latest accession number to locate the primary document in the filing index.
  4. Parse the filing HTML, then extract ESG sections using headings and keyword anchors (e.g., “Sustainability,” “Climate,” “Human Capital,” “Governance”).

Outcome