What's your process for pulling qualitative data from 10-Ks across multiple peers?

Running into a workflow bottleneck and wanted to see how other analysts are handling this.

When you're building a deck and need to compare how a group of 5-10 peer companies are talking about a specific theme (e.g., "supply chain risks," "customer concentration," "AI strategy") across their last several 10-Ks and Qs, what's your process?

Right now, it feels like a brute-force exercise: download all the PDFs, Ctrl+F for keywords, and manually copy/paste the relevant paragraphs into Excel/PowerPoint. It's incredibly time-consuming and feels like there should be a better way.

Are you guys using a specific tool for this (that isn't a full Bloomberg/FactSet terminal), or is this just one of those parts of the job you have to grind through?

I'm a former MSFT software engineer now doing my Master's and exploring AI solutions in this space. I'm trying to figure out if this is a real, universal pain point or if I'm just missing something obvious.

Any insights would be appreciated.

5 Comments
 

When it comes to pulling qualitative data from 10-Ks across multiple peers, here's what the most helpful WSO content suggests:

  1. Keyword Search and Thematic Analysis:

    • Your current method of using Ctrl+F for keywords is a common starting point, but it can be optimized. Instead of manually copying and pasting, consider using tools like Tableau or Excel with VBA scripts to automate the extraction and organization of data. This can help streamline the process of identifying and categorizing relevant sections.
  2. Alternative Data Tools:

    • While Bloomberg and FactSet are the gold standards, there are alternatives like CapIQ, PrivCo, or Pitchbook that can assist in gathering and analyzing qualitative data. These platforms often allow for more targeted searches and can save time compared to manual methods.
  3. AI and NLP Solutions:

    • Given your background in software engineering and interest in AI, exploring Natural Language Processing (NLP) tools could be a game-changer. Tools like Python's spaCy, NLTK, or even GPT-based models can be trained to extract specific themes (e.g., "supply chain risks") from 10-Ks. You could build a script to:
      • Parse the text from 10-K PDFs.
      • Identify and extract paragraphs containing your keywords.
      • Organize the data into a structured format for analysis.
  4. Efficient Workflow Tips:

    • Start with the Business Section and Risk Factors (Section 1A) of the 10-Ks. These sections are typically where themes like "supply chain risks" or "customer concentration" are discussed.
    • Use investor presentations as a quick primer before diving into the 10-Ks. They often highlight key themes and can guide your search.
  5. Collaborative Tools:

    • If you're working in a team, consider using shared databases or cloud-based tools like Google Sheets or Airtable to centralize and track the extracted data. This can help reduce redundancy and improve collaboration.
  6. Is This a Universal Pain Point?:

    • Yes, this is a common challenge for analysts. The process of extracting qualitative data from 10-Ks is often seen as a grind, but it's also an area ripe for innovation. Your background in AI could position you well to develop a solution that addresses this pain point.

If you're exploring AI solutions, focusing on automating the parsing and thematic analysis of 10-Ks could be a valuable contribution to the field. Good luck!

Sources: Knowledge Sharing: Corporate Development / M&A, Anatomy of the 10-K, Anatomy of the 10-K, https://www.wallstreetoasis.com/forum/hedge-fund/machine-learning-taking-over-hf-research-analyst-roles-in-near-future?customgpt=1, AI in fundamental investing

I'm an AI bot trained on the most helpful WSO content across 17+ years.
 

AlphaSense has a generative AI grid search for this but it sucks ass. I would assume most of its competitors do too.

And can it ever be?
 

Thanks! Would you please have some example question/answer, what you would like to get (and possibly what do you currently get)?

 

Mollitia necessitatibus explicabo ullam dolorem vel. Earum autem iure rem repellendus unde eveniet odio. Qui quia est error est.

Career Advancement Opportunities

July 2025 Investment Banking

  • Goldman Sachs 01 99.5%
  • Evercore 07 98.9%
  • Moelis & Company 04 98.4%
  • Citigroup 11 97.8%
  • Houlihan Lokey 08 97.3%

Overall Employee Satisfaction

July 2025 Investment Banking

  • Evercore 10 99.4%
  • Moelis & Company No 98.9%
  • RBC Royal Bank of Canada 03 98.3%
  • Houlihan Lokey 14 97.8%
  • Morgan Stanley 02 97.2%

Professional Growth Opportunities

July 2025 Investment Banking

  • Evercore 08 99.5%
  • Moelis & Company 01 98.9%
  • Houlihan Lokey 11 98.4%
  • JPMorgan 01 97.8%
  • Goldman Sachs 01 97.3%

Total Avg Compensation

July 2025 Investment Banking

  • Vice President (14) $321
  • Associates (60) $237
  • 3rd+ Year Analyst (9) $210
  • Intern/Summer Associate (14) $167
  • 2nd Year Analyst (33) $166
  • 1st Year Analyst (99) $145
  • Intern/Summer Analyst (100) $103
notes
16 IB Interviews Notes

“... there’s no excuse to not take advantage of the resources out there available to you. Best value for your $ are the...”

Leaderboard

1
redever's picture
redever
99.2
2
Secyh62's picture
Secyh62
99.0
3
BankonBanking's picture
BankonBanking
99.0
4
Betsy Massar's picture
Betsy Massar
98.9
5
kanon's picture
kanon
98.9
6
dosk17's picture
dosk17
98.9
7
GameTheory's picture
GameTheory
98.9
8
CompBanker's picture
CompBanker
98.9
9
DrApeman's picture
DrApeman
98.9
10
numi's picture
numi
98.8
success
From 10 rejections to 1 dream investment banking internship

“... I believe it was the single biggest reason why I ended up with an offer...”