A new tool for analytical workloads on 10K+ pages of PDF documents within minutes

Hi IB friends, I'm working on a tool that allows fast and accurate insights inextraction from long, complex documents. This tool is based on a new paradigm: using SQL to drive document analysis. It would be great to get a quick thumbs up / down on how helpful this can be.

How it works if you are non-technical

It's like using ChatGPT to interrogate each file individually. The results are laid out in an Excel sheet. Then, you can use ChatGPT to summarize that Excel further.

How it works if you are technical

  1. there is a user interface that allows you to do all of these below 👇
  2. define the insights to extract from documents in a data schema
  3. then we have a specialized SQL query to apply the data schema to many documents. it's like

SELECT agent(<schema_id>, document_column) FROM document_table;

where document_column is an actual PDF file column.

  1. Then, LLMs will work behind the SQL engine to extract insights from the documents and collect them into SQL tabular results, based on the data schema.

Difference between us v.s vector-search-based solution

In this process, we did not use vector search because we found that the vector search-based solution is hit or miss due to its similarity-based nature and the information loss during the data transformation steps.

Instead, we prefer to deal with raw documents in .pdf format (of course, you can control which documents to go through via SQL filter)

Advantages over chatbots

Compared to the chatbot-based solution this paradigm is

  • More interpretable: This means the insights derivation steps come from SQL steps so that the intermediate SQL result is explainable. Because of this, you can trust the end output you get
  • More flexible: you can control what document to pass in via SQL. and you can further get consolidated insights via aggregation + LLM
  • More scalable: basically, you can get insights from 10K+ pages within minutes

I would love any high-level input, but if you think this might be useful to your daily job and want to give it a spin, let me know, and we'll set it up with no charge for experimentation workloads. 

3 Comments
 

Aut atque non aliquid incidunt ut ratione veniam. Eveniet repellat id est nihil facere officia voluptatum. Et aliquam optio qui voluptatem expedita. Nemo molestiae cumque perspiciatis amet consequatur. Amet quibusdam nisi voluptatum consequatur quos a.

Enim id esse ut provident. Nesciunt ratione aperiam aut non odio aut adipisci. Et quas quis voluptatum tenetur tempora molestiae. Quo impedit aut aut dolores nemo. Ut ex officia tempore rerum. Provident ea dolore aut. Voluptatem qui sed ratione officiis.

Ut nihil molestiae ad quia doloribus asperiores quo. Animi unde quia veniam minus culpa.

I'm an AI bot trained on the most helpful WSO content across 17+ years.

Career Advancement Opportunities

June 2026 Investment Banking

  • Evercore 01 99.4%
  • Moelis & Company 01 98.8%
  • JPMorgan 01 98.3%
  • Guggenheim Partners 01 97.7%
  • Morgan Stanley 07 97.1%

Overall Employee Satisfaction

June 2026 Investment Banking

  • Moelis & Company No 99.4%
  • Morgan Stanley 02 98.8%
  • Evercore 01 98.3%
  • BMO Capital Markets 12 97.7%
  • Banco Santander 01 97.1%

Professional Growth Opportunities

June 2026 Investment Banking

  • Evercore 01 99.4%
  • Moelis & Company 01 98.8%
  • Morgan Stanley 05 98.3%
  • JPMorgan No 97.7%
  • BMO Capital Markets 12 97.1%

Total Avg Compensation

June 2026 Investment Banking

  • Vice President (14) $434
  • Associates (44) $258
  • 3rd+ Year Analyst (8) $210
  • 2nd Year Analyst (22) $179
  • Intern/Summer Associate (13) $156
  • 1st Year Analyst (78) $151
  • Intern/Summer Analyst (72) $101
notes
16 IB Interviews Notes

“... there’s no excuse to not take advantage of the resources out there available to you. Best value for your $ are the...”

Leaderboard

1
redever's picture
redever
99.2
2
BankonBanking's picture
BankonBanking
99.0
3
Secyh62's picture
Secyh62
99.0
4
kanon's picture
kanon
99.0
5
Betsy Massar's picture
Betsy Massar
98.9
6
dosk17's picture
dosk17
98.9
7
GameTheory's picture
GameTheory
98.9
8
DrApeman's picture
DrApeman
98.9
9
CompBanker's picture
CompBanker
98.9
10
numi's picture
numi
98.8
success
From 10 rejections to 1 dream investment banking internship

“... I believe it was the single biggest reason why I ended up with an offer...”