How is data science used in deal sourcing?

Dumb question here, but I've heard more and more how some firms are using data science to drive their sourcing efforts (i.e., identifying targets to approach outside of a process) -- can anyone explain this in a little more detail?  What would kind of metrics would algorithms be looking at, and from what sources?

 

Makes sense, thanks!   I think a few other service providers (Grata?) use AI/ML in their scraping too.  Any thoughts on how an in-house team would use it?  I'm guessing in a similar way: in-house data team grabs similar metrics from public sources (linkedin, conference/trade attendee lists, websites, etc.) for the sourcing team to prioritize direct outreach?  

Anyone here have experience with this?

 
Most Helpful

Coming from data science in a different area I can maybe help you take a step in the right direction. When people think of Data Science they might think fancy algos but in reality it's more like "How can we use data to make smarter decisions?", then work backwards from there. So here's how I'd work backwards for sourcing to solve the question "Who should I reach out to for sourcing?":
A) Historically, In X vertical, companies begin entering a process at Y revenue number/employee count at Z probability. Capture this data in an Excel sheet for all I care.
B) In X vertical, revenues are forecasted to increase by Y percent. Relate this back to the relationship in A.
C) Capture the name of those companies, and then scrape LinkedIn using Python for "Company Name, Title = VP"
D) Populate the name, title, and chance this person could enter a process.

Ta da. You now have a rough probability of a company entering a process based on historical data and have started building a pipeline. This took me all of 5 minutes to type and I have zero expertise in this industry so imagine the possibilities. Make sense?

Just had my trade dispute rejected by Schwab for a loss of 35k. This single issue alone should be a gigantic red flag to anyone who trades on their platform. If they have a system error, and you do not video record your trading (they actually said this), they will not honour their fuck up. Switching everything away from them. Fuck this company.
 

First off, most PE/GE firms aren't doing any real data science, if by that you mean analyzing very large datasets in programs like SQL and drawing conclusions from that data.

As another poster mentioned, PE/GE firms might use tools like Sourcescrub to find leads. And Sourcescrub itself just scrapes the web and probably does some manual data gathering / cleansing so that they can sell their platform to PE end users. 

I work in growth equity and the process PWM Hopeful outlined is spot on. You are looking for targets in X vertical, with revenue/EBITDA ranging between $Y-$Z, and growth rates above W%. You start by going on Sourcescrub and looking at other industry market maps to find all the relevant companies in X vertical. You pull in LinkedIn headcount info which tends to be a good way to back into revenue / growth rates, maybe pull in prior funding history, then use LinkedIn to find the person you want to be reaching out to at each target company (almost always the CEO). I doubt most PE firms are using python to scrape LinkedIn, usually you just find the CEO and get his/her email to reach out (Sourcescrub has CEO email address).

Once you have all this info, you populate your firm's CRM (Salesforce, Dealcloud, etc.) and start reaching out, usually with a template email or something that is slightly adjusted to reflect the interest your firm has in X vertical. Then you keep reaching out (often several times) until you get a response, or if you don't get a response, you try to find another avenue to get in touch with the company. That could be through a banker, through the PE firm that currently owns the business, 2nd degree LinkedIn connections, etc.

Hopefully this helps. As you can probably tell, this is not a very technically-sophisticated process. You could probably train a high-schooler to do it...

 
arb432

First off, most PE/GE firms aren't doing any real data science, if by that you mean analyzing very large datasets in programs like SQL and drawing conclusions from that data.

As another poster mentioned, PE/GE firms might use tools like Sourcescrub to find leads. And Sourcescrub itself just scrapes the web and probably does some manual data gathering / cleansing so that they can sell their platform to PE end users. 

I work in growth equity and the process PWM Hopeful outlined is spot on. You are looking for targets in X vertical, with revenue/EBITDA ranging between $Y-$Z, and growth rates above W%. You start by going on Sourcescrub and looking at other industry market maps to find all the relevant companies in X vertical. You pull in LinkedIn headcount info which tends to be a good way to back into revenue / growth rates, maybe pull in prior funding history, then use LinkedIn to find the person you want to be reaching out to at each target company (almost always the CEO). I doubt most PE firms are using python to scrape LinkedIn, usually you just find the CEO and get his/her email to reach out (Sourcescrub has CEO email address).

Once you have all this info, you populate your firm's CRM (Salesforce, Dealcloud, etc.) and start reaching out, usually with a template email or something that is slightly adjusted to reflect the interest your firm has in X vertical. Then you keep reaching out (often several times) until you get a response, or if you don't get a response, you try to find another avenue to get in touch with the company. That could be through a banker, through the PE firm that currently owns the business, 2nd degree LinkedIn connections, etc.

Hopefully this helps. As you can probably tell, this is not a very technically-sophisticated process. You could probably train a high-schooler to do it...

You're stating it's not technically sophisticated because you don't know anything about the technicals and are posing like you do. It can be infinitely technical if you had any respect or knowledge of the skillset instead of just regurgitating what the first poster + myself said. Class is now in session.

First off - SQL isn't a program - it's a querying language. It queries the data, hence SQL stands for Structured Query Language................ This is the equivalent of saying SUMIF is a program instead of an Excel function, or C# is a program instead of a language. Categorically wrong.

Second, nobody is doing analysis of datasets using SQL. Since - you know - it queries the data, not analyzes it. That's what Python, R, Stata, SPSS, etc. are for.

I see why you believe it isn't technical, you're ignorant of technicalities. 

Don't make me continue, but I will if you pose or disrespect the skillset again. 

Just had my trade dispute rejected by Schwab for a loss of 35k. This single issue alone should be a gigantic red flag to anyone who trades on their platform. If they have a system error, and you do not video record your trading (they actually said this), they will not honour their fuck up. Switching everything away from them. Fuck this company.
 

When I spoke about it being "not very technically-sophisticated" I was referring to the sourcing process I described in paragraphs 3 and 4, not what you described above, which I agree is technical. Sorry I didn't make that clear.

Most PE firms aren't using any sort of sophisticated data analysis (which again, I agree is technical, and beyond my understanding) to source deals. Maybe a few, but not most. 

 

There is nothing in what you said that entails a skill set that a high schooler cannot acquire. The fact that you have zero expertise is evident from how you described the process to identify prospects. Which is totally fine, except that you seem to think you have the expertise to lecture some anonymous person trying to help. Not cool.

The tool for analyses depends on the task at hand, type of data, skills available, etc. SQL can be used to analyze data in some use cases, but you seem to think "select * from linkedin" is its peak capability.

It's cool to show off your expertise. But two caveats. First, acquire some expertise. Second, please be humble. We all make mistakes, and we learn.

 

We aren't using data science but we use some light tech.

Plugging into ppp data, import/export records, SEMRush API, BuiltWith.com's API, and a handful others to build a list of targets. Then our overseas employee scrapes their email when it's not picked up by our scraper. Then auto email! 

Works strangely well.

Example of one of those indicators in use would be import/export records. If containers being imported are over X, you know the target is in your ~range. Can use growth of imports as a proxy for growth too. 

None are perfect, but it saves a lot of time.

 

Dicta dolore aut sed at illo rem vel. Expedita vel et dignissimos fugiat ipsa.

A sit reprehenderit perferendis facilis suscipit ea dolor expedita. Id tempora tempore quidem expedita.

Ea voluptatibus possimus accusantium suscipit. Occaecati dolores perferendis et nobis minima. Aliquam incidunt autem soluta inventore magnam. Aut id quis excepturi dicta. Molestiae iusto molestias omnis. Eius vel ut quia et aut facilis.

Career Advancement Opportunities

April 2024 Private Equity

  • The Riverside Company 99.5%
  • Blackstone Group 99.0%
  • Warburg Pincus 98.4%
  • KKR (Kohlberg Kravis Roberts) 97.9%
  • Bain Capital 97.4%

Overall Employee Satisfaction

April 2024 Private Equity

  • The Riverside Company 99.5%
  • Blackstone Group 98.9%
  • KKR (Kohlberg Kravis Roberts) 98.4%
  • Ardian 97.9%
  • Bain Capital 97.4%

Professional Growth Opportunities

April 2024 Private Equity

  • The Riverside Company 99.5%
  • Bain Capital 99.0%
  • Blackstone Group 98.4%
  • Warburg Pincus 97.9%
  • Starwood Capital Group 97.4%

Total Avg Compensation

April 2024 Private Equity

  • Principal (9) $653
  • Director/MD (22) $569
  • Vice President (92) $362
  • 3rd+ Year Associate (91) $281
  • 2nd Year Associate (206) $266
  • 1st Year Associate (387) $229
  • 3rd+ Year Analyst (29) $154
  • 2nd Year Analyst (83) $134
  • 1st Year Analyst (246) $122
  • Intern/Summer Associate (32) $82
  • Intern/Summer Analyst (314) $59
notes
16 IB Interviews Notes

“... there’s no excuse to not take advantage of the resources out there available to you. Best value for your $ are the...”

Leaderboard

success
From 10 rejections to 1 dream investment banking internship

“... I believe it was the single biggest reason why I ended up with an offer...”