Programming/Technical Skills for Finance: SQL and Python

I find that there are a lot of skeptics on WSO as to whether or not programming abilities are useful in finance. Believe it or not, it's more than just for developers and quants. These skills are definitely in high demand and will continue to be for many years to come.

Finance and Programming

In this post, I will give a few basic reasons for learning and using two programming languages (if you want to call them that) in the finance industry: SQL and Python.

best programming language for finance: SQL

The ability to store countless bytes of data in physical drives and (more recently) the cloud has changed the ways that many of today's industries function. You can keep records of what you've done, analyze what others have done, move data from one location to another, the possibilities are endless. All this data is stored in databases and unfortunately, SQL is the main method used to harvest this data.

I say "unfortunately" because SQL is not a great programming language because it's not powerful (see Python below). In finance, where so many decisions are driven by analytics, it still shocks me how many people don't know or don't want to learn SQL.

How did learning SQL help me? You can retrieve the data you want to make an argument. This is especially important if you're junior and your boss (like many senior people in finance) isn't very technical. No intelligent manager would argue with data, and if he does, maybe it's time to find a new job. All the trades of a particular security between certain prices on a given date? Easy. Every company with a multiple higher than 20 that trades on the NYSE? Done. Of course, this is all assuming that your workplaces has the databases that you need to do your work.

And if they don't, maybe you can start a good habit by writing and editing data. In any job, if you want to keep track of something and perform analysis on it some time down the road, SQL may help you get there. One can easily write a script to add to a table daily, and you only have to press "F5" (or "Ctrl+Enter", if you prefer) every morning to get the job done. As Bill Gates put it, "I choose a lazy person to do a hard job. Because a lazy person will find an easy way to do it."

Finance with Python

In my opinion, Python is the ultimate flexible programming language today. While it may not be the fastest, it's one of the easiest to learn in that you don't have to memorize weird syntax rules that don't make sense. You don't have to worry about building or compiling code; you can literally type print "Hello World" and click run and it'll work. The open source libraries for Python have tremendous breadth and depth.

How is Python Used in Finance

But how is it useful in finance? Here's a partial list:

  • Financial modeling: If Excel/VBA can do it, Python certainly can (and can probably do more) since it's a full-fledged programming language. Except you're not limited by how much data you can see on your screen and you can run more scenarios efficiently with a few lines of code.
  • Backtesting trades: You can easily code an algorithm in Python and run it on data to see how well your strategy performs. Honestly, it's much easier than coding it in a language that HFT algorithms actually use (like C++).
  • Analyzing data: Remember our friend SQL? Well it's not exactly the most adept at analyzing data; it only retrieves and organizes it. Good thing that Python, with a few packages, allows you to directly import SQL queries and work with the data that it returns in more complex ways.

Again, I could go on, but I'm not even that great at Python. I only picked it up in the last few months to do all of the above. If you have any other programming experience, Python will be a breeze to learn.

tl;dr Give Python and SQL a chance to improve your efficiency at work. Doing the hard things now will make your life easier down the road.

Additional Thoughts on Programming

User @EURCHF parity" shared a detailed post about SQL, Python, Haskell:

EURCHF:
SQL is actually an awesome language. Fully declarative, and with enormous work having gone into the backend (such that your queries can run very fast). You can do a lot more than just select with a few filters. I would really recommend reading up on advanced SQL, query tuning, etc. to get the most out of it.

Python is very easy to get things drafted in, but you will find as you get more experienced that it is messy, buggy and comparatively less well built than other languages like Common Lisp or Haskell. The Python Pandas library was born out of the financial world to add serious performance to R style array languages. Python Pandas is equivalent to R and Octave/Matlab, but R, whilst slower, has enormously more libraries, a really nice easy to use environment in R studio for the beginner, can be programmed like a lisp as you get more advanced, and is completely free. If you would like to invest in one language for data manipulation purposes R is a safe bet. Of course Python is much superior to Java, where "it takes so long to get so little done". Java is banned in my team.

Personally I love Haskell for its power, proximity to mathematics, extraordinary compiler (now faster than C and as fast as well tuned Assembler) and top notch community. Learning Haskell directly (e.g. from LYAH coupled, perhaps, with SICP) will be easier than shifting from an imperative language like Python to the functional paradigm; syntactically it is the cleanest and easiest to pick up vs Scala or Clojure (although Clojure as a lisp is closer to Scheme which you would encounter in SICP; if you are going the lisp way, CL is still a safer bet than Clojure).

As for resources, there is ample available for free on the net. Andrew Ng's Machine Learning course on Coursera is pretty good for a beginner, although in Octave (you can easily rewrite it in R, less easily but more interestingly in Haskell). Other than that, the best is to start 3-4 of what appear to be the top tutorials in your tech stack of choice, and keep going with the best one.

Of course, all this should be done from a linux environment. Dual boot or VM yourself Ubuntu (yes, yes, no geek cred, but it's the most user friendly with the most stuff already written and debugged; BSD is a bit hardcore). Learn to use the command line (takes half an hour).

As for the value in the financial world, I am not 100% sure. I think as you mature as a banker or trader, your instinct and ability to understand humans becomes much more important than your ability to look at data, even in a complex manner. Working on CS knowledge will improve your ability to think quantitatively, certainly, and might also teach you the value of knowing that data is often messed up (and cannot be relied upon).

Decided to Pursue a Wall Street Career? Learn How to Network like a Master.

Inside the WSO Finance networking guide, you'll get a comprehensive, all-inclusive roadmap for maximizing your networking efforts (and minimizing embarrassing blunders). This info-rich book is packed with 71 pages of detailed strategies to help you get the most of your networking, including cold emailing templates, questions to ask in interviews, and action steps for success in navigating the Wall Street networking process.

Networking Guide

 

Codeacademy.com and Khanacademy are both excellent free resources.

Codeacademy has courses on Python/Ruby/Javascript/PHP/etc. Khan has a whole video series on Python and a bunch of stuff about Comp Sci/programming in general. I'm sure the Khan community is also going to be very helpful.

Fwiw, I'm an aspiring monkey who currently teaches English abroad, and watching the first Khanacademy video on Python enabled me to write a program that saves me tons of time on grading.

 

This is awesome to hear. I do think a lot of people - in all kinds of industries - could benefit from basic scripting skills. Enormous amounts of work, particularly at junior level, is basic and repetitive and easily automated.

I personally got into programming because I had to deal with data globally instead of at country level, and had no idea how to fuse together the streams from X countries. My first Python script cleaned the data (e.g. European dates -> ISO timestamp), added a country column and did a UNION ALL (which I later transferred to SQL).

 

I recommend a basic knowledge of VBA for Excel for IBD. Definitely not needed, but during my SA stint this past summer I had a few mundane tasks that would have taken me numerous days to do manually. Instead, I wrote a script in a couple of hours (if I was good, it could have taken 10 minutes), and saved myself an insane amount of time.

“Success means having the courage, the determination, and the will to become the person you believe you were meant to be”
 
Best Response

SQL is actually an awesome language. Fully declarative, and with enormous work having gone into the backend (such that your queries can run very fast). You can do a lot more than just select with a few filters. I would really recommend reading up on advanced SQL, query tuning, etc. to get the most out of it.

Python is very easy to get things drafted in, but you will find as you get more experienced that it is messy, buggy and comparatively less well built than other languages like Common Lisp or Haskell. The Python Pandas library was born out of the financial world to add serious performance to R style array languages. Python Pandas is equivalent to R and Octave/Matlab, but R, whilst slower, has enormously more libraries, a really nice easy to use environment in R studio for the beginner, can be programmed like a lisp as you get more advanced, and is completely free. If you would like to invest in one language for data manipulation purposes R is a safe bet. Of course Python is much superior to Java, where "it takes so long to get so little done". Java is banned in my team.

Personally I love Haskell for its power, proximity to mathematics, extraordinary compiler (now faster than C and as fast as well tuned Assembler) and top notch community. Learning Haskell directly (e.g. from LYAH coupled, perhaps, with SICP) will be easier than shifting from an imperative language like Python to the functional paradigm; syntactically it is the cleanest and easiest to pick up vs Scala or Clojure (although Clojure as a lisp is closer to Scheme which you would encounter in SICP; if you are going the lisp way, CL is still a safer bet than Clojure).

As for resources, there is ample available for free on the net. Andrew Ng's Machine Learning course on Coursera is pretty good for a beginner, although in Octave (you can easily rewrite it in R, less easily but more interestingly in Haskell). Other than that, the best is to start 3-4 of what appear to be the top tutorials in your tech stack of choice, and keep going with the best one.

Of course, all this should be done from a linux environment. Dual boot or VM yourself Ubuntu (yes, yes, no geek cred, but it's the most user friendly with the most stuff already written and debugged; BSD is a bit hardcore). Learn to use the command line (takes half an hour).

As for the value in the financial world, I am not 100% sure. I think as you mature as a banker or trader, your instinct and ability to understand humans becomes much more important than your ability to look at data, even in a complex manner. Working on CS knowledge will improve your ability to think quantitatively, certainly, and might also teach you the value of knowing that data is often messed up (and cannot be relied upon); however, if you are already working 20h days, better suck up to your boss and make sure you are promoted instead of spending your free time learning to code.

Lastly, there are exceptional coders out there and they have the cool jobs which are very underpaid vs value added, due to the ample supply of smart people and short supply of good jobs. So don't hope to lateral into programming unless you are really good.

 

I would say Ruby over Python (Ruby on Rails is legit), and that SQL is a great language to know. Makes DB2 and Oracle easy to pick up as well. Database certs don't hurt for resumes.

 

This may be obvious, but one thing I wish I had learn't earlier is that just because you can program effectively in something like Haskell does not mean you can shirk off learning VBA. While the latter is significantly simpler than functional programming, it takes precedence over all other programming languages for most areas of finance.

 

One more thing, once you are comfortable in an array language like R or Matlab, and SQL, try and find your bank's kdb development group. APL is an amazing family of languages and you will definitely have a good conversation. But don't try and learn APL languages yourself, they are "write and forget" even if spectacularly elegant, and annoying to work with in terms of communication with the outside world.

 

A couple months ago I wanted to learn Ruby (I'm familiar with several languages, but I'm not a developer by trade) and a developer buddy of mine suggested Codecademy. I did their Ruby course and found it to be pretty good so I went on to Python and found that it was equally valuable. Lot of little exercises you may find trite, but all in all it goes quick and both are worthwhile courses imo.

"My caddie's chauffeur informs me that a bank is a place where people put money that isn't properly invested."
 
cthorm:

http://en.wikibooks.org/wiki/Python_Programming

There are also some PDFs out there, I recommend 'Learning Python the Hard Way'.

mikesswimn:

A couple months ago I wanted to learn Ruby (I'm familiar with several languages, but I'm not a developer by trade) and a developer buddy of mine suggested Codecademy. I did their Ruby course and found it to be pretty good so I went on to Python and found that it was equally valuable. Lot of little exercises you may find trite, but all in all it goes quick and both are worthwhile courses imo.

Thanks guys.

 

LPTHW is ok, it's good that it is well integrated with unix and explains everything. However it does get tedious typing out all the code all the time.

I would suggest the following progression: - install Ubuntu (e.g. VirtualBox it on your laptop; if low on RAM, go for xubuntu which is the lightweight version) - "Learn Bash the Hard Way" - CodeAcademy's Python course, which is great for people with ADD (the Java course is a hint better for the subjects covered, which are the absolute basics) - Coursera's SQL course - at this point you can either just google issues for StackOverflow answers, or go the actual hard way: http://mitpress.mit.edu/sicp/full-text/book/book.html - having been amazed at what is possible in Scheme, time to pick up a useful, more powerful language used in the real world: http://learnyouahaskell.com/chapters

And if you're wondering whether it's worth it: http://www.paulgraham.com/avg.html

 

This is a great post-expose. However, when referring to "SQL," do u mean MySQL or SQL thru oracle/Microsoft? And can u run python on a Mac ? Or do u need to have parallels to run it? On a separate note, it's a shame that most universities require some sort of overview to programming course-- having gone to Columbia and completed the core curriculum, I must say I loved most code classes, but none are getting me top job offers!

 

python is fairly straightforward and you will be able to write up some quick scripts to help you solve problems in your other classes. can't hurt to learn that is for sure.

This to all my hatin' folks seeing me getting guac right now..
 

If by IB you mean an investment bank, python is useful (along with other programs) in trading, MO, and AM work. For investment banking itself, you'll use excel and knowing VBA/python/C/etc is helpful but not required.

Get busy living
 

Dicta qui error consequatur neque illo sunt dolorum. Quidem voluptas deserunt quisquam doloribus sed quo. Qui neque velit eveniet corrupti molestiae officia. Totam autem ut porro rerum quasi voluptatum perspiciatis est. Suscipit voluptas at qui officiis itaque praesentium voluptatem. Velit sapiente quasi vel. Deleniti nihil et blanditiis quod vel.

Corrupti ea magnam et ad omnis quaerat. Laborum voluptates assumenda assumenda eveniet.

Possimus sint consequatur iure esse. Velit voluptatem exercitationem rerum minus quos natus. Delectus officia nesciunt qui voluptas deleniti blanditiis hic unde. Nulla similique consequatur cupiditate. Debitis pariatur nihil tempora quis sit officia. Vel quisquam eligendi labore laborum odio.

"Every man should lose a battle in his youth, so he does not lose a war when he is old"

Career Advancement Opportunities

March 2024 Investment Banking

  • Jefferies & Company 02 99.4%
  • Goldman Sachs 19 98.8%
  • Harris Williams & Co. (++) 98.3%
  • Lazard Freres 02 97.7%
  • JPMorgan Chase 03 97.1%

Overall Employee Satisfaction

March 2024 Investment Banking

  • Harris Williams & Co. 18 99.4%
  • JPMorgan Chase 10 98.8%
  • Lazard Freres 05 98.3%
  • Morgan Stanley 07 97.7%
  • William Blair 03 97.1%

Professional Growth Opportunities

March 2024 Investment Banking

  • Lazard Freres 01 99.4%
  • Jefferies & Company 02 98.8%
  • Goldman Sachs 17 98.3%
  • Moelis & Company 07 97.7%
  • JPMorgan Chase 05 97.1%

Total Avg Compensation

March 2024 Investment Banking

  • Director/MD (5) $648
  • Vice President (19) $385
  • Associates (86) $261
  • 3rd+ Year Analyst (13) $181
  • Intern/Summer Associate (33) $170
  • 2nd Year Analyst (66) $168
  • 1st Year Analyst (202) $159
  • Intern/Summer Analyst (144) $101
notes
16 IB Interviews Notes

“... there’s no excuse to not take advantage of the resources out there available to you. Best value for your $ are the...”

Leaderboard

1
redever's picture
redever
99.2
2
Secyh62's picture
Secyh62
99.0
3
Betsy Massar's picture
Betsy Massar
99.0
4
BankonBanking's picture
BankonBanking
99.0
5
dosk17's picture
dosk17
98.9
6
DrApeman's picture
DrApeman
98.9
7
kanon's picture
kanon
98.9
8
CompBanker's picture
CompBanker
98.9
9
GameTheory's picture
GameTheory
98.9
10
Jamoldo's picture
Jamoldo
98.8
success
From 10 rejections to 1 dream investment banking internship

“... I believe it was the single biggest reason why I ended up with an offer...”