June 19, 2018

Building a World Class Data Science Team

I have built data science and data research teams throughout the world since 2007.  I haven't always quite used that phrase, but in truth, the phrase data science simply didn't circulate much during those years.  Now as I am approaching a year of building a data science firm in Pittsburgh,  I am fortunate to find myself entrenched in a pool of top talent from elite universities and impressive industry backgrounds.  I've found that my experience doing this in war zones and refugee camps has given us an advantage regarding the role of human factors in data science, both internally and externally. My team is incredible and that is not an accident.

The Demand for Data Scientists
Data scientist is the number one professional in demand right now.  Recognized as America's coolest and among the best paying jobs, demand is expected to increase for many years ahead.  While this is great news for someone like myself who has worked in data science for over a decade, it nonetheless creates challenges for recruitment.  Most firms hire engineers, mathematicians, and physicists with good communication skills.  This makes sense, these are people with advanced quantitative skills who can do the data analysis - yet analysis skills alone are insufficient to maximize the value of such a team. 

Minds over Skills
My approach to hiring a data scientist is to go far beyond finding someone who can do the analysis. I specifically seek out people with interdisciplinary backgrounds and non-linear life histories. I seek out people who have degrees and educations in multiple fields but have synthesized this work with core data science specializations such as network analysis or natural language processing.  Hiring such persons his not easy - they are hard to find - but the difference is profound.  A great data scientist probably has more in common with a backpacker in foreign lands - continually probing at the social fabric, interpreting and learning about the complexities that support the data - than a lab researcher looking at a computer screen. 

The work of a data scientist is rarely as simple as doing math.  Companies rarely have their data easily accessible and compartmentalized. They rarely know what "insights" are the most important. The largest clients - typically the oldest - are sprawling with chaos and legacy systems, given the natural byproduct of employee churn and the constant injection of new ideas. Consequently, great data scientists the ability to seek out information within an organization, understand how that information flows and gets utilized, and then construct meaningful concepts about how data fits into the lives of the people doing the work. They understand that data is an abstraction and not every problem should be solved as a data problem.

Just last week, I met a man who worked as an auto mechanic for 15 years before going back to school to study computation and information systems. When I met him and heard his story, I was immediately interested, wanting to learn more about how he would approach a given problem.  He is a natural systems thinker, with a detective mindset, and the skills to do the work. Given my personal background in urban planning, I tend to default to geographic and spatial thinking, so I'm always curious to learn how others will approach the same problem in different ways.

Impact over Input
On my current team, I've hired and interviewed data scientists trained in political sciences, sociology, journalism, neuroscience, and mechanical engineering.  They are all skilled in programming languages and are adept technologists.  But they aren't merely "quants" and are instead, persons who can work with information simultaneously quantitative and qualitative.   They can make sense of the day-to-day chaos of complex organizations and social groups. They are comfortable talking about any subject and are great story tellers.  They also work well with other technical members of our team to translate their work into meaningful products through design and engineering.  The ability to do the math is top-notch - but that is not what defines the best data scientists.  It just happens to be the common thread among an expert team of problem solvers.