Most companies I talk to, from pre-product startups to Fortune 500s, are hiring data scientists. They see that data science and machine intelligence will play a key role in the business and in their products, so want to build a competitive advantage with data. If you are one of them, you may still wonder what kind of background a good candidate should have, given that “data scientist” is a broad job title. You may also realize that hiring a data science expert is extremely tough in this market.
The good news is that you don’t need a team of Ph.D.’s and statisticians to get started. While working in this industry, I have seen many businesses start their data science initiatives in a lean, cost-effective way by having just one person and by using free, open-source software like mine. Some of them even let a software engineer or a data analyst in their existing team to take on the role.
What Should the First Hire Do?
If you have decided someone, what should you look for in this first data scientist? I suggest finding someone who can establish the following framework:
1. Define a data problem based on your business need.
2. Setup a software platform to collect data from multiple sources. Conduct data sanity checks to ensure there are not mistakes in the collection pipeline.
3. Define an evaluation measurement based on your business goal.
4. Measure the current business performance as a baseline.
5. Understand the domain knowledge and data of your business.
6. Implement the first simple solution on a production system.
7. Conduct A/B tests with the evaluation measurement to compare the solution and the baseline.
Then repeat steps 6 and 7 with different solutions to improve results.
These tasks are not rocket science. They at least do not require Ph.D.-level theories. Basically, your first hire helps get 3 things ready: your data, a clear problem to be solved and a process to evaluate the business impact of any new solution.
After this framework is established, you might hire more data scientists or bring in external consultants if you want to create more sophisticated algorithms and solutions. The benefit of this process is that the problem is now very clear and you have a way to evaluate their work.
Next, when you are doing your initial interviewing, consider the following 5 points:
1. Not every data problem is a big data problem. Do not mix up data science and big data. Millions of records is not a big data problem. The data size of many small to medium size companies is small enough to be processed by a single machine. While you want a data storage system that is scalable for future growth, whether you need to analyze terabytes of data efficiently today depends on your business. Some experienced data scientists and engineers, especially those from tech giants, like LinkedIn and Twitter, specialize in large-scale data processing. Their impressive experience may not be helpful to you if your data size is not that large, and they may not be interested in your data problem anyway.
2. You cannot optimize what you cannot measure. Whether or not to build a data science team is an ROI question. You want to hire someone who cares about evaluation — who wants to measure the impact of his or her work on your business. Good candidates should be able to define the data problems based directly on your business needs and be able to propose evaluation methods for different data science approaches. They should care about delivering measurable results; otherwise, you may spend months doing something fancy, but be left wondering if there was any business gain.
3. Domain knowledge is key. Black-box predictive solutions often do not work well because every business is unique, so their data is unique. A generic algorithm that does not take domain knowledge into consideration has its limits. Your data team should understand and make use of your unique data to develop competitive advantages. Therefore, hire someone who is eager to acquire domain knowledge about your industry and your particular business by looking into your data.
4. Avoid solutions that are looking for problems. Sometimes, an expert who is too deep in certain approaches or research areas may have a tendency to solve every problem the same way. This is not uncommon for candidates who have spent years investigating a single algorithm. But when you are just starting out, you likely do not know which methodology works best for your business. You need someone who is interested in conducting experiments and helping solve problems with the most suitable solutions. Look for open-minded candidates.
5. Do not be a perfectionist. When you have a prediction or optimization problem, do not aim for perfect accuracy. It is not realistic. Instead, benchmark the data solution with what you currently have. Quickly having a practical solution that improves your business is much better than spending years chasing the impossible, perfect solution. For most companies, — especially small businesses and startups — becoming a research lab is not a good idea.
It is always better to start lean and start early. I am happy to hear your own experience in the comments.
The Young Entrepreneur Council (YEC) is an invite-only organization comprised of the world’s most promising young entrepreneurs. In partnership with Citi, YEC recently launched BusinessCollective, a free virtual mentorship program that helps millions of entrepreneurs start and grow businesses.
Image credit: CC by Sebastian Sikora