By Tobi Elkin on October 16, 2017
Kyrylo “Kip” Pogrebenko holds one of the most in-demand jobs in the world right now and fortunately for us, he’s plying his skills and intellectual curiosity at Impact Radius. In fact, IBM projects that the demand for data scientists will jump by 28% by 2020. We wanted to share more about Kip and what a data scientist actually does so we sat down to speak with him.
[TE]: What exactly does being a data scientist entail?
KP: Being a data scientist entails an understanding of the business and ecosystem involving data, an understanding data, and translating data into actionable insights by using extensive research, predictive analytics, and machine learning techniques.
It’s about finding actionable insights. There’s also a predictive component to it–you take historical data and you’re trying to predict certain metrics, and then you get actionable insights.
[TE]: What’s a typical day and typical week like?
KP: Very often I work with software engineers, data engineers, and product managers, converting proof of concept solutions into their production counterparts. I get involved in internal discussions with client-facing teams in order to learn about issues our clients may face. We work to make continuous improvements to the algorithms.
We identify false positives such as when our algorithm flags an incidence of fraud when, in fact, it’s not. We also identify false negatives which is when a test result indicates that there’s no incidence of fraud, but there actually is. Rooting out false positives and negatives improves accuracy, which ultimately improves the algorithm’s performance and ability to accurately identify fraudulent activities. This is a big part of a data scientist’s job. There are no perfect algorithms; they can’t catch everything.
I’m also working on compliance issues to meet industry standards, such as the MRC [Media Ratings Council] certification. The MRC is an industry body that oversees advertising measurement in ad tech and martech.
[TE]: What are some misconceptions that people may hold about data scientists?
KP: One of the biggest misconceptions is that data scientists excel because of their extensive knowledge of various algorithms and theories behind them. Although this is true, it’s not the only requirement. Data scientists spend the majority of their time on cleaning data and creating features for various algorithms.
Different algorithms can be applied to the same problem but there is no “one-size-fits-all” solution out there. What’s important is the set of features that go into an algorithm. The performance of one algorithm over another is based on the set of features used and the quality of those features.
Our process is very creative and it improves the performance of algorithms.
[TE]: What’s the difference between data scientists and other professionals who work with data–for example, data engineers and data analysts?
KP: Data engineers and data analysts also work with data. Although all three functions are interconnected, there is a conceptual difference between data scientists and the rest.
The goal of data science is providing strategic and actionable insights into the data, whereas data analysts primarily deal with the issues that we are aware of and their impact. As such, a data analyst typically deals with already established sources of data, often consolidated into one relational system. In contrast, a data scientist examines and evaluates various sources of data, usually disconnected. While the skillsets of both professionals overlap a lot, they differ at the same time. One of the main differences is that data scientists are trained in the field of machine learning and research.
[TE]: What’s your best advice to someone interested in working as a data scientist? Any tips for landing a job?
KP: I would say there is no set path to follow to become a data scientist. Obviously, though, a data scientist must possess natural curiosity, coupled with an ability to use it when solving issues that a business may face. A data scientist translates business problems to the language of formulas and algorithms. Formal education in a quantitative field is an essential requirement, as well as solid programming experience, and excelling in at least one of the languages that’s widely used in the industry, like Python.
A general understanding of modern concepts of data processing systems, such as Apache Hadoop and Apache Spark, is a must. Today’s demand for data scientists is high and will be higher in the future, but it doesn’t mean that landing a job will be easy. Often, during interviews candidates, are given open-ended problems to see how they cope in a high pressure environment to come up with solutions. While it’s preferable for data scientists to hold a graduate level degree (M.S., Ph.D., etc.), it’s not an absolute requirement.
[TE]: What degrees do you have?
KP: I have a computer science degree from Ukraine, and a Bachelor’s of Business Administration in Statistics and Quantitative Modeling from Baruch College in New York.
[TE]: What do you like about being a data scientist?
KP: It’s like a never-ending adventure — I like the sense of constantly discovering new things. Sometimes you think you know everything about a piece of data and you make a new discovery. There are lots of unexpected things that happen. Then you’re trying to explain the logical connections in the data, and the connection between the algorithm and its results.