The Big Data hype engine rolls on but nothing ever gets any clearer. Let’s be serious, NASA has been playing with ‘lots and lots’ of data since the time of the Moon launches but it was just data back then, and now all of a sudden it’s this big concern for the CIO ? Of course what comes with a fancy new industry vertical is a fancy new job title to come with it; the Data Scientist.
What is it, anyway
According to yet another survey this time by NewVantage, 70% of organisations surveyed plan to hire Data Scientists, and 100% of them said it’s “somewhat challenging” to hire a competent one. But just what is a Data Scientist anyway ?
There’s a funny Gartner blog post by analyst Svetlana Sicular (yeah, I know….Gartner doesn’t do funny) in which she heard a couple of definitions;
…a data scientist is 1) a data analyst in California or 2) a statistician under 35
But more importantly, Sicular makes a killer point. “Organisations already have people who know their own data better than mystical Data Scientists…learning Hadoop is easier than learning the company’s business.” In other words, if you have a Data Analyst employed then your search is over.
You just pay them more
It seems to me that as well as an entire industry being set up to handle what is basically just more data (i.e. just scale what you have) there’s a whole other market in squeezing a bigger salary out of an organisation for essentially the same job you already employ someone to do.
The notion of a Data Scientist is a little mad but then so is Big Data. Removing the buzzwords just leaves you with….Data.
And that’s all that it’s about, has been about and ever will be about.
Want a bigger laugh ?
WIRED has posted up an article about “Data Explorers”
And I rest my case……
What a silly question, a data scientist is several things all wrapped into one. First, a data analysts, second, a data analysts that consults and wants to charge more, and thirdly, a new title to create buzz. Probably similar to the VP of HR who in order to gain respectability and more money, became the Chief People Officer, which of course made all the difference.
Theo, I think that Gartner makes a good point. Regardless of how you define a data scientist, the real value of having one their understanding of how the data applies to your business.
I think that what most folks call a data scientists comes in basically two models.
1. The statistician - This is the person that can design studies, apply the appropriate statistical methods, and generate findings from the data. They tell you what the data says, end of story. They either have to rely on a subject matter expert from the business side of the house, or they grow their expertise in the business as they become more senior. Regardless, they are usually analyzing data from studies they’ve designed, and reporting findings.
2. The data architect - This is the person that understands data structures, definitions, systems, and data flow. They help you set the core parameters around your data based on what you want to accomplish. They deal with the dreaded “Three Vs” of data (yes, some add other “v” to the mix). Velocity, Volume, and Variety, You have to determine the granularity you will need from your data to answer the questions you want to answer. These folks help set and develop governance plans around all those issues to ensure the data is collected in a way that feeds the studies and analysis the statistician will execute.
That’s my take, at least.