Big data has been growing since the dawn of information technology. Now, the data that we create every other day is equivalent to all the data that we accumulated up to 2003. This gigantic amount of data has invaluable insights for not just businesses, but the whole human race itself. Big data analysis has been helping the healthcare industry with research for quite some time now. What more, big data might even solve the puzzle of cancer in the near future.
What if I told you that Big data is really just a pile of data that doesn’t make sense if you don’t know how to use it? This is where data scientists come into the picture. To make sense of the big data, we need data scientists, and good ones to be precise. And don’t let the title ‘data scientist’ fool you, there are quite a few of qualities that a data scientist should possess to be called one. If you are looking to hire a data scientist or are planning to be one yourself, here are the qualities that you should look for or possess.
Turning data into information is the primary job of a data scientist. The know-how of statistics is hence a quality that goes without saying. Looking at things with a quantitative mindset is important to stay neutral and keep biases away while dealing with data. A good data scientist understands that the depth and reliability of insights increase in proportion to the amount of data and refrains himself from reaching conclusions with inadequate data. With huge amount of data, trends and insights pop out as numbers. A love for numbers is hence necessary for being a true data scientist. A data scientist should be able to interrogate large amounts of data to derive actionable insights and then apply predictive modelling techniques to anticipate future trends. A good hold on statistics is necessary for preparing reports and plotting the recommended courses of action based on the insights.
A data scientist would work with different teams to build pipelines, tools, modules, packages, websites, dashboards and much more. This doesn’t mean a data scientist should be an expert coder, but an understanding of algorithms and how codes work can go a long way in the work of a data scientist. When the system cannot provide you the right trends or insights, it’s time to roll up the sleeves and write some code. This would be impossible without some programming skills and technical flexibility.
Python is accepted as the most versatile and compatible programming language and is ideal for handling databases and MapReduce-type queries. Being an easy to learn language and open source, learning python shouldn’t be much of an obstacle between you and your data science dream.
Having excellent ‘pseudo code skills’ is also considered by many organisations while hiring a data scientist. ‘Pseudo code skills’ is the capability to write how a query or algorithm should work in plain English. This problem-solving skill is essential to rise up as a data scientist. Data science is an industry where gold standards change at an alarming rate which stresses on the importance of having more skills than what the current scenario asks for.
Although data science is a fairly old field, new discoveries are made every now and then. The drive for finding new ways to solve an old problem is the reason behind this. A data scientist should always keep an inquisitive mind to watch out for new and better way to acquire, merge and process data and find tools to derive better insights. An ideal data scientist should never stop being curious as the data holds secrets that it would only confess to the curious ones. A real data scientist isn’t trying to see how the data proves his biases right, but instead looking for the truths hidden deep inside it.
With data, things can become quite difficult at times, and only curiosity can drive you towards the results. This is why curiosity is one of the most essential qualities of a data scientist.
Data analytics is more about the results than the process itself. It doesn’t matter how you bring the results with data as long as there are the expected results. Data scientists might have to take more than one route to solve certain problems at times. Getting stalled by small hurdles is not a good quality for a data scientist. Being result-driven helps in such cases as the strong determination to convert data into result becomes the driving force for themselves. Data scientists, in general are people who move from one problem to the other while juggling different tasks at the same time. Nothing but the result can stop them from the endeavour.
Creativity might look like the odd one in this list. The truth is, it is one of the most important qualities for a data scientist. Creative people aren’t afraid to make mistakes, they experiment new things and dare to explore new territories. They find opportunities in their failures and can easily change the direction. All of these are essential for data science.
We often categorise people into left-brained and right-brained. Hard sciences like big data is seldom associated with creativity and that’s a big mistake. Data scientists fall somewhere between the two categories and need a streak of creativity to find newer approaches and ways to handle the data. Statistics and databases are not what data science is all about, it’s the storytelling that makes the final output of the analytics useful to the decision makers.
Creativity alone cannot make a data scientist, of course. Someone who can prepare easy-to-consume, attractive and eye catching reports might not always be the best fit for the role of a data scientist. Data scientists can be called creative problem solvers.
Irrespective of whether you are working with structured data, unstructured or both together, a good data scientist must have a fundamental idea about the working of databases. Besides, a basic understanding of columnar and relational databases can go a long way in making the job of a data scientist easier. A lot of the corporate warehouses still use the traditional relational databases. Data scientists will also have to be involved in the setup of these databases although there will be technical personnel to execute the task. The know-how of developing a database infrastructure that can handle unstructured data is like cherry on the top.
A data scientist will mostly be working with the tech, analytics and business folks at the same time. S(he) often acts as the translator for all the parties involved. To handle the tech and business jargon at the same time and know what to use with who, requires strong communication skills. The output of analysis is not usually pretty, at least to someone who isn’t a data scientist. The insights and trends are stuck inside numbers and should be interpreted and communicated to the business team and the stakeholders in a way they understand it. A great data scientist should be able to translate the complex output from the analysis into a simpler form understood by people from varying backgrounds using storytelling, metaphors and visual means of communication.
A great data scientist is always hungry for more data. The quest for data is one without any set goals since more data is always better data. A data scientist should continue to look for more sources of data, better ways to acquire it and innovative methods to process it. The drive to acquire more data is something that a data scientist must possess as data is the fuel for analytics.