5 important characteristics of a Data Scientist
Updated: Nov 18, 2019
I share with you 5 important characteristics of a Data Scientist.
A Google search for ‘data scientist traits’ yields approximately only a million results, while that for ‘data science skills ’ throws up around 38 million results. A majority of these articles have focused on skills and tools of data science while very few have dwelt upon the personality traits that go into the making of good, if not great, data scientists. I already explained what the skills needed to become a data scientist are.
With proper training and guidance, just about anyone can master the requisite tools and skills of data science. However, acquiring those tools and their proper application requires a set of traits that are hard to identify and still harder to master.
What is a trait?
A set of technical or practical knowledge, applicable in a limited set of circumstances, is a skill. Statistical analysis, programming, as also baking are examples of skills. But knowing the correct proportion of liquid to dry ingredients in a cake is unlikely to be useful in building a sorting algorithm. And the characteristics of a Weibull distribution will not keep biscuits light and flaky.
A trait, on the other hand, is a mental habit with broad applications in dealing with life, the universe, and everything, for that matter. Traits, in the conventional sense, may be thought of as virtues. Taking forward the baking example, Python won’t help the dough rise, but the trait of patience will prevent us from hurling the mixing bowl across the room when it doesn’t. Contrary to common belief, traits aren’t genetic and fixed; traits that one is deficient in can be developed over a period.
The five essential traits of a great data scientist are:
Hard Working (bonus)
Curiosity – that insatiable hunger for knowledge and understanding – is the first and foremost trait of a data scientist. There are limitations of data that most people don’t know or care about, but data scientists must be curious about what they are doing and what they want to achieve. They must constantly ask questions about the data and of people. Their field is evolving so quickly that they have to maintain their interest to maintain their edge.
We need to ask questions that we ‘should’ already know the answer to, to develop the kind of curiosity that helps maintain and build skills.
We should understand what we’re doing and why we’re doing it, whether we are performing an analysis, writing code or just clearing junk and messed up data It is, after all, “data science” not “data randomly trying things.” Data science can be visualized as resting on the back of a great turtle, which in turn rests on another turtle. It’s turtles all the way down, and we should be able to explain each turtle as if our interlocutor holds five doctorate degrees.
Clarity can be developed by constantly asking ourselves two questions: “Why?” and “So what?” For every step taken in your analysis, ask yourself why you’re doing it, and what it means; for both the specific project and the larger context of what you’re doing. And, like a curious child or inquisitive teen, keep asking those questions until you get to an exclusive answer. Simply put, use “Why” and “So what” to see the turtles all the way down and be able to explain each one of them.
Creativity is probably the most misunderstood trait of those listed. People, by and large, view creativity in a binary fashion: a person either has it or doesn’t have it. . Wolfgang Mozart, for example, is popularly thought to have composed his music from a magical plane of existence, publishing full operas by age eight by sheer use of his inherent creativity. However, the fact is that Leopold Mozart, Wolfgang’s father, was a successful music teacher who experimented and implemented his pedagogical methods on the young Mozart almost from the boy’s birth.
With due respect to Mozart, who certainly had inborn musical genius, creativity can be learned and developed in much the same way athletes develop the skills for their sport.
To learn and develop creativity, one should try these creative pursuits regularly:
Unedited stream-of-consciousness writing (for example ‘Journalling’)
Reading articles well outside of anything one knows
One should also try small, spontaneous lifestyle changes like altering the route to work, or solving a familiar problem differently, all these will help develop one’s creativity. It might interest you to know that much smarter people than me dedicate their lives to studying creativity. There is a whole world to help you boost your creativity. Google throws up as many as 167 million searches for ‘develop creativity.’
While striving continually to be ever more creative, we must learn to be practical and keep at least one foot on the ground, and a healthy skepticism helps us do exactly that. Skepticism keeps our creativity in check, keeping us grounded in the real world rather than down the rabbit hole.
But how is skepticism developed while retaining the optimistic curiosity that drives one to keep learning? Begin by remembering that while one may be asking questions like Elle Woods, it doesn’t mean that every answer is taken at face value. Keep up the curiosity and explore the data, but keep in mind that it’s only as good as the methods employed to collect it. Find out, and after that, examine the assumptions and expectations of the people who gave the data. When you build your model, examine your assumptions and check whether they map to the assumptions of the model and whether they fit what the data says.
The eminently quotable statistician George E. P. Box famously said, ‘All models are wrong, but some are useful.’ By adopting this skeptical attitude, we can embed self-regulation into the optimism inherent in data science.
Data science is not magic, and data scientists are not wizards. The curious data scientist knows and accepts that he is not aware of everything and is always looking for new things to learn. The clear-eyed data scientist swallows pride and adapts the presentation to her audience, even if it means forgoing some ingenious technique built. The creative data scientist thinks ‘outside the box,’ even if it feels silly. And, of course, the skeptical data scientist must mistrust her data, and her models, evaluate them all with sharp clarity of thought and present the results with all the necessary cautions.
Thus the five traits of a data scientist can be summarised as under:-
Curiosity: Cultivate your inner Elle Woods.
Clarity: Be able to explain what one is doing as if one is in possession of five PhDs.
Creativity: Think differently, think laterally and think better.
Skepticism: All models are wrong, some are useful.
Humility: One is not a wizard.
5. Hard Working
Data science is not magic, and data scientists are not wizards. The curious data scientist knows and accepts that he is not aware of everything and is always looking for new things to learn. The clear-eyed data scientist swallows pride and adapts the presentation to her audience, even if it means forgoing some ingenious technique built. The creative data