Faisal     About     Data/SWE     Resume     Projects/Talks/Publications


>>> from stuff import faisal

thoughts about data science

A friend of mine had forwarded this post from Noah Lorang about what it means to be a data scientist. Many people have different options on the topic but this post really stuck a cord with me.

Finding solutions to business problems, in a effective well managed way is what it is all about. I think over the last few years the buzz and hype of “big data” and tons of different methodologies has gotten out of control. I honestly don’t think most people even know what big data means, even myself. I have my own thoughts on it, but it might be very different compared to the person sitting next to me.

One thing I love about Noah’s post is that he doesn’t try and make something out of nothing. It is the truth. At my job I’m not building sophisticated machine learning algorithms which process 20 trillion rows, and produce a 0.9458 AUC every day. Most of the time I’m trying to make things happen. Make a project come to life, helping our customers succeed, and helping the business make better, smarter decisions. This could be as simple as:

  • making charts easier to read or readily available
  • uncovering hidden metrics or variables, to help gauge our success.
  • understanding what a current data feed is providing, and what we can do with it.
  • building much needed infrastructure to get the right data in (most of the un-sexy work is just that)

Don’t get me wrong the statistical, heavy algorithmic side of things are also interesting, and very rewarding but you can’t have one with out the other in my opinion (read: proper infrastructure)

My main take away has been that the best data scientists I’ve met exemplify the following traits:

  • don’t know everything and humble enough to be self aware
  • know how to figure stuff out given the above
  • are always willing and curious enough to try (when faced with a challenge not everyone responds as you’d think)
  • are self sufficient (this is key)
  • have the basics of software engineering and math

I don’t have a background in statistics or machine learning, but I’m confident given the time I could build something from scratch to get 80% of the way there. I’m curious, interested, happy to jump out of my comfort zone, and also I’ve done it. This is what I mean by self sufficient.

Just my 2 cents. :-)