Hacker News new | past | comments | ask | show | jobs | submit login

WARNING: Stream of consciousness ahead

As someone who has just moved from Data Science to Software Engineering I feel very much the same way, liberated.

I worked in DS for 5 years and had varying degrees of success in working at companies that understood the proper use and application of Data Science. What killed my passion for it was a few things:

1. Data Science is a dubious field - Data Science can certainly be applied correctly but I and others have used the underlying statistical methods gung-ho at times. Part of this comes down to something that W.D said. That Data Scientists are generally early on in their career. We have been captivated by the shiny new field and want to use it as quickly as possible without fully understanding it. Throughout my career I've been met with varying degrees of scepticism about my profession by people because Data Science offers more than it can give.

2. Data Science professional development is poorly understood/completely neglected - If you look for resources to grow in your Data Science skills online you are invariably drowned out by the sheer volume of crappy "Intro to Data Science" courses online. As far as I can find there is very little advanced Data Science professional development resources out there. Compounding the problem is that Data Science teams are invariably managed by people who aren't native to the field. This has the effect of the manager letting Data Scientists self direct their learning which will hit the problem mentioned previously.

3. Support for MLOps is non-existent - I think this problem will change over the next couple of years but Data Science has had to go through cycles of being integrated into a business. The first "wave" of Data Science was met with the realisation by companies that they couldn't get Data Scientists to magic money out of the poorly maintained data they kept. This has caused a huge increase in Data Engineers (not just Data Science has spawned this), now we have Data Scientists who have access to nice data (thanks Data Engineers!), they can build some interesting models but how do they get it deployed? This second "wave" is seeing the rise of MLOps tools, engineers, etc but Data Scientists currently don't have the know-how to get their own models in to production. This inability is incredibly demoralizing from my experience.

4. Educating fellow Data Scientists is too difficult - Unfortunately the perception that is given to people coming in to Data Science is that you can just do model engineering and call it a day. Bootcamps, courses, tutorials are all geared towards getting people good at building models, not about considering how those models fit into the bigger picture. There is little to no knowledge about good programming practices, source control (a lot of Data Scientists I worked with only knew git as a swear word) or deployment strategies. You could argue that a Data Scientist's should only be concerned with building models, I would agree but the reality is that companies will hire a team of Data Scientists but will likely not provide complementing teams to get models in to production. When trying to upskill others on my team it's been an uphill battle. Either people don't care as they just want to build models or they have come from an adjacent field with no software engineering experience.

Apologies for the stream of consciousness but it feels good to get it off my chest. My move to Software Engineering started in my last role where I was a Lead for a Data Science team. Thankfully my boss (head of Data) understood the need for developing a whole data system from good Data Engineering all the way through to MLOps for deployments. I was very fortunate to be able to move to being the Lead MLOps Engineer and develop our capability to deploy models with CI/CD mechanisms using AWS. That really gave me the taste for building systems rather than models. I really do think Data Science has a place and can provide great value but it's still a long way off. If we can make it so that Data Science teams can deploy to production quickly and safely we can really start to reap the rewards.

For Software Engineers looking at getting in to Data Science I would suggest looking at MLOps first. You get to combine existing experience with tackling new problems (how do we keep models live and continuously learning? how do we ensure the tracking of experiments?) and will have a tremendous impact.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: