You're thinking it the wrong way. Notebooks don't do well in software developmen...

You're thinking it the wrong way. Notebooks don't do well in software development, but they are extremely useful on exploratory data analysis and quick iteration when searching for a suitable modeling approach. These two tasks use code, but for completely different purposes. A DS is working on the data, understanding it and trying to identify what information it may have. Then they try to find a model that will leverage that information to deliver whatever inference solves the business need. This is extremely interactive and iterative, and everything from the actual business problem to the ML approach may change at each iteration. Imposing software development practices at this point is disruptive to the train of thought, which is very burdened already by the level of uncertainty and all the mathematics required to understand the data results. The goal is to find a viable approach, not write production code.

Once this approach is found, a good clean-up/refactor is strongly recommended, to then start a proper software development that will create a live product from the found approach. I call this the switch between research mode and development mode, and it has strong parallels to the way R&D is done in many industries. I believe a lack of understanding of this dual nature of ML is what causes many of the problems in MLOps: plans that don't take into account the research time and risk, mixed teams where engineers don't understand the initial nature of DS work, attempts to put notebooks containing research code in production, etc. Even planning for the refactor doesn't solve it all - what will happen when the next generation of a model has to be created? Will the refactor Ed code be forced on the DS and ruin their research productivity? Will they start from scratch again and not only lose all the refactor/dev cost but also make this a recurring cost? I have been looking for answers for this for years now, and found none so far.

Source: I've been working with data for 27 years, as a data engineer, data architect and data scientist. When I do DE, my code is considered high quality by my peers, but when I'm doing DS research, I know I write bad code - and I won't change that. It's more productive to work this way and do the big refactor (possibly leaving the notebook env behind along the way) than the alternative.