Hacker News new | past | comments | ask | show | jobs | submit login

Can you please elaborate more on what kind of critical mistakes a machine can make, while someone with math background would not make.

I am building a competing tool, so I am not affiliate with MS, but I do think that auto ML has value.

Machine learning is different from imperative programming in such that most of the "programming" is done by experiments and not with actual "program", hence there is an opportunity to replace programming with compute. I.e. an automl platform can create 100's of models/pipelines and just try them all.

Also, why would you trust a model which was created manually and not a model which was auto created.

When a model is created in auto ML it pass the same validation process as manually created model, so in both cases the quality of the model should be judged independent from the way that it was created.

In addition, all models (regardless of how they were created - human / not human), should be monitored for predictive performance. I.e. I will not "trust" any model without continuous verification.




A common error is target leaking. An AutoML system will likely consider this a "strong feature". This is where having someone that actually understands the business domain is critical.

There's no question that there's value in AutoML system yet most ML production systems I've worked on / seen were way more complex than feature vector -> model -> prediction. You likely have multiple models, pipelines, normalizations and plain old conditionals. Hard to automate all of this.


Right. I am aiming at the group of companies that have 0 data scientist and would like to avoid hiring one. I assume that their use cases is simple/common and can be automated.

Note that automation is not only building the model, but automating the full life cycle - pre processing, hp optimization , pipeline deployment and monitoring/retraining.


> "Can you please elaborate more on what kind of critical mistakes a machine can make, while someone with math background would not make. I am building a competing tool"

the short answer is, go study stats and fundamentals of ML instead of asking hn to build your product for you.

> "why would you trust a model which was created manually and not a model which was auto created."

one of many reasons: domain knowledge is important, and math alone cant tell you things are muffed up. contrived example: you build a linear regression model to predict home price and square footage has a negative coefficient. Math conclusion: bigger house = lower price. domain knowledge: oh, we are missing a feature and the model cant tell the difference between city homes vs rural.

there is value to auto ml but there is a lot of room to go horribly wrong


Again, my point is that for a given data set, an auto ml system is much more efficient and radically cheaper than human modeler.

You are pointing to an area outside the realm of automl (feature engineering/generation) , which is domain specific. But this was not my original question.


this has nothing to do with feature engineering and generation. I never added or changed any features in the example. It is exactly in the realm of automl, you run a model, -because- you are missing data, your model is making wrong assumptions.

You could argue (which you didn't) that this would fall under model interpretation, but a model in this example would probably fail to generalize and make bad predictions in the future: IE slamming home values because they have large square footage.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: