Hacker News new | past | comments | ask | show | jobs | submit login

This! I work for a company with ridiculous amount of data that goes back years. I just don't see how any new startups (even well funded ones) is gonna compete. It's not like they can go and ask users to resubmit the data.



It's entirely feasible that you don't need ridiculous amounts of data to generate an AI - that's just the approach being taken by the vast majority of research teams.


Modern machine-learning/deep-learning takes a bunch of data and uses high-dimension, more-or-less brute-force methods to approximate that data with a curve. It works good often (seldom works "great" 'cause the data can't fully capture the situation).

The appealing thing about this is the programmer doesn't have to understand anything. If you have little data, approximation just isn't going to capture the situation. Either the programmer gets an understanding of the system (extremely costs and time-consuming) or we create systems that are themselves capable of this understanding. But no one knows how to do this, all the "artificial intelligence" victories anyone has observed have come from throwing computing power at a problem. Maybe someone will figure out how to throw computing power at the general problem of understanding but I'm doubtful.


That's totally fair. I'm not a AI researcher. From what I've heard from internal folks. A new competitor might be able to compete on one or two models to create some niche, but not in the entire market. The market is big enough that the competitors can still be viable companies, but we are not resting on our laurels either, so it'll be interesting to see.


What makes you think all that data is going to be relevant? One of the points of fragility of deep learning systems is that the data they're trained on only represents the past and the conditions of the present can change.

I mean, there are a few situations that can stay pretty constant but just think when a bunch of AI system are established, all thinking the world won't change, then a cascade of causes changes and the systems no longer working becomes part of that. Of course, we already saw fragility of modern systems with Covid, so let's add even more!


Honestly, I don't know. Not a AI researcher. But for competitors to be viable, they also need to get users to give them data and the infrastructure to handle it, that's not an easy task. We actually lost a contract to handle massive (PB size data) user data a few years ago, but after a few years, our competitor lost that contract back to us because they simply can't handle the volume and the processing power.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: