Hacker News new | past | comments | ask | show | jobs | submit login

This is a frighteningly common practice in DL research. Baselines are rarely taken with resect to alternate techniques, largely due to publication bias.

On one hand papers about DL applications are of interest to the DL community, and useful to see if there is promise in the technique. On the other hand, they may not be particularly useful to industry, or to forwarding broader research goals.




A good rule of thumb is to be slightly more suspicious of "DL for X" unless X was part of the AI/ML umbrella in the 2000s. If no one was publishing about X in AAAI/NIPS/ICML before 2013 or so then there's a pretty good chance that "DL for X" is ignoring 30+ years of work on X. This is becoming less true if one of the paper's senior author comes from the field where "X" is traditionally studied.

Another good rule of thumb is that physicists writing DL papers about "DL for X" where X is not physics are especially terrible about arrogantly ignoring 30+ years of deeply related research. I don't quite understand why, but there's an epidemic of physicists dabbling in CS/AI and hyping it way the hell up.


Anecdotally, having come from a physics background myself - DL is more similar to the math that physicists are used to than traditional ML techniques or even standard comp-sci approaches are. In combination with the universal approximation proofs of DL, it's easy to get carried away and think that DL should be the supervised ML technique.

Curiously, having also spent heavy time on traditional data-structures and algorithms gave me an appreciation for how stupendously inefficient a neural net is and part of me cringes whenever I see a one-hot encoding starting point...


Re: similar to the math they know, this makes sense.

I don't understand why over-hyping and over-selling is so common with AI/ML/DL work (to be fair, over-hyping is more related to AI than physicists in particular. But people from non-CS fields get themselves into extra trouble perhaps because they don't realize there are old-ish subfields dedicated to very similar problems to the ones they're working on.)


I think the answer is related to the startup culture, and that is to gather funding.


That's the confusing thing. Appealing to rich twitter users/journalists/the public doesn't strike me as a particularly good strategy for raising research funds!

Random rich people rarely fund individual researchers. More common for them to fund an institute (perhaps even by starting a new one). The institute then awards grants based on recommendations from a panel of experts. This was true before Epstein scandals, and now I cannot imagine a decent university signing off on random one-off funding streams from rich people.

All gov funding goes through panels of experts.

Listening to random rich people or journalists or the public just isn't how those panels of experts work. Over-hyping work by eg tweeting at rich/famous people or getting a bunch of news articles published is in fact a good way to turn off exactly the people who decide which scientists get money.

Maybe a particularly clueless/hapless PR person at the relevant university (or at Nature) is creating a mess for the authors?


>Random rich people rarely fund individual researchers.

Yes and no. There are private foundations that, if someone donates a reasonably large amount, say at least the amount of their typical grant, they will match the donor with a particular researcher, and the researcher will have lunch, give a tour, and send them a letter later about about the conclusions (more research is needed).

That doesn't mean the donor gets input into which proposals are accepted; that is indeed done by a panel of experts as far as I know. It's more of a thing to keep them engaged and relating to where the money goes when there are emotional reasons for supporting e.g. medical research.


Even less than that now. The ability of specific donors to direct funds to specific academic groups for specific research is WAY more constrained now than it was pre-Epstein. Institutions want that that extra layer of indirection.


While I agree with you, the nature paper that I linked above was published by the folks at google of all places. I think a valid hypothesis is that work done during internships (or even residencies) may not be on-par with what NeurIPS/ICLR/etc require but they give publicity and thus the PR teams push for that kind of papers.

However, it still does note explain why this kind of sloppy work done and published by publicly funded research labs, except perhaps as a form of advertisement.


Well, yeah, corporate research is what it is. A lot of the value add is marketing.


I agree with the sibling comment that whenever ML has not been used before on in a field and DL and especially DRL (deep reinforcement learning) are used, it is likely that the authors are ignoring decades of good research in ML.

After a very theoretical grad course in ML, I have come to appreciate other tools that come with many theoretical guarantees and even built-in regularization that are less Grad Student Descent and more understanding the field.

I think that the hype that was used to gather funding in DL is getting projected onto other fields, if only to gather more funding.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: