we already have a wave of papers that no human has the capacity to verify

uniqueuid · on Aug 13, 2024

Maybe, maybe not. It's a tiered system - you get the deluge at the unfiltered bottom and a narrower selection the more prestigious and selective the outlets / conferences / journals are.

Problem is, of course, that selection criteria are in large parts proxies, not measures of quality. With AI, those proxies become tainted and then you get an explosion of effort.

If anyone has a good recommendation for scalable criteria to assess the quality of papers (beyond fame haha) I'm all ears.

staunton · on Aug 13, 2024

If we had scalable criteria for quality of papers, that would be the end of science. The rest would be engineering...

mhuffman · on Aug 13, 2024

I have also, unless I hallucinated it, read accusations on this very site of peer reviewed papers that were at least partly generated by LLMs.

EGreg · on Aug 13, 2024

Literally every AI discussion on HN has the same format

  > AI is awesome,
    regulation is stupid

    > I wouldnt want to see
      AI flooding the market
      with X we can’t verify
      
      > We already have (copy
        whatever was just said)

For what it’s worth, when it comes to SCIENCE, I an actually in favor of AI, even giving it to everyone. Except possibly AI that would help engineer designer viruses.

Because in science, people literally ARE just doing pattern matching and testing the patterns. Kepler just looked at a bunch of star charts and found a low-dimensional explanation with ellipses. You can throw that data at an AI and it will come up with physics laws involving 24,729 variables that predict things way more accurately. Including potentially chaotic systems like weather or a three body problem etc.

So yeah, use AI for that, because you can actually check its predictions against data, and develop a reputation. We can’t really reason about theories and models of systems with 30,000 variables anyway.

If AI churns out 25 scientific models per day, the proper venue isnt publishing them on arxiv or Nature magazine. It’s testing their predictions and putting them up the same way HuggingFace does. To amass reputation and be checked against real data by multiple parties.

As a side question — is protein folding @ home dead now, since Google “solved it” with massive clusters and AI?

What about SETI@home? Why not just run AI for SETI and solve the drake equation once and for all? :)

EvgeniyZh · on Aug 13, 2024

> If AI churns out 25 scientific models per day, the proper venue isnt publishing them on arxiv or Nature magazine. It’s testing their predictions and putting them up the same way HuggingFace does.

Unless the testing will be done by AI, I doubt human scientist will bother with testing tons of incomprehensible models, unless the accuracy of these models will be exceptionally high

jalman · on Aug 17, 2024

The value of AI-mated research is results. This tool will aid engineering. It will offer a pinpointed research to resolve a particular issue at hand. It will close the verification loop naturally by proving that the research has indeed facilitated a useful result. Most of such useful research will never be publicly available.

What you are complaining about is a legitimately brocken verification loop. Let's consider a case. An engineering department has a problem. It passes the problem to the research department. The research department slacks out, while produces junk. It makes tons of papers that are hard to prove, not to say apply. They drink champagne with other researchers, so they can publish the junk and defend the turf. AI-mated research will finish the racket and corruption.

EvgeniyZh · on Aug 19, 2024

What you describe is quite different from what the original post describes.

Also I'd say that what you describe is quite different from what many people believe is the problem with research, either academic or industrial.

Academic research has a problem of low-quality outputs, but it's not necessarily "hard to apply" kind of problems. There is no "engineering department" that comes with problems to people in academia, and there is no inherent expectation of the research to be immediately applicable or applicable at all. A lot of high quality research is not immediately applicable and is not engineering-oriented, in fact. And this kind of research is still worth automatization, just like it is worth doing while we can't automatize it. There is of course a feedback between experiment and theory, but I would say it's the thing that's broken in the academia.

As for industrial research, which I guess you referred to in first place, I have less experience with it. But the goal of industrial research is not publishing papers, so it's not in their KPI, and they won't be drinking champaign for long if they are publishing junk papers instead of doing what they're hired for. Note that not any research done in industry is "industrial research" by this classification, and companies do academic research occasionally (Bell Labs or Google Brain for example). So if industrial research lab is created to solve engineering problems and doesn't do it properly, then, well, any sane company will fire them. It doesn't have wrong incentives problem like academic research, because its incentive is to make money for the parent company.

Now back to AI. Assuming theoretical research is easier for AI, the bottleneck is going to be human doing experiments. Obviously, AI output should be better than human output for it to be tested by experimentalists. What I was saying is that even worse, since the volume of AI output is higher, and the usual heuristics (name of author, personal discussions, etc.) won't work for filtering AI output, it should be better than high-quality human work and not just average human work. I'll be happy if this can be achieved, but we are very far from there.

mnky9800n · on Aug 15, 2024

This is nitpicky but I don't believe you need 24729 variables to reproduce Kepler's laws which very accurately predict the position of the planets. You basically only need what you need for newtons law of gravity which is masses and locations. Of course the n body problem is more complicated but you still only need masses and locations for all the bodies, plus a computer. What ai could be useful for is finding more efficient and accurate solutions to the many body problem. But tbf, I'm not a many body physicist. So what do I know.