Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There's an older tradition of rule-based machine translation. In these methods, someone really does understand exactly what the program does, in a detailed way; it's designed like other programs, according to someone's explicit understanding. There's still active research in this field; I have a friend who's very deep into it.

The trouble is that statistical MT (the things that became neural net MT) started achieving better quality metrics than rule-based MT sometime around 2008 or 2010 (if I remember correctly), and the distance between them has widened since then. Rule-based systems have gotten a little better each year, while statistical systems have gotten a lot better each year, and are also now receiving correspondingly much more investment.

The statistical systems are especially good at using context to disambiguate linguistic ambiguities. When a word has multiple meanings, human beings guess which one is relevant from overall context (merging evidence upwards and downwards from multiple layers within the language understanding process!). Statistical MT systems seem to do something somewhat similar. Much as human beings don't even perceive how we knew which meaning was relevant (but we usually guessed the right one without even thinking about it), these systems usually also guess the right one using highly contextual evidence.

Linguistic example sentences like "time flies like an arrow" (my linguistics professor suggested "I can't wait for her to take me here") are formally susceptible of many different interpretations, each of which can be considered correct, but when we see or hear such sentences within a larger context, we somehow tend to know which interpretation is most relevant and so most plausible. We might never be able to replicate some of that with consciously-engineered rulesets!





This is the bitter lesson.[1]

I too used to think that rule-based AI would be better than statistical, Markov chain parrots, but here we are.

Though I still think/hope that some hybrid system of rule-based logic + LLMs will end up being the winner eventually.

----------------

[1] https://en.wikipedia.org/wiki/Bitter_lesson


These days its pretty much the "sweet" lesson for everyone but Sutton and his peers it seems.

It's bitter for me because I like looking at how things work under the hood and that's much less satisfying when it's "a bunch of stats and linear algebra that just happens to work"

So you prefer "a bunch of electrons, field effects, and clocks than just happen to work"?

If you're building on a computer language, you can say you understand the computer's abstract machine, even though you don't know how we ever managed to make a physical device to instantiate it!

Yep, some domains have no hard rules at all.

Time flies like an arrow; fruit flies like a banana.


It's completely possible to write a parser that outputs every possible parse of "time flies like an arrow", and then try interpreting each one and discard ones that don't make sense according to some downstream rules (unknown noun phrase: "time fly").

I did this for a text adventure parser, but it didn't work well because there are exponentially ways to group the words in a sentence like "put the ball on the bucket on the chair on the table on the floor"


I would argue that particular sentence only exists to convey the bamboozled feeling you get when you reach the end of it, so only sentient parsers can parse it properly.

> There's an older tradition of rule-based machine translation. In these methods, someone really does understand exactly what the program does, in a detailed way

I would softly disagree with this. Technically, we also understand exactly what a LLM does, we can analyze every instruction that is executed. Nothing is hidden from us. We don't always know what the outcome will be; but, we also don't always know what the outcome will be in rule-based models, if we make the chain of logic too deep to reliably predict. There is a difference, but it is on a spectrum. In other words, explicit code may help but it does not guarantee understanding, because nothing does and nothing can.


The grammars in rule-based MT are normally fully conceptually understood by the people who wrote them. That's a good start for human understanding.

You could say they don't understand why a human language evolved some feature but they fully understand the details of that feature in human conceptual terms.

I agree in principle the statistical parts of statistical MT are not secret and that computer code in high-level languages isn't guaranteed to be comprehensible to a human reader. Or in general, binary code isn't guaranteed to be incomprehensible and source code isn't guaranteed to be comprehensible.

But for MT, the hand-written grammars and rules are at least comprehended by their authors at the time they're initially constructed.


Sure, I agree with that, but that's a property of hand-writing more than rule-based systems. For instance, you could probably translate a 6B LLM into an extremely big rule system, but doing so would not help you understand how the LLM worked.

Do you know what is the SOTA rule-based MT? I used to be deep into symbolics but couldn't find much in the way of contemporary rule based NLP.

My friend is working on Grammatical Framework, which has a Resource Grammar library of pre-written natural language grammars, at least for portions of them. The GF research community continues to add new ones over time, based on implementing portions of written reference grammars, or sometimes by native speakers based on their own native speaker intuitions. I'm not sure if there are larger grammar libraries elsewhere.

There could be companies that made much better rule-based MT but kept the details as trade secrets. For example, I think Google Translate was rule-based for "a long time" (I don't remember until what year, although it was pretty apparent to users and researchers when it switched, and indeed I think some Google researchers even spoke publicly about it). They had made a lot of investment (very far beyond something like a GF resource grammar) but I don't think they ever published any of that underlying work even when they discontinued that version of the product.

So basically there may be this gap where academic stuff is advancing slowly and yet now represents the majority of examples in the field because companies are so unlikely to have ongoing rule-based projects as part of projects. The available state of the art you can actually interact with may have gone backwards in recent years as a result!

nimi sina li pona tawa mi.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: