Every time I see this argument made, there seems to be a level of complexity and/or operational cost above which people throw up their hands and say "well of course we can't do that".
I feel like we will see that again here as well. It really is similar to the self-driving problem.
Self-driving is a beyond-six-sigma problem. An error rate of over 1-2 crashes per million miles, i.e., the human rate, is unacceptable.
Most jobs are not like that.
A good argument can be made, however, that software engineering, especially in important domains, will be among the last to be fully automated because software errors often cascade.
There’s a countervailing effect though. It’s easy to generate and validate synthetic data for lower-level code. Junior coding jobs will likely become less available soon.
Whereas software defects in design and architecture subtly accumulate, until they leave the codebase in a state in which it becomes utterly unworkable. It is one of the chief reasons why good devs get paid what they do. Software discussions very often underrate software extensibility, or in other words, its structural and architectural scaleability. Even software correctness is trivial in comparison - you can't even keep writing correct code if you've made an unworkable tire-fire. This could be a massive mountain for AI to climb.
Current LLMs lack the ability to perform abstraction at the right level a problem requires. When this gets solved, we’d be quite a bit closer to AGI, which has implications far beyond job displacement.
ARC-AGI Benchmark might serve as a canary in the coal mine.
I hear you. But I have wondered if there won't be a need to maintain certain like of software when you can just have it be rewritten for each iteration. Like some kind of schema evolution, yes but throwaway software at each iteration.
Well in terms of processing speed the AI could iterate on different designs until it finds an extensible one, with some kind of reinforcement learning loop. Produce a certain design, get stuck, throw it away, try a new one. Just like humans learn to write good code really - except at an unfathomable speed of iteration. But it still all sounds ridiculously challenging. There is something there that isn't about predicting next tokens like LLMs do. It's about inferring very complex, highly abstract metastructures in the text.
The challenge might be around the edges here, I guess you'd be able to instruct an agent to always code to a certain API spec, but no piece of software runs or does anything really useful in vacuum.
Fundamentally there is human with limited brain capacity that got trained to that. It’s just a question of time when there are equally capable, and then exceedingly capable models. There is nothing magical or special about human brain.
The only question is how fast it is going to happen. Ie what percentage of jobs is going to be replaced next year and so on.
> There is nothing magical or special about human brain.
There is a lot about the human brain that even the world's top neuroscientists don't know. There's plenty of magic about it if we define magic as undiscovered knowledge.
There's also no consensus among top AI researchers that current techniques like LLMs will get us anywhere close to AGI.
Nothing I've seen on current models (not even o1-preview) suggests to me that AIs can reason about codebases of more than 5k LOC. A top 5% engineer can probably make sense of a codebase of a couple million LOC in time.
Which models specifically have you seen that are looking like they will be able to surmount any time soon the challenges of software design and architecture I'm laying out in my previous comment?
Defining AGI as “can reason about 5MLOC” is ridiculous. When do the goal posts stop moving? When a computer can solve time travel? Babies have behavior all the time that is no more differentiable from what an LLM does on a normal basis (including terrible logic and hallucinations).
The majority of people on the planet can barely reason about how any given politician will affect them, even when there’s a billion resources out there telling them exactly that. No reasonable human would ever define AGI as having anything to do with coding at all, since that’s not even “general intelligence”… it’s learned facts and logic.
Babies can at least manipulate the physical world. Large language model can never be defined as AGI until it can control a general purpose robot, similar to how human brain controls our body's motor functions.
As generally intelligent beings, we can adapt to reading and producing 5M LOC, or to live in arctic climates, or to build a building in colonial or classical style as dictated by cost, taste, and other factors. That is generality in intelligence.
I haven't moved any goal posts - it is your definition which is way too narrow.
You’re literally moving the goalposts right now. These models _are_ adapting to what you’re talking about. When Claude makes a model for haikus, how is that different than a poet who knows literally nothing about math but is fantastic at poetry?
I’m sure as soon as Claude can handle 5MLOC you’ll say it should be 10, and it needs to make sure it can serve you a Michelin star dinner as well.
I feel pain for the people who will be employed to "prompt engineer" the behavior of these things. When they inevitably hallucinate some insane behavior a human will have to take blame for why it's not working.. and yea, that'll be fun to be on the receiving end of.
Humans 'hallucinate' like LLMs. The term used however, is confabulation: we all do it, we all do it quite frequently, and the process is well studied(1).
> We are shockingly ignorant of the causes of our own behavior. The explanations that we provide are sometimes wholly fabricated, and certainly never complete. Yet, that is not how it feels. Instead it feels like we know exactly what we're doing and why. This is confabulation: Guessing at plausible explanations for our behavior, and then regarding those guesses as introspective certainties. Every year psychologists use dramatic examples to entertain their undergraduate audiences. Confabulation is funny, but there is a serious side, too. Understanding it can help us act better and think better in everyday life.
I suspect it's an inherent aspect of human and LLM intelligences, and cannot be avoided. And yet, humans do ok, which is why I don't think it's the moat between LLM agents and AGI that it's generally assumed to be. I strongly suspect it's going to be yesterday's problem in 6-12 months at most.
No, confabulation isn’t anything like how LLMs hallucinate. LLMs will just very confidently make up APIs on systems they otherwise clearly have been trained on.
This happens nearly every time I request “how tos” for libraries that aren’t very popular. It will make up some parameters that don’t exist despite the rest of the code being valid. It’s not a memory error like confabulation where it’s convinced the response is valid from memory either, because it can be easily convinced that it made a mistake.
I’ve never worked with an engineer in my 25 years in the industry that has done this. People don’t confabulate to get day to day answers. What we call hallucination is the exact same process LLMs use to get valid answers.
You work with engineers who confabulate all the time: it's an intrinsic aspect of how the human brain functions that has been demonstrated at multiple levels of cognition.
> Humans 'hallucinate' like LLMs. The term used however, is confabulation: we all do it, we all do it quite frequently, and the process is well studied(1).
Yea i agree, i'm not making a snipe at LLMs or anything of the sort.
I'm saying i expect there to be a human-fallback in the system for quite some time. But solving the fallback problems with be one of black boxes. Which is the worst kind of project in my view, i hate working on code i don't understand. Where the results are not predictable.
That won't even be a real job. How exactly will there be this complex intelligence that can solve all these real world problems, but can't handle some ambiguity in some inputs it is provided? Wouldn't the ultra smart AI just ask clarifying questions so that literally anyone can "prompt engineer"?
As long as there is liability, there must be a human to blame, no matter how irrational. Every system has a failure mode, and ML models, especially the larger ones, often have the most odd and unique ones.
For example, we can mostly agree CLIP does a fine job classifying images, except if you glue a sticky note saying "iPod" onto an apple, it would say classify it as such.
No matter the performance, these are categorically statistical machines reaching for the most immediately useful representations, yielding an incoherent world model. These systems will be proposed as replacement to humans, they will do their best to pretend to work, they will inevitably fail over a long enough time horizon, and a human accustomed to rubber-stamping its decisions, or perhaps fooled by the shape of a correct answer, or simply tired enough to let it slip by, will take the blame.
This is because it will be absolutely catastrophic economically when the majority of high paying jobs can be automated and owned by a few billionaires. Then what will go along with this catastrophe will be all the service people who had jobs to support the people with high paid jobs, they're fucked too. People don't want to have to face that.
We'd be losing access to food, shelter, insurance, purpose. I can't blame people for at least telling themselves some coping story.
It's going to be absolutely ruinous for many people. So what else should they do, admit they're fucked? I know we like to always be cold rational engineers on this forum, but shit looks pretty bleak in the short term if this goal of automating everyone's work comes true and there are basically zero social safety nets to deal with it.
I live abroad and my visa is tied to my job, so not only would losing my job be ruinous financially, it will likely mean deportation too as there will be no other job for me to turn to for renewal.
If most people are unemployed, modern capitalism as we know it will collapse. I'm not sure that's in the interests of the billionaires. Perhaps some kind of a social safety net will be implemented.
But I do agree, there is no reason to be enthusiastic about any progress in AI, when the goal is simply automating people's jobs away.
So what? I can go 45 mph on a human powered road bike downhill. 28mph isn't as scary as it seems and you'd be glad you had that speed when you have to inevitably take that ebike into mixed traffic with cars. I believe for motorcycles they say speed is safety for similar reasons, to be able to escape somewhat from dangerous car situations going on all around your spongy soft body.
I enjoy riding e-bikes, for the record. Going fast on an e-bike is like having happiness on tap. I think more people should own them so that we can collectively start to realize a different way of organizing transportation and commute.
Regarding safety, when it is a bike vs a car, I agree that having more speed is better and safer for the cyclist.
But the issue is when we have a bike vs a pedestrian. I've been a pedestrian next to groups of fast moving e-bikes. It can be pretty scary. Some cyclists can feel entitled to riding on the sidewalk, which at high speeds can really injure someone if they were to crash. E-bikes also weigh a lot more than a non-ebike (up to 120lb in some cases) so getting hit at top speed is a bigger deal than a normal bike.
My overall point is that I think cyclists need to start behaving more like vehicles, rather than fast-moving pedestrians. Obviously we need more investment in biking infrastructure for that to happen, but with how fun and useful e-bikes are, I am optimistic that will eventually happen. As e-bikes become more common, I expect this to become a bigger part of the conversation.
>My overall point is that I think cyclists need to start behaving more like vehicles, rather than fast-moving pedestrians. Obviously we need more investment in biking infrastructure for that to happen, but with how fun and useful e-bikes are, I am optimistic that will eventually happen. As e-bikes become more common, I expect this to become a bigger part of the conversation.
I agree, but there needs to be enforcement of laws against cyclists and e-bikers. The entitlement of sidewalk riding and red light running, in pedestrian cities that already have bike lanes like NYC, needs to be counterbalanced with fear of consequences.
It would be easier for cyclists to behave like vehicles if they actually had infrastructure available. No one is going 28mph on a sidewalk by the way. Honestly going much faster than a light jog is sketchy enough with all the utility poles, cracked sidewalks, debris, foliage, even living situations, that one is liable to encounter on the sidewalk, imo.
A mental analogy that I personally use for this is cabinets. A section of an app is kind of like a cabinet in a kitchen. Some people like opaque cabinet doors while others prefer glass. Other people go even further, and eschew cabinets for open shelves.
These differences in object and system visibility sometimes reflect specific use cases (e.g., a professional kitchen can operate faster with open shelving than with opaque cabinet doors). Other times, they simply reflect the personal preferences of the designer.
Something I've been thinking about is what happens once the market for software developers becomes saturated. I think the answer is something along the lines of blending traditionally off-line skills (illustration, etc), domain knowledge, and software together. The ability to leverage the scale of the internet is still a form of tech-thinking, even if the software engineering part is not the main focus.
I think software alone has relatively fewer problems left to solve compared to solutions that require multi-domain thinking. So for the situation you've outlined, I think there is tremendous value in blending healthcare and software, or art and software. There are people who have found a niche in blending these ideas together [0], and I think the trend will continue.
So you shouldn't despair that your kids are not interested in "tech" now. It could be that once they are comfortable with their first domain choice, they will recognize the value in tech-enabled growth. You can then be there to help them realize that value by introducing system and algorithmic thinking, as well as tooling like vi or emacs.
My take is that it implements the benefits you describe for simple sites that might have less infrastructure at their disposal. It seems like you still need a node runtime to rebuild the files when the cache needs to be regenerated, however.
Serious question: how long does it take for humans to permanently adjust their sense of normalcy? There is a concept of `creeping normality`[0] that gets at this, but there isn't a lot of discussion about how to speed it up and make it permanent.
We know from South Korea that public mask-wearing is probably the most effective form of dropping the R0 quickly. If somehow 100% of Americans could get access to face shields (via 3d printing, for example), then the engineering and logistic problem is solved.
But the bigger problem is the social one. How do you socialize the acceptable use of masks or face shields in everyday public life? If enough people feel enough distress, I could see every person wearing masks, forever. If that happened, the talking points about this dragging on or immediately rebounding start to change.
Given enough technical choices in lifestyle design, there has to be some optimal solution that minimizes droplet emission while maximizing freedom of movement.
I would argue that an even more insightful view of money is to go one level up to “wealth”. Paul Graham covers it nicely: http://www.paulgraham.com/wealth.html
PG's arguments seem cogent enough, until you realize you need to still go up one level, to understand that wealth is power. No, not that wealth gives power, but wealth _is_ power in and of itself.
And power may not be a zero-sum game exactly, but it is close to it
I feel like we will see that again here as well. It really is similar to the self-driving problem.