…Compared to 60% of circumstances in the meat-based developer control group? :)

dimitrios1 · on Sept 3, 2021

I love that we always use the average here for these justifications. We just slowly chip away and any and all excellence. 10x memes aside, we all know what it's like to work with a truly talented and productive engineer versus your everyday schmoe collecting a paycheck. It's a story as old as time, and yet here we are doing the exact big factory industrialization techniques other industries have done and that is commoditize the thing that made them exceptional and eliminating artisanship, uniqueness, and ultimately quality and character.

It's a tragedy of the commons of a sort.

lhorie · on Sept 3, 2021

Wasn't there a thread here just yesterday about how 6% of some class of AI outperformed a human, but then it turned out that 0% outperformed two humans? That's also literally the lesson Uber learned the hard way when a SDV ran over a person (that zero humans is worse than one, and one is worse than two). This is also the principle behind code review, peer review, QA, middle management bureaucracy, and a whole lot of other things.

The tragedy, IMHO, is that AI models like this encourage centralizing decision making into a single black box (to the extent that external research then benefits the owner of the AI model rather than advancing public commons), whereas in pretty much every other aspect of life, we consider decentralization/redundancy of autonomy to be the solution to robustness problems.

6gvONxR4sf7o · on Sept 3, 2021

A common quip is that most benchmarks are of the performance of humans who aren’t really paying attention (because while building datasets, they’re doing this repetitive task over and over and over). So better than the average human benchmark isn’t generally great.

spywaregorilla · on Sept 3, 2021

I disagree. 40% is not great, but unlike the masses of developers, this is a single system that can improve over time. Further, a system that can do most of the work but requires a security specialist to polish it is still a useful tool. What's important to recognize is that this is not a terribly novel concept. Unsecure code is written every day.

6gvONxR4sf7o · on Sept 3, 2021

> unlike the masses of developers, this is a single system that can improve over time

I’d say the exact opposite. Unlike this algorithm, developers can continue to learn. There will likely be future algorithms that are improvements, but this isn’t that.

spywaregorilla · on Sept 3, 2021

Individual developers can learn, but they are replaced by new developers that have not learned. Sure, this specific instance of copilot is not the best, but it sure feels to me like people are discussing the concept of it, not the exact implementation right now.

phreeza · on Sept 3, 2021

It may be a tragedy, but I fail to see why it is a tragedy of the commons? Which resource that is a available to all is being overused? High-paying dev jobs? Those are not a commons in the sense that tragedy of the commons implies because lower-quality devs don't stand to benefit by only taking a smaller part of the job.

lhorie · on Sept 3, 2021

Here's a food analogy: everyone wants to buy the best looking apples, but then farmers are more incentivized to breed for looks than nutritious value, even though nutritious value is the superior metric.

Similarly, if everyone seeks to "dumb down" programming, you end up with a large pool of "dumbed down" programmers, which is counterproductive precisely because AI is imperfect and you need a higher level of expertise to compensate for its shortcomings. As Kernighan famously said: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." Similarly, if one lets the AI do the thinking in their stead, what hope do they have of being able to debug it?

Ironically, though, programming already suffers from this exact problem in a very fundamental way: every tool exists to make a programmer's life easier, and consequently there are a lot of glue-code programmers. The few that actually impact the industry meaningfully (e.g. most notable software comes out of Bay Area) are very expensive because the supply of experts is limited.

runnerup · on Sept 3, 2021

This actually feels to me like a concept worth exploring. I think we lack a concise term or phrase to reference what GP was trying to communicate.

In my heart I feel similarly to GP - and it does feel a lot like how I feel about tragedy of the commons situations. Maybe there seems to be a shared opportunity for everyone if these private companies would make the most of their financial capital, market dominance, dominance in human resources, and most especially leverage their network effects.

That would lead to better things for everyone, like the invention of smartphones. But the same corporations can also waste unimaginable resources and achieve very little. Often their failures don't just have little effect, but rather the failures choke/smother the market and prevent better alternatives from being widely used.

spywaregorilla · on Sept 3, 2021

and of the population that is likely to use copilot in production for their own work? 90%?

lupire · on Sept 3, 2021

These are made up numbers. A control group is needed.