Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Did these Ocaml maintainers undergo some special course for dealing with difficult people? They show enormous amounts of maturity and patience. I'd just give the offender Torvalds' treatment and block them from the repo, case closed.




In my big tech company, you don't want to be dismissive of AI if you don't want to sound like a paria. It's hard to believe how much faith leadership has in AI. They really want every engineer to use AI as much as possible. Reviewing is increasingly done by AI as well.

That being said, I don't think that's why reviewers here were so cordial, but this is the tone you'd expect in the corporate world.


I wouldn't say they were dismissive of AI, just that they are unwilling to merge code that they don't have the time or motivation to review.

If you want AI code merged, make it small so it it's an easy review.

That being said, I completely understand being unwilling to merge AI code at all.


Why would you be unwilling to merge AI code at all?

Consider my other PR against the Zig compiler [1]... I was careful to make it small and properly document it but there's a strict anti-AI policy for Zig and they closed the PR.

Why?

Is it not small? Not carefully documented? Is there no value it int?

I'm not complaining or arguing for justice. I'm genuinely interested in how people think in this instance. If the sausage looks good and tastes great, and was made observing the proper health standards, do you still care how the sausage was made?!

[1] https://github.com/joelreymont/zig/pull/1 [2] https://ziggit.dev/t/bug-wrong-segment-ordering-for-macos-us...


Personally, I'm skeptical that it's a real bug, and that even if it is, that's the proper fix. For all I know, the LLM hallucinated the whole thing, terminal prompts, output, "fix" and all.

It takes time to review these things, and when you haven't shown yourself to be acting responsibly, there's no reason to give you the benefit of the doubt and spend time even checking if the damn alleged bug is real. It doesn't even link to an existing issue, which I'm pretty sure would exist for something as basic as this.

How do you know it's an issue? I think you're letting the always confident LLM trick you into thinking it's doing something real and useful.


> Why would you be unwilling to merge AI code at all?

Because structurally it's a flag for being highly likely to waste extremely scare time. It's sort of like avoiding bad neighborhoods,not because everyone is bad, but because there is enough bad there that it's not worth bothering with.

What sticks out for me in these cases is that the AI sticks out like a sore thumb. Go ahead and use AI, it's as if the low effort nature of AI sets users on a course of using low effort throughout the cycle of whatever it is they are trying to accomplish as an end game.

The AI shouldn't look like AI. The proposed contributions shouldn't stand out from the norm. This include the entire process, not just the provided code. It's just a bad aesthetic and for most people it screams "low effort."


I can't even reproduce your supposed "issue" regarding the Zig compiler "bug". I have an Apple Silicon Mac and tried your reproducer and zig compiled and ran the program just fine.

Honestly, I really suggest reading up on what self-reflection means. I read through your various PRs, and the fact that you can't even answer why a random author name shows up in your PR means the code can't be trusted. It's not just about attribution (although that's important). It's that it's such a simple thing that you can't even reason through.

You may claim you have written loads of tests, but that means literally nothing. How do you know they are testing the important parts? Also you haven't demonstrated that it "works" other than in the simplest use cases.


Check the 2nd PR, the one in my repo and not the one that was rejected.

> Why would you be unwilling to merge AI code at all?

Are you leaving the third-party aspect out of your question on purpose?

Not GP but for me, it pretty much boils down to the comment from Mason[0]: "If I wanted an LLM to generate [...] unreviewed code [...], I could do it myself."

To put it bluntly, everybody can generate code via LLMs and writing code isn't what defines the dominant work of an existing project anymore, as the write/verify-balance shifts to become verify-heavy. Who's better equipped to verify generated code than the maintainers themselves?

Instead of prompting LLMs for a feature, one could request the desired feature from the maintainers in the issue tracker and let them decide whether they want to generate the code via LLMs or not, discuss strategies etc. Whether the maintainers will use their time for reviews should remain their choice, and their choice only - anyone besides the maintainers should have no say in this.

There's also the cultural problem where the review efforts are non-/underrepresented in any contemporary VCS, and the amount of merged code grants for a higher authority over a repository than any time spent doing reviews or verification (the Linux kernel might be an exception here?). We might need to rethink that approach moving forward.

[0]: https://discourse.julialang.org/t/ai-generated-enhancements-...


I'm strictly talking about the 10-line Zig PR above.

Well-documented and tested.


That's certainly a way to avoid questions... I mean sure, but everybody else is talking about how your humongous PRs are a burden to review.

Which is something I agreed with and apologized for, and admitted was somewhat of a PR stunt.

Now, what's your question?


> admitted was somewhat of a PR stunt.

You should be blocked, banned, and ignored.

> Now, what was your question?

Your attitude stinks. So does your complete lack of consideration for others.


You are admitting to wasting people’s time on purpose and then can’t understand why they don’t want to deal with you or give you the benefit of the doubt in the future?

It's worth asking yourself something: people have written substantial responses to your questions in this thread. Here you answered four paragraphs with two fucking lines referencing and repeating what you've already said. How do you expect someone to react? How can you expect anybody to take seriously anything you say, write, or commit when you obviously have so little ability, or willingness, to engage with others in a manner that shows respect and thought?

I really, truly don't understand. This isn't just about manners, mores, or self-reflection. The inability or unwillingness to think about your behavior or its likely reception are stupefying.

You need to stop 'contribiting' to public projects and stop talking to people in forums until you figure this stuff out.


>I really, truly don't understand. This isn't just about manners, mores, or self-reflection. The inability or unwillingness to think about your behavior or its likely reception are stupefying.

Shower thought: what does a typical conversation with an LLM look like? You ask it a question, or you give a command. The model spends some time writing a large wall of text, or performing some large amount of work, probably asks some follow up questions. Most of the output is repetitive slop so the user scans for the direct answer to the question, or checks if the tests work, promptly ignores the follow-ups and proceeds to the next task.

Then the user goes to an online forum and carries on behaving the same way: all posts are instrumental, all of the replies are just directing, shepherding, shaping and cajoling the other users to his desired end (giving him recognition and a job).

I'm probably reading too much into this one dude but perhaps daily interaction with LLMs also changes how one interacts with other text based entities in their lives.


I'll gladly discuss at length things that are near and dear to my heart.

Facing random people in the public court of opinion is not one of them!

Also, there's long-form writing in my blog posts, Twitter and Reddit.


Well if you wanna contribute (at least as a proxy) to OSS, you need to deal with people and make them want to deal with you. If you don't do that, no PR, regardless of how perfect it is, will ever be accepted. If you're so sure that your strategy for the future of development is correct, then prove it by building your own project, where you can fully decide which contributions are accepted, even those which are 100% ai generated. This should be easy, right? Once your project gains wide spread adoption, you can show everybody that you've been right all along. Until then, it's just empty talk.

That's exactly their plan, it seems.

Remind me please, when did I sign up to meet your expectations?

My expectations are those of any reasonable, sensible person who has a modicum of software-development experience and any manners at all.

Incidentally, my expectations are also exactly the same as every other person who has commented on your PRs and contributions to discussion.

My expectations, lastly, are those of someone who evaluates job candidates and casts votes for and against hiring for my team.

Your website says repeatedly that you're open to work. Not only would I not hire you; I would do everything in my power to keep you out of my company and off my team. I'd wager good money that many others in this thread would, too.

If you have a problem with my expectations, you have a problem not with my expectations but with your own poor social skills and lack of professional judgment.


> Why would you be unwilling to merge AI code at all?

Because AI code cannot be copyrighted. It is not anyone's IP. That matters when you're creating IP.

edit: Assuming this is a real person I'm responding to, and this isn't just a marketing gimmick, having seen the trail you've left on the internet over the past few weeks, it strikes me of mania, possibly chatbot-induced. I don't know what I can say that could help, so I'm dropping out of this conversation and wish you the best.


This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

The main reason for being unwilling to merge AI code is going to be that it sets a precedent that AI code is acceptable. Suddenly, maintainers need to be able to make judgement calls on a case-by-case basis of what constitutes an acceptable AI contribution, and AI is going to be able to generate far more slop than people will ever have the time to review and agree upon.


> This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

This depends on what courts find, at least one non-precedent setting case found model training on basically everyone's IP without permission to be fair use. If it's fair use, consent isn't needed, licenses don't matter and the only way to prevent training on your content is to withhold it and gate it behind contracts that forfeit your clients' rights to fair use.

But that is beside the point, even if what you claim was the case, my point is that AI output isn't property. It's not property whether its training corpus was full of licensed or unlicensed content. This is what the US Copyright Office determined.

If you include AI output in your product, that part of it isn't yours. It isn't anybody's, so anyone can copy it and anyone can do whatever they want with it, including the AI/cloud providers you allowed your code to get slurped up to as context to LLMs.

You want to own your IP, you don't want to say "we own 30% of the product we wrote, but 70% of it is non-property that anyone can copy/use/sell with no strings attached, even open source licenses". This matters if you're building a business or open source project. If you include AI code in your open source project, that part of the project isn't covered by your license. LLMs can't sign CLAs and they can't produce intellectual property that can be licensed or owned. The more of your project that is developed by AI, the more it is not yours, and the more of it cannot be covered by your open source license of choice.


> This is what the US Copyright Office determined.

There are hundreds of countries in the world. Whatever the "US Copyright Office" determines, applies to only one of them.


Find me a jurisdiction where AI output is the IP of the prompter

> This is a position that seems to be as unenforceable as AI can't be trained on code whose copyright owners have not given consent.

Stares at facebook stealing terabytes of copyrighted content to train their models

Also, even if code is trained only on FLOSS approved licenses, GPL based ones have some caveats that would disqualify many projects with including code


If poo flinging monkeys are making the sausage people don't care how good the sausage is.

This is a good point. There's a lot of cheering for the Linus swearing style, but if the average developer did that they'd eventually get a talking-to by HR.

Please name it, so that we can know to avoid it and its products.

I think you naturally undergo that course when you are maintainer of a large OSS project.

Well, you go one of two ways. Classic Torvalds is the other way, until an intervention was staged.

There's a very short list of people who can get away with "Classic Torvalds"

Frankly the originator of this pull request deserves the "classic Torvalds" treatment, no matter who is delivering it.

In fact, anyone can get away with it. What are they going to do, call the police because you called them a moron?

Consider the facts: (1) most open-source maintainers have a real job (2) an unwritten rule of corporate culture is "(appear) nice" (see all of the "corporate speak" memes about how "per my last email" means "fuck off") (3) these developers may eventually need a job at another company (4) their "moron" comment is going to live forever on the internet...

If you're famous enough for that to filter to the whoever is handling your resume, it anything it will be positive

It's clear some people have had their brain broken by the existence of AI. Some maintainers are definitely too nice, and it's infuriating to see their time get wasted by such delusional people.

That’s why AI (and bad actors in general) is taking advantage of them. It’s sick.

> "It's clear some people have had their brain broken by the existence of AI."

The AI wrote code which worked, for a problem the submitter had, which had not been solved by any human for a long time, and there is limited human developer time/interest/funding available for solving it.

Dumping a mass of code (and work) onto maintainers without prior discussion is the problem[1]. If they had forked the repo, patched it themselves with that PR and used it personally, would they have a broken brain because of AI? They claim to have read the code, tested the code, they know that other people want the functionality; is wanting to share working code a "broken brain"? If the AI code didn't work - if it was slop - and they wanted the maintainers to fix it, or walk them through every step of asking the AI to fix it - that would be a huge problem, but that didn't happen.

[1] copyrightwashing and attribution is another problem but also not one that's "broken brain by the existence of AI" related.


>They claim to have read the code, tested the code, they know that other people want the functionality; is wanting to share working code a "broken brain"?

There is clearly a deviation between the amount of oversight the author thinks they provided and the actual level of oversight. This is clear by the fact that they couldn’t even explain the misattribution. They also mention that this is not their area of expertise.

In general, I think that it is a reasonable assumption that, if you couldn’t have written the code yourself, you’re in no position to claim you can ensure its quality.


If a manager says they provided oversight of their developer employees, and the code was not as good as the manager thought, would you say "the manager has had their brain broken by the existence of employees"?

I'll bite, let's grant for the sake of the argument that equaling the LLM with a person holds.

This manager is directly providing an unrelated team with an unprompted 150-file giant PR dumped at once with no previous discussion. Upon questioning, he says the code has been written by an outside contractor he personally chose.

No one has onboarded this contractor to the team, and checking their online presence shows lots of media appearances, but not a single production project in their CV, much less long time maintenance.

A cursory glance at the files reveals that the code contains copypasted code from stackoverflow to the point that the original author's name is still pasted in comments. The manager can not justify this, but doesn't seem bothered by the fact, and insists that the contractors are amazing because he's been following them in social networks and is infatuated with their media there.

Furthermore, you check the manager's history in slack and you see 15 threads of him doing the same for other teams. The ones that have agreed to review their PRs have closed them for being senseless.

How would you be interacting with this guy?


This was a pretty spot on analogy. In particular “the manager cannot justify this, but doesn't seem bothered by the fact, and insists that the contractors are amazing” is too accurate.

> If a manager says they provided oversight of their developer employees, and the code was not as good as the manager thought, would you say "the manager has had their brain broken by the existence of employees"?

That could be either regular incompetence or a "broken brain." It's more towards latter if the manager had no clue about what was going on, even after having it explained to him.

This guy is equivalent to a manager who hired two bozos to do a job, but insisted it was good because he had them check each other's work and what they made didn't immediately fall down.


...Well, if you want to make an argument for calling them "useless and incompetent", I'd say you have a great point, good manager would at least throw it to QA and/or recruit someone better after failure

By testing the code I mean that I actually focused on tests passing and the output in the examples being produced by AI running lldb using this modified compiler.

It's clear Claude adapted code directly from the OxCaml implementation (the PR author said he pointed Claude at that code [1] and then provides a ChatGPT analysis [2] that really highlights the plagiarism, but ultimately comes to the conclusion that it isn't plagiarized).

Either that highlights someone who is incompetent or they are willfully being blasé. Neither bodes well for contributing code while respecting copyright (though mixing and matching code on your own private repo that isn't distributed in source or binary form seems reasonable to me).

[1]: https://github.com/ocaml/ocaml/pull/14369#issuecomment-35573...

[2]: https://github.com/ocaml/ocaml/pull/14369#issuecomment-35566...


The key is that AI adapted, not stole.

It's actually capable of reasoning and generating derivative code and not just copying stuff wholesale.

See examples at the bottom of my post:

https://joel.id/ai-will-write-your-next-compiler/


Sorry, this is just ridicilous and shows how people fragile really are. This whole topic and whole MR as well.

I am routinely looking into the folly implementation, sometimes into the libstdc++, sometimes into libc++, sometimes into boost or abseil etc. to find inspiration for problems that I tackle in other codebases. By the same standards, this should also be plagiarism, no? I manufacture new ideas by compiling existing knowledge from elsewhere. Literally every engineer in the world does the same. Why is AI any different?


Perhaps because the AI assigned copyright in the files to the author of the library it copied from and the person prompting it told it to look at that library. Without even getting into the comedy AI generated apologia to go with it which makes it look worse rather than better.

From a pragmatic viewpoint as an engineer you assign the IP you create over to the company you work for so plagarism has real world potential to lose you your job at best. There's a difference between taking inspiration from something unrelated "oh this is a neat algorithmic approach to solving this class of problems" to "I need to implement this specific feature and it exists in this library so I'll lift it nearly verbatim".


Can you give an example what exactly was copied? I ask because I took a look into MR and original repo, and the conclusion is that the tool only copy-pasted the copyright header but not the code. So I am still wondering - what's wrong with that (it's a silly mistake even a human can make), and where is the copyright infringement everyone is talking about?

> copy-past[ing] the copyright header but not the code [is] a silly mistake even a human can make

Do you mind showing me some examples of that? That seems so implausible to me

Just for reference, here's another example of AI adding phantom contributors and the human just ignoring it or not even noticing: https://github.com/auth0/nextjs-auth0/issues/2432


Oh wow. That's just egregious. Considering the widespread use of Auth0, I'm surprised this isn't a bigger story.

> Do you mind showing me some examples of that? That seems so implausible to me

What's so special about it that I need to show you the example?


You are claiming humans copy-and-paste copyright headers without copying the corresponding code. To prove you're correct, you only need to show one (or a few) examples of it happening. To prove you incorrect, someone would have to go through all code in existence to show the absence of the phenomenon.

Hence the burden of proof is on you.


No code besides the header was copied so I am asking what is so problematic about it?

that was already explained before

None of that matters. The header is there, in writing, and discussed in the PR. It is acknowledged by both parties and the author gives a clumsy response for its existence. The PR is simply tainted by this alone, not to mention other pain points.

You may not consider this problematic. But maintainers of this project sure do, given this was one of the immediate concerns of theirs.


OxCaml is a fork of OCaml, they have the same license.

I wasn't able to find any chunks of code copied wholesale from OxCaml which already has a DWARF implementation.

All that code wasn't written by Mark, AI just decided to paste his copyright all over.


It matters because it completely weakens their point of stance and make them look unreasonable. Header is irrelevant since it isn't copyright infringement, and FWIW when it has been corrected (in the MR), then they decided that the MR is too complex for them and closed the whole issue. Ridiculous.

An incorrect copyright header is a major red flag for non technical reasons. If you think it is an irrelevant minor matter then you do not undesirable several very important social and legal aspects of the issue.

Social maybe yes what legal aspects? Everybody keeps repeating that but there is no copyright infringement. Maybe you can point me to one?

I understand that people are uncomfortable with this, I am likely too, but objectively looking there's technically nothing wrong or different to what humans already do.


The point is that it ended up in the PR in the first place. The submitted seemed unaware of its presence and only looked into it after it was pointed out. This is sloppy and is a major red flag.

So there's no point? Sloppy maybe yes but technically incorrect or legally questionable no. Struggle is real

If the submitter is sloppy with things that are not complicated, how can one be sure of things that ARE complicated?

The funny thing is that it works, have a look at the MR. It says:

  All existing tests pass. Additional DWARF tests verify:

  DWARF structure (DW_TAG_compile_unit, DW_TAG_subprogram).
  Breakpoints by function and line in both GDB and LLDB.
  Type information and variable visibility.
  Correct multi-object linking.
  Platform-specific relocation handling.
So the burden of proof is obviously not anymore on the MR submitter side but the other.

Yes?

That is why some people are forbidden to contribute to projects if their eyes have read projects with incompatible licenses, in case people go to copyright court.


Yes what? Both oxcaml and ocaml have compatible LGPL licenses so I didn't get your argument.

But even if that hadn't been the case, what exactly would be the problem? Are you saying that I cannot learn from a copyrighted book written by some respected and known author, and then apply that knowledge elsewhere because I would be risking to be sued for copyright infringement?


The wider point is that copyright headers are a very important detail and that a) the AI got it wrong b) you did not notice c) you have not taken on board the fact that it is important despite being told several times and have dismissed the issue as unimportant

Which raises the question how many other important incorrect details are buried in the 13k lines of code that you are unaware of and unable to recognise the significance of? And how much mantainer time would you waste being dismissive of the issues?

People have taken the copyright header as indicative of wider problems in the code.


Yes, please then find those for now imaginative issues and drill through them? Sorry, but I haven't seen anyone in that MR calling out for technical deficiencies so this is just crying out loud in a public for no concrete reasons.

It's the same as if your colleague sitting next to you would not allow the MR to be merged for various political and not technical reasons - this is exactly what is happening here.


> Yes, please then find those for now imaginative issues and drill through them?

No, that is a massive amount of work which will only establish what we already know with a high degree of certainty due to the red flags already mentored - that this code is too flawed to begin with.

This is not political, this is looking out for warming signs in order to avoid wasting time. At this stage the burden of proof is on the submitter, not the reviewers


Too flawed? Did you miss that tiny detail that MR fixes a long time issue for ocaml? This is exactly political because there's no legal or technical issue. Only fluff by scared developers. I have no stakes in this but I'm sincerely surprised by the amount of unreasonable and unsubstantiated claims and explanations given in this thread and MR

"Yes what? Both oxcaml and ocaml have compatible LGPL licenses so I didn't get your argument."

LGPL is a license for distribution, the copyright of the original authors is retained (unless signed away in a contribution agreement, usually to an organization).

"Are you saying that I cannot learn from a copyrighted book written by some respected and known author, and then apply that knowledge elsewhere because I would be risking to be sued for copyright infringement?"

This was not the case here, so not sure how that is related in any way?


Do you understand that no code besides the header copyright was copied? So what copyright exactly are you talking about?

Depends on the license of the original material, which is why they tend to have a list of allowed use cases for copying content.

Naturally there are very flexible ones, very draconian ones, and those in the middle.

Most people get away with them, because it isn't like everyone is taking others to copyright court sessions every single day, unless there are millions at play.


Also, I just took a glance at the PR and even without knowing the language it took 10 seconds for the first WTF. The .md documents Claude generated are added to .gitignore, including one for the pr post itself.

That’s the quality he’s vouching for.


People complaining about AI stealing code may not realize that OxCaml is a fork of the code that AI is modifying. Both forks have the same license and there are people working on both projects.

AI did paste Mark's copyright on all the files for whatever reason but it did not lift the DWARF implementation from OxCaml and paste it into the PR.

The PR wasn't one-shot either. I had to steer it to completion over several days, had one AI review the changes made by the other, etc. The end result is that the code _works_ and does what it says on the tin!


You are under a delusion. I’m serious.

I wonder if it's the best outcome? The contributor doesn't seem to have a bad intention, could his energy be redirected more constructively? E.g. encouraging him to split up the PR, make a design proposal etc.

The constructive outcome is the spammer fucks off or learns how to actually code.

Lots of people all over the world learn some basics of music in school, or even learn how to play the recorder, but if you mail The Rolling Stones with your "suggestions" you aren't going to get a response and certainly not a response that encourages you to keep spamming them with "better" recommendations.

The maintainers of an open source project are perfectly capable of coercing an LLM into generating code. You add nothing by submitting AI created code that you don't even understand. The very thought that you are somehow contributing is the highest level of hubris and ego.

No, there's is nothing you can submit without understanding code that they could not equally generate or write, and no, you do not have an idea so immensely valuable that it's necessary to vibe code a version.

If you CAN understand code, write and submit a PR the standard way. If you cannot understand code, you are wasting everyone's time because you are selfish.

This goes for LLM generated code in companies as well. If it's not clear and obvious from the PR that you went through and engineered the code generated, fixed up the wrong assumptions, cleaned up places where the LLM wasn't given tight enough specs, etc, then your code is not worth spending any time reviewing.

I can prompt Claude myself thank you.

The primary problem with these tools is that assholes are so utterly convinced that their time is infinitely valuable and my time is valueless because these people have stupidly overinflated egos. They believe their trash, unworkable, unmaintainable slop puked out by an LLM is so damn valuable, because that's just how smart they are.

Imagine going up to the Civil Engineer building a bridge and handing them a printout from ChatGPT when you asked it "How do you build a bridge" and feeling smug and successful. That's what this is.


> The maintainers of an open source project are perfectly capable of coercing an LLM into generating code. You add nothing by submitting AI created code that you don't even understand. The very thought that you are somehow contributing is the highest level of hubris and ego.

Thank you! I was struggling to articulate why AI generated PRs annoy me so much and this is exactly it.


I believe there's a flood of people waiting to be able to "contribute" by publishing a lot of LLM generated code. My question is what if they manage to grab resources from the original devs ?

I think it's for me to redo the PR and break it into smaller pieces.

There's value in the PR in that it does not require you to install the separate OxCaml fork from Jane St which doesn't work with all the OCaml packages. Or wasn't when I tried it back in August.


A big part of software engineering is maintenance not just adding features. When you drop a 22,000 line PR without any discussion or previous work on the project, people will (probably correctly) assume that you aren't there for the long haul to take care of it.

On top of that, there's a huge asymmetry when people use AI to spit out huge PRs and expect thorough review from project maintainers. Of course they're not going to review your PR!


AI actually has the advantage here in my experience. Yes, you can do AI wrong and tell it to just change code, write no documentation, provide no notes on the changes, and not write any tests. But you would be dumb to do it that way.

As it stands now you can set AI to do actual software development with documentation, notes, reasoning for changes, tests, and so on. It isn’t exactly easy to do this, a novice to AI and software development definitely wouldn’t set it up this way, but it isn’t what the tech can really do. There is a lot to be done in using different AI to write tests and code (well, don’t let an AI who can see the code to write the tests, or you could just get a bunch of change detector crap), but in general it mostly turns out that all the things SWEs can do to improve their work works on AI also.


Note that this PR works, was tested, etc.

I was careful to have AI run through the examples in the PR, run lldb on the sample code and make sure the output matches.

Some of the changes didn't make it in before the PR was closed but I don't think anyone bothered to actually check the work. All the discussion focused on the inappropriateness of the huge PR itself (yes, I agree), on it being written by AI... and on the AI somehow "stealing" work code.


I'm actually not talking about whether the PR works or was tested. Let's just assume it was bug-free and worked as advertised. I would say that even in that situation, they should not accept the PR. The reason is that no one is the owner of that code. None of the maintainers will want to dedicate some of their volunteer time to owning your code/the AIs code, and the AI itself can't become the owner of the code in any meaningful way. (At least not without some very involved engineering work on building a harness, and since that's still a research-level project, it's clearly something which should be discussed at the project level, not just assumed).

> but I don't think anyone bothered to actually check the work

Including you


I’ve been finding that the documentation the AI writes isn’t so much for humans, but for the AI when it later goes to work on the code again…well, to say AI benefits from good PRs as much as people do. You could ask the AI to break up the PR next time if possible, it will probably do so much more easily than you could do it manually.

You can ask AI to write documentation for humans.

Also, I'll try to break up the PR sometime but I'm already running Claude using two $200/mo accounts, in addition to another $200/mo ChatGPT, and still running into time limits.

I want to finish my compilers first.


What forces you to publish this work as a PR, or as many PRs? You could have simply kept that for yourself, since you admitted in the PR discussion that you found it useful. Many people seem to think you haven't properly tested it, so that would also be a good way of testing it before publishing it, wouldn't it?

Is (or should that be) the goal, responsibility, or even purview of the maintainers of this project?

I honestly reread the whole thread in awe.

Not due to the submitter, as clickbaity as it was, but reading the maintainers and comparing their replies with what I would have written in their place.

That was a masterclass of defending your arguments rationally, with empathy, and leaving negative emotions at the door. I wish I was able to communicate like this.

My only doubt is whether this has a good or bad effect overall, giving that the PR’s author seemed to be having their delusions enabled, if he was genuine.

Would more hostility have been productive? Or is this a good general approach? In any case it is refreshing.


Years back I attended someone doing an NSF outreach tour in support of Next Generation Science Standards. She was breathtaking (literally - bated breath on "how is that question going to be handled?!?"). Heartfelt hostile misguided questions, too confused to even attain wrong, somehow got responses which were, not merely positive and compassionate, but which managed to gracefully pull out constructive insights for the audience and questioner. One of those "How do I learn this? Can I be your apprentice?" moments.

The Wikipedia community (at least 2 decades back) was also notable. You have a world of nuttery making edits. The person off their meds going article by article adding a single letter "a". And yet a community ethos that emphasized dealing with them with gentle compassion, and as potential future positive contributors.

Skimming a recent "why did perl die" thread, one thing I didn't see mentioned... The perl community lacked the cultural infrastructure to cope with the eternal-September of years of continuous newcomer questions, becoming burned out and snarky. The python community emphasized it's contrast with this, "If you can't answer with friendly professionalism, we don't need your reply today" (or something like that).

Moving from tar files with mailing lists, to now community repos and git and blogs/slack/etc, there's been a lot of tech learned. For example, Ruby's Gems repo was explicitly motivated by "don't be python" (then struggling without a central community repo). But there's also been the social/cultural tech learned, for how to do OSS at scale.

> My only doubt is whether this has a good or bad effect overall

I wonder if a literature has developed around this?


I don't think 'hostility' is called for, but certainly a little bit more... bluntness.

But indeed, huge props to the maintainers for staying so cool.


I work with contractors in construction and often have to throw in vulgarity for them to get the point. This feels very similar to when I'm too nice

I think it’s really good for people to have good case studies like this they can refer to in the case of ai prs as a justification rather than having to take the time themselves



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: