Some standards and practices can help avoid some types of problems, and some are even rather effective (like airgapping your systems), but there isn't any way to assure security in general other than truly understand what you are doing.
**
I feel like Copilot is the wrong direction to optimize development. This is mostly going to help people with already poor understanding of what they are doing create even more crap.
For a good developer those low level, low engagement activities are not a problem (except maybe for learning stage where you actually want people engaged rather than copy/paste). What it does not help is the important parts of development -- defining domain of your problem, design good APIs and abstractions, understanding how everything works and fits together, understanding what your client needs, etc.
Also, I feel this is going to help increase complexity by making more copies of same structures throughout the codebase.
My working theory about this is this is going to hinder new developers even more than they already are by google and stack*. Every time you are giving new developers an easier way to copy paste code without understanding you are robbing them an opportunity to gain deeper understanding of what they are doing and in effect prevent them from learning and growing.
It is a little bit like giving answers to your kids homework without giving them chance to arrive at the answer or explaining anything about it.
**
Another way I feel this is going to hurt developers is competition in who can produce most volume of code.
I have already noticed this trend where developers (especially more junior but aspiring to advance) try to outcompete others by producing more code, close more tickets, etc. Right now it means skipping understanding of what is going on in favor of getting easy answers from the Internet.
These guys can produce huge amounts of code with relatively little actual engagement.
To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
The Copilot is probably going to make it even more difficult for people who want to do it the right way because even starker difference in false productivity measurements.
I've seen so much boilerplate in the Java or classic .NET Framework world, it's incredible. So many layers of DTOs, Request/Response Models and so on, that could be just generated. Or most of the time even removed completely (that would cost some "architects" their job though).
This is also true for a lot of Redux or Angular/NgRx applications. So much boilerplate, that you can't find the relevant code anymore.
(I have been professionally programming Java backends for the past 16 years).
Java is not the culprit here.
I think it is something that happened on the way that has something to do with J2EE and patterns craze we had a decade ago or two ago.
It doesn't help that frameworks like Spring and their documentation go out of their way to propagate these boilerplate-heavy patters.
Copying these lazy patterns is shortest, easiest way to get to working solution for a person that doesn't want to put any extra effort. And you can't get punished for doing this. Most developers don't even know there exist any other possibilities than mandatory controller calling service calling database layer and hordes of DTOs some people call "model".
I'm working on a Dart / Flutter project where most devs are coming from Java and Android backgrounds. For me, coming mostly from JavaScript, TypeScript, and Python, the amount of pointless over-engineering is very frustrating.
We need to jam through every change through 10 layers now, because of "clean architecture". The team is very slow and can't implement even small changes quickly.
The worst part is that I feel like I'm the idiot for thinking about whether the 50 classes (dtos, models, mappers, blablabla) actually make sense and reduce coupling. I see that anytime a tiny requirement changes, I need to update the 50 classes again, so in practice, it's just doesn't bring anything positive.
When I raise my concerns, they just roll their eyes, and make me feel like "I'm just not a senior enough guy" who just accidently got in the team.
I feel your pain, I am in much the same situation just in a tech lead position.
It takes a lot of patience to undo this damage and explain that simplicity is much more important than lazily, mindlessly repeating "best" practices. I am using quotes intentionally because they aren't actually best -- "best" would suggest there are no better practices which obviously cannot be true.
The goal should always be to make the application simple and easy to work with. Patterns should be tools to achieve the goal rather than being goals themselves.
Simple is important because it allows understanding your application (which is important for developer efficiency as well as improving reliability). It also enables you to modify your application much more easily (more code usually means more work to change it) and this is important to fighting technical debts and to reduce cost of any future development.
> When I raise my concerns, they just roll their eyes, and make me feel like "I'm just not a senior enough guy" who just accidently got in the team.
Easier to do that than just admit technical dept. Some people cannot acknowledge a problem and live alongside it if it is too large to tackle immediately. It has to be explained away or compartmentalized. How simple it is blame the messenger. Sorry you had to experience that from your team.
> We need to jam through every change through 10 layers now, because of "clean architecture".
I feel like this is a rather unfair comment because it doesn't sound like a situation created by "clean architecture."
Granted, you probably should not try and force every detail into this architecture just like you should not rewrite a perfectly good library just because it does not fit into it nicely. But even then; drilling through half a dozen or more layers for every change sounds just wrong. There should not be just any kind of separation in your program. There should be a separation of concerns.
The real problem seems to me a culture in which "We do $X because of $AUTHORITY." is regarded as a sensible answer to criticism. I have worked with exceptionally confident, almost blinkered, people in charge of the big picture and never once have I heard a bullshit answer like that.
i think the other thing is, theres clean architecture and then theres Clean Architecture TM where the thing is taken literally (leading to slavishly applying all the layers with lots of boilerplate, useless mocks and ridiculously coupled unit tests, over architected dependency injectors (assemblies) etc)
i was honestly surprised when i watched a series of lectures from mr uncle (bob) where he clarified a lot of things such as "use dependency injection only where it matters" and "unit tests should be replaced with integration tests after a system is finished being implemented" to "slavish following of agile "customs" is unproductive" etc etc
i think a lot of issues could be resolved if people took the time to think and listen carefully about these things and not stop at the first couple of search hits for "clean architecture"
I totally feel your pain. Go and work somewhere else.
What you can sometimes do, is to remove all those layers to be able to implement or fix something. Then tell the team that you didn’t have time „to do it properly“ and you focused on functionality and efficiency over design. Management loves that.
And after it’s done, some of those abstraction nazis can refactor in all those abstractions again. So they don’t distract you from the next meaningful task.
But make sure, that you understand what benefit this decoupling brings. Because sometimes it’s useful, just not often enough.
It definitely is the culprit. They didn't even want to add `var` to the language until recently, and let's not even go to the anonymous class vs lambdas retardation.
These are just the things that they eventually buckled on, but Java is extremely boilerplatey - the bad patterns and XML crap got invented to deal with that problem.
DDD and onion are another issue, mostly coming out of the TDD movement and "make everything unit-testable". If I liked one thing about working with Rails is that they just gave you E2E tests from the start. But if Java/.NET were more flexible (dynamic or FP do far better at enabling simpler unit testing from my experience) mocking would be simpler so the unit testing part would be simpler too.
The TDD movement came from smalltalk programmers (not that I think it has anything to do with smalltalk, just the programmers came from there). In my experience it was in Ruby and javascript code where I have seen the most inane micro(nano?)-unit tests. Some of this was partly because the unit tests were testing what a static typechecker could verify automatically (at the cost of a little verbosity).
I don't see how you figured there is a relationship between DDD and "make everything unit testable". DDD is about high level architecture. It's at the opposite end of the spectrum.
Also, smalltalkers didn't mean the same thing by "unit". What would be called unit tests in one context may mostly be considered integration tests in another.
I agree, the crazy mock- and stub-heavy unmaintainable micro unit testing thing seems to have been an innovation that came out of the Ruby scene.
Maybe I shouldn't call it TDD camp, but the people that were prothletising TDD were also selling onion/DDD in C# world, so I just bunch those together, it's more about onion, layered architectures, testing at different layers and mocking them, etc.
While I consider TDD the wrong approach in 90% of scenarios, in dynamic languages it works out much better because the object model is so flexible you can mock just about anything trivially. In C# and Java it's just boilerplate on top of boilerplate.
>They didn't even want to add `var` to the language until recently, and let's not even go to the anonymous class vs lambdas retardation.
This realization is always endlessly infuriating to me as someone who was taught way to much Java in Uni, and had to intensionally push myself into other languages to realize why simple things like higher-order functions were subjugated under the tyranny of classes in java.
But far worse than that is the absolutely abysmal and destructive philosophy around types in java. Just mash em together with namespaces and classes, and then nerf type inference to the point that it couldn’t infer what is literally the most trivial reflexion-based equality: “Object o = New Object()”.
Just because objects are involved doesn't mean it is OOP code... OOP is completely misunderstood especially in Java community to the point where it is pretty difficult to see actually object oriented code.
A "service" with a bunch of stateless functions (I am intentionally not calling them methods) is really just a library of routines and the class is used mostly for namespace purposes (to group related functions together) and maybe deliver access to some dependencies. But those dependencies could be thought about almost the same way as global variables in a C program, because usually there exists only single instance of the service.
Neither are DTOs being passed between these services an OOP meachanism -- they are almost C-like structs to make it easier to pass data between functions and to have single reference to them. The only exception maybe is things like equals(), hashcode() etc, but this is very shallow use of OOP patterns.
So it is really difficult to say this is abuse of OOP, when there is very little of actual OOP in it.
The pattern you are describing here (Service classes with static'ish methods together with data classes) is a very functional approach (modules and records). I would consider this much better than "real" OOP.
The OOP "abuse" I was referring to is mostly caused by inheritance. Five or more levels of inheritance is not so uncommon in some enterprise business logic. And once you have to work with that, you arrived in hell. Especially if it is split into different projects, that you can't navigate or debug as one easily.
An IDE or other tools will generate correct boilerplate code. Seems a gripe from someone that prefers hidden magical code that setups code behind their back.
The evil is that someone trained an AI on random text , not even with some AST, so you have garbage in so no surprise you get garbage out.
A true AI would understand that "the dev wants trough find all lines of text in a file that have this property", the AI just does "this code string is similar to this other code string using this `black box metric`"
The problem with boilerplate code is, that it is mostly generated once (by the ide) and then slightly modified. More like a template. In the end you get a lot of meaningless code (the generated code), with some meaningful parts inside. But you can’t see anymore what was generated and what was added manually without deep analysis of the commit log.
It is much better to generate code on the fly during build, so it doesn’t even go to source control and people can’t modify it.
Code generated by magic is worse in my experience, with non-magic code you put a breakpoint where the project starts and you can run step trough it line by line, function by function and it makes sense. What I hate are magic frameworks that are terrible at reporting the issues, say in angular1 you have some bindings and soemtimes they don't trigger , you can't debug the magic strings of the templates in the debugger to see what is happening (it could be a typo but because is all Google dev magic it won't warn you about it).
I hope we are talking about same things, like you dislike creating classes and function explicitly and prefer adding a comment or some templating system that generates a ton of obscure code behind the scenes.
Generated code doesn't have to be magical. The generated code should off course be reviewed by a person from time to time. It must be debuggable and readable/understandable too. Otherwise it doesn't make any sense.
A good example for generated code are typed clients for an OpenAPI interface. Instead of writing a REST client on your own based on a spec, you generate it. And if something isn't right in the first place, don't edit the generated code, fix/configure the generator instead!
Or database models. Either generate the database from the models or the models from the database.
Sure, other good example is for example in Qt, the Designer tool will create some XML that shows the widget you placed properties, then a tool will generate code that is easy to read(not obfuscated or clever).
Can you give some examples on what kind of bad code generation/boilerplate you mean when you think at Java/C# ?
Not without trade-off though, I've been on both spectrum. Template generated code allows you to modify things if you know how, while generation on the flies will need a bunch of options, hooks, and worse string-based evaluation call to make it modifiable.
So it's better for code with little to no modification, while boilerplate / template are better for things that will be modified.
> Template generated code allows you to modify things if you know how
That's exactly the problem. and because templates allow you to modify the code, the template creators do no work on generalizing it so it covers all the use cases. So, on practice, templates usually require that you modify it.
And now you have a huge codebase, mostly with the default text, but with some changes at random, and one of those changes is breaking it. Good luck finding it.
Or you can use Lisp, call your boilerplate generator a macro and get rid of any boilerplate and never again have any problems with modifying any duplicate pieces of code.
Sigh... every time I see people generating any code (say through external tool or through IDE) I am always thinking how this could be a simple macro in any Lisp language.
Templates are a fancy way of copy&paste programming.
So if you need a lot of template code, then your design or your framework is not well suited for the task.
In software development the goal is usually to move commonly used functions or patterns into a library or a framework, instead of copy&pasting them with slight modifications.
Between explicit boilerplate vs. hidden magic, give me the boilerplate every time. At least then it is obvious (maybe painfully obvious) what code is executing. Hidden magic is the worst since code should be optimized for reading and diagnosing, not for write-time which only happens once.
This is because "Copy'n'paste" programming get's more and more common.
I see more and more juniors pasting code or shellcommands from StackOverflow with careless ease, without even pretending anymore that they're interested in how it actually works.
It is true that lines of code are correlated with bugs. In fact, that's the best predictor of the number of bugs - there was some study somewhere that concluded that.
I wonder if the way we are approaching it is wrong. We are basically putting text though a deep learning black box. The model might have learned some abstractions, but all in all it is just playing word games and trying to guess the most likely continuation of a string. Maybe we should go into the other direction and base such an AI on a really massive ontology. Instead of unstructured strings, put highly structured facts into the model.
For example, just like in Copilot you'd start with:
def login_user(username, password):
But the ontology would also know things like:
- This is a web application and this function is going to be called after submitting a form
- Security specialist Bob says you should always hash your passwords
- Specialist Anne says you should use bcrypt
- Tom says Anne is 95% trustworthy
... and thousands of facts more. And then it would take them all into consideration, build a represenation of the problem you are trying to solve, find a strategy, and only in the end generate code.
I have a feeling that there was a qualitiative leap going from simple neural networks and multivariate methods to "deep learning" and modern machine learning, and that this is mainly driven by scale and available computing power. Now what if we try the same thing for ontologies, expert systems, and triple store databases? I think the difference will be between some AI parroting what it read on Wikipedia (direct speach), and a smarter AI being able to reason about what it read on Wikipedia (indirect speach).
I think one way this could be improved is, instead of giving an exact answer (which is provably impossible to do correctly) maybe it could be possible to point the developer to other repositories where other people were solving similar problem.
There are already services that do this for you and I actually find them useful. For example, I might be trying to use a function from some library and it fails. If I get pointed to some public repositories that use the same library in function for similar purpose, I may learn that I am missing some critical setup. I can also browse different uses of this function/library and get informed on how it is at the very least used successfully by others.
>The Programmer's Apprentice Project: A Research Overview
>MIT AI Lab Memo No. 1004, November 1987.
>Rich, Charles; Waters, Richard C.
>Abstract: The goal of the Programmer's Apprentice project is to develop a theory of how expert programmers analyze, synthesize, modify, explain, specify, verify, and document programs. This research goal overlaps both artificial intelligence and software engineering. From the viewpoint of artificial intelligence, we have chosen programming as a domain in which to study fundamental issues of knowledge representation and reasoning. From the viewpoint of software engineering, we seek to automate the programming process by applying techniques from artificial intelligence.
>Plan Recognition in a Programmer's Apprentice. Ph.D. Thesis proposal.
>MIT AI Lab Working Paper 147, May 1977.
>Rich, Charles
>Abstract: Brief Statement of the Problem: Stated most generally, the proposed research is concerned with understanding and representing the teleological structure of engineered devices. More specifically, I propose to study the teleological structure of computer programs written in LISP which perform a wide range of non-numerical computations. The major theoretical goal of the research is to further develop a formal representation for teleological structure, called plans, which will facilitate both the abstract description of particular programs, and the compilation of a library of programming expertise in the domain of non-numerical computation. Adequacy of the theory will be demonstrated by implementing a system (to eventually become part of a LISP Programmer's Apprentice) which will be able to recognize various plans in LISP programs written by human programmers and thereby generate cogent explanations of how the programs work, including the detection of some programming errors.
Thanks for the term and the resources! Sometimes one has a vague idea and it's really nice to see that this is a thing people put thought into.
Funny that I would describe a solution based on machine learning as scruffy and a solution based on bayesian logic and knowledge databases as neat, whereas Wikipedia defines it the other way around.
There's some fascinating historical back-story and quotes on the talk page of that wikipedia entry, and also an interesting question about how machine learning is neat:
>Roger Schank first used those terms "scruffy" and "neat" at an AI conference in the 1970s. He proudly called himself a scruffy. 71.183.59.144 (talk) 02:17, 26 October 2011 (UTC)
>The terminology is sourced to the late 1970s or early 1980s and originated by Schenk according to this:
>"In particular, certain personality traits go hand and hand with certain styles of research. Schank and Abelson hit upon one such phenomenon along these lines and dubbed it the neats vs. the scruffies. These terms moved into the mainstream AI community during the early 80s, shortly after Abelson presented the phenomenon in a keynote address at the Annual Meeting of the Cognitive Science Society in 1981. Here are some selected excerpts from the accompanying paper in the proceedings:"
>The article quotes a lengthy excerpt of this keynote address, some of which I include below
>“The study of the knowledge in a mental system tends toward both naturalism and phenomenology. The mind needs to represent what is out there in the real word, and it needs to manipulate it for particular purposes. But the world is messy, and purposes are manifold. Models of mind, therefore, can become garrulous and intractable as they become more and more realistic. If one’s emphasis is on science more than on cognition, however, the canons of hard science dictate a strategy of the isolation of idealized subsystems which can be modeled with elegant productive formalisms. Clarity and precision are highly prized, even at the expense of common sense realism. To caricature this tendency with a phrase from John Tukey (1969), the motto of the narrow hard scientist is, “Be exactly wrong, rather than approximately right”.
>The one tendency points inside the mind, to see what might be there. The other points outside the mind, to some formal system which can be logically manipulated [Kintsch et al., 1981]. Neither camp grants the other a legitimate claim on cognitive science.... an unnamed but easily guessed colleague of mine (Schenk?), who claims that the major clashes in human affairs are between the “neats” and the “scruffies”. The primary concern of the neat is that things should be orderly and predictable while the scruffy seeks the rough-and-tumble of life as it comes ... The fusion task is not easy. It is hard to neaten up a scruffy or scruffy up a neat. It is difficult to formalize aspects of human thought that which are variable, disorderly, and seemingly irrational, or to build tightly principled models of realistic language processing in messy natural domains.
>What are the difficulties in starting our from the scruffy side and moving toward the neat? The obvious advantage is that one has the option of letting the problem areas itself, rather than the available methodology, guide us about what is important. The obstacle, of course, is that we may not know how to attack the important problems. More likely, we may think we know how to proceed, but other people may find our methods sloppy. We may have to face accusations of being ad hoc, and scientifically unprincipled, and other awful things."
>Source is Chapter 5 of this book edited by Schenk and published in 1994, titled "Beliefs, Reasoning, and Decision Making: Psycho-logic in Honor of Bob Abelson". Article needs clean-up, which I am doing now.--FeralOink (talk) 13:58, 2 August 2021 (UTC)
>Machine learning is only provably correct for the known examples it was trained for. If that is not an adhoc approach to AI, then I don't know what is. Big data is the epitome of a scruffy. No model, just data, not formalism, besides fitting a curve/model to the given data. It is the exact same approach that scruffies follow: abstracting from examples for specific sub tasks.<unsigned>
>Just because some mathematical methods are employed, like optimization for a sub-problem, i.e. curve fitting, does not make the approach itself neat.
>Obviously, scruffies also use mathematically rigorous approaches, when employing provably correct algorithms, such as searching trees, or certain signal processing approaches.
>So far, the only valid "neats", are those doing GOFAI: they use a minimal model and deduce everything based on it, with no added assumptions or axioms along the way.
>Machine learning is only based on added assumptions/axioms: the training data. New for each problem, no general model.<unsigned>
>Yeah, I noticed that too. Not sure who introduced machine learning to the article. I'm trying to clean up, e.g. removing the jargon about scruffies just being casual hackers throwing stuff together in an ad hoc manner. I don't know enough about the people involved though. I know about the methods you mention (curve fitting, converging series, mathematical modeling) but not necessarily who did what. I don't even know whether most of these guys, the neats OR the scruffies, would be comfortable with "big data" (i.e. lots of specious results with very low cost of being wrong).--FeralOink (talk) 10:48, 3 August 2021 (UTC)
Do you think only experts should be programming? I'm an amateur programmer, and I think copilot could help me a lot with unimportant things, as you said - I even tried to install it but I'm not on some list. I can read code, and have built a few programs - I've hired around 30 different programmers in my life, and the vast majority clearly are copy-pasters-adapters. The way I see it, that happens because programming is still way more complex than it should be - and copilot will help with that. Maybe you are thinking about elite developers or perhaps developers on big companies, but I think it will be greatly benefitial for us low-level coders amateurs, freelancers and fresh people. Am I wrong?
> Do you think only experts should be programming? I'm an amateur programmer (...)
Amateur vs professional and novice vs expert are completely separate things.
You can be professional novice just as you can be expert amateur.
Now, the answer to your question is an obvious "NO". To be an expert you have to be a novice first.
The problem rather is "Are you making progress towards being an expert or are you just learning to more efficiently execute your novice workflow?"
> The way I see it, that happens because programming is still way more complex than it should be - and copilot will help with that.
No, it is just an illusion of help.
Just as your son may thank you for help when you give him an answer to his homework. From his point of view you have helped him, true, but from another point of view the point of the task wasn't to deliver answer to the teacher, it was to imprint something valuable on the mind of the child.
I see, sorry for wrong words. I am an amateur programmer, not a complete novice, and I have contracted mid-level professional programmers.
I understand your point about learning and getting better at it. All I'm saying is most of programmers won't become experts: the market doesn't demand that, and most just aren't able or don't want to.
No-code will make a huge impact in next decade imo.
In my specific case, I would be able to become an expert programmer but I don't intend to because I have other carreer choices. So I think copilot would be of great help.
Why did you redirect OP's question about amateurs to one about novices?
For amateurs, the homework is a great analogy - they don't need a lesson, they need a calculator so they can get back to the professional work they are doing.
I think the argument is that an amateur with copilot is going to stay an amateur longer than someone without copilot while simultaneously only helping them create something no one--including them--should rely on: it teaches the wrong habits and helps with the wrong problem.
I sympathize with where you're coming from, but the phrase "unimportant things" bothers me. I'm always seeing clients deploy alpha or beta software in production. I see tech companies accumulating tech debt like nobodies business. None of that should happen. And often disasters involving tech get traced back to a cascading failure that started with something considered unimportant.
I love that software is an accessible discipline to hobbyists and that it empowers people. But it needs to be a discipline, top to bottom. We need deep understanding with security and robustness as fundamentals, good practices, and all of that baked into our tools.
From my experience with tutoring, I would say that it won't help. The best way to learn is to mechanically do the work. Having things fed to you yield poorer results, IME.
Another parallel: language learning. You learn more by speaking and writing than merely reading and listening, because the former actually requires you to actively associate grammar rules to your physical actions, whereas consumption has a lower bar of effort since you can infer things from context, gloss over things, etc.
> I feel like Copilot is the wrong direction to optimize development. This is mostly going to help people with already poor understanding of what they are doing create even more crap.
Sometimes you don't need an expert to produce highly secure, highly optimized code.
Have you seen the crap that people buy at Walmart? The furniture is not heirloom furniture, the food is not a 3-star artisanal experience. Have you bought tools at Harbor Freight? They're not the lifetime companion of a tradesman, kept in wood boxes and wrapped in cosmoline after each use. But an awful lot of work gets done with them, common homeowner wisdom is if you need a tool, buy it at Harbor Freight, if you use it enough to wear it out spend 10x to buy a really good one, but most tools you'll only use once or twice.
At workplaces across the country right this minute there are human beings doing rote transcription from one application to another, copy-pasting if they're lucky. That's a waste of effort and intellectual potential, and a hodgepodge of Excel equations or a crappy bit of Copilot glue code could be just the ticket. Yes, if those become the business' secret sauce and sold to customers on the Internet, they ought to put some effort into doing it properly, but there's a ton of work that could be accomplished with low-quality code.
>Have you seen the crap that people buy at Walmart? The furniture is not heirloom furniture,
the difference is that your sofa isn't programmable and networked into every other appliance in your house underpinned by a general purpose computer rife for abuse.
Virtually every piece of software you install is an access point to your machine or your sensitive data. One isolated thing in the analog world breaks down, not a problem. One misconfigured password in a VPN client, and whoops part of your national oil infrastructure goes offline
> I feel like Copilot is the wrong direction to optimize development. This is mostly going to help people with already poor understanding of what they are doing create even more crap.
I can't wait until we start seeing Copilot Natives devs, who had it enabled from the moment they first opened VSCode at their "become an engineer in 3 months" bootcamp.
> To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
That's something I really want my competitors to do. Honestly it makes finding stocks to short much easier (or poaching talent...)
>To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
This is how management is in most places I feel, especially when it comes to evaluating junior, and early senior engineers.
I too feel this is the wrong direction, from the fundamental aspect that trained algorithms contain no comprehension of what they are doing. They are the classic idiot savant. In a technological economy, comprehension of the environment is everything. I do not see how comprehension can be achieved without the elusive General AI, so I do not see this as anything other than a new area of research exposing how vitally important it is to have comprehension.
For context, I'm a very experienced software engineer (I shipped products before most of my coworkers were born) and I've been using Copilot for 6-8 weeks while creating a challenging (and therefore fun!) new system.
> This is mostly going to help people with already poor understanding of what they are doing create even more crap.
I can see how people who haven't used it at length might come to that conclusion, but my experience with it calls the "mostly" part into question. I'm sure there will be cases of that. But as someone who deeply understands my craft, I'm finding significant benefits.
> What it does not help is the important parts of development -- defining domain of your problem, design good APIs and abstractions, understanding how everything works and fits together, understanding what your client needs, etc.
Quite the contrary! The last time a new tool helped me with those parts as much was when I moved from C++ to Python in 1997. What I experienced in my C++ -> Python transition was that an enormous chunk of my brainpower could shift from language gymnastics to the problem domain. Copilot gives me a similar feeling. It frequently suggests exactly the 1-3 lines of code I was about to type and saves me 30-60 seconds (easily 20 minutes in a full day of coding). Much better than that, it lets my focus stay on better abstractions, APIs, etc.
> Also, I feel this is going to help increase complexity by making more copies of same structures throughout the codebase.
We, as engineers, are still responsible for what we produce. Any tool needs to be used with critical thought. Of course there will be those who don't think enough. And it might even make them look better in the short term. But that will be exposed in the medium to long term - `git blame` will point to them as the authors of problematic code and not Copilot. When such problems arise (or even better, before they arise), some of us who are more experienced need to step up and mentor less experienced folks so that they develop good habits.
A small sample of areas it's helping me...
When I decide that I want to use different representations internally and externally for some data in a class, I initialize the internal member variables. Part way through typing Python's `@property` decorator, it's suggesting the name of the property and exactly how to use the member variables to generate the external representation I want. Over half the time, it's exactly what I was about to type. Maybe a quarter of the time it's not and I just don't accept the suggestion (or do a quick edit). And 5-10% of the time it suggests an approach that is better than what I was thinking. And that's in a very simple use case.
In other scenarios, it often sets up my loops just as I want them. Sometimes it picks column major when I want row major. I just keep typing and as soon as it's clear I want row major, it's suggesting that. Again, occasionally it surprises me with something better - if I just use that one function I rarely have a need for, the inner loop melts away. Why didn't I think of that? Well, now "I" did. The code I'm producing with Copilot is better than the code I would have written without because I'm thinking as I use it.
Where it really saves me time / focus is when I have some tricky calculation or API call that isn't hard, but there's a bunch of little details to get right. One I did yesterday... lookup a value in a dict, but the key needs to be mapped through another dict. Between the original key, the two dicts, and the variable receiving the result there are four variable names, plus one more for the mapped key (to spread it across two statements for readability). Before typing anything, I paused for a second to get the names straight in my head. Before I finished my thought, it suggested the lines, I looked at it for a second to make sure it was right, laughed because it was, and hit tab. It wasn't a hard task, but it helped me stay focused on the bigger picture.
Most of the time this doesn't feel at all like boilerplate. It's picking up my variable names and properly using the data structures I setup in other parts of the code. There's a big misconception that it's just pasting snippets in. It feels very different from that in real usage. Also, it rewards good naming habits. In the example above, how did it know I wanted to map the key through that dict? `key_mapping` was in the variable name. Easier for others to read later and for Copilot to read now.
The system I'm building is definitely better designed because of Copilot. Not because Copilot did any of the design, but because it freed me up to focus on the design more. It will have downsides, but in experienced hands it can be a great tool. I'm not affiliated with Microsoft / Github / OpenAI in any way. I'm just doing better work because I'm using it and doing better work makes me feel good. When the time comes, I'll pay for Copilot out of my own pocket if my company doesn't pay for it.
I love that we always use the average here for these justifications. We just slowly chip away and any and all excellence. 10x memes aside, we all know what it's like to work with a truly talented and productive engineer versus your everyday schmoe collecting a paycheck. It's a story as old as time, and yet here we are doing the exact big factory industrialization techniques other industries have done and that is commoditize the thing that made them exceptional and eliminating artisanship, uniqueness, and ultimately quality and character.
Wasn't there a thread here just yesterday about how 6% of some class of AI outperformed a human, but then it turned out that 0% outperformed two humans? That's also literally the lesson Uber learned the hard way when a SDV ran over a person (that zero humans is worse than one, and one is worse than two). This is also the principle behind code review, peer review, QA, middle management bureaucracy, and a whole lot of other things.
The tragedy, IMHO, is that AI models like this encourage centralizing decision making into a single black box (to the extent that external research then benefits the owner of the AI model rather than advancing public commons), whereas in pretty much every other aspect of life, we consider decentralization/redundancy of autonomy to be the solution to robustness problems.
A common quip is that most benchmarks are of the performance of humans who aren’t really paying attention (because while building datasets, they’re doing this repetitive task over and over and over). So better than the average human benchmark isn’t generally great.
I disagree. 40% is not great, but unlike the masses of developers, this is a single system that can improve over time. Further, a system that can do most of the work but requires a security specialist to polish it is still a useful tool. What's important to recognize is that this is not a terribly novel concept. Unsecure code is written every day.
> unlike the masses of developers, this is a single system that can improve over time
I’d say the exact opposite. Unlike this algorithm, developers can continue to learn. There will likely be future algorithms that are improvements, but this isn’t that.
Individual developers can learn, but they are replaced by new developers that have not learned. Sure, this specific instance of copilot is not the best, but it sure feels to me like people are discussing the concept of it, not the exact implementation right now.
It may be a tragedy, but I fail to see why it is a tragedy of the commons? Which resource that is a available to all is being overused? High-paying dev jobs? Those are not a commons in the sense that tragedy of the commons implies because lower-quality devs don't stand to benefit by only taking a smaller part of the job.
Here's a food analogy: everyone wants to buy the best looking apples, but then farmers are more incentivized to breed for looks than nutritious value, even though nutritious value is the superior metric.
Similarly, if everyone seeks to "dumb down" programming, you end up with a large pool of "dumbed down" programmers, which is counterproductive precisely because AI is imperfect and you need a higher level of expertise to compensate for its shortcomings. As Kernighan famously said: "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." Similarly, if one lets the AI do the thinking in their stead, what hope do they have of being able to debug it?
Ironically, though, programming already suffers from this exact problem in a very fundamental way: every tool exists to make a programmer's life easier, and consequently there are a lot of glue-code programmers. The few that actually impact the industry meaningfully (e.g. most notable software comes out of Bay Area) are very expensive because the supply of experts is limited.
This actually feels to me like a concept worth exploring. I think we lack a concise term or phrase to reference what GP was trying to communicate.
In my heart I feel similarly to GP - and it does feel a lot like how I feel about tragedy of the commons situations. Maybe there seems to be a shared opportunity for everyone if these private companies would make the most of their financial capital, market dominance, dominance in human resources, and most especially leverage their network effects.
That would lead to better things for everyone, like the invention of smartphones. But the same corporations can also waste unimaginable resources and achieve very little. Often their failures don't just have little effect, but rather the failures choke/smother the market and prevent better alternatives from being widely used.
You are the free labor copilot to train Microsoft GitHub's Copilot tool. You are responsible for any of those insecure code errors and the diligence require. You will be on the hook for resulting problems. But Microsoft and their home-phoning, tracking-embedded editor will get real people to correct and train their machine for free—with their stated plan of later selling that machine back to us later.
I wish there were a “robots.txt” file for Git to disallow certain bots from training on anything I have written.
Merely hosting your code publicly seems like it wouldn't give GitHub the right to train AI models on it. You could even say it's against your terms of use. And to do it, they would have to go out of their way to find your repo on the web and clone it—unlikely.
My impression (NOT A LAWYER) is that by hosting your code in a public repo on GitHub, you agree to their terms and give them the right to "read" your code including training AI models on it. Or at least that's what they're banking on.
Go host on Sourcehut or self-host with Gitea, and I would think it unlikely (but not impossible) that any big company would use your code to train their AI.
It's not even very clear whether training an AI on OSS code is violation of those licenses. So unless you make your code public clearly under a proprietary license that clearly rejects such use, you can't really prevent people from doing that anyway.
Just imagine, there's really nothing preventing people from scraping your blog to train their natural language processing AI or whatever, why would code be any different? Even if you put up a big sign saying you don't consent to having your data ingested by a neural network, I doubt it will get noticed anyway...
People have been taking large OSS codebases (eg. Linux kernel) for various statistical analyses. AI is just doing the same thing in a more sophisticated manner.
I bet if I trained an AI on some vocalist and released an album I'd get some legal mayhem. I do concede it might go differently for code, but none of these issues are crystal clear for me.
I wish it were easier to convince projects I like and want to help migrate for the same reason. Committing to their repos does not put me in the clear--including mere mirrors.
No, the license that you apply is completely irrelevant, and there’s certainly nothing whatsoever special about the GPL. Copilot is completely depending on being effectively exempt from copyright; if that legal theory falls apart, the entire space (and a lot of other machine learning stuff) is utterly doomed. Trouble is, Copilot can’t tell whether it’s reproducing copyrightable chunks of your code, or indeed where what it produces came from, by the very nature of machine learning techniques.
That’s not how learning, human or machine, works. Learning is about collecting all kinds of stuff from diverse sources into a great melting-pot, so that you can form something new out of it—but you can’t generally identify where everything comes from. Individual recognisable tricks perhaps, but if you want to say “this code was inspired by X, Y and Z”, well, that inspiration is typically everything, the entire corpus.
I don't think that the GPL gives much more protection than any other FOSS license, here, in practice.
If Copilot were to reproduce a larger part of, say, an MIT-licensed codebase or almost any other permissive licence, then they should legally provide attribution. I'm pretty sure that they don't even have an option to provide such specific attribution, which means that either they believe that the code copied from any one source is below the relevant threshold or they're just ignoring copyright.
I would assume github could supercede your license by putting its own claim to your code in the TOS. I doubt they have done that, but just pointing it out.
Sure they can, search engines produce copyrighted material all the time. The issue comes in when people think this somehow indemnifies them as users of Copilot - my guess is, it doesn't protect you any more than if you use a search engine to copy an entire codebase for your own purposes.
Not possible. Such licenses are founded upon copyright doctrine, and copyright doesn’t protect against learning, natural or machine. As it stands (and this can certainly change), legal consensus in general (regardless of jurisdiction) is that if you publish your code where they can reach it, they can use it.
So would it (theoretically) be legal to train on the JS files services like gmail.com serve to the client? What about decompiled output of proprietary software like certain files in Windows and macOS?
Except for any laws or restrictions against decompiling, it would legally be no different than the GPL case. Although personally I think since co-pilot is capable of redistributing the code, the question of whether the GPL permits the specific usage is still unclear.
Or specifically no closed-source or closed-data tools. I wouldn't mind if a non-profit org wanted to help, but it's Microsoft—and they want to sell it back to us in the future.
I've experienced this first hand: the autosuggest is scarily accurate and insidious at the same time. On numerous occasions I've auto-filled a 10-15 line suggestion that looked like it was exactly what I wanted to do, but made a very critical mistake (e.g. in a For loop, referencing the wrong array despite calling it the right name). Not really security related stuff, but head scratchers that make it harder to debug since I didn't actually write the code.
It seems reasonable to want Copilot to help you produce code of a reasonable quality.
If it’s just helping you crank out the same bad code more quickly, without learning anything in the process, that’s useful to know. Some people might still want a tool like that, I wouldn’t.
Sure. But in order to know if its 'of reasonable quality' you need some sort of baseline to compare it to. What is reasonable quality? I think what your average human does is probably reasonable.
Like, if your average dev will produce insecure code in 80% of samples, then Copilot starts to look really good! But if its closer to 0.01% of code samples, then copilot looks more like an intriguing novelty, not to be brought too near serious work. Much like dippin dots in this regard.
That's basically where my gut went when I read the headline - so is that of a junior engineer, or really any engineer who hasn't had to think about it, and we don't promote their code directly to prod, either (if we can avoid it).
Copilot shouldn't be able to generate code destined for prod without review any more than should any line of code written by a human.
> For comparison, what percentage of human-generated code is secure?
Yeah how did they measure? Did static and dynamic analysis find design bugs too?
Maybe - as part of a Copilot-assisted DevSecOps workflow involving static and dynamic analysis run by GitHub Actions CI - create Issues with CWE "Common Weakness Enumeration" URLs from e.g. the CWE Top 25 in order to train the team, and Pull Requests to fix each issue?: https://cwe.mitre.org/top25/
I use Copilot mostly as replacement for intellisense and macros. It helps me automating repetitive tasks. I would never trust Copilot for an algorithm or a snippet, I mean I would treat the code just like anything taken from StackOverflow or Github.
It is important to remember that Copilot can improve. 40% is not a bad baseline, but one data point does not give us much info, we should wait and see the rate of improvement.
so far GitHub Copilot is more feasible as tool for humans doing code-coverage for its input code, "given enough eyeballs, all bugs are shallow" style. When a developer goes, "huh, Copilot generated insecure code, better report it to the original project it learned it from" - if only Copilot was able to link to the original project, it would all be great and useful.
It is useful since it means copilot is not taking your job any time soon. i.e. if 40% of the time the human driving the thing is needed to intervene and prevent obvious security flaws then expert is still needed to use the tool.
I think it was obvious from the beginning that it's trained on GitHub code, so it would be surprising if it was better than the average code on GitHub.
In any case, if Copilot can generate code as well as the average programmer without supervision, that means it can already take the job of 50% of programmers. A more useful metric though is how many programmers can a person using Copilot replace by having greater productivity?
Also, in how many programming jobs does security matter? In my job for example it doesn't matter at all.
With any commercial software security matters. You don't want your licensing solution to be craked by editing a file.
Anything that connects to any server or has any sort of networking.
Anything that needs privileges (like installing drivers) needs security.
Anything that reads/interpeds any data given to it needs to have security.
Of course this all depends what you mean by "matter". I can't come up with a program that doesn't need to think about security at all expect something trivial like hello world.
As many people have pointed out indirectly, this is almost certainly caused by the training set. Without a bias or ranking for quality, it will just churn out the “best fit” or most popular snippets…
Happily having access to GitHub copilot, it very often generates the code that I want. So it saves me from typing and also often saves checking Stack Overflow. I think the libraries/packages you use also play a big influence in how easy it is for copilot to create security flaws. Still, more training against security holes would be appreciated.
Well, of course. GPT-3 has no underlying model of meaning. It's just autocomplete with a bigger data set. Used on natural language, it produces text that looks reasonable for about three paragraphs. Then you realize it's just blithering and has nothing to communicate. (Like too many bloggers, but that's another issue.)
Unit tests with TONS of assertions, cleaned data from form to ORM object, stuff that look look like you're just through a list and doing the same thing over and over. For these, Copilot is great. I wouldn't trust it to do anything else though.
It's painful to see that GitHub Copilot is called "AI". For god's sake, it is not AI. It's just an advanced auto-complete for coders. GPT-3 is close to AI, GitHub Copilot is not.
Jesus Christ, please make them stop. Stop using AI as a buzzword.
GitHub Copilot is literally GPT adapted for code. The paper on OpenAI Codex, the stuff powering Copilot, makes it very clear in the abstract. https://arxiv.org/abs/2107.03374
Either you call both AI or you call neither AI.
(A previous version of the comment stated that it was tuned from GPT-3. This is incorrect; the simpler GPT was used for faster convergence.)
I wonder if the Copilot model could somehow be repurposed to analyze the quality of a developer’s code. Seeing how Microsoft owns both GitHub and LinkedIn, it’s a good bet this is something they’re actively researching.
It's learning from existing code, right? Doesn't this say something about developers in general, or is the thought that it uses combinations of code that are insecure?
I don't hold the average developer in very high regard. There are tons of developers who are much better than me and I readily read their books, follow their tweets, blog posts and online talks to learn from them. I hold them in high regard, but these people are not the average developer.
If you would pick any smaller company with a dev team, a freelancer or an agency, your chances of finding a developer who understands and upholds quality code is vastly reduced.
Not to mention a lot of beginners will just push their practice projects to GitHub and never look at it again. I'm also guilty of this, but I never realized Microsoft was training AI with this code. If Copilot is learning from these projects then I'd say the code it regurgitates is not average, but even below average.
I will say I’m not looking forward to writing some mundane code today.
It’s interacting with GCS to scan a bucket for an extension, load the data with pandas, and concat some dataframes. It’s something dumb but mildly finicky that’s going to eat up so much time I could be using for higher value work.
Copilot would be very welcome as I do this, instead of annoyingly going off to Google 3 different python libraries and getting it all to work nicely together.
I think this should be pretty much expected. I'm unfamiliar with how this network is trained, but I'm pretty sure the data ranking is not perfect.
I'm guessing the ranking features are based on the repo stats, contributor stats, etc. Even "good" contributors will make rookie mistakes in certain areas.
Interesting to imagine how GH will try to solve this issue.
It might be possible to learn from the change history of the projects. There's likely quite many commits which fix certain security issues, such as SQL injection problems. Maybe even with suitable metadata in the issues or commit messages.
I think where OpenAI Codex (which is what Copilot uses) gets more interesting is when the allow you to fine-tune the model on your own (trusted) code. That could help reduce the time it takes for new engineers to get up to speed for example.
My take on Copilot has not changed.
I believe it will make programmers that produce junk code more productive, by being able to produce more junk code in less time.
Only if 100% of the code was insecure when written by a human. Given that humans can think through what code is doing, I don’t think that’s a reasonable assumption.
But this problem is solved using GitHub's CodeQL for searching and filtering generated code. By combining Copilot with GitHub Semantic and GitHub CodeQL, you have a means of writing and generating the code you want in a secure way. This means that you no longer need the original source code that was used to train Codex. Training Codex and selling as a product in the form of Copilot steals the essence of the original source code used to train it, to build the future of programming, while paying nothing back to the original authors. Even Elon Musk was opposed to OpenAI exclusively licensing to Microsoft GPT.
Security starts with deep understanding.
Some standards and practices can help avoid some types of problems, and some are even rather effective (like airgapping your systems), but there isn't any way to assure security in general other than truly understand what you are doing.
**
I feel like Copilot is the wrong direction to optimize development. This is mostly going to help people with already poor understanding of what they are doing create even more crap.
For a good developer those low level, low engagement activities are not a problem (except maybe for learning stage where you actually want people engaged rather than copy/paste). What it does not help is the important parts of development -- defining domain of your problem, design good APIs and abstractions, understanding how everything works and fits together, understanding what your client needs, etc.
Also, I feel this is going to help increase complexity by making more copies of same structures throughout the codebase.
My working theory about this is this is going to hinder new developers even more than they already are by google and stack*. Every time you are giving new developers an easier way to copy paste code without understanding you are robbing them an opportunity to gain deeper understanding of what they are doing and in effect prevent them from learning and growing.
It is a little bit like giving answers to your kids homework without giving them chance to arrive at the answer or explaining anything about it.
**
Another way I feel this is going to hurt developers is competition in who can produce most volume of code.
I have already noticed this trend where developers (especially more junior but aspiring to advance) try to outcompete others by producing more code, close more tickets, etc. Right now it means skipping understanding of what is going on in favor of getting easy answers from the Internet.
These guys can produce huge amounts of code with relatively little actual engagement.
To management (especially with wrong incentives) this seems like a perfect worker, because management usually doesn't understand the connection between lack of engagement and planning at design/development time with their later problems (or they don't feel it is them that is going to pay the price).
The Copilot is probably going to make it even more difficult for people who want to do it the right way because even starker difference in false productivity measurements.