This looks incredibly cool; I'm wowed by the fact that the model has learnt to negate words in if-else statements, though I struggle to think of a case where that particular completion would have been useful.
At the same time, I'm less excited about the fact that the model is cloud-only, both for security/privacy reasons and because I spend a not-insignificant amount of my time on limited-bandwidth/high-latency internet connections.
I'm also curious as to why the survey didn't ask about GPU specifications; most of the time I use my laptop to code whilst plugged in, and I'd happily use only LSP completions when on battery, so power consumption wouldn't be an issue (though fan noise might), and allegedly my GPU (a GTX 1050) can pull off almost 2 TFLOPs, which is well over the "10 billion floating point operations" mentioned in the post.
> I'm wowed by the fact that the model has learnt to negate words in if-else statements
I know it learned natural languages from using GPT-2, but I am surprised it didn't get "confused" since words are used in such a different way in programming.
For example strong appears as the html tag <strong> with no corresponding <weak> tag. And weak appears in weak_ptr in C++ and there's no such thing as a strong_ptr.
I have been using regular TabNine with vim since January. It is considerably better than anything else I've used, and it saves me a ton of time. Sometimes the suggestions are eerily exactly what I wanted. O_o
I was using TabNine very happily until it started consuming excessive CPU. This bug has been known for more than 6 months but no fix is available at the moment. https://github.com/zxqfl/TabNine/issues/24
Even if it's technically interesting, to me this seems to epitomizes two trends I strongly dislike:
1. The road to digital serfdorm by removing the ability of running software directly on our individual machines.
2. Programming degenerating to internet and AI assisted copy-pasta code monkeying around byzantine boilerplate APIs. The upside is that such tooling makes filling in said boilerplate less painful. But I don't want garbage APIs to become even less painful (and hence less likely to be weeded out) than they have already become thanks to stack-overflow etc. I also wonder if this will end up producing more "plausible" and hence invidiously wrong code; similar to Xerox's infamous jbig2 smart image compression fiasco where copying sometimes changed the numbers in the document.
This is awesome. I've been thinking for a while that this would be a good idea and it's great to see someone actually do it.
Why not allow individual developers with desktop GPUs to run the model locally? I don't want to run a reduced size laptop model on my machine with a Titan GPU. It would be awesome to actually harness the GPU power for coding :)
One problem is that deploying GPU neural networks cross-platform is a huge pain. You basically have to get your users to install CUDA and figure out how to dynamically link against that on Windows and Linux, Mac users and people with AMD GPUs are out of luck of course.
The only way to do cross-platform GPU compute without your users installing a toolkit like CUDA is with Vulkan (and MoltenVk or gfx-rs). But then you don't have the super optimized GEMM kernels so your network might run slower than just using a super-optimized GEMM kernel on the CPU.
This is a big part of it. CPU-only models are much easier to deploy, and not all developers have GPUs, so the first iteration of the local model will be CPU-only. We might release a GPU version later.
Users don't have to install CUDA if you build your app correctly. Yes, AMD GPU users would be out of luck, but that's really on AMD for failing to get anything like cuDNN integrated with popular frameworks.
Looks cool. Iirc the transformer architecture doesn't allow any constraints on the learned language model. For code completion settings, a model aware of the (programming) language constructs explicitly and then augmented with code samples would be much more efficient (you could greatly reduce the search space for next token etc).
>a model aware of the (programming) language constructs explicitly
You could never include that in the model's training. The best you could do would be to construct an AST on the model output and discard suggestions with invalid syntax. And provide enough negative examples (invalid syntax) to reduce false positives.
What you proposed would never work with a language model, and makes no sense with how backprop works. The model will learn the grammar (syntax), but will always output some percentage of false positives (invalid syntax).
You can't hardcode the syntax into the model. Another approach is to encode token types after tokenization, which will give the model more information about the syntax/meaning of tokens.
The older TabNine works for any language because it's based on looking at the rest of your project. Deep TabNine is new and requires lots of open source training data for each language to work well. Maybe there is enough open source training data for Elixir though and it would work well, dunno.
This is great I'm happy there are so many projects going using machine learning with coding. Since I plan on applying ML in a future project for coding on phones and other mobile devices.
Cool, finally people can utilize the statistical human average by automatically boiler-plating their code with what the average developer would produce.
Sounds like this has good chance of bringing the "human intelligence on deep-learning auto-pilot" into the world of developers. Don't think, just accept what the computer tells you. Over time, why think at all.
There is no bigger enemy to concise, clean code, than too much auto-completion. If it doesn't pain you to repeat patterns, then there is literally no incentive no to litter your code with anti-patterns and copy paste. Now it's even worse, because this copy paste gets "smart" enough to look alright, which is in itself the very definition of an anti-pattern.
Language models mimic the style of the surrounding text (as you can see from the examples in [1]), so the model will only try to give you low-quality boilerplate code if your codebase already contains lots of low-quality boilerplate code.
As far as I can tell, it suggests one token at a time and uses its model to help rank these tokens. This is useful, but there is a lot to be gained by suggesting multiple tokens at once.
TabNine has always included a logistic regression model to help rank completions. It uses features such as the occurrence frequency of the token and the number of similar contexts in which it occurs.
At the same time, I'm less excited about the fact that the model is cloud-only, both for security/privacy reasons and because I spend a not-insignificant amount of my time on limited-bandwidth/high-latency internet connections.
I'm also curious as to why the survey didn't ask about GPU specifications; most of the time I use my laptop to code whilst plugged in, and I'd happily use only LSP completions when on battery, so power consumption wouldn't be an issue (though fan noise might), and allegedly my GPU (a GTX 1050) can pull off almost 2 TFLOPs, which is well over the "10 billion floating point operations" mentioned in the post.