Autocompletion with Deep Learning

_9jgl · on July 15, 2019

This looks incredibly cool; I'm wowed by the fact that the model has learnt to negate words in if-else statements, though I struggle to think of a case where that particular completion would have been useful.

At the same time, I'm less excited about the fact that the model is cloud-only, both for security/privacy reasons and because I spend a not-insignificant amount of my time on limited-bandwidth/high-latency internet connections.

I'm also curious as to why the survey didn't ask about GPU specifications; most of the time I use my laptop to code whilst plugged in, and I'd happily use only LSP completions when on battery, so power consumption wouldn't be an issue (though fan noise might), and allegedly my GPU (a GTX 1050) can pull off almost 2 TFLOPs, which is well over the "10 billion floating point operations" mentioned in the post.

zawerf · on July 15, 2019

> I'm wowed by the fact that the model has learnt to negate words in if-else statements

I know it learned natural languages from using GPT-2, but I am surprised it didn't get "confused" since words are used in such a different way in programming.

For example strong appears as the html tag <strong> with no corresponding <weak> tag. And weak appears in weak_ptr in C++ and there's no such thing as a strong_ptr.

Felz · on July 16, 2019

Vanilla GPT2 actually does a half decent job of emulating the style of programming.

Try entering "public static int main() {" into https://talktotransformer.com/

0b01 · on July 15, 2019

I think the tokens are simply one hot encoded. Though some sort of word2vec embedding is also situationally plausible

jacob-jackson · on July 15, 2019

Good point, I added it to the survey.

frou_dh · on July 15, 2019

Since it's asking for FLOPS, are you ready for people typing in 14-digit numbers? ;)

jacob-jackson · on July 15, 2019

I'm looking forward to some data cleaning :)

Mil0dV · on July 26, 2019

So for reference (and checking) I entered 1862000000000 for my 1050 ti.

pgib · on July 15, 2019

I have been using regular TabNine with vim since January. It is considerably better than anything else I've used, and it saves me a ton of time. Sometimes the suggestions are eerily exactly what I wanted. O_o

ahallock · on July 16, 2019

Same experience. I try not to think how many keystrokes this would have saved me over my career had this type of deep learning been available.

ogirginc · on July 15, 2019

I was using TabNine very happily until it started consuming excessive CPU. This bug has been known for more than 6 months but no fix is available at the moment. https://github.com/zxqfl/TabNine/issues/24

jacob-jackson · on July 15, 2019

It's difficult to solve an issue like this without steps that can be used to reproduce the issue.

Here is a similar issue where someone provided instructions to reproduce the problem, which allowed me to fix the issue: https://github.com/zxqfl/TabNine/issues/43.

patrec · on July 15, 2019

Even if it's technically interesting, to me this seems to epitomizes two trends I strongly dislike:

1. The road to digital serfdorm by removing the ability of running software directly on our individual machines.

2. Programming degenerating to internet and AI assisted copy-pasta code monkeying around byzantine boilerplate APIs. The upside is that such tooling makes filling in said boilerplate less painful. But I don't want garbage APIs to become even less painful (and hence less likely to be weeded out) than they have already become thanks to stack-overflow etc. I also wonder if this will end up producing more "plausible" and hence invidiously wrong code; similar to Xerox's infamous jbig2 smart image compression fiasco where copying sometimes changed the numbers in the document.

modeless · on July 15, 2019

This is awesome. I've been thinking for a while that this would be a good idea and it's great to see someone actually do it.

Why not allow individual developers with desktop GPUs to run the model locally? I don't want to run a reduced size laptop model on my machine with a Titan GPU. It would be awesome to actually harness the GPU power for coding :)

trishume · on July 15, 2019

(disclaimer: not the author)

One problem is that deploying GPU neural networks cross-platform is a huge pain. You basically have to get your users to install CUDA and figure out how to dynamically link against that on Windows and Linux, Mac users and people with AMD GPUs are out of luck of course.

The only way to do cross-platform GPU compute without your users installing a toolkit like CUDA is with Vulkan (and MoltenVk or gfx-rs). But then you don't have the super optimized GEMM kernels so your network might run slower than just using a super-optimized GEMM kernel on the CPU.

jacob-jackson · on July 15, 2019

This is a big part of it. CPU-only models are much easier to deploy, and not all developers have GPUs, so the first iteration of the local model will be CPU-only. We might release a GPU version later.

modeless · on July 15, 2019

Users don't have to install CUDA if you build your app correctly. Yes, AMD GPU users would be out of luck, but that's really on AMD for failing to get anything like cuDNN integrated with popular frameworks.

arathore · on July 15, 2019

Looks cool. Iirc the transformer architecture doesn't allow any constraints on the learned language model. For code completion settings, a model aware of the (programming) language constructs explicitly and then augmented with code samples would be much more efficient (you could greatly reduce the search space for next token etc).

codesushi42 · on July 15, 2019

>a model aware of the (programming) language constructs explicitly

You could never include that in the model's training. The best you could do would be to construct an AST on the model output and discard suggestions with invalid syntax. And provide enough negative examples (invalid syntax) to reduce false positives.

What you proposed would never work with a language model, and makes no sense with how backprop works. The model will learn the grammar (syntax), but will always output some percentage of false positives (invalid syntax).

You can't hardcode the syntax into the model. Another approach is to encode token types after tokenization, which will give the model more information about the syntax/meaning of tokens.

okr · on July 15, 2019

I want an intellij plugin!

Looks really cool. Thx!

droid_w · on July 25, 2019

For IntelliJ try Codota (www.codota.com) which trains AI on semantic models of Java code

losvedir · on July 16, 2019

I'm impressed by the blog post and how it clearly addressed a lot of the concerns I had about a cloud based service.

Does it not support Elixir? I tried out and was impressed by the older version of TabNine, and could have sworn that was with an Elixir project.

trishume · on July 16, 2019

The older TabNine works for any language because it's based on looking at the rest of your project. Deep TabNine is new and requires lots of open source training data for each language to work well. Maybe there is enough open source training data for Elixir though and it would work well, dunno.

(disclaimer: not the author)

onurcel · on July 15, 2019

It looks nice. It would be nice to have the ability to fine-tune the model to your own source code. Is that on the roadmap?

jacob-jackson · on July 15, 2019

Yep. We will start by offering it to enterprises. We may extend it to individuals if we can offer it for sufficiently low cost.

bobajeff · on July 16, 2019

This is great I'm happy there are so many projects going using machine learning with coding. Since I plan on applying ML in a future project for coding on phones and other mobile devices.

I'll have a few examples to learn from.

superasn · on July 16, 2019

Can I use this with any intellij products like phpstorm or webstorm?

cstuder · on July 16, 2019

Not at the moment, but planned according to the developer: https://github.com/zxqfl/TabNine/issues/13

droid_w · on July 25, 2019

For IntelliJ try codota.com which trains AI on semantic models of the code

angrygopher · on Aug 1, 2019

Only works for Java (and maybe JavaScript). No help for those of us using real languages.

lucidrains · on July 15, 2019

Great job! Could you share how many parameters and the amount of training data per language you used?

jacob-jackson · on July 15, 2019

We're still prototyping new models and modifying the dataset, so we plan to release more technical details in the future.

hpen · on July 15, 2019

A blog post describing the deep learning details would be awesome!

lucidrains · on July 15, 2019

Awesome, thank you! I was waiting for someone to do this! :)

0b01 · on July 15, 2019

Greatest autocompletion tool ever created just got better!

tempsolution · on July 15, 2019

Cool, finally people can utilize the statistical human average by automatically boiler-plating their code with what the average developer would produce.

Sounds like this has good chance of bringing the "human intelligence on deep-learning auto-pilot" into the world of developers. Don't think, just accept what the computer tells you. Over time, why think at all.

There is no bigger enemy to concise, clean code, than too much auto-completion. If it doesn't pain you to repeat patterns, then there is literally no incentive no to litter your code with anti-patterns and copy paste. Now it's even worse, because this copy paste gets "smart" enough to look alright, which is in itself the very definition of an anti-pattern.

jacob-jackson · on July 15, 2019

Language models mimic the style of the surrounding text (as you can see from the examples in [1]), so the model will only try to give you low-quality boilerplate code if your codebase already contains lots of low-quality boilerplate code.

[1] https://openai.com/blog/better-language-models/

codesushi42 · on July 15, 2019

VSCode Intellisense also uses a deep learning approach for autocompletion.

How does this compare?

jacob-jackson · on July 15, 2019

As far as I can tell, it suggests one token at a time and uses its model to help rank these tokens. This is useful, but there is a lot to be gained by suggesting multiple tokens at once.

TabNine has always included a logistic regression model to help rank completions. It uses features such as the occurrence frequency of the token and the number of similar contexts in which it occurs.

codesushi42 · on July 15, 2019

Intellisense also outputs multiple suggestions.

Also, TabNine mentions they are using transformers, which is not logistic regression. The context will be inferred using attention.

jacob-jackson · on July 15, 2019

Sorry, I worded that poorly. What I mean is that each individual suggestion consists of a single token -- at least, that is what I see in https://visualstudio.microsoft.com/services/intellicode/. Compare the videos in https://tabnine.com/blog/deep, where most suggestions consist of multiple tokens.

Deep TabNine (announced today) uses transformers. TabNine (released last year) uses logistic regression.

codesushi42 · on July 16, 2019

Interesting, thanks.

vincent-toups · on July 15, 2019

CTRL-F "emacs" <no results> back.

jacob-jackson · on July 15, 2019

https://github.com/TommyX12/company-tabnine