I will always wish that Python weren't the lingua franca of sorts for machine le...

jdonaldson · on June 21, 2018

Nah, I'm with you. Trying to do the same in Haxe. There's not a first class language for typed tensor ops. Type information could include dimensionality, and remove a whole bunch of stupid, time consuming errors.

LolWolf · on June 21, 2018

What about Julia? It’s what we’ve been teaching our Stanford ML class with and it’s fast! Also typed and JIT with multiple dispatch. Arrays here don’t have the Numpy/Scipy weirdness of matrices vs. ndarrays and linalg is truly first-class.

This comes from someone who used to despise the language, but it’s truly come a long way.

Xcelerate · on June 21, 2018

I'm still amazed that Julia hasn't taken off in six years. It's this great language that solves most of the problems that other (scientific computing) languages have, and hardly anyone uses it. I use it all the time for personal projects, but I use Python at work. Looking forward to the day I can use Julia for everything.

aurelian15 · on June 21, 2018

My primary issue with Julia is that it has a relatively high latency in a REPL environment. I and many people I know primarily use REPL environments (e.g. Jupyter Lab) for scientific computing, so this is a pretty relevant use-case. For example, if I start Julia and type

   [1 2 3; 4 5 6; 7 8 9] ^ 2

I have to wait about 5 seconds for a response (on a first generation Core i7, SSD). On the other hand, running the following in a fresh Python interpreter is almost instantaneous:

   import numpy as np
   np.linalg.matrix_power([[1, 2, 3], [4, 5, 6], [7, 8, 9]], 2)

Unfortunately, in most scenarios the actual execution speed (where Julia is far superior) is secondary. People just tend to run larger experiments over night; and as long as you can express your code in terms of numpy matrix operations, Python is fast enough.

montalbano · on June 21, 2018

Just tried this on my (much slower) Intel m3-6Y30 (microsoft surface) processor and it worked in just over a second. What Julia version are you running? Speed has been improving steadily with new releases.

> Unfortunately, in most scenarios the actual execution speed (where Julia is far superior) is secondary. People just tend to run larger experiments over night;

I don't think there are only two scenarios, one requiring instant feedback and dependent on fast startup time, and the other where programs can be run overnight. There are infinite cases in-between. And crucially what about programs that take several days if not weeks (such as the biomechanical data analysis I do in my work)? Execution speed for me (and many others) is essential and doing this work in Python is a pain (I was using it for the same kind of problems before I switched to Julia).

goatlover · on June 21, 2018

> For example, if I start Julia and type [1 2 3; 4 5 6; 7 8 9] ^ 2'

Put that in a function and run it twice. The second time will be blazing fast since it's jitted. That's the workaround for REPL/Notebook usage. In my experience with Notebooks, I end up having to rerun the code all the time, so it will only be slow the first time around. And I've had my share of Python code that took 5 seconds or more to complete, every single time.

StefanKarpinski · on June 21, 2018

That taking 5 seconds is very strange. I have an early Core M (mobile laptop chip, much slower than Core i7, which is a desktop chip) and that expression takes 0.7 seconds at a fresh prompt. That's still much worse JIT compilation delay than we'd like it to be, but 5 seconds is either a very bad configuration or perhaps a bit of hyperbole? There are other situations like time-to-first-plot where compile times do cause a really serious delay that is a very real problem—and a top priority to fix.

aurelian15 · on June 21, 2018

Tried again this morning after rebooting the computer -- turns out I was low on RAM yesterday evening. After starting Julia a few times to make sure it is cached I get the following results:

  time julia -e '[1 2 3; 4 5 6; 7 8 9] ^ 2'
  real	0m1.629s

And for Python/Numpy

  time python -c 'import numpy as np; print(np.linalg.matrix_power([[1, 2, 3], [4, 5, 6], [7, 8, 9]], 2))'
  real	0m0.103s

Edit: Julia Version is 0.6.3, installed directly from the Fedora 28 repositories.

b2gills · on June 27, 2018

And people think Perl 6 is slow

  time perl6 -e 'say [1, 2, 3;  4, 5, 6;  7, 8, 9] >>**>> 2'
  [(1 4 9) (16 25 36) (49 64 81)]

  real	0m0.170s

Note that the majority of that time is just loading Perl 6.

  time perl6 -e 'Nil'
  real	0m0.156s

Perhaps someone could create a slang module for Julia in Perl 6, as that would be a fairly easy way to improve its speed. (Assuming Julia is easy to parse and doesn't have many features that aren't already in Perl 6)

batbomb · on June 21, 2018

It’s not a great general purpose language the way Python is. Neither is Matlab, so that’s okay if your competition is matlab but not if the competition is Python, C++, etc..

goatlover · on June 21, 2018

Meaning the Python standard library and common libraries are geared toward general purpose more than Julia.

I'm not sure that the Julia language itself is lacking in any general sense. It's just more geared toward scientific computing, but there's nothing about the language making it that hard to write general code. It's not R.

gaius · on June 21, 2018

Right, if you stick strictly to use cases covered by NumPy, Pandas, Matplotlib etc, there are better options than Python. But many real programs need other things too, and that’s why they start in and stick to Python regardless.

StefanKarpinski · on June 21, 2018

Glad you're enjoying it! "Hardly anyone uses it" isn't really accurate: Julia's in the top 50 languages on the Tiobe index (between Rust and VBScript this month) [1] and in the top 30 of the IEEE Spectrum language rankings (IIRC, paywall) [2]. I'd say that's quite the opposite of "not taking off". Anecdotally, there are a lot of people on StackOverflow giving excellent, accurate answers these days and I have no idea who they are, which feels like a significant place for a language to get. Getting all the way to the top will take a bit of time :)

[1] https://www.tiobe.com/tiobe-index/

[2] https://spectrum.ieee.org/computing/software/the-2017-top-pr...

LolWolf · on June 21, 2018

Honestly, just wanted to give a huge thanks for such an awesome language! I’ve truly been converted as a huge Python person into Julia and it’s been slowly taking over my research workflow since it’s just so fast and actually fun!

Xcelerate · on June 22, 2018

> "Hardly anyone uses it" isn't really accurate

My apologies for the wording. I should have just said I wish it was used more in industry nowadays. And thanks for creating Julia btw! It's been a very enjoyable and productive language to program in.

cutler · on June 21, 2018

Maybe if they had released 1.0 after all this time it might have had a better chance of being adopted.

Sinidir · on June 21, 2018

> Arrays here don’t have the Numpy/Scipy weirdness of matrices vs. ndarrays

Oh my god. This has bit me in the ass every time i did something with matrices in python. It seems like such a weird split.

LolWolf · on June 21, 2018

After the mess that was writing a numerical library that interfaced with scipy but used numpy arrays, I’m actually slowly porting most of my daily workflow into Julia. It’s just gotten faster and the code is much easier to read.

montalbano · on June 21, 2018

Also as no-one has mentioned it yet, TensorFlow wrapper for Julia:

https://github.com/malmaud/TensorFlow.jl

bjz_ · on June 21, 2018

As Haskell begins to get more and more dependently typed features this could definitely be an exciting possibility (I think there are folks already working on this).

lgas · on June 21, 2018

Indeed there are: https://www.youtube.com/watch?v=ulwoUq6VaSs

cheez · on June 21, 2018

c++ does a good job

gaius · on June 21, 2018

I have been looking to get into C++ with CNTK actually... I think this might be the next big thing. People don’t realise how close to a high-level language C++ is nowadays with C++14 compliance in every major compiler

In fact I am sure that advocates of Go, Rust et al are really comparing them to C++98 and would be very pleasantly surprised by C++14

jb1991 · on June 21, 2018

I love C++ but it is hard to deny the vast workflow improvements that come with languages that utilize a REPL, like Python or Julia or Matlab/Octave. Being able to poke at your code or data and experiment without a compile/run/debug cycle is a huge advantage to productivity.

gaius · on June 21, 2018

it is hard to deny the vast workflow improvements that come with languages that utilize a REPL

I’m using cling, part of ROOT, for that https://root.cern.ch/cling

jb1991 · on June 21, 2018

That's cool, but it's hardly the idiomatic workflow for C++. Maybe that will change; it should!

pjmlp · on June 21, 2018

A few advocates of safer systems programming languages, regardless of which ones, happen to use C++, are up to date with C++17 and follow up on ISO C++'s work.

Thing is, no matter how much better C++ gets, preventing some groups to use it as "C compiled with C++ compiler" is fighting windmills.

ubernostrum · on June 21, 2018

Python is my preferred language to work in, and has been for many years.

It's not the only language I've used, though, and I've had basically the inverse of your problem from time to time: people who think running an API-documentation generator over their statically-typed code, so that it just spits out a big list of functions with their return types, parameter names and parameter types, is "documentation".

Now, it may be you come from a world where that's the standard way to do things and you've gotten used to it and mostly don't notice just how much further research and digging it takes to figure out from "Frobnitzable bunglify(Widget[] w, int c)" what it is exactly that it, um, does. But it's hard to knock people for not writing documentation when advocating for a language feature that encourages people not to write documentation!

My experience is that it's easy to fail to produce useful documentation in just about any programming language, and that languages' choice of type systems does not correlate significantly with the quality of documentation of key libraries and tools.

ioddly · on June 21, 2018

I think there's a fair bit to be said for API discoverability. I was working with the TypeScript compiler API earlier this year, which is an excellent piece of software but not particularly well documented, and I was able to discover a fair bit of what I needed through Intellisense and well-named functions.

Python is also fairly good in this regard, though not quite as convenient, with the help() function (it's been a long time since I saw one of these but I do recall a handful of libraries breaking help with "clever" metaprogramming).

Ceteris paribus, I'd rather have the nice completions.

ubernostrum · on June 21, 2018

My point is mostly that real, good, documentation is a lot of effort no matter the language or its type system, and that once you've worked in a language long enough you internalize whatever processes you use to compensate for common shortcomings in that language's documentation. Which then means it's often not a fair comparison to be thrown into a new language, requiring new processes to compensate, that you aren't used to and so notice more.

I used to point out the same thing way back in the day when people argued that tables were "easier" for web page layout than CSS, because of all the hacks involved in doing CSS layout 15-ish years ago. It wasn't actually that tables didn't require hacks, it was that people had been doing tables long enough they'd forgotten how much of it consisted of hacks, and so the new and unfamiliar stuff they had to learn to make CSS work only seemed more complicated.

int_19h · on June 22, 2018

Well-designed APIs can be self-documenting to a significant extent with the right choice of names (and things they name, as well). But it takes as much if not more effort to do this right, as it does to write quality docs.

chewxy · on June 21, 2018

There's always Go - Gorgonia[0] aims to be a Tensorflow/PyTorch replacement. Performance is somewhat comparable (somewhat because it's always in flux as I keep updating it). Heck, it even uses hindley milner type inferences (not that it's particularly useful)[1]

[0] - https://github.com/gorgonia/gorgonia

[1] - https://github.com/chewxy/hm

blt · on June 21, 2018

I totally agree, it's kind of depressing. With deep learning we had an opportunity to pick a new language because ppl would have used whatever language was necessary to get what tf (& theano) provide. If Google had written tf in Julia, Swift, Nim, etc. our whole world could be different.

oddity · on June 21, 2018

I also strongly dislike python, but I've come to accept that it has its niche and that my main issue (beyond the parts I believe are design flaws) is that its correct niche is far smaller than the domain is actually used.

It's the bash/perl of the 90s, and not the language you should be writing your deliverable in. Unfortunately, python happens to work well for getting something functioning, and turning that into something that functions well gets called "maintenance". I've given up that fight and now I'll just stay in python land for as long as it takes to isolate it into a separate process and then go and write the rest of my code in something I prefer.

Guthur · on June 21, 2018

And so you know f(a b c) accepts floats, but what good is that if you don't actually know what a b c are. Which is the reason I hate the term type safe. It's type checked but the behavior may or may not be safe.

Melchizedek · on June 21, 2018

If you want the type to be more specific than float, you can create a more specific type.

Melchizedek · on June 21, 2018

The most appropriate languages for ML should really be functional first, statically typed languages like OCaml or F# or maybe Scala.

The problem might be that many people doing ML don't really know how to program beyond a basic level, and at first glance a language like Python seems "easier". But they get by anyway since the difficult part of ML is often not really the programming as such, and if your program is just a 100 line script calling Tensorflow then any language can work, however annoying it may be.

int_19h · on June 22, 2018

I would imagine that static typing would make the experimentation phase (i.e. slicing and dicing things in REPL) more annoying than they need to be.

nnq · on June 21, 2018

Thinks like TensorFlow for Swift are probably partially motivated by the same problem you mention.

Swift overall seems like a good possible static language for ML, but gaining popularity is a slow thing. Actually I really can't seems to see much other alternative languages than it that could gain popularity except maybe Julia.

pjmlp · on June 21, 2018

Quite right, as explained on its presentation.

https://www.youtube.com/watch?v=Yze693W4MaU

zmmmmm · on June 21, 2018

> I will always wish that Python weren't the lingua franca of sorts for machine learning.

Absolutely ... I feel like the ML world is caught in a strong local minima with Python: it gets people on board due to the exploratory ease, but then the effort of learning a different system once they already understand things in that ecosystem is just too great. Unfortunately it is highly suboptimal for writing highly structured, complex code.

You should have a look at DL4J and ND4J if these things frustrate you. Particularly with a dynamic JVM language like Groovy they give you something close to the best of both worlds.

pvg · on June 21, 2018

ex. completions for Tensorflow always take multiple seconds, which IMO is unacceptable

What tool are you using in which completions take so unpleasantly long?

thosakwe · on June 21, 2018

Jedi in Vim, PyCharm/IntelliJ, and whatever tool VSCode uses (which I think is Jedi).

Same experience everywhere. Once I type "tf" - a hang for multiple seconds, presumably while it loads the entire Tensorflow library. The completion then works smoothly, until I hit space or otherwise exit the completion.

Once I type "tf" again, the cycle repeats, and I have to wait multiple seconds.

You'd think the tool might cache libraries instead of reloading them every time. Seems like it doesn't at all.

Congeec · on June 21, 2018

You need YouCompleteMe. It uses Jedi internally, but the completion is async i.e. the ui is not blocked. Also it caches completions, you shall get completion candidates reasonably fast the second time you input tf.

thosakwe · on June 21, 2018

Might have to do look into that. Currently not doing most of my development in Vim though; I typically prefer IntelliJ.

pvg · on June 21, 2018

IntelliJ completion should index and cache and it's always async. I just tried it with the current one (and current tf) - the response is not instant (tf has a ridiculous number of toplevel definitions) but it's much faster than 2 seconds.

pjmlp · on June 21, 2018

Maybe Google's Swift for Tensorflow will eventually change that.

gaius · on June 21, 2018

will always wish that Python weren't the lingua franca of sorts for machine learning.

Then you’re in luck, because that language is R!