Hacker News new | past | comments | ask | show | jobs | submit login
JupyterLite: a JupyterLab distribution that runs in the browser (github.com/jupyterlite)
326 points by abdullahkhalids on Nov 29, 2022 | hide | past | favorite | 55 comments



JupyterLite is such an incredible piece of software. It came out of the Pyodide project, which first got Python compiled to WebAssembly working in the browser.

It really is astonishing that they've managed to get the full data science Python stack - a lot of it based around custom C extensions (numpy, Pandas etc) running entirely in the browser.


Shameless self promotion: I made a constrained particle simulator and an numerical integrator visualizer with pyodide. (They are both for class assignments.)

https://scleox.github.io/constrained-particle-system-simulat...

https://scleox.github.io/integrator-visualizer/index.html


it's neat, but it's kinda slow (25x slower than native when I tested it)

In a JupyterLyte notebook:

    import numpy as np
    a = np.random.randn(128,128)
    b = np.random.randn(128,128)
    %timeit a @ b
    1.5 ms ± 3.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

On my command line:

    $ python3 -m timeit --setup 'import numpy as np; a = np.random.randn(128,128); b = np.random.randn(128,128)' 'a @ b'
    5000 loops, best of 5: 59.4 usec per loop


While this is obviously a great demonstration as to why you wouldn't do something silly like build an entire code editor in the browser for day to day use (because it would be slow and clunky and have all sorts of issues), it's great for a accessibility when you don't need performance at all.


Is this a snide comment on Atom and VS code? (I do think they're "slow" but not necessarily more than non-web rich IDEs - just compared to something like vim with conservative config/plug-ins).


I don't think so.

VSCode, Atom (and Jupyter) all run natively, only using a browser renderer for UI.

Jupiter already uses a browser as the front end, but the kernel runs natively.

JupiterLite runs entirely in the browswer, including the kernel. This means you don't need Jupyter installed or running on a server to execute the kernel. But it also means the kernel is running much slower in Web Assembly.


Yeah I’ve had a slowly bubbling level of hatred towards vscode for the last few months. Because it’s heft it’s absorbed all the plug-in ecosystem but it’s like sublime text left the sunroof open and now everything is soggy.


There's definitely a noticeable input lag in vscode.

I basically nerfed the vscode language features which make it more manageable.

But sublime still feels far more responsive.


Sure, but if you want to do a demo or have students do something quick, it's a great start


Not that I expect this to have fantastic performance, but I think your benchmark is cheating. Locally hosted Jupyter is going to have significant overhead compared to a straight Python interpreter.


On my machine parents test is;

   python command line 247µs 
   local  jupyter lab  274µs
          jupyter lite 4ms  (about 16 times slower than the other two)
Honestly considering how jupyter lite is working, I still feel this is pretty good.


I’m shocked the local instance is so tight with the interpreter. I assumed jupyter injected itself all over the place to allow halting loops and what have you.

I do not care about the performance of jupyer lite - that it runs at all is a miracle. That it is acceptable performance for doing demos/teaching from a browser is simply magic.


Jupyter is incredibly powerful. I think a lot of the surprise (which I also had) is when you come from compiled languages and see the huge difference between running a compiled machine code vs just some faux interpreter. Python doesn’t have that. Everything is an interpreter (in a very loose sense). Jupyter, ipython, the shell etc all typically work similarly. If you think the overhead of rendering a webpage should slow down Jupyter, that is not the case. The “kernel” running your code and what’s handling your interactivity with Jupyter are separate processes.


In Jupyter you choose Run Cell or Debug Cell. Debug Cell just uses sys.settrace like pdb does. Run Cell is straight python. Also, the example they gave pretty much calls into C straight away.


Why do you say that? I assume the timeit magic reduces a lot of overhead


What happened if you increase the size of the matrices? I think we might be measuring the overhead


Such a compact resource usage in WebAssembly.

I am seeing about 50-60MB resource usage which is fantastic considering that includes pandas, numpy, sklearn and some other non standard modules.

By comparison when I tried to package a hello world pandas project with PyInstaller the single file exe for Windows came out to 500MB.


and Fortran! SciPy is like 20% fortran (as far as I remember).


It really is. I guess this'll be quite useful for people on chromebooks and iPads as well.


Chromebooks can run linux containers out of the box, you don't need to hurt yourself like this. The experience is similar to WSL.

https://support.google.com/chromebook/answer/9145439?hl=en


Not school-issued ones, as my son and I discovered. He told me that the school had assigned a Chromebook to each student. I said: "Great, let's install Python on it."

Nope. Locked down like a crab's arse.

Sad. The Chromebook went under his bed, and came back out at the end of the school year when it had to be returned.


I’ve never mentally rendered a crab’s ass but now your comment got me thinking, is it some sort of chitin-based mechanical trap door? Or maybe an iris like the star gate but with fewer petals? This is a particularly glaring gap in my intuition’s ability to improvise an answer.


I didn’t believe it when I first read it but I can indeed run arbitrary code on my iPhone with Jupyterlite. This is great! So finally I can analyse data using the iPad on a plane, hopefully… if I find a way to reliably start Jupyterlite from the iPad‘s local storage, load my data from there, and have a persistent python environment.


This GitHub Actions + GitHub Pages feature looks like the use case I would be looking to exploit for educational purposes.

From this: https://github.com/jupyterlite/demo you can easily get to https://jupyterlite.github.io/demo

In case anyone wants to be helpful, I would love to see a guide to how to use this to transparently display where files are stored so that I could figure out how to automatically commit my changes or something like that. Each repository could be a lesson and worksheet for a coding/class, or even a collection of lessons. I imagine it would be fairly straightforward to create an interaction mode as well, for non-coders to ogle at other's charts and tables and interactive bokeh plots creations.


Yes, the reason I posted this is because I was looking into how to host my mini-book online, which has a lot of computational interactions built in.

I don't want to deal with hosting the backend on my server and all the security and performance issues it entails. This is a great solution.

Though not ideal. I want to publish the book as a jupyter-book, and best would be if the code cells in it were interactive. I think my readers can do without the remaining features of Jupyter lab.


There was some work (and POC) on getting pyodide to work as a thebe backend. You can check out this thread: https://github.com/executablebooks/thebe/issues/465 and other issues/PRs backlinked to it. I don't think anything like that has shipped yet, but definitely worth exploring (in the meantime the usual mybinder backend[1] for jupyter books works great, i.e. you don't have to host yourself).

[1] https://jupyterbook.org/en/stable/interactive/launchbuttons....


> I would love to see a guide to how to use this to transparently display where files are stored [...]

I'm not 100% sure I understand your request for a guide, but if you create your own repo based on the jupyterlite/demo template, you can then put your notebooks in `content` dir[1] and they will automatically become available. The CI build step that does this is here[2]. So in some sense you don't need a guide, it just works ;)

[1] https://github.com/jupyterlite/demo/tree/main/content

[2] https://github.com/jupyterlite/demo/blob/main/.github/workfl...


Related:

Tiny Games in JupyterLite - https://news.ycombinator.com/item?id=31315470 - May 2022 (1 comment)

JupyterLite: Jupyter WebAssembly Python - https://news.ycombinator.com/item?id=27823962 - July 2021 (2 comments)

JupyterLite – WASM-powered Jupyter running in the browser - https://news.ycombinator.com/item?id=27323548 - May 2021 (64 comments)


JupyterLite is great! I use it heavily with my middle school math students — wonderful to have NumPy, matplotlib, and friends available without needing to set up accounts. I created a package called jupy5 that provides Turtle graphics and a p5.js-inspired sketchbook for JupyterLite as well.


jupy5 sounds pretty awesome!


Thanks! And big thanks to martinRenou1 for his work on ipycanvas!! He did most of the heavy lifting.


What exactly is Jupyterlab and how does it fit into the Jupiter ecosystem? There are notebooks and jupyterlab and jupyterhub, but I haven’t found anything documenting what role each fulfills.


Jupyter Notebooks was the start (of things Jupyter)

Then people said it can be tough for people who want to do data analysis to have to figure out installing python locally and getting everything set up to run notebooks just to open a web browser, so JupyterHub was created so that Data Analyst types didn’t have to worry about any of that, they could just open a web browser and work (also running on a server lets you have a beefy server to connect to).

JupyterLab is the next version of Jupyter Notebooks, separating some of the concerns on the back end and giving a bit of a different interface on the front end.


Jupyterlab is just an IDE. You can open up terminals, consoles (which are like notebooks, but are a console), and notebooks, on top of a text editor.

It's actually really nice. Works great over ondemand and is way more responsive than X-based apps.

Jupyterhub is a way to coordinate compute space / config across many users of notebooks or jupyterlab.


This is really great because I could „install“ it on my pc at work which wouldn‘t let me install any actual downloadable software (not working there as a developer, just office stuff). It almost behaves like a real app, I only need to upload files to the app file system. Totally sufficient for running a small project and figuring things out in pandas or extracting things from a few CSVs. I tried replit.com before but the connection to the vm never really worked well enough, issuing commands took ages. Only other thing is writing PowerShell which I can run surprisingly but that’s far from the flexibility of Python.


Starboard.gg is similar but also allows JavaScript, CSS, or HTML cells to execute in the same notebook. Python has access to the DOM and can share variables with JavaScript.


I love the idea of Starboard. But seems like the development has stalled?


Yes. Last commit was 5 months ago [1]. Seems like a great idea though.

What I don't like it is that they invented yet another markdown syntax for code cells - it is the opening bracket # %[python] with no closing bracket.

There already is a popular markdown code cell syntax of [2]

```python

```

[1] https://github.com/gzuidhof/starboard-notebook

[2] https://github.github.com/gfm/#fenced-code-blocks


The format is only partially invented, it follows Jupytext [0], but adds support for cell metadata. There is no obvious way to get that in fenced codeblocks, especially with the ability to spread it over multiple lines so it plays well with version control.

One more consideration is that it's not "Markdown with code blocks interspersed", one might as well use plaintext or AsciiDoc.

Of course there are tradeoffs.. I wish I had more time to work on it.

[0]: https://github.com/gzuidhof/starboard-notebook/blob/master/d...

[1]: https://github.com/mwouts/jupytext


Thanks for the clarifications. I still don't like it, but I understand.


Whatever happened to collaborative editing in these notebooks? It would be a great usecase for JupyterLite but last I saw the only project for real-time sharing of notebooks over the network was slurped up into Google Colab any open version was abandoned.



Excellent news, thank you! Imagine telling someone even ten years ago that they could program something in a notebook environment, over the internet, in a slow language, with all the work of setup and computation being done in the browser, and it would still be convenient and fast!


This looks great and as an R guy makes me rather jealous. Any chance that R will get compilation to wasm? There's this but it seems dead: https://github.com/iodide-project/r-wasm



the overall effort could be a game changer for data science education. the barriers have dropped considerably since open source data science stacks based on python, R or julia became widely used but it is still quite a hassle to setup an environment for non-technical people


JupyterLite is a great project!

Addressing some other comments: the main limitation you'll find is probably the filesystem. WASM File System APIs are not extremely well evolved and (i think?) they're still in-memory FSs.


JupyterLite has a binding between the browser local storage and the WASM File System, so you can create a file from Python and it will be saved in the browser local storage for next time you come


Files seem to persist even after quitting chrome, refreshing or closing the tab. I'm not an expert, but assume it's cookie storage.


I think JupyterLite is using localStorage by default. But there are File System APIs for WASM and some external projects[0], I don't think they're yet integrated.

[0] https://github.com/jvilk/BrowserFS


Thanks, yes I see it now. It stores the notebook, which is a json, as a value in the localStorage object, with the filename as a key. Interesting, yes, not quite easy to configure (though I haven't read through the documentation or code to find this).


> I think JupyterLite is using localStorage by default

This has been a problem when using Konqueror.


Unreal. How does this stack up against a cluster based approach for large datasets?


It runs on your local machine through an intermediary compilation unit, so I think in comparison to a cluster the answer is “poorly”. This is for quickly spinning up an analysis without any of the traditional hosting mechanisms (virtualenvs, docker, etc).

I think of it as a fantastic teaching tool where getting a newbie’s environment configured would otherwise be painful. With this tool, you can share a link and have people coding in moments.


JupyterLite is awesome. But I hope wasm can support grpc.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: