Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Expose multiple interpreters to Python code

> Implement PEP 554

> PEP 554 - Multiple Interpreters in the Stdlib

That's going to be fun. Why fight the GIL when multithreading, when you can just get around it with more interpreters?



How is this different from multiprocessing? The examples looks like a complete nightmare...

    interp = interpreters.create()
    interp.run(tw.dedent("""
        import some_lib
        import an_expensive_module
        some_lib.set_up()
        """))
    wait_for_request()
    interp.run(tw.dedent("""
        some_lib.handle_request()
        """))
I'm actually shocked this is even being contemplated. We've regressed to evaling?


There's a massive difference to multiprocessing: the different sub-interpreters can use a C(++)/Rust extension module to talk to shared state. In the current multi-processing world, the whole C++/Rust state needs to be duplicated for each process (in the case of our app, this means 5 GB memory usage per core); with subinterpreters, we can share the same C++/Rust state.

The `interpreters` API is just the starting point. Compare it with `subprocess`, not with `multiprocessing`. Once subinterpreters are useful, people will build higher-level APIs for them.


It's just a poc to show indecently running codes, the exec() is not how you will use it, but it simulates importing python code and running it (because an exec() is what import does :)).


It's the minimum needed to get something going, and any really sane use of it would be evalling something like

    import worker
    worker.run()
The PEP explicitly mentions this, and that something like subinterpreter.run(func, ...) could be considered in the future: https://peps.python.org/pep-0554/#interpreter-call


I think this is so smart. The main thing holding back replacement of the GIL at the moment is that there is a VAST existing ecosystem of Python packages written in C/etc that would likely break without it.

Multiple interpreters with their own GIL keep all of that existing code working without any changes, and mean we can run a Python program on more than one CPU at the same time.


Only C extensions that themselves have no global state and don't depend on the GIL for locking, which most of them do. So they will all require some porting, and it will take time since it requires newer CPython API only available in 3.9+ and some even 3.11+ (PEP 630).


But isn't this basically just nicer multiprocessing?


It's much nicer if you're using the C API...


No, they are in-process and able to access the same memory, although probably put behind a communication layer.


I do think that there’s been a lot of work around GIL removal, and every talk seems to end at the reality that the GIL allows for avoiding of a loooooot of locking structures and when removing the GIL you end up needing many granular locks.


It comes at a cost, of course. You don't really have shared memory state, which is often easiest to conceptually think about.

So you are just transforming the problem into a data sharing problem between interpreters, which requires careful thought on both the language side for abstractions, and the consumer side to use right.

It also makes the tooling and verification much harder in practice - for example, you aren't reasoning about deadlocks in a single process anymore, but both within a single process and across any processes it communicates with.

At an abstract level, they are transformable into each other. At a pragmatic level, well, there is a good reason you mostly see tooling for single-process multi-threaded programs :)


> which is often easiest to conceptually think about

Absolutely, but is also the easiest to shoot yourself in the foot with. Trade-offs! I'm biased though, I'm a big fan of deep-copy channels (which for small shallow objects is still fast), though not having the option at all for shared memory here will be a bit of a pain for certain things of course.


If all global state is made thread safe and, then whether threads are subinterpreters or a single interpreter is conceptually irrelevant and probably easier to implement.


It really really depends on what you mean by global state here. If you mean the global state within the interpreter that's one thing. Preserving the global state of your application is another.

But a weird "global state" (really more a global property) is the semantics between concurrent pieces of code and the expectations about things like setting variables, possibly interleavings etc.

The nice part of different interpreters isn't just getting around the gil, and maintaining similar isolation, but it's almost like a Terms and Services agreement: I opened this can of worms and it's my responsibility to learn what the consequences are.


It is not conceptually irrelevant. With threads, you can create a Python object on one, store it into a global (or some shared state), and use that object from a different thread. You can't do that with subinterpreters tho.


I think the last decade or so of programming has taught us that people just plain suck at multithreading. Go, Rust are all languages that solve this problem in different ways. It would be a tragedy if Python went back to the old way and didn't have a better solution.


"Went back" implies that threads and shared state are not the status quo. They definitely are in Python (and, realistically, they also are in general, given the degree of Rust adoption vis a vis other PLs). So Python will have support them, if only so that we don't have to rewrite all the Python code that's already around. A new language has the luxury of not caring about backwards compatibility like that.

Also, Go doesn't really solve the problem - sure, it has channels, but it still allows for mutable shared state, and unlike Rust, it doesn't make it hard to use.


In my career, I would say 95% of parallelism does not require low level threading primitives like locks. A lot of it is solved by queues which can be provided by the runtime. The rest of the 5% usually takes up 25% of the debugging, lol.


Couldn't you transfer ownership between subinterpreters with shared memory outside the subinterpreter?

Associate some shared memory with each subinterpreter (the same array or map)


It's not a question of having a mechanism to transfer data. Sure, you can easily use a static global in a native module to easily transfer a reference across subinterpreter boundaries. But the moment you try to increment refcount for the referenced object, things already break, because you're going to be using the wrong (subinterpreter-global) lock.


Oh I was thinking from a native python object perspective.

You could have a rule that the refcount must be 1 when sending an object between subinterpreters.

In other words, you cannot use an object that was .send() to another subinterpreter.

Then you invalidate the reference in that subinterpreter when it calls send to the other subinterpreter which is transferred by assignment.

Can transfer any amount of data with zero copies.


Even a completely empty object will contain a reference to its type, which is itself an object. How will you marshal that? Bear in mind that each subinterpreter has its own copy of type objects, and there isn't even a guarantee that those types match even if their names do.


It sounds it is a problem but I feel it can be solved with engineering and mathematics. Not saying it would be easy though.

Couldn't you separate the storage of the refcounts from the objects and use a map to get at them?

As for the identities between types being different.

To create an subinterpreter that can marshall between subinterpreters without copying the data structures requires a different data structure that is safe to use from any interpreter. We need to decouple the book keeping data structures from the underlying data.

We can regenerate book keeping data structures during a .send or .receive

Maintaining identity equivalence is an interesting problem. I think it's a more fundamental problem that probably has mathematical solutions.

If we think of objects as being a vector in mathematical space. We have pointers between vectors in this memory space.

For a data structure to be position independent. We need some way of intending references to be global. But we don't want to introduce a layer of indirection on the reference of object relationships. That would be slower. Could use an atomic counter to ensure that identifiers are globally unique.

Don't want to serialize the access to global types.

It sounds to me it is a many-to-many to many-to-many problem. Which even databases don't solve very well.


It occurred to me, that I was told about Unison Lang recently and this language uses content addressable functions.

In other words, the code for a function is hashed and that is its identity that never changes while the program is running.

If we use the same approach with Python, each object could have a hash that corresponds to the code only, instead of the data. This is the objects identity even when added to the book keeping data of another subinterpreter.

This requires decoupling of book keeping information from actual object storage. But replaces pointers with lookups which could be inlined to pointer arithmetic.


> conceptually irrelevant and probably easier to implement.

Well, it depends on how it’s implemented.

If “made thread safe” means constantly grabbing locks around large blocks of data then the end result is concurrency (hopefully!) but not parallelism. Meaning you might only have one thread active at a time in practice.

Wrapping the universe in a mutex is thread safe. But it’s not a good solution.


I presume they know what they are doing and won't be doing a big world mutex :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: