Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Per-interpreter isolation for extension modules

This will break many modules. Basically any that use static variables, which is done pretty much everywhere.



Yes this would be a challenge for extension modules to implement support for this. Here is a discussion between the core dev and the numpy team: https://mail.python.org/archives/list/numpy-discussion@pytho...

It's going to be a bit of a chicken and egg problem, core Python will need to prove it's worthwhile for extension devs to implement, core Python will struggle without support from extension devs. We shall see.


IMO, it was this sort of chicken & egg problem that slowed the adoption of 3.x in the first place. I know personally, I wasn't able to use 3.x for anything non-trivial until close to 3.7 because some of the 3rd party libs I needed weren't available. I seriously hope this doesn't happen again, though I am really excited for these improvements to CPython.


I don't disagree, but the positive thing about this is it's opt-in for extensions.

If extensions don't support it it means you just can't use that extension when trying to run multiple interpreters in the same process. Let's see if there's even a good use case for running multiple interpreters in the same process outside of embedded programming, it's not 100% clear yet.


If it's static would it not get it's own allocation within each of the isolated interpreters?


Static modules are loaded as shared libraries/dlls. The way operating systems implement this is that each library is loaded once per process and that statically allocated memory is mapped into the virtual address space of the process. You can't load one so/dll multiple times in some sort of container, so each module would have to implement this isolation inside their module, probably through some sort of API that the python runtime offers to the module. It's not rocket science but it will definitely break existing code where it's common practice to use dll lifetime hooks as initialization code that allocated some global state that's conveniently shared throughout the module.


> You can't load one so/dll multiple times in some sort of container

I believe you can do that with `dlmopen` in separate link maps. I have worked with multiple completely isolated Python interpreters in the same process that do not share a GIL using that approach.


Thank you for the hint about dlmopen! I had a problem that can be solved by loading multiple copies of a DLL, and it looks like reading manpages of the dynamic linker would have been a better approach than googling with the wrong keywords.


That's great!

There are a few cases where `dlmopen` has issues, for example, some libraries are written with the assumption that there will only be one of them in the process (their use of globals/thread local variables etc.) which may result in conflicts across namespaces.

Specifically, `libpthread` has one such issue [1] where `pthread_key_create` will create duplicate keys in separate namespaces. But these keys are later used to index into `THREAD_SELF->specific_1stblock` which is shared between all namespaces, which can cause all sorts of weird issues.

There is a (relatively old, unmerged) patch to glibc where you can specify some libraries to be shared across namespaces [2].

[1]: https://sourceware.org/bugzilla/show_bug.cgi?id=24776#c13

[2]: https://patchwork.ozlabs.org/project/glibc/patch/20211010163...


IIRC glibc is limited to 16 namespaces though.


Currently it is, yes. I am not sure how fundamental it is. I tried patching glibc to support more (128 in my case) and it seemed to work fine.


It's all a single process, and native modules are just shared libraries, so how would it allocate multiple instances for different interpreters?


Does anyone know of a way to load multiple instances of a DLL in the same process on Linux? A few months ago I was googling for a solution and didn't find anything ready-made. I guess the dynamic linker wants to have a unique address for each symbol, but in principle you should be able to load another DLL instance, initialize it and call its functions indirectly by using function pointers.


How would you find said function pointers?


dlsym and RTLD_LOCAL ?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: