Removing the GIL won't magically solve any problems.
This has been tried before, with disappointing results, which is why I'm reluctant to put much effort into it myself. In 1999 Greg Stein (with Mark Hammond?) produced a fork of Python (1.5 I believe) that removed the GIL, replacing it with fine-grained locks on all mutable data structures. He also submitted patches that removed many of the reliances on global mutable data structures, which I accepted. However, after benchmarking, it was shown that even on the platform with the fastest locking primitive (Windows at the time) it slowed down single-threaded execution nearly two-fold, meaning that on two CPUs, you could get just a little more work done without the GIL than on a single CPU with the GIL.
"To reduce memory usage, the garbage collector will now clear internal free lists when garbage-collecting the highest generation of objects. This may return memory to the operating system sooner."
For someone building long running processes or loading data generating millions of unique ints or floats over and over, this is major (though still hard to believe it was ever an issue).
I'm not sure I understand why this is "huge." If the working set of the program is stable over time (or slowly growing), it seems like this isn't a significant win: it's only a win when the working set shrinks, and this change allows the freed memory to be returned to the OS more promptly. Your examples of "long running processes" or "generating millions of unique ints/floats" wouldn't necessarily qualify: ISTM that the normal generational GC should handle both those cases fine. Am I missing something?
Long story short: CPython uses a custom memory allocator on top of the OS malloc because some mallocs are really bad and python knows more about it's memory usage patterns than the OS. The newer CPython allocator plays better with popular OS's so that repeatedly newing and freeing lots of objects is more likely to return memory to the system.
The old behavior didn't effect server sized systems and typical workloads but it did piss off some embedded apps.
Abstract Base Classes will finally allow us some semblance of an interface. Glad that this is one less reason to look to java when teaching the principals and benefits of OO-programming.
I would prefer not having a GIL, anyway, but this is far better than nothing.