Accessing a global object as the most simple benchmark that in fact exercises the locks still shows a massive slowdown that is not offset by the moderate general speedups since 3.9:
x = 0
def f():
global x
for i in range(100000000):
x += i
f()
print(x)
Results:
3.9: 7.1s
3.11: 5.9s
3.14: 6.5s
3.14-nogil: 8.4s
That is a NOGIL slowdown of 18% compared to 3.9, 44% compared to 3.11 and 30% compared to 3.14. These numbers are in line with previous attempts at GIL removal that were rejected because they didn't come from Facebook.
Please do not complain about the global object. Using a pure function would obviously be a useless benchmark for locking and real world Python code bases have far more intricate access patterns.
>Please do not complain about the global object. Using a pure function would obviously be a useless benchmark for locking and real world Python code bases have far more intricate access patterns.
Just because there's a lot of shit Python code out there, doesn't mean people who want to write clean, performant Python code should suffer for it.
If you are accessing a global in a loop you might want to assign it to a local variable first. From what I remember that should result in a speedup in all python versions.
That's correct. Flask has a global request context object, so by design it can only safely handle a single request at a time per Python interpreter. If you want to parallelize multiple Flask servers, you spin up multiple interpreters.
Web services in Python that want to handle multiple comcurrent requests in the same interpreter should be using a web framework that is designed around that expectation and don't use a global request context object, such as FastAPI.
You misunderstand. The "request" or "g" objects in Flask are proxies which access the actual objects through contextvars, which are effectively thread-local storage with some extra sugar. The context stack of a contextvar is already within the TLS and therefore always bound to a specific thread.
Do you mean to say that mutating globals is not commonly used?
Because literally every import, class definition, or function definition that you make at top-level is a global.
Now some people do in fact do all those things inside a function, too, and then call that function as the only thing that actually happens globally. And I've done such hacks myself to squeeze the last few % of perf out of CPython on the very rare occasions where you need to do that but dropping into C is not an option. But that's certainly not idiomatic Python.
This is a silly benchmark though. Look at pyperformance if you want something that might represent real script/application performance. Generally 3.14t is about 0.9x the performance of the default build. That depends on a lot of things though.
> It's also common / Pythonic to use uppercase L for lists.
Variables always start with a lowercase letter in idiomatic Python unless they're constants or types.
Using single-letter uppercase for variables is not unusual in ML Python code, but that also happens to be one of the worst ecosystems when it comes to idiomatic Python and general code quality.
As mentioned in the article, others might have different constraints that make the GIL worth it for them; since both versions of Python are available anyways, it's a win in my book.
Even though technically, everything in Python is an object, I feel strongly that programmers should avoid OOP in Python like the plague. Every object is a petri dish for state corruption.
Thee is a very solid list of reasons to use pure functions with explicit passing wherever humanly possible, and I personally believe there is no comparable list of reason to use OOP.
* Stack-allocated primitives need no refcounting
* Immutable structures reduce synchronization
* Data locality improves when you pass arrays/structs rather than object graphs
* Pure functions can be parallelized without locks
I'm 100% on board regarding the use of pure functions and eschewing shared mutable state wherever practical. (Local mutable state is fine.)
But OOP does not necessitate mutable state; you can do OOP with immutable objects and pure methods (except the constructor). Objects are collections of partially-applied functions (which is explicit in Python) that also conceal internal details within its own namespace. It is convenient in certain cases.
I haven’t seriously used Python in over 15 years, but I assume the comparison is against using a preforking server with 1+ process per core.
The question is whether 1+ thread per core with GIL free Python perform as well as 1+ process per core with GIL.
My understanding is that this global is just a way to demonstrate that the finely grained locking in the GIL free version may make it so that preforking servers may still be more performant.
An incrementing global counter is a pretty common scenario if your goal is to have guaranteed-unique IDs assigned to objects within a process, especially if you want them to be sequential too. I've got counters like that in various parts of code I've shipped, typically incremented using atomics.
Please do not complain about the global object. Using a pure function would obviously be a useless benchmark for locking and real world Python code bases have far more intricate access patterns.