The limitation of thread in Python
Properties of Thread and Process in Python
Thread
- Cannot run multiple threads at the same time
- When fork a new thread, the code segment and global data segment are shared between threads → no time is wasted like forking a new process
Process
- When fork a new process, the entire data and code segment are copied to the new process → time-consuming
Benchmark
There are 2 types of tasks in Python:
- interactivity with IO such as calling HTTP API, reading files,… - IO bounded
- complex computation, using multiple CPUs - CPU bounded
There are 2 parallel processing methods in Python:
- multithreading
- multiprocessing
We use the following program to test the efficiency of multithreading/single-threading and multiprocessing/single-processing in the 2 types of tasks.
- measure function is used to measure the execution time of a function.
- fib function is the Fibonacci number calculation function, representing the task that uses multiple CPUs.
- get_user function is the function that calls the API from Github, representing the IO-bound task.
import threading
import multiprocessing
import time
import requests
def measure(func):
def f(*args, **kwargs):
start = time.time()
ret = func(*args, **kwargs)
print(f"{func.__name__} takes {time.time()-start} seconds")
return ret
return f
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
def get_user():
return requests.get("https://api.github.com/users/magiskboy")
@measure
def cpu_bounded_in_multiple_process():
p1 = multiprocessing.Process(target=fib, args=(40,), daemon=True)
p2 = multiprocessing.Process(target=fib, args=(40,), daemon=True)
p1.start()
p2.start()
p1.join()
p2.join()
@measure
def cpu_bounded_in_multiple_thread():
t1 = threading.Thread(target=fib, args=(40,), daemon=True)
t2 = threading.Thread(target=fib, args=(40,), daemon=True)
t1.start()
t2.start()
t1.join()
t2.join()
@measure
def cpu_bounded_in_single_thread():
fib(40)
fib(40)
@measure
def io_bounded_in_single_thread(n_iters=10):
for i in range(n_iters):
get_user()
@measure
def io_bounded_in_multiple_threads(n_iters=10):
ts = []
for i in range(n_iters):
t = threading.Thread(
target=get_user,
daemon=True,
)
t.start()
ts.append(t)
for t in ts:
t.join()
if __name__ == "__main__":
cpu_bounded_in_single_thread()
cpu_bounded_in_multiple_thread()
cpu_bounded_in_multiple_process()
io_bounded_in_single_thread()
io_bounded_in_multiple_threads()
Observations
- For IO-bound tasks, multithreading is more efficient than single-threading.
- For CPU-bound tasks, multiprocessing is more efficient than multithreading.
- For CPU-bound tasks, single-threading is more efficient than multithreading.

Q&A
Q: Why is multithreading more efficient than single-threading for IO-bound tasks even though we cannot run multiple threads at the same time in Python?
A: In IO-bound tasks that do not use CPU, when a thread is locked, that thread is still considered running because the time locked overlaps with the time waiting for the IO result. Therefore, parallelism is partially guaranteed.
Q: Why is multiprocessing more efficient than multithreading for CPU-bound tasks?
A: In Python there is no multithreading because only 1 thread can run at a time. With multiprocessing, multiple processes can run at the same time.
Q: For tasks that use multiple CPUs, why is single-threading more efficient than multithreading?
A: In Python, only 1 thread can run at a time, so even if multiple threads are running, parallel processing does not occur. In fact, multithreading is less efficient in this case because of the context switching overhead.
GIL - Global Interpreter Lock
GIL is a mutex lock used to ensure that only 1 thread is executed at a time.
Mutex lock will lock the entire data segment of the entire process for other threads and only the executing thread can access that data segment.
Only use 1 mutex lock for the entire data of the process has 2 reasons:
- avoid deadlock
- reduce the cost of updating the status of mutex locks, avoiding overhead
You can read mor about GIL in here https://github.com/python/cpython/blob/main/Python/ceval_gil.c#L14
Conclusion
- For tasks that require a lot of IO interaction, use multithreading.
- For tasks that require computation, need more CPU, do not use multithreading but optimize the program to process single-threaded. In addition, you can use multiprocessing if the problem is really "heavy".