Multithreading vs Multiprocessing in Python
Pre-requisites: Multithreading and Multiprocessing
Remember the golden rule:
multithreading
for I/O bound tasks andmultiprocessing
for CPU bound task
Python threads vs process
Following table shows python specific details of program_processes_threads
Global Interpreter lock (GIL)
-
Ensures thread safety
-
Improves single thread performance
-
Prevents simultaneous multi-threading
-
Bad for CPU limited tasks
-
I/O limited threads are hardly affected
Outcomes
Following benchmarks were run on a 8 core / 16 threads CPU
We have measuring number of function calls, higher the better
(NOTE: below multithreading only uses 1 process and either 4/8/16 threads. Multiprocessing uses 4/8/16 processes with 1 thread each)
Conclusion
- Python is multithreaded
- It is concurrent but not parallel
- CPython will consider switching threads every 15 ms or when an I/0 operation is encountered
- Multi-threaded does not strictly mean single-core
- the OS may switch the python process between physical and virtual CPU cores!
-
multiprocessing is the standard python way to increase processing power if needed
-
Most numerical libraries (numpy, scipy, tensor flow) are simultaneously multi-threaded behind the scenes
-
Since threading allows process data to be accessed, data may need to be protected with Lock()
-
In multiprocessing, Pipe() and Queue() are used for processes to share data between processes
-
Pipe() may be corrupted if accessed simultaneously
- It is NOT thread safe
- Queue() can be accessed by multiple users
- It is thread and process safe