Python threading and multiprocessing basics
threading module, Thread class, GIL, I/O-bound vs CPU-bound, multiprocessing module, Process class, concurrent.futures, ThreadPoolExecutor
Concurrency in Python
Python has a GIL (Global Interpreter Lock) that prevents true parallel thread execution for CPU work. Threads are effective for I/O-bound tasks; processes bypass the GIL for CPU-bound work.
import threading
import time
def worker(name, delay):
time.sleep(delay)
print(f"{name} done")
t1 = threading.Thread(target=worker, args=("A", 1))
t2 = threading.Thread(target=worker, args=("B", 2))
t1.start(); t2.start()
t1.join(); t2.join() # wait for both
print("All done")
ThreadPoolExecutor (recommended)
from concurrent.futures import ThreadPoolExecutor
urls = ["url1", "url2", "url3"]
def fetch(url):
time.sleep(0.5) # simulate I/O
return f"fetched {url}"
with ThreadPoolExecutor(max_workers=3) as pool:
results = list(pool.map(fetch, urls))
print(results)
Multiprocessing
from concurrent.futures import ProcessPoolExecutor
def cpu_task(n):
return sum(i**2 for i in range(n))
with ProcessPoolExecutor() as pool:
results = list(pool.map(cpu_task, [10**6]*4))
Use ThreadPoolExecutor for network/file I/O and ProcessPoolExecutor for CPU-heavy computation. Both are cleaner and safer than managing threads or processes manually.
Concurrency is a broad topic — Python offers threads, processes, and coroutines (asyncio), each suited to different problems. For I/O-bound work with existing synchronous libraries, threads are the pragmatic choice. For CPU-intensive computations like image processing or numerical work, processes are necessary to bypass the GIL. For high-concurrency I/O with thousands of simultaneous connections (web servers, chat systems), asyncio is the most efficient approach. concurrent.futures provides a consistent API for both threads and processes, making it easy to swap between them as requirements change.
