Python is widely used for data processing, automation, and scripting, but it’s often criticized for being slow in CPU-bound tasks due to the Global Interpreter Lock (GIL). Fortunately, Python’s built-in multiprocessing module allows you to fully utilize multiple CPU cores, enabling true parallelism and greatly improving performance.
In this article, you’ll learn how multiprocessing works, how it differs from multithreading, and how to implement it to make your Python programs faster and more efficient.
Why Use Multiprocessing?
When to Use:
- Heavy computations (e.g., numerical simulations, data crunching)
- CPU-bound tasks (vs I/O-bound tasks which are better with threading
- Parallel processing of independent tasks (e.g., image conversion, API calls)
Multiprocessing vs. Multithreading
| Feature | Multiprocessing | Multithreading |
|---|---|---|
| Uses multiple… | Processes (separate memory) | Threads (shared memory) |
| Bypasses GIL? | Yes | No (in CPython) |
| Ideal for… | CPU-bound tasks | I/O-bound tasks |
Getting Started with multiprocessing
Step 1: Import the module
import multiprocessing
Step 2: Define a function to run in parallel
def square(n):
return n * n
Step 3: Use a Pool to distribute tasks
if __name__ == '__main__':
with multiprocessing.Pool() as pool:
numbers = [1, 2, 3, 4, 5]
results = pool.map(square, numbers)
print(results)
Output:
[1, 4, 9, 16, 25]
Example: Parallel Processing with Pool
from multiprocessing import Pool
import time
def slow_task(x):
time.sleep(1)
return x * 2
if __name__ == '__main__':
start = time.time()
with Pool(processes=4) as pool: # Use 4 worker processes
results = pool.map(slow_task, range(8))
print(f"Results: {results}")
print(f"Time taken: {time.time() - start:.2f} seconds")
8 tasks run in about 2 seconds instead of 8 — big performance gain!
Using Process for More Control
For more custom behavior:
from multiprocessing import Process
def greet(name):
print(f"Hello, {name}!")
if __name__ == '__main__':
p1 = Process(target=greet, args=('Alice',))
p2 = Process(target=greet, args=('Bob',))
p1.start()
p2.start()
p1.join()
p2.join()
Sharing Data Between Processes
Use multiprocessing.Queue or multiprocessing.Value for inter-process communication.
Example with Queue:
from multiprocessing import Process, Queue
def worker(q):
q.put('Data from worker')
if __name__ == '__main__':
q = Queue()
p = Process(target=worker, args=(q,))
p.start()
print(q.get()) # Output from worker
p.join()
Tips and Best Practices
-
Always guard your code with
if __name__ == "__main__"to avoid infinite recursion on Windows. -
Use
Poolfor mapping functions over lists. -
Don't share regular Python objects across processes — use
Queue,Value, orManagerobjects. -
Use
concurrent.futures.ProcessPoolExecutor(Python 3.4+) for a higher-level API.
Performance Comparison
| Task Type | Sequential | Multiprocessing |
|---|---|---|
| 4 slow tasks | 4 sec | ~1 sec |
| Image processing | 20 sec | ~5 sec |
| Large math calc | 10 sec | ~2-3 sec |
Summary
| Feature | Description |
|---|---|
multiprocessing.Pool |
Best for parallelizing a function over data |
multiprocessing.Process |
Full control over separate processes |
Queue / Value |
Share data between processes |
| Avoid shared state | Each process has its own memory |
Multiprocessing is a powerful feature in Python that allows your scripts to run faster and more efficiently on multi-core systems. Whether you're processing files, analyzing data, or building a high-performance application, using multiprocessing can drastically reduce execution time and improve throughput.

0 Comments:
Post a Comment