Skip to content

Common Pitfalls (pickling, __main__)

1) Missing main guard

Always use:

main_guard.py
if __name__ == "__main__":
    pass
main_guard.py
if __name__ == "__main__":
    pass

Without this, some environments will repeatedly spawn child processes.

2) Pickling errors

Multiprocessing needs to serialize (pickle) functions and data.

Avoid:

  • lambdas
  • nested functions
  • open file handles
  • database connections

Prefer:

  • top-level functions
  • simple data (numbers/strings/lists/dicts)

3) Oversubscribing CPUs

Creating too many processes can slow down your system.

Guidance:

  • start with os.cpu_count()os.cpu_count()
  • benchmark for your workloads

4) Returning huge data

Sending massive arrays through Queue can be slow.

Options:

  • write output to files
  • batch results
  • aggregate inside workers

๐Ÿงช Try It Yourself

Exercise 1 โ€“ Start a Process

Exercise 2 โ€“ Process Pool map()

Exercise 3 โ€“ Multiprocessing Queue

If this helped you, consider buying me a coffee โ˜•

Buy me a coffee

Was this page helpful?

Let us know how we did