Note
Click here to download the full example code
Multiprocessing#
This notebook will show you how to use parallel processing with navis
.
By default, most NAVis functions use only a single thread/process (although some third-party functions used under the hood might). Distributing expensive computations across multiple cores can speed things up considerable.
Many NAVis functions natively support parallel processing. This notebook will illustrate various ways to use parallelism. Before we get start: NAVis uses pathos
for multiprocessing - if you installed NAVis with pip install navis[all]
you should be all set. If not, you can install pathos
separately:
pip install pathos -U
Running NAVis functions in parallel#
Since version 0.6.0
many NAVis functions accept a parallel=True
and an (optional) n_cores
parameter:
import time
import navis
def time_func(func, *args, **kwargs):
"""A function to time the execution of a function."""
start = time.time()
func(*args, **kwargs)
print(f"Execution time: {round(time.time() - start, 2)}s")
# Load example neurons
nl = navis.example_neurons()
Important
This documentation is built on Github Actions where the number of cores can be as low as 2. The speedup on your machine should be more pronounced than what you see below. That said: parallel processing has some overhead and for small tasks the overhead can be larger than the speed-up.
Without parallel processing:
time_func (
navis.resample_skeleton,
nl,
resample_to=125
)
Out:
Execution time: 1.7s
With parallel processing:
time_func (
navis.resample_skeleton,
nl,
resample_to=125,
parallel=True
)
Out:
Execution time: 1.33s
The same also works for neuron methods!
Without parallel processing:
time_func (
nl.resample, 125
)
Out:
Execution time: 1.73s
With parallel processing:
time_func (
nl.resample, 125, parallel=True
)
Out:
Execution time: 1.28s
By default parallel=True
will use half the available CPU cores. You can adjust that behaviour using the n_cores
parameter:
time_func (
nl.resample, 125, parallel=True, n_cores=2
)
Out:
Execution time: 1.33s
Note
The name n_cores
is actually a bit misleading as it determines the number of parallel processes that NAVis will spawn. There is nothing stopping you from setting n_cores
to a number higher than the number of available CPU cores. However, doing so will likely over-subscribe your CPU and end up slowing things down.
Parallelizing generic functions#
For non-NAVis function you can use NeuronList.apply
to parallelize them.
First, let's write a mock function that simply waits one second and then returns the number of nodes:
def my_func(x):
import time
time.sleep(1)
return x.n_nodes
# Without parallel processing
time_func (
nl.apply, my_func
)
Out:
Execution time: 5.01s
# With parallel processing
time_func (
nl.apply, my_func, parallel=True
)
Out:
Execution time: 3.26s
Total running time of the script: ( 0 minutes 15.692 seconds)
Download Python source code: tutorial_misc_00_multiprocess.py
Download Jupyter notebook: tutorial_misc_00_multiprocess.ipynb