Technical Appendix 17b: Profiling code for speed and an application of the Gillespie algorithm

| Download notebook

# Colab setup ------------------
import os, sys, subprocess
if "google.colab" in sys.modules:
    cmd = "pip install --upgrade biocircuits line_profiler multiprocess watermark"
    process = subprocess.Popen(cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
# ------------------------------

try:
    import multiprocess
except:
    import multiprocessing as multiprocess

import numpy as np
import numba

import biocircuits

# Plotting modules
import bokeh.io
import bokeh.plotting

bokeh.io.output_notebook()

# Line profiler
%load_ext line_profiler

Loading BokehJS ...

It helps to have complete examples as starting points for building systems to simulate with the Gillespie algorithm. In this appendix, we set up a Gillespie simulation of the repressilator. We will then return to a simple example of gene expression to demonstrate how to assess the speed of the calculation and improve it.

Simulating the repressilator

We have seen a structure to how we can set up Gillespie simulations; we need to specify the propensities and updates, along with an initial population. It helps clarify the system being simulated and also avoids bugs if we make tables for

The species whose populations we are describing;
The update-propensity pairs;
The parameter values.

After constructing the tables, coding up the update and propensity functions is much easier.

To demonstrate this procedure, we will perform a Gillespie simulation for the repressilator, as described in Elowitz and Leibler. We will consider both RNA and DNA in a repressilator where gene 1 represses gene 2, gene 2 represses gene 3, and gene 3 represses gene 1. We explicitly consider mRNA and protein. A repressor binds its operon with chemical rate constant \(k_r\), and an operon may have zero, one, or two repressors bound. The unbinding rate of a repressor when one is bound is \(k_{u,1}\) and that of a repressor when two are bound if \(k_{u,2}\), with \(k_{u,2} < k_{u,1}\) to capture cooperativity. Transcription happens when no repressors are bound to a promoter region with rate constant \(k_{m,u}\) and happens when one or two repressors are bound with rate \(k_{m,o}\).

As we build the simulation, let’s start with a table of the populations we are keeping track of.

index	description	variable
0	gene 1 mRNA copy number	`m1`
1	gene 1 protein copy number	`p1`
2	gene 2 mRNA copy number	`m2`
3	gene 2 protein copy number	`p2`
4	gene 3 mRNA copy number	`m3`
5	gene 3 protein copy number	`p3`
6	Number of repressors bound to promoter of gene 3	`n1`
7	Number of repressors bound to promoter of gene 1	`n2`
8	Number of repressors bound to promoter of gene 2	`n3`

Note that we labeled each species with an index, which corresponds to its position in the array of populations in the simulation.

Next, we can set up a table of updates and propensities for the moves we allow in the Gillespie simulation. We also assign an index to each entry here, as this helps keep track of everything.

index	description	update	propensity
0	transcription of gene 1 mRNA	`m1 ⟶ m1 + 1`	`kmu(n3 == 0) + kmo(n3 > 0)`
1	transcription of gene 2 mRNA	`m2 ⟶ m2 + 1`	`kmu(n1 == 0) + kmo(n1 > 0)`
2	transcription of gene 3 mRNA	`m3 ⟶ m3 + 1`	`kmu(n2 == 0) + kmo(n2 > 0)`
3	translation of gene 1 protein	`p1 ⟶ p1 + 1`	`kp * m1`
4	translation of gene 2 protein	`p2 ⟶ p2 + 1`	`kp * m2`
5	translation of gene 3 protein	`p3 ⟶ p3 + 1`	`kp * m3`
6	degradation of gene 1 mRNA	`m1 ⟶ m1 - 1`	`gamma_m * m1`
7	degradation of gene 2 mRNA	`m2 ⟶ m2 - 1`	`gamma_m * m2`
8	degradation of gene 3 mRNA	`m3 ⟶ m3 - 1`	`gamma_m * m3`
9	degradation of unbound gene 1 protein	`p1 ⟶ p1 - 1`	`gamma_p * p1`
10	degradation of unbound gene 2 protein	`p2 ⟶ p2 - 1`	`gamma_p * p2`
11	degradation of unbound gene 3 protein	`p3 ⟶ p3 - 1`	`gamma_p * p3`
12	degradation of bound gene 1 protein	`n1 ⟶ n1 - 1`	`gamma_p * n1`
13	degradation of bound gene 2 protein	`n2 ⟶ n2 - 1`	`gamma_p * n2`
14	degradation of bound gene 3 protein	`n3 ⟶ n3 - 1`	`gamma_p * n3`
15	binding of protein to gene 1 operator	`n3 ⟶ n3 + 1`, `p3 ⟶ p3 - 1`	`kr * p3 * (n3 < 2)`
16	binding of protein to gene 2 operator	`n1 ⟶ n1 + 1`, `p1 ⟶ p1 - 1`	`kr * p1 * (n1 < 2)`
17	binding of protein to gene 3 operator	`n2 ⟶ n2 + 1`, `p2 ⟶ p2 - 1`	`kr * p2 * (n2 < 2)`
18	unbinding of protein to gene 1 operator	`n3 ⟶ n3 - 1`, `p3 ⟶ p3 + 1`	`ku1(n3 == 1) + 2ku2*(n3 == 2)`
19	unbinding of protein to gene 2 operator	`n1 ⟶ n1 - 1`, `p1 ⟶ p1 + 1`	`ku1(n1 == 1) + 2ku2*(n1 == 2)`
20	unbinding of protein to gene 3 operator	`n2 ⟶ n2 - 1`, `p2 ⟶ p2 + 1`	`ku1(n2 == 1) + 2ku2*(n2 == 2)`

Finally, we have parameters that were introduced in the propensities, so we should have a table defining them.

parameter	value	units
`kmu`	0.5	1/sec
`kmo`	5×10\(^{-4}\)	1/sec
`kp`	0.167	1/molec-sec
`gamma_m`	0.005776	1/sec
`gamma_p`	0.001155	1/sec
`kr`	1	1/molec-sec
`ku1`	224	1/sec
`ku2`	9	1/sec

We have now clearly defined all of our parameters, updates, and propensities. With the initial population, out simulation is not completely defined. In practice, we recommend constructing tables like this (and including them in your publications!) if you are going to do Gillespie simulations in your work.

We are now tasked with coding up the propensities and the updates. Starting with the propensities, we recall that the biocircuits module requires a call signature of propensity(propensities, population, t, *args), where the args are the parameters. We find that it is clearest to list the arguments one at a time in the function definition. It is also much clearer to unpack the population into individual variables than to use indexing. Finally, when returning the array of propensities, we recommend having one propensity for each line indexed in order.

@numba.njit
def repressilator_propensity(
    propensities, population, t, kmu, kmo, kp, gamma_m, gamma_p, kr, ku1, ku2
):
    m1, p1, m2, p2, m3, p3, n1, n2, n3 = population

    propensities[0] = kmu if n3 == 0 else kmo
    propensities[1] = kmu if n1 == 0 else kmo
    propensities[2] = kmu if n2 == 0 else kmo
    propensities[3] = kp * m1
    propensities[4] = kp * m2
    propensities[5] = kp * m3
    propensities[6] = gamma_m * m1
    propensities[7] = gamma_m * m2
    propensities[8] = gamma_m * m3
    propensities[9] = gamma_p * p1
    propensities[10] = gamma_p * p2
    propensities[11] = gamma_p * p3
    propensities[12] = gamma_p * n1
    propensities[13] = gamma_p * n2
    propensities[14] = gamma_p * n3
    propensities[15] = kr * p3 * (n3 < 2)
    propensities[16] = kr * p1 * (n1 < 2)
    propensities[17] = kr * p2 * (n2 < 2)
    propensities[18] = ku1 * (n3 == 1) + 2 * ku2 * (n3 == 2)
    propensities[19] = ku1 * (n1 == 1) + 2 * ku2 * (n1 == 2)
    propensities[20] = ku1 * (n2 == 1) + 2 * ku2 * (n2 == 2)

Now, we can code up the update. The update is a matrix where entry i,j is the change in species i due to move j. Since we have indexes both the species and the moves (in the update/propensity table), we can include the indices when we define the update for clarity.

repressilator_update = np.array(
    [   # 0   1   2   3   4   5   6   7   8
        [ 1,  0,  0,  0,  0,  0,  0,  0,  0], # 0
        [ 0,  0,  1,  0,  0,  0,  0,  0,  0], # 1
        [ 0,  0,  0,  0,  1,  0,  0,  0,  0], # 2
        [ 0,  1,  0,  0,  0,  0,  0,  0,  0], # 3
        [ 0,  0,  0,  1,  0,  0,  0,  0,  0], # 4
        [ 0,  0,  0,  0,  0,  1,  0,  0,  0], # 5
        [-1,  0,  0,  0,  0,  0,  0,  0,  0], # 6
        [ 0,  0, -1,  0,  0,  0,  0,  0,  0], # 7
        [ 0,  0,  0,  0, -1,  0,  0,  0,  0], # 8
        [ 0, -1,  0,  0,  0,  0,  0,  0,  0], # 9
        [ 0,  0,  0, -1,  0,  0,  0,  0,  0], # 10
        [ 0,  0,  0,  0,  0, -1,  0,  0,  0], # 11
        [ 0,  0,  0,  0,  0,  0, -1,  0,  0], # 12
        [ 0,  0,  0,  0,  0,  0,  0, -1,  0], # 13
        [ 0,  0,  0,  0,  0,  0,  0,  0, -1], # 14
        [ 0,  0,  0,  0,  0, -1,  0,  0,  1], # 15
        [ 0, -1,  0,  0,  0,  0,  1,  0,  0], # 16
        [ 0,  0,  0, -1,  0,  0,  0,  1,  0], # 17
        [ 0,  0,  0,  0,  0,  1,  0,  0, -1], # 18
        [ 0,  1,  0,  0,  0,  0, -1,  0,  0], # 19
        [ 0,  0,  0,  1,  0,  0,  0, -1,  0], # 20
    ],
    dtype=int)

Next, we specify the parameter values according to the parameter table. Remember that we need to package them in a tuple, after defining them.

# Parameter values
kmu = 0.5
kmo = 5e-4
kp = 0.167
gamma_m = 0.005776
gamma_p = 0.001155
kr = 1.0
ku1 = 224.0
ku2 = 9.0

repressilator_args = (kmu, kmo, kp, gamma_m, gamma_p, kr, ku1, ku2)

Finally, we specify the initial population and the time points we want for the sampling.

# State with 10 copies of everything, nothing bound to operators
repressilator_pop_0 = np.array([10, 10, 10, 10, 10, 10, 0, 0, 0], dtype=np.int64)

repressilator_time_points = np.linspace(0, 80_000, 4001)

And we are all set to perform the calculation. We will make a single trace and then plot it.

# Perform the Gillespie simulation
pop = biocircuits.gillespie_ssa(
    repressilator_propensity,
    repressilator_update,
    repressilator_pop_0,
    repressilator_time_points,
    args=repressilator_args,
)


# Make plot
colors = bokeh.palettes.d3["Category10"][3]
p = bokeh.plotting.figure(
    height=250,
    width=600,
    x_range=[0, 80_000 / 60],
    x_axis_label="time (min)",
    y_axis_label="protein copy number",
)
for c, i in enumerate([1, 3, 5]):
    p.line(repressilator_time_points / 60, pop[0, :, i], color=colors[c])

bokeh.io.show(p)

We observe the oscillations, but the amplitudes are highly variable.

Increasing speed

The calculation we implemented without JITting is much slower than the version in the biocircuits package. Still, the algorithm in that package implements a rather naive version of the Gillespie algorithm.

An obvious speed improvement can be made by only recalculating the propensity for copy numbers we have not visited. For example, for simple gene expression, we do not need to recompute the propensity for mRNA decay if the previous move was a protein decay. The propensity for mRNA decay is the same it was at the previous step. Gibson and Bruck developed the next reaction method, which makes these improvements, among others, and can result in significant speed-up for complicated sets of reactions. Instead of wading into the algorithmic details, we will instead investigate how we can speed up our implementation of the direct Gillespie algorithm. (These speed boosts are all implemented in the biocircuits package.)

For simplicity, we will again consider single gene expression,

\[ \begin{align} \text{DNA} \rightarrow \text{mRNA} \rightarrow \text{protein} \end{align} \tag{1}\]

The reactions and propensities are then

\[ \begin{align} \begin{array}{ll} \text{reaction, }r_i & \text{propensity, } a_i \\ m \rightarrow m+1,\;\;\;\; & \beta_m \\[0.3em] m \rightarrow m-1, \;\;\;\; & m\\[0.3em] p \rightarrow p+1, \;\;\;\; & \beta_p m \\[0.3em] p \rightarrow p-1, \;\;\;\; & \gamma p. \end{array} \end{align} \tag{2}\]

We will first remake the non-JITted version of the propensity function to test its speed.

def simple_propensity(propensities, population, t, beta_m, beta_p, gamma):
    """Updates an array of propensities given a set of parameters
    and an array of populations.
    """
    # Unpack population
    m, p = population

    # Update propensities
    propensities[0] = beta_m  # Make mRNA transcript
    propensities[1] = m  # Degrade mRNA
    propensities[2] = beta_p * m  # Make protein
    propensities[3] = gamma * p  # Degrade protein

We also need the updates.

# Column 0 is change in m, column 1 is change in p
simple_update = np.array(
    [
        [1, 0],   # Make mRNA transcript
        [-1, 0],  # Degrade mRNA
        [0, 1],   # Make protein
        [0, -1],  # Degrade protein
    ],
    dtype=np.int64,
)

We also need functions from when we introduced the Gillespie algorithm to enable non-JITted profiling and to provide a basis of comparison going forward. These function are are in the cell below.

def sample_discrete(probs):
    """Randomly sample an index with probability given by probs."""
    # Generate random number
    q = np.random.rand()

    # Find index
    i = 0
    p_sum = 0.0
    while p_sum < q:
        p_sum += probs[i]
        i += 1
        
    return i - 1


def gillespie_draw(propensity_func, propensities, population, t, args=()):
    """
    Draws a reaction and the time it took to do that reaction.

    Parameters
    ----------
    propensity_func : function
        Function with call signature propensity_func(population, t, *args)
        used for computing propensities. This function must return
        an array of propensities.
    propensities : ndarray
        Propensities for each reaction as a 1D Numpy array.
    population : ndarray
        Current population of particles
    t : float
        Value of the current time.
    args : tuple, default ()
        Arguments to be passed to `propensity_func`.

    Returns
    -------
    rxn : int
        Index of reaction that occured.
    time : float
        Time it took for the reaction to occur.
    """
    # Compute propensities
    propensity_func(propensities, population, t, *args)

    # Sum of propensities
    props_sum = propensities.sum()

    # Compute next time
    time = np.random.exponential(1.0 / props_sum)

    # Compute discrete probabilities of each reaction
    rxn_probs = propensities / props_sum

    # Draw reaction from this distribution
    rxn = sample_discrete(rxn_probs)

    return rxn, time


def gillespie_ssa(propensity_func, update, population_0, time_points, args=()):
    """
    Uses the Gillespie stochastic simulation algorithm to sample
    from probability distribution of particle counts over time.

    Parameters
    ----------
    propensity_func : function
        Function of the form f(params, t, population) that takes the current
        population of particle counts and return an array of propensities
        for each reaction.
    update : ndarray, shape (num_reactions, num_chemical_species)
        Entry i, j gives the change in particle counts of species j
        for chemical reaction i.
    population_0 : array_like, shape (num_chemical_species)
        Array of initial populations of all chemical species.
    time_points : array_like, shape (num_time_points,)
        Array of points in time for which to sample the probability
        distribution.
    args : tuple, default ()
        The set of parameters to be passed to propensity_func.

    Returns
    -------
    sample : ndarray, shape (num_time_points, num_chemical_species)
        Entry i, j is the count of chemical species j at time
        time_points[i].
    """

    # Initialize output
    pop_out = np.empty((len(time_points), update.shape[1]), dtype=np.int64)

    # Initialize and perform simulation
    i_time = 1
    i = 0
    t = time_points[0]
    population = population_0.copy()
    pop_out[0, :] = population
    propensities = np.zeros(update.shape[0])
    while i < len(time_points):
        while t < time_points[i_time]:
            # draw the event and time step
            event, dt = gillespie_draw(
                propensity_func, propensities, population, t, args
            )

            # Update the population
            population_previous = population.copy()
            population += update[event, :]

            # Increment time
            t += dt

        # Update the index
        i = np.searchsorted(time_points > t, True)

        # Update the population
        pop_out[i_time : min(i, len(time_points))] = population_previous

        # Increment index
        i_time = i

    return pop_out

Let’s do some profiling to see what took so long. We will use the magic function %lprun to profile runs of SSAs. The output has line wrapping, so it is kind of hard to read in a Jupyter notebook, unfortunately.

# Specify parameters for calculation
args = (10.0, 10.0, 0.4)
time_points = np.linspace(0, 50, 101)
population_0 = np.array([0, 0], dtype=int)

%lprun -T lp_results.txt -f gillespie_ssa gillespie_ssa(\
        simple_propensity, simple_update, population_0, time_points, args)


*** Profile printout saved to text file 'lp_results.txt'.

We see that 80% of the time is spent doing draws. Nothing else is really worth looking at. Let’s see how we can improve our draw speed.

propensities = np.ones(4)

%lprun -T lp_results.txt -f gillespie_draw \
        [gillespie_draw(simple_propensity, propensities, population_0, 0, args) \
            for _ in range(10_000)]


*** Profile printout saved to text file 'lp_results.txt'.

The propensity function is taking the most time. We will focus on improving that.

Speed boost by JIT compilation with Numba

As we have seen, Numba is a package that does LLVM optimized just-in-time compilation of Python code. The speed-ups can be substantial. We will use numba to compile the parts of the code that we can. For many functions, we just need to decorate the function with

@numba.jit()

and that is it. If possible, we can use the nopython=True keyword argument to get more speed because the compiler can assume that all of the code is JIT-able. Note that using (numba.njit?) is equivalent to using (numba.jit?)(nopython=True).

We can speed that up by sacrificing some readability and directly accessing the arrays instead of setting up views, as we have been doing.

@numba.njit
def simple_propensity_numba(
    propensities, population, t, beta_m, beta_p, gamma
):
    """Updates an array of propensities given a set of parameters
    and an array of populations.
    """
    # Unpack population
    m, p = population

    # Update propensities
    propensities[0] = beta_m  # Make mRNA transcript
    propensities[1] = m  # Degrade mRNA
    propensities[2] = beta_p * m  # Make protein
    propensities[3] = gamma * p  # Degrade protein

    
# Check speeds
print('Old propensity function:')
%timeit simple_propensity(propensities, population_0, 0, *args)

print('\nNumba\'d propensity function:')
%timeit simple_propensity_numba(propensities, population_0, 0, *args)

Old propensity function:
1.57 μs ± 9.71 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Numba'd propensity function:
163 ns ± 1.2 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

We got some speed up! Nice!

Now, we also saw that the sums and division of arrays are slow. Let’s optimize the sum operation using numba.

@numba.njit
def sum_numba(ar):
    return ar.sum()


# Make dummy array for testing
ar = np.array([0.3, 0.4, 0.3, 0.2, 0.15])

print('\nNumPy sum:')
%timeit ar.sum()

print('\nNumba sum:')
%timeit sum_numba(ar)


NumPy sum:
431 ns ± 3.92 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Numba sum:
72.1 ns ± 0.721 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

We get another speed boost, though we should note that this speed boost in the sum is due to numba optimizing sums of a certain size. For sums over large numbers of entries, numba’s performance will not exceed NumPy’s by much at all.

Finally, we will speed up the sampling of the discrete distribution. We will do this in two ways. First, we notice that the division operation on the propensities took a fair amount of time when we profiled the code. We do not need it; we can instead sample from an unnormalized discrete distribution. Secondly, we can use numba to accelerate the while loop in the sampling.

@numba.njit
def sample_discrete_numba(probs, probs_sum):
    q = np.random.rand() * probs_sum

    i = 0
    p_sum = 0.0
    while p_sum < q:
        p_sum += probs[i]
        i += 1
        
    return i - 1


# Make dummy unnormalized probs
probs = np.array([0.1, 0.3, 0.4, 0.05, 0.15, 0.6])
probs_sum = sum_numba(probs)

print('Result from non-JITted method:')
%timeit sample_discrete(probs)

print("\nResults from numba'd version:")
%timeit sample_discrete_numba(probs, probs_sum)

Result from non-JITted method:
360 ns ± 2.09 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Results from numba'd version:
96.5 ns ± 1.87 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

We get a speed-up of about a factor of three. Let’s now make a new gillespie_draw() function that is faster. The fast propensity function is just an argument to the fast draw-er.

def gillespie_draw_fast(propensity_func, propensities, population, t, args):
    """
    Draws a reaction and the time it took to do that reaction.
    """
    # Compute propensities
    propensity_func(propensities, population, t, *args)
    
    # Sum of propensities
    props_sum = sum_numba(propensities)
    
    # Compute time
    time = np.random.exponential(1 / props_sum)

    # Draw reaction given propensities
    rxn = sample_discrete_numba(propensities, props_sum)
    
    return rxn, time


print('Old Gillespie draw:')
%timeit gillespie_draw(simple_propensity, propensities, population_0, 0.0, args)

print('\nFast Gillespie draw:')
%timeit gillespie_draw_fast(simple_propensity_numba, propensities, population_0, 0.0, args)

Old Gillespie draw:
3.16 μs ± 9.11 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

Fast Gillespie draw:
598 ns ± 3.96 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

So, our optimization got us another speed boost. Let’s adjust our SSA function to include the fast draws.

def gillespie_ssa_fast(
    propensity_func, update, population_0, time_points, args=()
):
    """
    Uses the Gillespie stochastic simulation algorithm to sample
    from proability distribution of particle counts over time.
    
    Parameters
    ----------
    propensity_func : function
        Function of the form f(params, population) that takes the current
        population of particle counts and return an array of propensities
        for each reaction.
    update : ndarray, shape (num_reactions, num_chemical_species)
        Entry i, j gives the change in particle counts of species j
        for chemical reaction i.
    population_0 : array_like, shape (num_chemical_species)
        Array of initial populations of all chemical species.
    time_points : array_like, shape (num_time_points,)
        Array of points in time for which to sample the probability
        distribution.
    args : tuple, default ()
        The set of parameters to be passed to propensity_func. 
        
    Returns
    -------
    rxn : int
        Index of reaction that occured.
    time : float
        Time it took for the reaction to occur.        
    Returns
    -------
    sample : ndarray, shape (num_time_points, num_chemical_species)
        Entry i, j is the count of chemical species j at time
        time_points[i].
    """

    # Initialize output
    pop_out = np.empty((len(time_points), update.shape[1]), dtype=np.int64)

    # Initialize and perform simulation
    i_time = 1
    i = 0
    t = time_points[0]
    population = population_0.copy()
    pop_out[0, :] = population
    propensities = np.zeros(update.shape[0])
    while i < len(time_points):
        while t < time_points[i_time]:
            # draw the event and time step
            event, dt = gillespie_draw_fast(
                propensity_func, propensities, population, t, args
            )

            # Update the population
            population_previous = population.copy()
            population += update[event, :]

            # Increment time
            t += dt

        # Update the index
        i = np.searchsorted(time_points > t, True)

        # Update the population
        pop_out[i_time : min(i, len(time_points))] = population_previous

        # Increment index
        i_time = i

    return pop_out

Now we can test the speed of the two SSAs.

print('Gillespie SSA:')
%timeit gillespie_ssa(simple_propensity, simple_update, \
                      population_0, time_points, args)

print('\nFast Gillespie SSA:')
%timeit gillespie_ssa_fast(simple_propensity_numba, simple_update,\
                           population_0, time_points, args)

Gillespie SSA:
42.7 ms ± 539 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

Fast Gillespie SSA:
12.2 ms ± 107 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

So, we are now faster with not too much work. This is still a general solver that you can use with any propensity function and update.

We have constructed a generic tool for doing Gillespie simulations. Specifically, we pass a propensity function into the algorithm. Passing a function as an argument precludes use of numba. This means that we cannot just-in-time compile the entire Gillespie simulation. We could insist that our propensity function be encoded in a global function prop_func(), then we can fully JIT compile the entire simulation. (Note that there are ways to get around this insistence on a global function, and the biocircuits.gillespie_ssa() function does not rely on such a global function, but for the purposes of this demonstration, it is convenient.)

@numba.njit
def prop_func(propensities, population, t, beta_m, beta_p, gamma):
    """Updates an array of propensities given a set of parameters
    and an array of populations.
    """
    # Unpack population
    m, p = population
    
    # Update propensities
    propensities[0] = beta_m      # Make mRNA transcript
    propensities[1] = m           # Degrade mRNA
    propensities[2] = beta_p * m  # Make protein
    propensities[3] = gamma * p   # Degrade protein


@numba.njit
def gillespie_draw_numba(propensities, population, t, args):
    """
    Draws a reaction and the time it took to do that reaction.
    
    Assumes that there is a globally scoped function
    `prop_func` that is Numba'd with nopython=True.
    """
    # Compute propensities
    prop_func(propensities, population, t, *args)

    # Sum of propensities
    props_sum = np.sum(propensities)
    
    # Compute time
    time = np.random.exponential(1 / props_sum)
    
    # Draw reaction given propensities
    rxn = sample_discrete_numba(propensities, props_sum)

    return rxn, time


@numba.njit
def gillespie_ssa_numba(update, population_0, time_points, args):
    """
    Uses the Gillespie stochastic simulation algorithm to sample
    from proability distribution of particle counts over time.
    
    Parameters
    ----------
    update : ndarray, shape (num_reactions, num_chemical_species)
        Entry i, j gives the change in particle counts of species j
        for chemical reaction i.
    population_0 : array_like, shape (num_chemical_species)
        Array of initial populations of all chemical species.
    time_points : array_like, shape (num_time_points,)
        Array of points in time for which to sample the probability
        distribution.
    args : tuple, default ()
        The set of parameters to be passed to propensity_func.
        
    Returns
    -------
    sample : ndarray, shape (num_time_points, num_chemical_species)
        Entry i, j is the count of chemical species j at time
        time_points[i].
        
    Notes
    -----
    .. Assumes that there is a globally scoped function
       `propensity_func` that is Numba'd with nopython=True.
    """
    # Initialize output
    pop_out = np.empty((len(time_points), update.shape[1]), dtype=np.int64)

    # Initialize and perform simulation
    i_time = 1
    i = 0
    t = time_points[0]
    population = population_0.copy()
    pop_out[0,:] = population
    propensities = np.zeros(update.shape[0])
    while i < len(time_points):
        while t < time_points[i_time]:
            # draw the event and time step
            event, dt = gillespie_draw_numba(propensities, population, t, args)
                
            # Update the population
            population_previous = population.copy()
            population += update[event,:]
                
            # Increment time
            t += dt

        # Update the index (Have to be careful about types for Numba)
        i = np.searchsorted((time_points > t).astype(np.int64), 1)

        # Update the population
        for j in np.arange(i_time, min(i, len(time_points))):
            pop_out[j,:] = population_previous
        
        # Increment index
        i_time = i
                           
    return pop_out

Now let’s test the speed of all three of our functions.

print('Gillespie SSA:')
%timeit gillespie_ssa(simple_propensity, simple_update, \
                      population_0, time_points, args)

print('\nFast Gillespie SSA:')
%timeit gillespie_ssa_fast(simple_propensity_numba, simple_update,\
                            population_0, time_points, args)

print('\nTotally numba\'d Gillespie SSA:')
%timeit gillespie_ssa_numba(simple_update, population_0, time_points, args)

Gillespie SSA:
41.5 ms ± 1.11 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Fast Gillespie SSA:
12.5 ms ± 146 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Totally numba'd Gillespie SSA:
472 μs ± 1.68 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

We got an extra order of magnitude speed boost by totally JIT compiling everything. The speed up is significant, so we should probably use Numba’d code, which is what the biocircuits.gillespie_ssa() function uses when possible.

Parallel Gillespie simulations

Sampling by the Gillespie algorithm is trivially parallelizable. We can use the multiprocessing module to parallelize the computation. Syntactically, we need to specify a function that takes a single argument. Below, we set up a parallel calculation of Gillespie simulations for our specific example.

Note

The multiprocessing module is part of the standard Python distribution. As of April 2022, multiprocessing hangs in IPython and Jupyter environments. As an alternative, we will use the multiprocess package. It can be installed by running pip install multiprocess on the command line. In the contexts in which we are doing parallel calculation, the multiprocess package can be used as a drop-in replacement for multiprocessing.

def gillespie_fn(args):
    return gillespie_ssa_numba(*args)
    
    
def gillespie_parallel(fn, update, population_0, time_points, args,
                       n_simulations, n_threads):
    """
    Convenience function to do parallel Gillespie simulations.
    """
    input_args = (update, population_0, time_points, args)

    with multiprocess.Pool(n_threads) as p:
        populations = p.map(fn, [input_args]*n_simulations)
        
    return np.array(populations)

We are paying some overhead in setting up the parallelization. Let’s time it to see how we do with two threads.

n_simulations = 1000

print('\nnumba\'d Gillespie SSA:')
%timeit [gillespie_ssa_numba(simple_update, population_0, time_points, args) \
            for _ in range(n_simulations)]

print('\nParallel numba\'d Gillespie SSA:')
%timeit gillespie_parallel(gillespie_fn, simple_update, population_0, time_points,\
                           args, n_simulations, 2)


numba'd Gillespie SSA:
468 ms ± 5.14 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Parallel numba'd Gillespie SSA:
368 ms ± 27 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

We get a factor of two speed boost from two cores. This brings our total speed boost from the optimization to nearly two orders of magnitude.

If we insist on exact sampling out of a probability distribution defined by a master question, we can get significant speed boosts by switching to the Gibson and Bruck algorithm, especially for more complicated systems and propensity functions. If we are willing to approximately sample out of the probability distribution, there are many fast, approximate methods (e.g., Salis and Kaznessis) available.

The function gillespie_ssa() in the biocircuits package uses all of the speed boosts (and a few extra) that we discussed here. You will get the best performance if you pass a Numba’d propensity function.

Computing environment

%load_ext watermark
%watermark -v -p numpy,scipy,numba,tqdm,line_profiler,multiprocess,biocircuits,bokeh,jupyterlab

Python implementation: CPython
Python version       : 3.13.13
IPython version      : 9.12.0

numpy        : 2.2.6
scipy        : 1.17.1
numba        : 0.61.2
tqdm         : 4.67.3
line_profiler: 5.0.2
multiprocess : 0.70.19
biocircuits  : 0.1.19
bokeh        : 3.9.0
jupyterlab   : 4.5.6