Homework 16.2: Noise in a switchable promoter

This problem was inspired by Munsky, et al. (Science, 2012). We will explore some aspects of stochastic gene expression using both the analytical and computational techniques that we described in class. In this problem we will contrast two models of stochastic gene expression.

The first “simple” model consists of a simplely active promoter where transcription happens with a constant stochastic rate, while the second model consists of a promoter switching stochastically between an ON and an OFF state. Transcription in this model can only occur when the promoter is in the ON state. This could, for example, describe the open and closed chromatin state. mRNA degradation is present in both models. We will not consider protein in this problem.

The models may be written as chemical equations,

\begin{align} \require{mhchem} &\text{Simple: } \ce{ON ->[\beta] mRNA ->[\gamma] \varnothing} \\ &\text{Two-state: } \ce{OFF <=>[k_\mathrm{on}][k_\mathrm{off}] ON ->[\beta] mRNA ->[\gamma] \varnothing}. \end{align}

When the promoter is on, transcription proceeds at a rate \(\beta\). Otherwise, no transcripts are made. Transcripts degrade with a rate \(\gamma\).

a) Write down a master equation describing the temporal dynamics of the probability distribution of the number of RNA transcripts, \(P(n;t)\), for the simple model. Nondimensionalize time using the decay rate \(\gamma\).

b) In this part of the problem, you will perform Gillespie simulations of the simple model. (This is very similar to what was done in the chapter, but provides a good baseline for studying the two-state model.) Start by writing down a system of stochastic chemical reactions for the simple model of gene expression. Make sure to enumerate all species, all reactions, and the propensity functions and associated parameters. Then, using the Gillespie stochastic simulation algorithm, implement a simulation of the simple model of gene expression. Use a \(\beta/\gamma\) of 25, where \(\beta\) and \(\gamma\) are the dimensional parameters. Initialize your simulation with no copies of mRNA being present. Perform many simulations to get plenty of samples at each time point.

1. Plot snapshots of the simulated mRNA distribution at various time points. Comment on what you see.
1. Compute and plot the Fano factor and noise \(\eta\), as a function of time from your SSA samples. In Problem 16.1, you derived that the Fano factor is one for all time for this model. Do the computed results match that?

c) Now, we will move on to the two-state model with a switchable promoter. Write down a similar master equation for the two-state case. Now, each “state” is not just the mRNA copy number, but also the state of the promoter (on or off). We will define a variable \(a\in\{0,1\}\) describing this, where \(a = 0\) means the promoter is off and \(a=1\) means it is on. Thus, write a master equation for \(P(a,n,t)\). Again, be sure to nondimensionalize time using \(\gamma\).

d)

Show that at steady state,

\begin{align} \text{probability that promoter is on} = \langle a \rangle = \frac{k_\mathrm{on}/k_\mathrm{off}}{1 + k_\mathrm{on}/k_\mathrm{off}}. \end{align}

Show further that at steady state

\begin{align} \langle n \rangle = \frac{\beta}{\gamma}\,\langle a \rangle = \frac{\beta}{\gamma}\,\frac{k_\mathrm{on}/k_\mathrm{off}}{1 + k_\mathrm{on}/k_\mathrm{off}}, \end{align}

where the parameters are dimensional.

Finally, show that the steady state Fano factor is

\begin{align} F &= \frac{\sigma^2}{\langle n \rangle} = 1 + \frac{\beta}{\gamma}\,(1-\langle a \rangle) \,\frac{1}{1 + k_\mathrm{on}/\gamma + k_\mathrm{off}/\gamma} \\[1em] &= 1 + \frac{\beta}{\gamma}\,\frac{1}{1 + k_\mathrm{on}/k_\mathrm{off}} \,\frac{1}{1 + k_\mathrm{on}/\gamma + k_\mathrm{off}/\gamma}, \end{align}

where again the parameters are dimensional.

What is the significance that the Fano factor of the two-state model can never be less than one? Describe physically what is happening when the Fano factor is large.

e) We will now use a Gillespie algorithm to sample out of the probability distribution defined by the master equation for the two-state system. Start with no transcripts with the promoter being off. There are lots of ways to explore the space of the three dimensionless parameters. To do our exploration, we will vary the parameters \(k_\mathrm{on}\) and \(k_\mathrm{off}\) and choose \(\beta/\gamma\) such that the steady state \(\langle n \rangle = 25\).

Choose \(k_\mathrm{on}/\gamma = 0.05\) and \(k_\mathrm{off}/\gamma = 0.1\). Plot a trajectory for a single trajectory. Do you see bursts of expression? Compute many trajectories and make a plot showing the distribution of steady state transcripts (either an ECDF or histogram). Comment on what you see.
Choose \(k_\mathrm{on}/\gamma = 50\) and \(k_\mathrm{off}/\gamma = 100\). Do you see bursts in trajectories? We again want to investigate the steady state distribution of copy numbers. In this case, we want to compare this to a negative binomial distribution. We can connect the steady state distribution of the two-state model to the burst size \(b\) and burst frequency \(r\) of a negative binomial distribution using the mean and Fano factor derived in part (d). (This is the method of moments.) Specifically,

\begin{align} &\langle n \rangle = \frac{\beta}{\gamma}\,\frac{k_\mathrm{on}/k_\mathrm{off}}{1 + k_\mathrm{on}/k_\mathrm{off}} = rb,\\[1em] &F = 1 + \frac{\beta}{\gamma}\,\frac{1}{1 + k_\mathrm{on}/k_\mathrm{off}} \,\frac{1}{1 + k_\mathrm{on}/\gamma + k_\mathrm{off}/\gamma} = 1 + b. \end{align}

So, the negative binomial distribution that most closely approximates the steady state distribution by the method of moments has a burst size

\begin{align} b = \frac{\beta}{\gamma}\,\frac{1}{1 + k_\mathrm{on}/k_\mathrm{off}} \,\frac{1}{1 + k_\mathrm{on}/\gamma + k_\mathrm{off}/\gamma}, \end{align}

and burst frequency

\begin{align} r = \frac{k_\mathrm{on}}{k_\mathrm{off}}\left(1 + k_\mathrm{on}/\gamma + k_\mathrm{off}/\gamma\right). \end{align}

Make a plot of the distribution of the steady state copy numbers and overlay that of a negative binomial distribution with parameters given by \(b\) and \(r\) above. Hint: To compute the probability mass function of a negative binomial distribution for values of n with burst size b and burst frequency r, you can use scipy.stats.nbinom.pmf(n, r, 1 / (1 + b)). You can use scipy.stats.nbinom.cdf(n, r, 1 / (1 + b)) to get the values for the CDF. Does the steady state distribution resemble a negative binomial distribution?

Do the results you derived in part (d) about the Fano factor and mean hold for the steady states you observe? You can investigate other parameter values as well, if you wish.