0. Configuring your computer to use Python for scientific computing


Why Python?

As will become readily apparent even at the beginning of our journey into biological circuit design, you will need to use your computer to analyze circuits and understand the principles governing their function. There are plenty of approaches we could take, and many languages we could use for computing as well. Indeed, in addition to Python, Matlab/Octave, Mathematica, R, Julia, Java, JavaScript, C++, and others are widely used. We have chosen to use Python. Though we view this as an unessential choice (we believe language wars are counterproductive and welcome anyone to port the code we use to any language of their choice), we nonetheless feel we should explain our choice.

Python is a flexible programming language that is widely used in many applications. This is in contrast to more domain-specific languages like R and Julia. It is easily extendable, which is in many ways responsible for its breadth of use. We find that there is a decent Python-based tool for many applications we can dream up, certainly in systems biology. However, the Python-based tool is seldom the very best for the particular task at hand, but it is almost always pretty good. Thus, knowing Python is like having a Swiss Army knife; you can wield it to effectively accomplish myriad tasks. Finally, we also find that it has a shallow learning curve with most students.

That said, if we had to choose another language, it would be Julia. Julia is a well-designed, modern language for scientific computing. Its tools for numerical differential equations, an important method we employ, are superb.

Why not use systems biology packages?

There are packages available to streamline systems biology calculations, such as PySB or Matlab’s SymBiology. While these packages are useful, we find that many applications in systems biology, and in genetic circuits in particular, need, or at least benefit from, bespoke computational analyses. We therefore will build all of our code from scratch, using only packages like NumPy, SciPy, and Bokeh, which contain core numerical and plotting data structures and routines. Of course, code we use in one chapter may be reused in another, but our approach is that we build all of the code we need as we go along. This will provide a greater level of mastery and less reliance on black boxes (though there will inevitably be some).

The biocircuits package

For some of those black boxes, we will use the biocircuits package, which is written to be used with this book. The functions contained therein are presented before being abstracted away into the package. Thus, anything that is in the package is introduced in the main text and you should have a full understanding of how it works.

The documentation for this package appears at the end of this book.

What to do if you are new to Python

As you proceed through the chapters, we assume that you have a basic introduction to computer programming and the Python programming language. We assume further that you have a working knowledge of NumPy. If this is new to you, you can work through Appendix B to get up to speed. In fact, for some of the programming techniques we learn, such as making interactive plots, we will refer to Appendix B.

Jupyter notebooks

This book is constructed from Jupyter notebooks. To quote Jupyter’s documentation,

Jupyter Notebook and its flexible interface extends the notebook beyond code to visualization, multimedia, collaboration, and more. In addition to running your code, it stores code and output, together with markdown notes, in an editable document called a notebook.

This allows for executable documents that have code, but also richly formatted text and graphics, enabling the reader to interact with the material as they read it.

While you read this book, you can read the HTML-rendered version of the notebook. While many of the graphics are interactive, this version is not executable. To execute (and even edit) code in the notebooks, you will need to run them. There are many options available to run Jupyter notebooks. Here are a few we have found useful.

  • JupyterLab: This is a browser-based interface to Jupyter notebooks and more (including a terminal application, text editor, file manager, etc.). As of March 2023, Chrome, Firefox, and Safari are supported. Microsoft Edge is not. Therefore, if you are a Windows user, you need to be sure you have either Chrome of Firefox installed. Because we encourage users to run code on their own machine (and because nearly every computer has a browser application), we expect most of our readers to use JupyterLab. We give instructions below on how to do the necessary installations and launch JupyterLab.

  • VSCode: This is an excellent source code editor that supports Jupyter notebooks. Be sure to read the documentation on how to use Jupyter notebooks in VSCode.

  • Google Colab: Google offers this service to run notebooks in the cloud on their machines. As of March 2023, users can run on two core machines for free. There are a few caveats, though. First, not all packages and updates are available in Colab. Furthermore, not all interactivity that will work natively in Jupyter notebooks works with Colab. Finally, if a notebook sits idle for too long, you will be disconnected from Colab. All of the notebooks in the HTML rendering of this book have an “Open in Colab” button at the upper right that allows you to launch the notebook in Colab. This is a quick-and-easy way to execute the book’s contents.

Installing a Python distribution

Prior to embarking on your journey into biological circuits, you need to have a functioning Python distribution installed on your computer. There are two main ways people set up Python for scientific computing.

  1. By downloading and installing package by package with tools like pip.

  2. By downloading and installing a Python distribution that contains binaries of many of the scientific packages needed. Anaconda is the dominant distribution for scientific and data science computing.

We will use Anaconda, with its associated package manager, conda.

Downloading and installing Anaconda

If you already have Anaconda installed on your computer, you can skip to the next section to install node.js.

Downloading and installing Anaconda is simple.

  1. Go to the Anaconda homepage and download the graphical installer. Be sure to use the most up-to-date version of Python.

  2. Install Anaconda using the graphical installer.

  3. Follow the on-screen instructions for installation. While doing so, be sure that Anaconda is installed in your home directory, as is the default, not in root.

That’s it! After you do that, you will have a functioning Python distribution.

Install node.js

node.js is a platform that enables you to run JavaScript outside of the browser. We will not use it directly, but it needs to be installed for some of the more sophisticated JupyterLab functionality. Install node.js by downloading the appropriate installer for your machine from the node.js website.

Package installations

Because Anaconda ships with many (over 100) packages. As a result, the web of dependencies among the packages can be unwieldy, and it is difficult to ensure that each reader of this book has compatible versions. To alleviate this, and because we do not need all of the packages that come with Anaconda, we will create an environment specific for use with this book. This environment contains only the packages we use in this book and their dependencies and is therefore easier to manage. The environment is specified in a YAML file, which you can download:

There are two ways you can create the environment, on the command line of via the Anaconda Navigator. We generally prefer the command line, but find many students like using the Navigator.

Installation using the command line

conda is a package manager for keeping all of your packages up-to-date and consistent with regards to dependencies. It has plenty of functionality beyond our basic usage, which you can learn more about by reading the docs. Here, we will use its command line interface to create the environment.

To access the command line, if you are using macOS, you can use the Terminal application. It is typically in the /Applications/Utilities folder. Otherwise, hit ⌘-space bar and type terminal in the search box, and select the Terminal Application. For Windows, you can use PowerShell, which you can launch through the Start menu. If you are using Linux, it’s a good bet you already know how to navigate a terminal, so we will not give specific instructions for Linux.

Once on the command line, you need to navigate to the directory where you have saved the biocircuits.yml file. Let’s say you save it in the directory biological_circuit_design in your home directory. You can navigate to that directory by entering

cd ~/biological_circuit_design

on the command line. (The ~ symbol is a shortcut for your home directory.)

Now, to create the environment, execute the following.

conda env create -f biocircuits.yml

It will take a minute for two for the environment creation to complete.

Installation using Anaconda Navigator

Anaconda has a GUI-based interface called Anaconda Navigator. If you’re using macOS, this is available in your Applications menu. If you are using Windows, you access it from the Start menu. You can then launch Anaconda Navigator.

After launching the Navigator, click Environments on the left menu panel. After clicking that, you will see a panel open immediately to the right of the left menu panel with a Search Environments window at the top. At the bottom of that panel, click Import. Select Local drive, and find the biocircuits.yml file you downloaded. Then, click Import. It will take a minute for two for the environment creation to complete.

Activating the environment

After the environment is created, you need to activate it. To do this from the command line, execute the following.

conda activate biocircuits

If you are using the Anaconda Navigator, click Home on the left navigation menu. At the top of the window, you will see two pull-down menus. Select All application on biocircuits.

You will need to activate the environment every time you open a new terminal (or PowerShell) window or launch Anaconda Navigator. (If you are using the command line, you can have this happen automatically if you like by adding conda activate biocircuits to your configuration file, e.g., .bashrc.)

Launching JupyterLab

You can launch JupyterLab is from the command line. On Windows, this is usually accessed through PowerShell and on macOS through Terminal. To launch JupyterLab, type the following on the command line (again, after you have done conda activate biocircuits).

jupyter lab

If the default browser on your machine is unsupported (e.g., Microsoft Edge), you can specify the browser you want using, for example,

jupyter lab --browser=firefox

Alternatively, you can use the Anaconda Navigator. Upon launching the Navigator, you should see an option to launch JupyterLab on the Home screen. (Be sure that you select All applications on biocircuits in the top pulldown menus.) After clicking Launch for JupyterLab, a new browser window or tab will open with JupyterLab running.

Within the JupyterLab window, you will have the option to launch a notebook, a console, a terminal, or a text editor. As you work through this book, you will use notebooks almost exclusively.

Checking your distribution

Let’s now run a quick test to make sure things are working properly. We will make a quick plot that requires some of the scientific libraries we will use in this book.

Launch a Jupyter notebook in JupyterLab. In the first cell (the box next to the [ ]: prompt), paste the code below. To run the code, press Shift+Enter while the cursor is active inside the cell. You should see a plot that looks like the one below. If you do, you have a functioning Python environment for scientific computing!

[1]:
import numpy as np
import bokeh.io
import bokeh.plotting

bokeh.io.output_notebook()

# Generate plotting values
t = np.linspace(0, 2 * np.pi, 200)
x = 16 * np.sin(t) ** 3
y = 13 * np.cos(t) - 5 * np.cos(2 * t) - 2 * np.cos(3 * t) - np.cos(4 * t)

# Make the plot
p = bokeh.plotting.figure(frame_width=300, frame_height=300)
p.line(x, y, line_width=3, color="red")
source = bokeh.models.ColumnDataSource(
    dict(x=[0], y=[0], text=["Biocircuits"])
)
p.text(
    x="x",
    y="y",
    text="text",
    source=source,
    text_align="center",
    text_font_size="18pt",
)

# Display
bokeh.io.show(p)
Loading BokehJS ...

Computing environment

At the end of each chapter in this book, all of which are constructed from a Jupyter notebook, we will show what versions of the employed packages were used. This will help you troubleshoot if your outputs look dissimilar from what is on the book website.

We use the handy watermark package to do this.

[2]:
%load_ext watermark
%watermark -v -p numpy,bokeh,jupyterlab
Python implementation: CPython
Python version       : 3.10.10
IPython version      : 8.10.0

numpy     : 1.23.5
bokeh     : 3.1.0
jupyterlab: 3.5.3