Google Colab Tutorial For Running Python Notebooks

This provides a short tutorial for Google Colab as an alternative to Jupyter for running Python code. We show how to bring in, modify and run a Jupyter Notebook from a Github repository.

Colab (short for “Colaboratory”) is a Google cloud service. It allows users to write and execute Python code in a web-based environment without needing to install anything locally. Within limits, Colab is free to use, and it interacts with a user’s Google Drive, so Colab notebooks can import additional Python libraries from *.py files. Additionally, the instructions here would allow usage from a Chromebook or on a CPU that does not allow local, laptop file storage. To demonstrate Colab, we will use a case study of running the Jupyter Notebook in this Pandas introduction Github repository called Pandas_Intro_For_Noncoders. This tutorial walks step-by-step through using Colab to run the notebook including modifying the repository notebook to import its data from a repository folder on Google Drive.

Note that several helpful code snippets are available in pasteable format at the bottom of this blog.

Opening Colab and Cloning the Pandas_Intro_For_Noncoders Github Repository

  1. Open Colab by navigating to https://colab.research.google.com. This opens a blank notebook (e.g. Untitled0.ipynb in the picture below)
  2. To access your Google drive from the notebook, mount the drive by executing the following to Python statements in a cell.
    • Type the statements into a blank cell (Can use +Code button to add cells as needed)
    • Run the cell by clicking its run button (black circle with triangle) or by clicking in the cell and typing Shift+Enter
  1. Optionally, add a new folder (e.g. Projects_Python in example) to your Google Drive by clicking on the three-dot menu next to the folder and choosing New Folder.
  2. The picture shows how to access your Google Drive’s folder tree.
  3. It is helpful to also open a browser tab pointing to your Google Drive (https://drive.google.com/
  4. Use the +Code button at the top of the notebook to add two blank cells
  5. Enter and run the %cd command to change directory to the desired folder
  6. Enter and run the !git clone command shown below to clone (e.g make a copy of) the Pandas Intro Github repository directly into the selected Google Drive folder.

Opening and Running the Pandas_Fundamentals.ipynb Notebook in Colab

  1. The Colab window does not have a way to open the notebook directly. Go to your Google Drive tab and right/control-click on the *.ipynb.
  2. Choose Open With / Google Colaboratory. This opens the notebook in a separate Colab browser tab
  3. We are done with the previous Untitled0.ipynb notebook. It is ok to close this browser tab

  1. Note that you can run notebook cells individually or choose Colab’s Runtime / Run All menu
  2. Jupyter notebooks such as this one typically point to files assuming a hard drive (local folder) address. This causes a FileNotFound error when the notebook tries to open sample Excel data several cells into the notebook

 

  1. To point the Colab notebook to your Google Drive folder, insert cells as shown below to a) mount the drive and b) create a prefix string for the Google Drive path

  1. Add the dir_google_drive string as a filename prefix in the read_excel statement as shown. This allows the notebook to run from the sample data copy on your Google Drive. Be careful to include “/” delimiters as shown and pay attention to case sensitivity.

That gets the repository’s notebook running in Google Colab with the repository’s sample data!

 

Useful code snippets:

#Mount user's Google Drive without precheck that it is already mounted
from google.colab import drive
drive.mount('content/drive')
#Clone a Github repository as folder to current (Google Drive) working directory
#Use !cd xxx to change directory to desired parent folder
!git clone https://github.com/jlandgre/Python_Colab_Template.git
#Attempt to mount user's Google Drive
is_drive_mounted = os.path.exists('\content\drive')
   if not is_drive_mounted:
try:
   from google.colab import drive
   drive.mount('\content\drive')
except ModuleNotFoundError:
   #Add statements for case where notebook is run as local Jupyter
   pass
#Set a string prefix for Google Drive directory path
dir_google_drive = '/content/drive/My Drive/Projects_Python/Pandas_Intro_For_Noncoders/'
#Add project's Google Drive path to sys.path to allow importing *.py libraries
dir_libs = dir_google_drive + 'libs'
if dir_libs not in sys.path: sys.path.append(dir_libs)