Subscribe

Jupyter notebook tricks

Aug 29, 2023

Love using Jupyter notebooks, but after a while, they look like a total mess? 😵‍💫

What if I told you there is a quick, simple, and efficient way to make them tidy and shiny?

 

The problem

Jupyter notebooks are the most popular environment to develop Machine Learning models.

They are the faster way to
→ add code
→ fix code
→ re-run code

for your Machine Learning project. However, they quickly turn into a mess.... unless you follow these 3 tips.

 

Tip 1. Encapsulate common code as functions

If you do not encapsulate your code, you are doomed to duplicate it.
And code duplication is both a productivity killer and an endless source of bugs.

The solution:
→ Define functionality ONCE.
→ Call it as many times as you need

Tip 2. Extract common functions into a separate src/ folder

Often you have the same function defined in several notebooks. Which is, again, code duplication.

To solve this create a source code folder (aka src/) at the same level where your notebooks are and extract your functions as code in .py files.

You can group these functions into separate .py files, depending on their main functionality:
→ plotting.py
→ data_transformation.py
→ model_training.py
→ utils.py

 

Tip 3. Add autoreload magic to your notebook

To use your functions inside Jupyter, you need to import them.
For example:

from src.plot import plot_figure

By default, Jupyter caches all library imports and only loads them once, unless you restart the kernel. So when you update the .py files in src/, the changes are not picked up by Jupyter.

To solve this, you just add these 2 lines at the beginning of your notebook.

%load_ext autoreload
%autoreload 2

 

To sum up

Reduce code duplication, by encapsulating and extracting common functionality into separate functions under src/

Add autorelaod magic to your notebook, to keep it in sync with your src/ code.

By following these simple steps, you will keep your notebook lightweight, and naturally transition your Python code into a packageable format.

 

That’s it for today.

Have a fantastic day, and keep on learning.

Pau

The Real World ML Newsletter

Every Saturday

For FREE

Join 19k+ ML engineers ↓