Exercises - NumPy for Numerical Performance

Basic NumPy vs Pure Python

Exercise 1 — 🟢 Beginner Create the same sequence of numbers using both a Python list and a NumPy array, then compare their memory usage:

import sys
import numpy as np

# create both with values 0 to 999
lst = list(range(1_000))
arr = np.arange(1_000)

# tasks:
# 1. measure memory of lst using sys.getsizeof()
#    remember: you need to account for both
#    the list container AND the PyObjects
# 2. measure memory of arr using sys.getsizeof()
# 3. what is the ratio between the two?
# 4. verify arr.itemsize — how many bytes per element?

Exercise 2 — 🟢 Beginner Predict the output of the following code before running it:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])

print(type(arr))            # ?
print(arr.dtype)            # ?
print(arr.itemsize)         # ?
print(arr.nbytes)           # ?
print(arr.shape)            # ?
print(arr.ndim)             # ?

Exercise 3 — 🟢 Beginner Use timeit to measure the performance difference between a Python list comprehension and a NumPy vectorized operation for squaring 100,000 numbers:

import numpy as np
import timeit

data_list = list(range(100_000))
data_np   = np.arange(100_000)

# ❌ python list comprehension
def python_squares(lst):
    return [x**2 for x in lst]

# ✅ numpy vectorized
def numpy_squares(arr):
    return arr ** 2

# tasks:
# 1. measure both with timeit number=100
# 2. calculate the speedup ratio
# 3. verify both produce identical results
#    hint: np.array_equal(python_result, numpy_result)

Vectorized Operations

Exercise 4 — 🟢 Beginner Rewrite these Python loops as NumPy vectorized operations and verify the results are identical:

import numpy as np

data = list(range(1_000))

# ❌ python loops — rewrite each as a NumPy operation
def add_ten(lst):
    return [x + 10 for x in lst]

def multiply_by_two(lst):
    return [x * 2 for x in lst]

def subtract_mean(lst):
    mean = sum(lst) / len(lst)
    return [x - mean for x in lst]

# ✅ rewrite using NumPy
arr = np.array(data)

# expected:
# np.array_equal(add_ten(data),       arr + 10)        → True
# np.array_equal(multiply_by_two(data), arr * 2)       → True
# np.array_equal(subtract_mean(data),  arr - arr.mean()) → True

Exercise 5 — 🟡 Intermediate Use NumPy vectorized filtering to replace these Python list comprehensions and measure the speedup:

import numpy as np
import timeit

data_list = list(range(1_000_000))
data_np   = np.arange(1_000_000)

# ❌ python filtering
def python_filter(lst):
    return [x for x in lst if x > 500_000]

# ✅ rewrite using NumPy boolean indexing
def numpy_filter(arr):
    # your implementation here
    pass

# tasks:
# 1. implement numpy_filter using boolean indexing
# 2. measure both with timeit number=10
# 3. verify results are identical
# 4. explain what a boolean mask is in the context of NumPy

Exercise 6 — 🟡 Intermediate Use NumPy to replace this Python loop that applies multiple operations to a dataset:

import numpy as np

data = list(range(1, 1_001))

# ❌ python — multiple passes, multiple loops
def python_pipeline(lst):
    squared  = [x**2 for x in lst]
    filtered = [x for x in squared if x > 1_000]
    total    = sum(filtered)
    return total

# ✅ rewrite as a single NumPy pipeline
def numpy_pipeline(arr):
    # your implementation here
    pass

arr = np.arange(1, 1_001)

# expected:
# python_pipeline(data)   → same result as numpy_pipeline(arr)
# hint: arr[arr > 1_000].sum()

Aggregations

Exercise 7 — 🟢 Beginner Replace these manual Python aggregations with NumPy equivalents and measure the speedup:

import numpy as np
import timeit

data_list = list(range(1_000_000))
data_np   = np.arange(1_000_000, dtype=np.float64)

# ❌ python manual aggregations
def python_stats(lst):
    n    = len(lst)
    mean = sum(lst) / n
    minimum = min(lst)
    maximum = max(lst)
    return mean, minimum, maximum

# ✅ rewrite using NumPy
def numpy_stats(arr):
    # your implementation here
    pass

# tasks:
# 1. implement numpy_stats using arr.mean(), arr.min(), arr.max()
# 2. measure both with timeit number=100
# 3. verify results are identical
# 4. calculate the speedup ratio for each operation

Exercise 8 — 🟡 Intermediate Use NumPy to compute descriptive statistics on a dataset and compare with Python’s statistics module:

import numpy as np
import statistics
import timeit

data_list = [float(x) for x in range(100_000)]
data_np   = np.array(data_list)

# tasks:
# 1. compute mean, std, min, max using both
#    statistics module and NumPy
# 2. verify results are identical
# 3. measure performance of each
# 4. which is faster and by how much?

# expected:
# statistics.mean(data_list)  vs  data_np.mean()
# statistics.stdev(data_list) vs  data_np.std()
# min(data_list)              vs  data_np.min()
# max(data_list)              vs  data_np.max()

Memory Layout

Exercise 9 — 🟡 Intermediate Investigate the memory layout of NumPy arrays with different dtypes and explain the trade-offs:

import numpy as np
import sys

data = list(range(1_000))

# create arrays with different dtypes
arr_int8   = np.array(data, dtype=np.int8)    # 1 byte per element
arr_int32  = np.array(data, dtype=np.int32)   # 4 bytes per element
arr_int64  = np.array(data, dtype=np.int64)   # 8 bytes per element
arr_float32 = np.array(data, dtype=np.float32) # 4 bytes per element
arr_float64 = np.array(data, dtype=np.float64) # 8 bytes per element

# tasks:
# 1. verify itemsize for each array
# 2. calculate total memory for each using arr.nbytes
# 3. compare with the Python list memory (~36 bytes per integer)
# 4. when would you choose int8 over int64?
# 5. what happens if you store 1000 in an int8 array?
#    hint: np.array([1000], dtype=np.int8)

Exercise 10 — 🟡 Intermediate Use tracemalloc to measure and compare the actual memory allocated by a Python list and a NumPy array at different scales:

import numpy as np
import tracemalloc

sizes = [1_000, 10_000, 100_000, 1_000_000]

for size in sizes:
    # measure Python list
    tracemalloc.start()
    lst = list(range(size))
    snap_list = tracemalloc.take_snapshot()
    tracemalloc.stop()

    # measure NumPy array
    tracemalloc.start()
    arr = np.arange(size)
    snap_arr = tracemalloc.take_snapshot()
    tracemalloc.stop()

    # tasks:
    # 1. extract memory usage from each snapshot
    # 2. calculate the ratio for each size
    # 3. does the ratio stay constant as size grows?
    # 4. plot or print a table of results
    # expected:
    # size=1_000     list=XX KB  numpy=XX KB  ratio=X.Xx
    # size=10_000    list=XX KB  numpy=XX KB  ratio=X.Xx
    # size=100_000   list=XX KB  numpy=XX KB  ratio=X.Xx
    # size=1_000_000 list=XX MB  numpy=XX MB  ratio=X.Xx

When to Use NumPy

Exercise 11 — 🟡 Intermediate For each scenario decide whether NumPy is the right tool, justify your answer, and implement the solution using the appropriate approach:

# scenario 1 — sum 10 numbers
data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
# numpy or pure python? why?

# scenario 2 — square 1 million numbers
data = list(range(1_000_000))
# numpy or pure python? why?

# scenario 3 — store a list of names
names = ["Alice", "Bob", "Charlie"]
# numpy or pure python? why?

# scenario 4 — compute the mean of 100,000 sensor readings
readings = [float(x) for x in range(100_000)]
# numpy or pure python? why?

# scenario 5 — matrix multiplication of two 1000×1000 matrices
# numpy or pure python? why?

Exercise 12 — 🔴 Advanced Build a benchmark that compares pure Python, NumPy, and built-in functions across different input sizes and produces a summary table:

import numpy as np
import timeit

sizes = [1_000, 10_000, 100_000, 1_000_000]

for size in sizes:
    data_list = list(range(size))
    data_np   = np.arange(size, dtype=np.float64)

    # benchmark three approaches for squaring numbers
    t_python = timeit.timeit(
        lambda: [x**2 for x in data_list], number=10
    )
    t_builtin = timeit.timeit(
        lambda: list(map(lambda x: x**2, data_list)), number=10
    )
    t_numpy = timeit.timeit(
        lambda: data_np ** 2, number=10
    )

    # expected output:
    # size=1_000     python=X.XXXs  builtin=X.XXXs  numpy=X.XXXs
    # size=10_000    python=X.XXXs  builtin=X.XXXs  numpy=X.XXXs
    # size=100_000   python=X.XXXs  builtin=X.XXXs  numpy=X.XXXs
    # size=1_000_000 python=X.XXXs  builtin=X.XXXs  numpy=X.XXXs

    # tasks:
    # 1. complete the benchmark and print the table
    # 2. at what size does NumPy start winning decisively?
    # 3. is builtin map() always faster than a list comprehension?
    # 4. does the speedup ratio grow with size?

Try measuring both time and memory for every exercise — the goal is to build an intuition for when NumPy is worth the dependency, and when pure Python is the simpler and perfectly adequate choice.