Pure functions
Pure Functions
Section titled “Pure Functions”A pure function has two guarantees:
- Deterministic: given the same input, it always returns the same output
- No side effects: it doesn’t read from or modify anything outside its own scope,
- no global state
- no I/O
- no mutation of arguments
# Pure: depends only on its inputs, changes nothing outsidedef add(a, b): return a + b
def square(n): return n ** 2Impure Functions
Section titled “Impure Functions”An impure function violates one or both of the above rules: it either produces different results for the same input, or it causes side effects.
# 1. Modifies external/global statecount = 0def increment(): global count count += 1 # side effect: mutates global variable
# 2. Depends on external state (non-deterministic output)import randomdef roll_dice(): return random.randint(1, 6) # same call, different results
# 3. Mutates its argumentdef append_item(lst, item): lst.append(item) # side effect: modifies the caller's list return lst
# 4. Performs I/Odef greet(name): print(f"Hello, {name}") # side effect: writes to stdoutImpure vs. pure
Section titled “Impure vs. pure”# ❌ Impure — depends on and modifies external statetotal = 0
def add_to_total(x): global total total += x # Side effect — modifies external state return total
# ❌ Impure — modifies its argumentdef append_zero(lst): lst.append(0) # Mutates input — surprising to caller! return lst# ✅ Pure — no side effects, same input → same outputdef add(a, b): return a + b # Only depends on inputs, touches nothing else
# ✅ Pure — returns new data instead of mutatingdef append_zero(lst): return lst + [0] # Original list untouched
# Pure functions are:# - Easy to test (no setup/teardown)# - Easy to reason about# - Safe to cache (memoize)# - Safe to run in parallel
import functools
@functools.lru_cache(maxsize=None)def fibonacci(n): # Pure → safe to cache! if n <= 1: return n return fibonacci(n-1) + fibonacci(n-2)Side by Side
Section titled “Side by Side”| Property | Pure | Impure |
|---|---|---|
| Same input → same output | ✅ Always | ❌ Not guaranteed |
| Modifies external state | ❌ Never | ✅ May do so |
| Easy to test | ✅ Yes | ⚠️ Harder |
| Safe to cache / memoize | ✅ Yes | ❌ No |
| Safe to parallelize | ✅ Yes | ⚠️ Depends |
Memoization
Section titled “Memoization”Memoization is an optimization technique where a function caches its results. The first time it’s called with a given input, it computes and stores the result; every subsequent call with the same input skips the computation and returns the cached value directly.
A Simple Example
Section titled “A Simple Example”# Without memoization — recomputes every timedef fibonacci(n): if n <= 1: return n return fibonacci(n - 1) + fibonacci(n - 2)Calling fibonacci(40) here makes millions of redundant recursive calls. With memoization:
# Manual memoization using a cache dictcache = {}def fibonacci(n): if n in cache: return cache[n] # return stored result instantly if n <= 1: return n cache[n] = fibonacci(n - 1) + fibonacci(n - 2) return cache[n]Or using Python’s built-in decorator:
from functools import lru_cache
@lru_cache(maxsize=None)def fibonacci(n): if n <= 1: return n return fibonacci(n - 1) + fibonacci(n - 2)lru_cache handles the caching automatically. LRU stands for Least Recently Used: it can optionally discard old cached
results if memory is a concern.
Why It Only Works on Pure Functions
Section titled “Why It Only Works on Pure Functions”Memoization assumes that the same input always produces the same output. If a function is impure (relying on global state, randomness, or I/O) caching its result would return stale or wrong answers. This is why memoizability is listed as a key advantage of pure functions.
The Trade-off
Section titled “The Trade-off”| Detail | |
|---|---|
| Saves | Time (avoids recomputation) |
| Costs | Memory (stores previous results) |
| Best for | Expensive, repeatedly-called pure functions |
| Avoid when | Inputs are always unique, or memory is tight |