Skip to content

Pure functions

A pure function has two guarantees:

  1. Deterministic: given the same input, it always returns the same output
  2. No side effects: it doesn’t read from or modify anything outside its own scope,
    • no global state
    • no I/O
    • no mutation of arguments
# Pure: depends only on its inputs, changes nothing outside
def add(a, b):
return a + b
def square(n):
return n ** 2

An impure function violates one or both of the above rules: it either produces different results for the same input, or it causes side effects.

Common causes of impurity
# 1. Modifies external/global state
count = 0
def increment():
global count
count += 1 # side effect: mutates global variable
# 2. Depends on external state (non-deterministic output)
import random
def roll_dice():
return random.randint(1, 6) # same call, different results
# 3. Mutates its argument
def append_item(lst, item):
lst.append(item) # side effect: modifies the caller's list
return lst
# 4. Performs I/O
def greet(name):
print(f"Hello, {name}") # side effect: writes to stdout
Impure functions
# ❌ Impure — depends on and modifies external state
total = 0
def add_to_total(x):
global total
total += x # Side effect — modifies external state
return total
# ❌ Impure — modifies its argument
def append_zero(lst):
lst.append(0) # Mutates input — surprising to caller!
return lst
Pure functions
# ✅ Pure — no side effects, same input → same output
def add(a, b):
return a + b # Only depends on inputs, touches nothing else
# ✅ Pure — returns new data instead of mutating
def append_zero(lst):
return lst + [0] # Original list untouched
# Pure functions are:
# - Easy to test (no setup/teardown)
# - Easy to reason about
# - Safe to cache (memoize)
# - Safe to run in parallel
import functools
@functools.lru_cache(maxsize=None)
def fibonacci(n): # Pure → safe to cache!
if n <= 1:
return n
return fibonacci(n-1) + fibonacci(n-2)
PropertyPureImpure
Same input → same output✅ Always❌ Not guaranteed
Modifies external state❌ Never✅ May do so
Easy to test✅ Yes⚠️ Harder
Safe to cache / memoize✅ Yes❌ No
Safe to parallelize✅ Yes⚠️ Depends

Memoization is an optimization technique where a function caches its results. The first time it’s called with a given input, it computes and stores the result; every subsequent call with the same input skips the computation and returns the cached value directly.

# Without memoization — recomputes every time
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)

Calling fibonacci(40) here makes millions of redundant recursive calls. With memoization:

# Manual memoization using a cache dict
cache = {}
def fibonacci(n):
if n in cache:
return cache[n] # return stored result instantly
if n <= 1:
return n
cache[n] = fibonacci(n - 1) + fibonacci(n - 2)
return cache[n]

Or using Python’s built-in decorator:

from functools import lru_cache
@lru_cache(maxsize=None)
def fibonacci(n):
if n <= 1:
return n
return fibonacci(n - 1) + fibonacci(n - 2)

lru_cache handles the caching automatically. LRU stands for Least Recently Used: it can optionally discard old cached results if memory is a concern.

Memoization assumes that the same input always produces the same output. If a function is impure (relying on global state, randomness, or I/O) caching its result would return stale or wrong answers. This is why memoizability is listed as a key advantage of pure functions.

Detail
SavesTime (avoids recomputation)
CostsMemory (stores previous results)
Best forExpensive, repeatedly-called pure functions
Avoid whenInputs are always unique, or memory is tight