Threading — I/O Bound Tasks

What is Threading Good For?

Threading shines when your program spends most of its time waiting — for a network response, a file read, a database query. During that wait the GIL is released, allowing other threads to run. The result is that instead of waiting for each operation to complete before starting the next, multiple operations can be in flight simultaneously.

The key insight is that threading does not make individual operations faster — it makes the total waiting time shorter by overlapping it.

Sequential vs Threaded — The Core Difference

The example below simulates fetching data from three URLs, each taking 2 seconds. Without threading the program waits for each fetch to complete before starting the next — 6 seconds total. With threading all three fetches run simultaneously — 2 seconds total:

import threading
import time

def fetch_data(url, results, index):
    """Simulate a network request — 2 seconds of I/O wait."""
    print(f"fetching {url}...")
    time.sleep(2)                       # simulated I/O wait — GIL released here
    results[index] = f"data from {url}"
    print(f"done: {url}")

urls    = ["site1.com", "site2.com", "site3.com"]
results = [None] * len(urls)

Without threading — sequential, 6 seconds:

start = time.perf_counter()

for i, url in enumerate(urls):
    fetch_data(url, results, i)         # wait 2s, then start next

print(f"sequential: {time.perf_counter() - start:.1f}s")  # ~6.0s

  timeline without threading

  t=0s  fetch site1.com ──────────────────► done at t=2s
  t=2s  fetch site2.com ──────────────────► done at t=4s
  t=4s  fetch site3.com ──────────────────► done at t=6s
                                            total: 6s

With threading — concurrent, 2 seconds:

results = [None] * len(urls)
threads = []

start = time.perf_counter()

for i, url in enumerate(urls):
    t = threading.Thread(target=fetch_data, args=(url, results, i))
    threads.append(t)
    t.start()                           # start immediately, do not wait

for t in threads:
    t.join()                            # wait for ALL threads to finish

print(f"threaded: {time.perf_counter() - start:.1f}s")     # ~2.0s ✅
print(results)

  timeline with threading

  t=0s  fetch site1.com ──────────────────► done at t=2s
  t=0s  fetch site2.com ──────────────────► done at t=2s  ← all start at once
  t=0s  fetch site3.com ──────────────────► done at t=2s
                                            total: 2s

t.join() is important — it tells the main thread to wait until each worker thread has finished before continuing. Without it the main thread would print results before the fetches complete.

`ThreadPoolExecutor` — A Cleaner API

Managing threads manually with threading.Thread is verbose and error-prone for larger workloads. ThreadPoolExecutor from concurrent.futures provides a higher-level API that handles thread creation, management, and cleanup automatically:

from concurrent.futures import ThreadPoolExecutor, as_completed
import time

def fetch(url):
    """Simulate a network request."""
    time.sleep(2)
    return f"data from {url}"

urls = ["site1.com", "site2.com", "site3.com", "site4.com"]

Processing results as they complete — useful when you want to handle each result immediately rather than waiting for all:

with ThreadPoolExecutor(max_workers=4) as executor:
    # submit all tasks — returns a future for each
    futures = {executor.submit(fetch, url): url for url in urls}

    # process each result as it arrives — not in submission order
    for future in as_completed(futures):
        url    = futures[future]
        result = future.result()
        print(f"{url} → {result}")

Processing results in order — useful when the order of results matters:

with ThreadPoolExecutor(max_workers=4) as executor:
    # map preserves order — blocks until all tasks complete
    results = list(executor.map(fetch, urls))

print(results)
# ['data from site1.com', 'data from site2.com', ...]  ← in original order

The difference between submit and map:

  submit + as_completed          map
  ─────────────────────          ───
  results arrive as ready        results arrive in order
  out of order                   blocks until all done
  useful when order              useful when order
  does not matter                matters

Thread Safety — Protecting Shared Data

Because threads share memory, they can read and modify the same variable simultaneously — leading to race conditions where the final value depends on the unpredictable order in which threads execute.

The problem occurs because counter += 1 is not a single atomic operation — it is three steps that can be interrupted between any of them:

  counter += 1  expands to:

  step 1: READ  current value of counter    ← thread can be interrupted here
  step 2: ADD   1 to the value              ← or here
  step 3: WRITE result back to counter      ← or here

  if two threads interleave these steps:

  thread A reads counter = 100
  thread B reads counter = 100              ← reads before A writes
  thread A writes counter = 101
  thread B writes counter = 101             ← overwrites A's result!

  expected: 102
  actual:   101  ← one increment lost

import threading

# ❌ race condition — counter += 1 is not atomic
counter = 0

def increment():
    global counter
    for _ in range(100_000):
        counter += 1                    # read → add → write — not atomic!

threads = [threading.Thread(target=increment) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()

print(counter)      # should be 500,000 — often much less!

The fix is a Lock — it ensures only one thread can execute the critical section at a time:

# ✅ thread safe — Lock prevents interleaving
counter = 0
lock    = threading.Lock()

def safe_increment():
    global counter
    for _ in range(100_000):
        with lock:                      # only one thread enters here at a time
            counter += 1               # now atomic — no interleaving possible

threads = [threading.Thread(target=safe_increment) for _ in range(5)]
for t in threads: t.start()
for t in threads: t.join()

print(counter)      # always 500,000 ✅

  with lock:

  thread A acquires lock ──► executes counter += 1 ──► releases lock
                                                              │
                                              thread B acquires lock
                                              ──► executes counter += 1
                                              ──► releases lock

  threads take turns — no interleaving, no lost increments

The with lock: syntax automatically acquires the lock on entry and releases it on exit — even if an exception occurs inside the block.

Threading — I/O Bound Tasks

What is Threading Good For?

Sequential vs Threaded — The Core Difference

ThreadPoolExecutor — A Cleaner API

Thread Safety — Protecting Shared Data

`ThreadPoolExecutor` — A Cleaner API