Async/Await — The Modern Way

What is Async/Await?

Threading and multiprocessing both achieve concurrency by running multiple execution units simultaneously — threads or processes that the OS schedules and switches between. Async/Await takes a fundamentally different approach: a single thread runs an event loop that switches between tasks cooperatively rather than preemptively.

The key word is cooperative — a task voluntarily yields control at await points when it is waiting for I/O, allowing the event loop to run another task. No OS involvement, no thread switching overhead, no GIL concerns. The event loop is entirely in control.

  Threading                          Async/Await
  ─────────                          ───────────

  Thread 1 ──████████░░░░────        Coroutine 1 ──████░░░░████──
  Thread 2 ──░░░░████████────        Coroutine 2 ──░░░░████░░░░──
  Thread 3 ──░░░░░░░░████────
                                     ░ = waiting at await point
  OS switches threads                event loop switches coroutines
  preemptively — any time            cooperatively — only at await
  overhead per switch                no OS involvement — faster

This makes async/await extremely efficient for programs that manage many concurrent I/O operations — thousands of simultaneous HTTP requests, database queries, or WebSocket connections — where threading would require thousands of threads, each carrying memory and switching overhead.

Coroutines — The Building Block

An async def function is a coroutine — it does not run immediately when called. It returns a coroutine object that the event loop schedules and runs. The await keyword marks points where the coroutine voluntarily yields control:

import asyncio

async def fetch(url):                   # async def — defines a coroutine
    print(f"start: {url}")
    await asyncio.sleep(2)              # yield control here — event loop runs others
    print(f"done:  {url}")
    return f"data from {url}"

  await asyncio.sleep(2)

  coroutine says: "I am waiting for 2 seconds,
                   run something else in the meantime"
       │
       ▼
  event loop: "noted — I will come back in 2 seconds"
       │
       ▼
  event loop runs other coroutines while this one waits
       │
       ▼
  2 seconds later — event loop resumes this coroutine

Sequential vs Concurrent — `await` vs `gather`

The example below shows the difference between running coroutines sequentially and running them concurrently. The code looks similar — the difference is in how you call them:

Sequential — each waits for the previous to finish:

import asyncio
import time

async def fetch(url, delay):
    print(f"fetching {url}...")
    await asyncio.sleep(delay)
    return f"data from {url}"

async def main_sequential():
    start = time.perf_counter()

    result1 = await fetch("site1.com", 2)   # waits 2s before starting next
    result2 = await fetch("site2.com", 1)   # waits 1s before starting next
    result3 = await fetch("site3.com", 3)   # waits 3s

    print(f"sequential: {time.perf_counter() - start:.1f}s")  # ~6s

asyncio.run(main_sequential())

  sequential timeline

  t=0s  fetch site1.com ──────────────────► done at t=2s
  t=2s  fetch site2.com ──────────────────► done at t=3s
  t=3s  fetch site3.com ──────────────────► done at t=6s
                                            total: ~6s

Concurrent — all run simultaneously with gather:

async def main_concurrent():
    start = time.perf_counter()

    # gather starts all coroutines at once
    # switches between them at every await point
    results = await asyncio.gather(
        fetch("site1.com", 2),
        fetch("site2.com", 1),
        fetch("site3.com", 3),
    )

    print(f"concurrent: {time.perf_counter() - start:.1f}s")  # ~3s ✅
    print(results)

asyncio.run(main_concurrent())

  concurrent timeline (gather)

  t=0s  fetch site1.com ──────────────────────────► done at t=2s
  t=0s  fetch site2.com ──────────────────► done at t=1s
  t=0s  fetch site3.com ────────────────────────────────────► done at t=3s
                                            total: ~3s ✅ (longest task)

asyncio.gather starts all coroutines at once and returns when the last one finishes — the total time is determined by the slowest task, not the sum of all tasks.

Tasks — Fire and Forget

asyncio.create_task schedules a coroutine to run in the background immediately, without waiting for it. This is useful when you want to start work and continue doing other things while it runs:

import asyncio

async def background_task(name, delay):
    await asyncio.sleep(delay)
    print(f"task {name} completed")
    return name

async def main():
    # create tasks — they start immediately in the background
    task1 = asyncio.create_task(background_task("A", 2))
    task2 = asyncio.create_task(background_task("B", 1))
    task3 = asyncio.create_task(background_task("C", 3))

    print("tasks created — doing other work...")
    await asyncio.sleep(0.1)            # yield control so tasks can start
    print("still doing other work...")

    # wait for all tasks to complete
    results = await asyncio.gather(task1, task2, task3)
    print(f"all done: {results}")

asyncio.run(main())

Expected output — tasks complete in order of their delay, not creation order:

  tasks created — doing other work...
  still doing other work...
  task B completed     ← delay=1, finishes first
  task A completed     ← delay=2, finishes second
  task C completed     ← delay=3, finishes last
  all done: ['A', 'B', 'C']  ← results in creation order (gather preserves order)

The difference between gather and create_task:

  asyncio.gather(coro1, coro2)     asyncio.create_task(coro1)
  ────────────────────────────     ──────────────────────────
  starts coroutines and            starts coroutine immediately
  waits for all to finish          returns a task object
  blocks until done                you decide when to await it
  useful for known set of tasks    useful for background work

Async Context Managers and Iterators

Async/await extends to context managers and iterators — useful for resources that require async operations to open or close, like database connections or streaming data:

import asyncio

# async context manager — for resources that need async setup/teardown
class AsyncDatabase:
    async def __aenter__(self):
        print("connecting to database...")
        await asyncio.sleep(0.1)        # async connection
        return self

    async def __aexit__(self, *args):
        print("closing connection...")
        await asyncio.sleep(0.1)        # async cleanup

    async def query(self, sql):
        await asyncio.sleep(0.1)        # async query
        return f"results for: {sql}"

# async generator — for streaming data
async def stream_data(n):
    for i in range(n):
        await asyncio.sleep(0.1)        # simulate streaming delay
        yield i ** 2

async def main():
    # async with — calls __aenter__ and __aexit__ with await
    async with AsyncDatabase() as db:
        result = await db.query("SELECT * FROM users")
        print(result)

    # async for — calls __anext__ with await on each iteration
    async for value in stream_data(5):
        print(value)                    # 0, 1, 4, 9, 16

asyncio.run(main())

async with and async for are direct equivalents of with and for — the only difference is that their setup, teardown, and iteration steps are themselves awaitable.

Real World Example — Concurrent HTTP Requests

The most common real-world use of async/await is fetching multiple URLs concurrently. The aiohttp library provides an async HTTP client that integrates naturally with asyncio:

import asyncio
import aiohttp                          # pip install aiohttp

async def fetch_url(session, url):
    """Fetch a single URL using an existing session."""
    async with session.get(url) as response:
        return await response.json()

async def fetch_all(urls):
    """Fetch all URLs concurrently using a shared session."""
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_url(session, url) for url in urls]
        return await asyncio.gather(*tasks)     # all fetched concurrently

urls = [
    "https://api.example.com/user/1",
    "https://api.example.com/user/2",
    "https://api.example.com/user/3",
]

results = asyncio.run(fetch_all(urls))

  without asyncio — sequential requests
  ──────────────────────────────────────
  request 1 ──────────────────► response 1  (200ms)
  request 2 ──────────────────► response 2  (200ms)
  request 3 ──────────────────► response 3  (200ms)
  total: ~600ms


  with asyncio.gather — concurrent requests
  ──────────────────────────────────────────
  request 1 ──────────────────► response 1  (200ms)
  request 2 ──────────────────► response 2  (200ms)  ← all in flight
  request 3 ──────────────────► response 3  (200ms)     simultaneously
  total: ~200ms ✅

A shared ClientSession is used for all requests — creating a session per request would be wasteful as sessions manage connection pooling internally.

Threading vs Async/Await — When to Use Which

Both handle I/O-bound work, but they do it differently:

	Threading	Async/Await
Execution model	multiple threads	single thread, event loop
Switching	OS preemptive	cooperative at `await`
Overhead per task	~8MB per thread	~1KB per coroutine
Max concurrent tasks	hundreds	thousands
Code style	normal functions	`async def` + `await`
Best for	existing blocking libraries	I/O-heavy, many connections

The memory difference is significant — a thread costs ~8MB of stack space while a coroutine costs ~1KB. This is why async/await scales to thousands of concurrent connections where threading would exhaust memory.