Tutorial

How to Parse US Addresses in Django

Turn a freeform address column into clean, structured model fields. Free API, a small requests service module, and a management command to backfill your existing rows.

sthan.io Team
sthan.io Team
June 27, 2026 · 11 min read

Your Django model has a single address field full of freeform strings: "1600 Pennsylvania Ave NW Apt 4B, Washington DC 20500". It is fine for display, but useless for anything structured - you can't filter(city="..."), sort by street, group by ZIP, or de-duplicate, because every part of the address is mashed into one column.

Address parsing fixes that. You send the raw string and get back discrete components: the primary number, the street name, the suffix, the directional, the unit, the city, the state, and the ZIP - each in its own field, ready to drop into its own model column.

This tutorial shows you how to parse US addresses in Django using sthan.io's address API. We wrap the requests library in a small service module, expose it through a view, and - the part Django makes easy - backfill an existing table with a management command.

Quick summary: Send your API key as a Bearer token, call GET /v2/address-parser/usa/{address}, and read the components from the Result object - addressNumber, streetName, streetPostType, unitType, unitNumber, city, stateCode, zipCode. The free tier gives you 100 lookups/month - no credit card required.

What you'll need: a Django 4+ project on Python 3.8+ and a free sthan.io account. No credit card, no approval queue. The free parser tier is 100 lookups/month; paid plans start at $8/month if you need more.

Try it first

Parse any US address right here - no signup required:

Try it live

That's what you're building. Type a messy one-line address and the API hands back every component as a separate, standardized field.

What the API returns

Every response is wrapped in a standard envelope. For parsing, the Result field is a single object whose properties are the address components. This is a real response for 1600 Pennsylvania Ave NW Apt 4B:

{
  "Id": "2737f8f3-af83-4ba1-b9c3-e29d2ba9e03b",
  "Result": {
    "inputAddress": "1600 Pennsylvania Ave NW Apt 4B Washington DC 20500",
    "addressLine1": "1600 PENNSYLVANIA AVE NW",
    "addressLine2": null,
    "addressNumber": "1600",
    "streetPreDir": "",
    "streetName": "PENNSYLVANIA",
    "streetPostType": "AVE",
    "streetPostDir": "NW",
    "unitType": "apt",
    "unitNumber": "4b",
    "city": "WASHINGTON",
    "stateCode": "DC",
    "zipCode": "20500",
    "zip4": null,
    "county": null,
    "matchMode": "Speculative",
    "matchTier": "Near",
    "confidence": 0.7,
    "matchCode": {
      "houseNumber": "Matched",
      "street": "Matched",
      "unit": "Matched",
      "city": "Matched",
      "state": "Matched",
      "zipCode": "Matched",
      "zip4": "NotApplicable"
    },
    "footnotes": ["recovered: standardized via correction, not an exact match"]
  },
  "ClientSessionId": null,
  "StatusCode": 200,
  "IsError": false,
  "Errors": []
}
Casing matters in Python. The envelope keys (Id, Result, StatusCode, IsError, Errors) are PascalCase, while the component fields inside Result are camelCase (addressNumber, streetName, streetPostType). Read each key exactly as shown - a Python dict lookup is case-sensitive, so result["AddressNumber"] would raise KeyError.

The fields you'll use most often are covered in the component map below. Note that some fields are empty or null when they don't apply - there's no leading directional here, so streetPreDir is an empty string, and the parser didn't append a zip4 or county, so those are null.

Get your API key

  1. Sign up at sthan.io and subscribe to the free Address Parser tier
  2. Open your dashboard and create an API key
  3. Copy the key - it looks like sthan_live_xxxxxxxxxxxxxxxx

You get the key immediately, with no approval queue. An API key is the simplest way to authenticate: you send it as a Bearer token on every request and there is no separate login step. (If you prefer a short-lived token, there is a JWT flow covered later.)

Configure the project

Install requests and keep your key out of source control by reading it from the environment in settings.py:

pip install requests

# Set the key in your shell (or a .env file loaded by python-dotenv)
export STHAN_API_KEY="sthan_live_xxxxxxxxxxxxxxxx"
Security tip: Never hard-code the key in settings.py. Read it from an environment variable in production and a .env file (loaded with python-dotenv) locally - and add .env to .gitignore.
# settings.py
import os

STHAN_BASE_URL = "https://api.sthan.io"
STHAN_API_KEY = os.environ["STHAN_API_KEY"]  # raises early if it isn't set

Build the parser service

Keep the API call in one small module so views, tasks, and management commands all share it. A single module-level requests.Session reuses the connection pool and attaches the auth header once. The service URL-encodes the address, calls the endpoint, unwraps the envelope, and returns the Result object:

# addresses/sthan.py
import requests
from urllib.parse import quote
from django.conf import settings

# One Session for the whole app: connection pooling + a default auth header
_session = requests.Session()
_session.headers.update({"Authorization": f"Bearer {settings.STHAN_API_KEY}"})


class ParserError(Exception):
    """Raised when the API reports a business error in the envelope."""


def parse_address(address, mode="speculative", timeout=10.0):
    encoded = quote(address.strip(), safe="")
    url = f"{settings.STHAN_BASE_URL}/v2/address-parser/usa/{encoded}"

    response = _session.get(url, params={"match": mode}, timeout=timeout)
    response.raise_for_status()  # surfaces 4xx/5xx as an exception

    envelope = response.json()
    if envelope.get("IsError"):
        raise ParserError(", ".join(envelope.get("Errors", [])))

    return envelope["Result"]

Expose it through a thin view. Parse on the server - the API does not enable CORS for browser requests, and your key must never reach client-side JavaScript. Django's CSRF middleware protects the form, and the key stays in settings:

# addresses/views.py
from django.http import JsonResponse
from django.views.decorators.http import require_POST
from .sthan import parse_address, ParserError


@require_POST
def parse_view(request):
    raw = request.POST.get("address", "").strip()
    if not raw:
        return JsonResponse({"ok": False, "reason": "empty"}, status=400)

    try:
        result = parse_address(raw)
    except ParserError:
        return JsonResponse({"ok": False, "reason": "parse_failed"}, status=502)

    return JsonResponse({
        "ok": True,
        "addressNumber": result["addressNumber"],
        "streetName": result["streetName"],
        "unit": {"type": result["unitType"], "number": result["unitNumber"]},
        "city": result["city"],
        "state": result["stateCode"],
        "zip": result["zipCode"],
    })
# urls.py
from django.urls import path
from addresses import views

urlpatterns = [
    path("addresses/parse", views.parse_view, name="parse_address"),
]

Choose a match mode

The match parameter controls how much typo tolerance the parser applies while standardizing components. The same call supports four modes, from strictest to loosest:

ModeBehaviorUse when
strict Only confident, exact-component matches; returns little or nothing when the input is ambiguous. You only want components you can fully trust.
balanced Exact plus typo-corrected components. Returns the best parse, flagging corrected fields. Typical cleanup of user-entered addresses.
fuzzy Wider recovery for messy or partial input. Higher recall, more corrections. Backfilling a column of inconsistent legacy data.
speculative Loosest recovery, with extra tolerance for heavily misspelled street names. Best-effort parses are flagged matchTier = "Speculative". Maximum recovery / agent tooling. This is the default.

If you omit match, the endpoint defaults to speculative for the widest recovery. Whichever mode you pick, the location-defining parts of the address - the primary number, ordinal, directional, state, and the street's core name - are never substituted. A looser mode only widens tolerance for misspellings of the same street.

Map the components to model fields

Each component comes back as its own field, so mapping them to model columns is a direct copy. The fields you'll use most:

FieldMeaningExample
addressNumberPrimary (house/building) number1600
streetPreDirLeading directionalN in "N Main St"
streetNameCore street namePENNSYLVANIA
streetPostTypeStreet suffix / typeAVE, ST, BLVD
streetPostDirTrailing directionalNW
unitType / unitNumberSecondary unit designator and valueapt / 4b
city, stateCode, zipCode, zip4City, two-letter state, 5-digit ZIP, +4WASHINGTON, DC, 20500

A model with one column per component lines up directly with the response. (A few extra fields - streetPreType, streetPreMod, streetPreSep, streetPostMod - capture modifiers in unusual street names like "Avenue of the Americas" and are empty for ordinary addresses.)

# addresses/models.py
from django.db import models


class ParsedAddress(models.Model):
    raw = models.CharField(max_length=255)
    address_number = models.CharField(max_length=16, blank=True)
    street_pre_dir = models.CharField(max_length=4, blank=True)
    street_name = models.CharField(max_length=128, blank=True)
    street_type = models.CharField(max_length=16, blank=True)
    street_post_dir = models.CharField(max_length=4, blank=True)
    unit_type = models.CharField(max_length=16, blank=True)
    unit_number = models.CharField(max_length=16, blank=True)
    city = models.CharField(max_length=128, blank=True)
    state = models.CharField(max_length=2, blank=True)
    zip_code = models.CharField(max_length=5, blank=True)
    zip4 = models.CharField(max_length=4, blank=True)

The natural Django pattern for an existing table is a one-off management command. It walks the rows that haven't been parsed yet, calls the service for each, copies the components across, and saves - and it handles a rate-limit hiccup without dropping a row. Pair the parse with matchCode if you want to record which fields were trusted as-is (Matched) versus Corrected, Inferred, Unmatched, or NotApplicable.

# addresses/management/commands/backfill_addresses.py
import time
import requests
from django.core.management.base import BaseCommand
from addresses.models import ParsedAddress
from addresses.sthan import parse_address, ParserError


class Command(BaseCommand):
    help = "Parse the raw address on each row into structured columns."

    def handle(self, *args, **options):
        for row in ParsedAddress.objects.filter(street_name=""):
            try:
                result = parse_address(row.raw)
            except requests.HTTPError as exc:
                if exc.response is not None and exc.response.status_code == 429:
                    time.sleep(2)          # rate limited - wait and skip for now
                    continue
                raise
            except ParserError:
                continue                   # unparseable input - leave it for review

            row.address_number = result["addressNumber"]
            row.street_pre_dir = result["streetPreDir"]
            row.street_name = result["streetName"]
            row.street_type = result["streetPostType"]
            row.street_post_dir = result["streetPostDir"]
            row.unit_type = result["unitType"] or ""
            row.unit_number = result["unitNumber"] or ""
            row.city = result["city"]
            row.state = result["stateCode"]
            row.zip_code = result["zipCode"]
            row.zip4 = result["zip4"] or ""
            row.save()
            self.stdout.write(f"Parsed: {row.raw}")

Run it with python manage.py backfill_addresses. Because the components come back as discrete, standardized fields, the whole table becomes queryable - ParsedAddress.objects.filter(city="WASHINGTON") now works.

Alternative: JWT authentication

An API key is the simplest option and is all most apps need. If your security policy prefers short-lived credentials, the platform also supports a 2-step JWT flow. You call GET /Auth/Token once with your profileName and profilePassword headers, receive a token valid for up to 60 minutes, then send that token as the Bearer value on subsequent calls:

# addresses/sthan.py (token variant)
import os


def get_token():
    response = _session.get(
        f"{settings.STHAN_BASE_URL}/Auth/Token",
        headers={
            "profileName": os.environ["STHAN_PROFILE_NAME"],
            "profilePassword": os.environ["STHAN_PROFILE_PASSWORD"],
        },
        timeout=10,
    )
    response.raise_for_status()
    return response.json()["Result"]["access_token"]


# Swap the static key for a cached token, refreshing on expiry
_session.headers.update({"Authorization": f"Bearer {get_token()}"})

Everything else - the endpoint, the envelope, the parsing - stays exactly the same. Cache the token and refresh it shortly before the 60-minute expiry rather than fetching one per request.

Handle errors

Two status codes are worth handling explicitly so a hiccup never stalls a backfill:

  • 401 - The key or token was rejected. Check the value and, on the JWT flow, refresh and retry once.
  • 429 - Rate limit reached. Back off and retry rather than dropping the row (the backfill command above does this).
import time


def parse_with_retry(address, mode="speculative", retries=2):
    for attempt in range(retries + 1):
        try:
            return parse_address(address, mode)
        except requests.HTTPError as exc:
            if exc.response is not None and exc.response.status_code == 429 \
                    and attempt < retries:
                time.sleep(2 ** attempt)  # 1s, then 2s
                continue
            raise
    return None

The exponential back-off (1s, then 2s) is enough for transient limits. For a large backfill, add a small delay between rows and a circuit breaker so one bad minute doesn't stall the whole queue.

What's next: confirm the parsed address is deliverable

Parsing gives you clean, structured components fast. It does not, on its own, confirm that mail or a package will actually arrive - a well-formed address can still point at a unit that no longer accepts delivery. When deliverability matters, run the address through the Address Verification API, which returns a deliverability status and appends ZIP+4 and county. It's the same envelope and the same requests service pattern - one GET against /v2/address-verification/usa/{address}. The Django walkthrough is here: Verify US Addresses in Django.

If you want users to enter a clean address in the first place, Address Autocomplete suggests complete addresses as they type. For the parser in plain Python without Django, see Parse US Addresses in Python.

Frequently Asked Questions

Send your sthan.io API key as a Bearer token and call GET /v2/address-parser/usa/{address} with the requests library from a small service module. Read the structured components from the Result object - addressNumber, streetName, streetPostType, unitType, unitNumber, city, stateCode, and zipCode - and copy them into your model fields. The full working example is in the sections above.

The free tier includes 100 lookups per month with no credit card required. Paid plans start at $8/month. There is no trial period; the free tier is permanent. See pricing for higher-volume plans.

The parser breaks a freeform address into discrete fields: addressNumber, streetPreDir and streetPostDir (leading and trailing directionals), streetName, streetPostType (the suffix like Ave or St), unitType and unitNumber, city, stateCode, zipCode, and zip4. Each field is returned separately so you can store it in its own model column.

Parsing splits a freeform address into structured components and standardizes their format. Verification goes a step further and confirms the address is real and deliverable, returning a deliverability status. Parse when you need clean, column-ready fields; verify when you need to know whether mail or a package will actually arrive. See Verify US Addresses in Django.

Yes - a one-off management command is the natural fit. Iterate the rows that still have a raw, unparsed address, call the parser for each, copy the components into the model fields, and save. The backfill command above includes rate-limit handling so a single 429 never drops a row.

Turn messy address strings into clean fields

Parse freeform addresses into structured components with one call - free tier of 100 lookups/month, paid from $8/month, no credit card to start.

sthan.io Team
Written by sthan.io Team

The sthan.io engineering team builds and maintains address verification, parsing, geocoding, and autocomplete APIs. With deep expertise in postal addressing standards and spatial data systems, we help businesses improve address data quality and reduce failed deliveries. Questions? Reach us at [email protected].

Learn more about us