Getting Started with the World Bank API

Source on EconIndx: World Bank Open Data — free, CC BY 4.0, 1,600+ indicators across 260+ economies.

Access & Pricing

Completely free, no API key required for standard use. Optionally register at data.worldbank.org for higher rate limits. No contracts, no enterprise tier. The CC BY 4.0 license permits commercial use with attribution.

Your First Data Pull

No authentication needed. Pull your first indicator in under a minute:

import requests
import pandas as pd

WB_BASE = "https://api.worldbank.org/v2"

def fetch_indicator(indicator: str, per_page: int = 5000) -> pd.DataFrame:
    """Fetch an indicator for all countries, most recent 50 years."""
    url = f"{WB_BASE}/country/all/indicator/{indicator}"
    params = {
        "format": "json",
        "per_page": per_page,
        "mrv": 50,   # most recent 50 values per country
    }
    r = requests.get(url, params=params)
    meta, data = r.json()

    if not data:
        return pd.DataFrame()

    rows = []
    for d in data:
        if d["value"] is not None:
            rows.append({
                "country_code": d["countryiso3code"],
                "country_name": d["country"]["value"],
                "indicator": indicator,
                "year": int(d["date"]),
                "value": d["value"],
            })
    return pd.DataFrame(rows)

# Pull GDP (current USD) for all countries
gdp = fetch_indicator("NY.GDP.MKTP.CD")
print(f"Rows: {len(gdp)}")
print(f"Countries: {gdp['country_code'].nunique()}")
print(f"Year range: {gdp['year'].min()} – {gdp['year'].max()}")

First Pull: What to Expect

Indicator	Description	Rows (all countries, 50yr)	Country coverage
`NY.GDP.MKTP.CD`	GDP, current USD	~9,000–11,000	~215 (sparse early)
`SP.POP.TOTL`	Population	~12,000	260+
`FP.CPI.TOTL.ZG`	Inflation, CPI %	~7,500	~180
`SL.UEM.TOTL.ZS`	Unemployment, %	~6,000	~150
`NY.GDP.PCAP.CD`	GDP per capita, USD	~10,000	~215

One full indicator pull (all countries, 50 years) takes 2–5 seconds. Paginate using ?page=N if the total in the metadata exceeds your per_page setting.

Watch for aggregates: The response includes regional groups (WLD, ECS, EAP, etc.) alongside country rows. Filter by region.id != "NA" or check for 3-character ISO codes to exclude aggregates.

Key Indicators to Start With

Output & income:

NY.GDP.MKTP.CD — GDP, current USD
NY.GDP.MKTP.KD.ZG — GDP growth rate, annual %
NY.GDP.PCAP.PP.CD — GDP per capita, PPP (best for cross-country comparison)

People:

SP.POP.TOTL — total population
SP.DYN.LE00.IN — life expectancy at birth
SE.ADT.LITR.ZS — adult literacy rate

Economy:

FP.CPI.TOTL.ZG — inflation, CPI %
SL.UEM.TOTL.ZS — unemployment, total %
BX.KLT.DINV.WD.GD.ZS — FDI net inflows, % of GDP

Trade & finance:

NE.EXP.GNFS.ZS — exports of goods & services, % of GDP
GC.DOD.TOTL.GD.ZS — central government debt, % of GDP

Data Tolerance & Validation

What’s normal:

Null rates are high for low-income countries and early years. For NY.GDP.MKTP.CD, expect ~20–30% nulls across all country-years (many early years are missing). This is not a bug.
Regional aggregates (WLD, LIC, MIC, etc.) are present in every response. Store a is_aggregate flag rather than filtering them out — they’re useful for benchmarking.
Annual cadence: most indicators update once a year in April/May (World Development Indicators release). Don’t poll more than monthly.
Country coverage varies by indicator — some have 260+, others only 100–150 countries.

⚠️ Revision note: World Bank indicators are updated annually, typically in April–May. Always check the lastupdated field in the metadata envelope — a stale date means your pipeline ran before the annual refresh, not that data is missing.

Validation checks:

def validate_wb_pull(df: pd.DataFrame, indicator: str) -> dict:
    countries = df["country_code"].nunique()
    null_rate = df["value"].isna().mean() if "value" in df.columns else 1.0
    latest_year = df["year"].max() if len(df) else None
    years_stale = 2026 - latest_year if latest_year else None

    return {
        "indicator": indicator,
        "row_count": len(df),
        "country_count": countries,
        "null_rate": round(null_rate, 4),
        "latest_year": latest_year,
        "stale_alert": years_stale > 2 if years_stale else True,
    }

report = validate_wb_pull(gdp, "NY.GDP.MKTP.CD")
print(report)
# Expected: row_count ~10000, country_count ~250, null_rate ~0.15-0.30

Alert thresholds:

Country count below 180 for a mainstream indicator: check the API or indicator status
latest_year older than 2 years behind current year: data may be deprecated
Null rate above 50% for a core macro indicator: investigate — may be a parsing issue

Loading Multiple Indicators Efficiently

import time

indicators = {
    "NY.GDP.MKTP.CD": "gdp_current_usd",
    "NY.GDP.MKTP.KD.ZG": "gdp_growth_pct",
    "SP.POP.TOTL": "population",
    "FP.CPI.TOTL.ZG": "inflation_cpi_pct",
    "SL.UEM.TOTL.ZS": "unemployment_pct",
}

all_frames = []
for code, label in indicators.items():
    df = fetch_indicator(code)
    df["indicator_label"] = label
    all_frames.append(df)
    time.sleep(1)  # polite — no published rate limit but respect the API

master = pd.concat(all_frames, ignore_index=True)
print(f"Total rows: {len(master)}")
# Expected: 45,000–60,000 rows for 5 indicators

Schema Stability

The Indicators API schema has been stable for years. Indicator codes occasionally retire — check the indicator metadata endpoint (/v2/indicator/{code}) for "sourceNote" and status. Country codes follow ISO 3166 alpha-3 with World Bank extensions (Kosovo = XKX, Channel Islands = CHI). Map these to your geography dimension on first load; the mapping rarely changes.

📌 Note: The World Bank assigns non-standard ISO codes to some territories (e.g., Kosovo = XKX, Channel Islands = CHI). Store a geography mapping table in your data lake rather than assuming all codes are ISO 3166-1 alpha-3.

Getting Started with the World Bank API

Access & Pricing

Your First Data Pull

First Pull: What to Expect

Key Indicators to Start With

Data Tolerance & Validation

Loading Multiple Indicators Efficiently

Schema Stability

Recent articles

From Import to Trust: Validating Economic Data After You Load It

Handling Rate Limits, Retries, and Failed Imports When Loading Economic Data

Scheduling Your 2026 Data Imports: Power Automate vs. GitHub Actions vs. Cloudflare Workers

Loading Economic Data Into Excel and Power BI Without Writing Code

Getting Started with the World Bank API

Access & Pricing

Your First Data Pull

First Pull: What to Expect

Key Indicators to Start With

Data Tolerance & Validation

Loading Multiple Indicators Efficiently

Schema Stability

Related Sources on EconIndx

Related Guides

Recent articles

From Import to Trust: Validating Economic Data After You Load It

Handling Rate Limits, Retries, and Failed Imports When Loading Economic Data

Scheduling Your 2026 Data Imports: Power Automate vs. GitHub Actions vs. Cloudflare Workers

Loading Economic Data Into Excel and Power BI Without Writing Code