Source on EconIndx: World Bank Open Data — free, CC BY 4.0, 1,600+ indicators across 260+ economies.
Access & Pricing
Completely free, no API key required for standard use. Optionally register at data.worldbank.org for higher rate limits. No contracts, no enterprise tier. The CC BY 4.0 license permits commercial use with attribution.
Your First Data Pull
No authentication needed. Pull your first indicator in under a minute:
import requests
import pandas as pd
WB_BASE = "https://api.worldbank.org/v2"
def fetch_indicator(indicator: str, per_page: int = 5000) -> pd.DataFrame:
"""Fetch an indicator for all countries, most recent 50 years."""
url = f"{WB_BASE}/country/all/indicator/{indicator}"
params = {
"format": "json",
"per_page": per_page,
"mrv": 50, # most recent 50 values per country
}
r = requests.get(url, params=params)
meta, data = r.json()
if not data:
return pd.DataFrame()
rows = []
for d in data:
if d["value"] is not None:
rows.append({
"country_code": d["countryiso3code"],
"country_name": d["country"]["value"],
"indicator": indicator,
"year": int(d["date"]),
"value": d["value"],
})
return pd.DataFrame(rows)
# Pull GDP (current USD) for all countries
gdp = fetch_indicator("NY.GDP.MKTP.CD")
print(f"Rows: {len(gdp)}")
print(f"Countries: {gdp['country_code'].nunique()}")
print(f"Year range: {gdp['year'].min()} – {gdp['year'].max()}")
First Pull: What to Expect
| Indicator | Description | Rows (all countries, 50yr) | Country coverage |
|---|---|---|---|
NY.GDP.MKTP.CD | GDP, current USD | ~9,000–11,000 | ~215 (sparse early) |
SP.POP.TOTL | Population | ~12,000 | 260+ |
FP.CPI.TOTL.ZG | Inflation, CPI % | ~7,500 | ~180 |
SL.UEM.TOTL.ZS | Unemployment, % | ~6,000 | ~150 |
NY.GDP.PCAP.CD | GDP per capita, USD | ~10,000 | ~215 |
One full indicator pull (all countries, 50 years) takes 2–5 seconds. Paginate using ?page=N if the total in the metadata exceeds your per_page setting.
Watch for aggregates: The response includes regional groups (WLD, ECS, EAP, etc.) alongside country rows. Filter by region.id != "NA" or check for 3-character ISO codes to exclude aggregates.
Key Indicators to Start With
Output & income:
NY.GDP.MKTP.CD— GDP, current USDNY.GDP.MKTP.KD.ZG— GDP growth rate, annual %NY.GDP.PCAP.PP.CD— GDP per capita, PPP (best for cross-country comparison)
People:
SP.POP.TOTL— total populationSP.DYN.LE00.IN— life expectancy at birthSE.ADT.LITR.ZS— adult literacy rate
Economy:
FP.CPI.TOTL.ZG— inflation, CPI %SL.UEM.TOTL.ZS— unemployment, total %BX.KLT.DINV.WD.GD.ZS— FDI net inflows, % of GDP
Trade & finance:
NE.EXP.GNFS.ZS— exports of goods & services, % of GDPGC.DOD.TOTL.GD.ZS— central government debt, % of GDP
Data Tolerance & Validation
What’s normal:
- Null rates are high for low-income countries and early years. For
NY.GDP.MKTP.CD, expect ~20–30% nulls across all country-years (many early years are missing). This is not a bug. - Regional aggregates (
WLD,LIC,MIC, etc.) are present in every response. Store ais_aggregateflag rather than filtering them out — they’re useful for benchmarking. - Annual cadence: most indicators update once a year in April/May (World Development Indicators release). Don’t poll more than monthly.
- Country coverage varies by indicator — some have 260+, others only 100–150 countries.
Validation checks:
def validate_wb_pull(df: pd.DataFrame, indicator: str) -> dict:
countries = df["country_code"].nunique()
null_rate = df["value"].isna().mean() if "value" in df.columns else 1.0
latest_year = df["year"].max() if len(df) else None
years_stale = 2026 - latest_year if latest_year else None
return {
"indicator": indicator,
"row_count": len(df),
"country_count": countries,
"null_rate": round(null_rate, 4),
"latest_year": latest_year,
"stale_alert": years_stale > 2 if years_stale else True,
}
report = validate_wb_pull(gdp, "NY.GDP.MKTP.CD")
print(report)
# Expected: row_count ~10000, country_count ~250, null_rate ~0.15-0.30
Alert thresholds:
- Country count below 180 for a mainstream indicator: check the API or indicator status
latest_yearolder than 2 years behind current year: data may be deprecated- Null rate above 50% for a core macro indicator: investigate — may be a parsing issue
Loading Multiple Indicators Efficiently
import time
indicators = {
"NY.GDP.MKTP.CD": "gdp_current_usd",
"NY.GDP.MKTP.KD.ZG": "gdp_growth_pct",
"SP.POP.TOTL": "population",
"FP.CPI.TOTL.ZG": "inflation_cpi_pct",
"SL.UEM.TOTL.ZS": "unemployment_pct",
}
all_frames = []
for code, label in indicators.items():
df = fetch_indicator(code)
df["indicator_label"] = label
all_frames.append(df)
time.sleep(1) # polite — no published rate limit but respect the API
master = pd.concat(all_frames, ignore_index=True)
print(f"Total rows: {len(master)}")
# Expected: 45,000–60,000 rows for 5 indicators
Schema Stability
The Indicators API schema has been stable for years. Indicator codes occasionally retire — check the indicator metadata endpoint (/v2/indicator/{code}) for "sourceNote" and status. Country codes follow ISO 3166 alpha-3 with World Bank extensions (Kosovo = XKX, Channel Islands = CHI). Map these to your geography dimension on first load; the mapping rarely changes.
Next Steps
- Full details on bulk download, rate limits, and the WDI bulk CSV at the World Bank source page on EconIndx
- Python package:
pip install wbdataorpip install world_bank_data - Browse all 1,600+ indicators at data.worldbank.org/indicator