Reshaping Data (pivot, pivot_table, melt)
Long vs wide (why reshaping exists)
- Long format: one row per observation (better for analysis/plotting)
- Wide format: values spread across columns (good for reporting)
Example: long format
Long format
import pandas as pd
df = pd.DataFrame({
"date": ["2025-01-01", "2025-01-01", "2025-01-02", "2025-01-02"],
"city": ["pune", "delhi", "pune", "delhi"],
"sales": [100, 120, 90, 150],
})
print(df)Long format
import pandas as pd
df = pd.DataFrame({
"date": ["2025-01-01", "2025-01-01", "2025-01-02", "2025-01-02"],
"city": ["pune", "delhi", "pune", "delhi"],
"sales": [100, 120, 90, 150],
})
print(df)pivotpivot: reshape when data is unique
pivot
wide = df.pivot(index="date", columns="city", values="sales")
print(wide)pivot
wide = df.pivot(index="date", columns="city", values="sales")
print(wide)If the combination of (indexindex, columnscolumns) isn’t unique, pivotpivot throws an error.
pivot_tablepivot_table: reshape with aggregation
pivot_table
wide = df.pivot_table(index="date", columns="city", values="sales", aggfunc="sum")
print(wide)pivot_table
wide = df.pivot_table(index="date", columns="city", values="sales", aggfunc="sum")
print(wide)meltmelt: wide → long
melt
wide = pd.DataFrame({
"date": ["2025-01-01", "2025-01-02"],
"pune": [100, 90],
"delhi": [120, 150],
})
long = wide.melt(id_vars=["date"], var_name="city", value_name="sales")
print(long)melt
wide = pd.DataFrame({
"date": ["2025-01-01", "2025-01-02"],
"pune": [100, 90],
"delhi": [120, 150],
})
long = wide.melt(id_vars=["date"], var_name="city", value_name="sales")
print(long)Practical notes
- Many ML/EDA workflows prefer long format.
- Pivoted tables often have a multi-index for columns—use
reset_index()reset_index()orwide.columnswide.columnsrenaming if needed.
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
