Data Type Conversion and Validation
Why dtype conversion matters
Many datasets arrive with wrong dtypes:
- numbers stored as strings
- dates stored as strings
- categories stored inconsistently
If dtypes are wrong, your stats and charts can be wrong.
Convert to numeric safely
to_numeric
import pandas as pd
df = pd.DataFrame({"amount": ["1,200", "500", "oops", " 700 "]})
df["amount"] = df["amount"].astype(str).str.replace(",", "", regex=False).str.strip()
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print(df)
print(df.isna().sum())to_numeric
import pandas as pd
df = pd.DataFrame({"amount": ["1,200", "500", "oops", " 700 "]})
df["amount"] = df["amount"].astype(str).str.replace(",", "", regex=False).str.strip()
df["amount"] = pd.to_numeric(df["amount"], errors="coerce")
print(df)
print(df.isna().sum())Convert to datetime
to_datetime
import pandas as pd
df = pd.DataFrame({"date": ["2025-01-01", "2025/01/02", "invalid"]})
df["date"] = pd.to_datetime(df["date"], errors="coerce")
print(df)to_datetime
import pandas as pd
df = pd.DataFrame({"date": ["2025-01-01", "2025/01/02", "invalid"]})
df["date"] = pd.to_datetime(df["date"], errors="coerce")
print(df)Categories
category dtype
import pandas as pd
df = pd.DataFrame({"city": ["Pune", "Delhi", "Pune"]})
df["city"] = df["city"].astype("category")
print(df.dtypes)category dtype
import pandas as pd
df = pd.DataFrame({"city": ["Pune", "Delhi", "Pune"]})
df["city"] = df["city"].astype("category")
print(df.dtypes)Validate assumptions
Typical checks:
- ID columns contain no duplicates
- numeric columns are non-negative
- date columns are within expected range
Validation examples
# no duplicate ids
# assert df["id"].is_unique
# non-negative values
# assert (df["amount"] >= 0).all()
# date range
# assert df["date"].min() >= pd.Timestamp("2020-01-01")Validation examples
# no duplicate ids
# assert df["id"].is_unique
# non-negative values
# assert (df["amount"] >= 0).all()
# date range
# assert df["date"].min() >= pd.Timestamp("2020-01-01")Tip
Convert + validate early. It prevents subtle bugs later.
If this helped you, consider buying me a coffee ☕
Buy me a coffeeWas this page helpful?
Let us know how we did
