Skip to content

Iris Flower Classification

Goal

Use the Iris dataset to:

  • Perform EDA
  • Visualize feature relationships
  • Train a simple classifier

Load dataset

Load iris
import pandas as pd
from sklearn.datasets import load_iris
 
iris = load_iris(as_frame=True)
df = iris.frame
print(df.head())
Load iris
import pandas as pd
from sklearn.datasets import load_iris
 
iris = load_iris(as_frame=True)
df = iris.frame
print(df.head())

Pair plot

Pair plot
import seaborn as sns
 
sns.pairplot(df, hue="target")
Pair plot
import seaborn as sns
 
sns.pairplot(df, hue="target")

Train/test split + model

Train a classifier
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
 
X = df.drop(columns=["target"])
y = df["target"]
 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
 
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
print("Accuracy:", clf.score(X_test, y_test))
Train a classifier
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
 
X = df.drop(columns=["target"])
y = df["target"]
 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42, stratify=y)
 
clf = LogisticRegression(max_iter=1000)
clf.fit(X_train, y_train)
print("Accuracy:", clf.score(X_test, y_test))

Deliverable

  • Which features separate classes best?
  • Any overlap between species?
  • Baseline accuracy

If this helped you, consider buying me a coffee ☕

Buy me a coffee

Was this page helpful?

Let us know how we did