Regression Models in Machine Learning

15. Regression Models in Machine Learning#

15.1. Introduction#

Regression analysis is a fundamental method in statistics and machine learning, allowing us to understand and model relationships between variables. By using regression, we can predict outcomes, identify trends, and make informed decisions. In this notebook, we will explore several common regression techniques, each having its particular uses and strengths, along with code examples and real-life applications to illustrate their power.

15.2. Linear Regression#

15.2.1. Theory#

Linear Regression is the simplest form of regression. It assumes a linear relationship between the independent variable (feature) ( X ) and the dependent variable (output) ( Y ). The relationship is modeled by the equation:

[ Y = \beta_0 + \beta_1X + \epsilon ]

where:

( Y ) is the dependent variable.
( \beta_0 ) is the y-intercept.
( \beta_1 ) is the slope of the line.
( \epsilon ) represents the error term (residuals).

The goal of Linear Regression is to find the parameters ( \beta_0 ) and ( \beta_1 ) that minimize the sum of squared residuals.

15.2.2. Examples#

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# Example data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([2, 3, 5, 7, 11])

# Create a linear regression model
model = LinearRegression()
model.fit(X, Y)

# Plot the results
plt.scatter(X, Y, color='blue', label="Data Points")
plt.plot(X, model.predict(X), color='red', label="Linear Fit")
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Linear Regression Example')
plt.legend()
plt.show()

Real-world Application: Predicting house prices based on various features such as size, location, and number of rooms.

15.3. Polynomial Regression#

15.3.1. Theory#

Polynomial Regression is an extension of Linear Regression that models the relationship as an ( n^{th} ) degree polynomial. It is represented as:

[ Y = \beta_0 + \beta_1X + \beta_2X^2 + \ldots + \beta_nX^n + \epsilon ]

This type of regression is useful for capturing non-linear patterns in the data.

15.3.2. Examples#

from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# Example data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
Y = np.array([1, 4, 9, 16, 25])  # Quadratic relationship

# Create a polynomial regression model
degree = 2
polynomial_model = make_pipeline(PolynomialFeatures(degree), LinearRegression())
polynomial_model.fit(X, Y)

# Plot the results
plt.scatter(X, Y, color='green', label="Data Points")
plt.plot(X, polynomial_model.predict(X), color='orange', label=f"Degree {degree} Polynomial Fit")
plt.xlabel('X')
plt.ylabel('Y')
plt.title('Polynomial Regression Example')
plt.legend()
plt.show()

Real-world Application: Modeling the growth of populations where growth rates change over time.

15.4. Ridge and Lasso Regression#

15.4.1. Theory#

Ridge Regression adds a penalty equivalent to the square of the magnitude of coefficients, which shrinks the coefficients and reduces model complexity.

[ \text{Loss} = \sum (y_i - \hat{y}_i)^2 + \lambda \sum \beta_j^2 ]

Lasso Regression adds a penalty equivalent to the absolute value of the magnitude of coefficients, which can shrink some coefficients to zero, allowing for feature selection.

[ \text{Loss} = \sum (y_i - \hat{y}_i)^2 + \lambda \sum |\beta_j| ]

The term ( \lambda ) is a hyperparameter that controls the strength of the penalty.

15.4.2. Examples#

from sklearn.linear_model import Ridge, Lasso

# Example data
X = np.random.rand(10, 3)
Y = np.dot(X, np.array([1.5, -2, 3])) + np.random.normal(0, 0.1, 10)

# Ridge Regression
ridge_model = Ridge(alpha=1.0)
ridge_model.fit(X, Y)
ridge_coef = ridge_model.coef_

# Lasso Regression
lasso_model = Lasso(alpha=0.1)
lasso_model.fit(X, Y)
lasso_coef = lasso_model.coef_

print(f'Ridge Coefficients: {ridge_coef}')
print(f'Lasso Coefficients: {lasso_coef}')

Real-world Application: Ridge is often used for multicollinear data where predictor variables are highly correlated, while Lasso is useful for reducing model size by selecting influential features.

15.5. Logistic Regression#

15.5.1. Theory#

Logistic Regression is used for binary classification problems. It predicts the probability of occurrence of an event by fitting data to a logistic curve. The logistic model is given by:

[ P(Y=1|X) = \frac{1}{1 + e^{-(\beta_0 + \beta_1X)}} ]

where ( P(Y=1|X) ) is the probability that the dependent variable ( Y ) equals 1.

15.5.2. Examples#

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Load dataset (iris)
iris = load_iris()
X = iris.data[:, :2]  # using only two features for simplicity
Y = (iris.target != 0) * 1  # making it a binary classification

# Split data
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=42)

# Create and train the model
log_model = LogisticRegression()
log_model.fit(X_train, Y_train)

# Evaluate the model
predictions = log_model.predict(X_test)
accuracy = accuracy_score(Y_test, predictions)
print(f'Logistic Regression Accuracy: {accuracy:.2f}')

Real-world Application: Predicting whether an email is spam or not is a typical application of logistic regression.

15.6. Model Evaluation Techniques#

15.6.1. Theory#

To ensure a model’s reliability, different model evaluation techniques are employed:

Training and Testing Split: Divides data into training and testing subsets.
Cross-validation: Involves splitting the data into k-folds to ensure all data is used for training and testing in rotations.
Metrics:
- Mean Squared Error (MSE): Evaluates the average of the squared differences between the observed and predicted values.
- R-squared: Measures the proportion of variance in the dependent variable predictable from the independent variable(s).

15.6.2. Examples#

from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import cross_val_score

# Example metric calculations
cross_val_scores = cross_val_score(linear_model, X, Y, cv=5)

# Assuming predictions from the earlier linear regression example
mse = mean_squared_error(Y, model.predict(X))
r2 = r2_score(Y, model.predict(X))

print(f'Mean Squared Error: {mse:.2f}')
print(f'R-squared: {r2:.2f}')
print(f'Cross-Validation Scores: {cross_val_scores}')

Insight: These techniques are invaluable in gauging a model’s performance on unseen data and understanding its generalization capabilities.

In conclusion, regression models are indispensable in machine learning, each bringing unique benefits to tackle a variety of challenges across domains. These foundational techniques pave the way for building more sophisticated models tailored to specific needs.