Regression searches for relationships among variables. In other words, you need to find a function that maps some features or variables to others sufficiently well.The dependent features are called the dependent variables, outputs, or responses. The independent features are called the independent variables, inputs, regressors, or predictors.
The coefficient of determination, denoted as π Β², tells you which amount of variation in π¦ can be explained by the dependence on π±, using the particular regression model. A larger π Β² indicates a better fit and means that the model can better explain the variation of the output with different inputs.
The value π Β² = 1 corresponds to SSR = 0. Thatβs the perfect fit, since the values of predicted and actual responses fit completely to each other.
Linear regression is probably one of the most important and widely used regression techniques. Itβs among the simplest regression methods. One of its main advantages is the ease of interpreting results.
Simple or single-variate linear regression is the simplest case of linear regression
(venv) $ python -m pip install numpy scikit-learn statsmodels
There are five basic steps when youβre implementing linear regression:
.reshape()
on x because this array must be two-dimensional, or more precisely, it must have one column and as many rows as necessary. Thatβs exactly what the argument (-1, 1) of .reshape()
specifies.LinearRegression
, which will represent the regression model. You can provide several optional parameters to LinearRegression
:
.fit()
, you calculate the optimal values of the weights πβ and πβ, using the existing input and output, x and y, as the arguments. In other words, .fit()
fits the model..score()
, the arguments are also the predictor x and response y, and the return value is π
Β²..intercept_
, which represents the coefficient πβ, and .coef_
, which represents πβ:.predict()
, you pass the regressor as the argument and get the corresponding predicted response.import numpy as np
from sklearn.linear_model import LinearRegression
x = np.array([5, 15, 25, 35, 45, 55]).reshape((-1, 1))
y = np.array([5, 20, 14, 32, 22, 38])
model = LinearRegression().fit(x, y)
r_sq = model.score(x, y)
print(f"coefficient of determination: {r_sq}")
print(f"intercept: {model.intercept_}")
print(f"slope: {model.coef_}")
y_pred = model.predict(x)
print(f"predicted response:\n{y_pred}")
You can implement multiple linear regression following the same steps as you would for simple regression. The main difference is that your x array will now have two or more columns.
import numpy as np
from sklearn.linear_model import LinearRegression
x = [
[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]
]
y = [4, 5, 20, 14, 32, 22, 38, 43]
x, y = np.array(x), np.array(y)
model = LinearRegression().fit(x, y)
r_sq = model.score(x, y)
print(f"coefficient of determination: {r_sq}")
print(f"intercept: {model.intercept_}")
print(f"coefficients: {model.coef_}")
y_pred = model.predict(x)
print(f"predicted response:\n{y_pred}")
Implementing polynomial regression with scikit-learn is very similar to linear regression. Thereβs only one extra step: you need to transform the array of inputs to include nonlinear terms such as π₯Β².
The variable transformer refers to an instance of PolynomialFeatures that you can use to transform the input x.
You can provide several optional parameters to PolynomialFeatures:
# Step 1: Import packages and classes
import numpy as np
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
# Step 2a: Provide data
x = [
[0, 1], [5, 1], [15, 2], [25, 5], [35, 11], [45, 15], [55, 34], [60, 35]
]
y = [4, 5, 20, 14, 32, 22, 38, 43]
x, y = np.array(x), np.array(y)
# Step 2b: Transform input data
x_ = PolynomialFeatures(degree=2, include_bias=False).fit_transform(x)
# Step 3: Create a model and fit it
model = LinearRegression().fit(x_, y)
# Step 4: Get results
r_sq = model.score(x_, y)
intercept, coefficients = model.intercept_, model.coef_
# Step 5: Predict response
y_pred = model.predict(x_)