# What is regression and linear regression?

## What is regression? Which models can you use to solve a regression problem?

Regression is a part of supervised ML. Regression models investigate the relationship between a dependent (target) and independent variable (s) (predictor). Here are some common regression models

*Linear Regression*establishes a linear relationship between target and predictor (s). It predicts a numeric value and has a shape of a straight line.*Polynomial Regression*has a regression equation with the power of independent variable more than 1. It is a curve that fits into the data points.*Ridge Regression*helps when predictors are highly correlated (multicollinearity problem). It penalizes the squares of regression coefficients but doesn’t allow the coefficients to reach zeros (uses L2 regularization).*Lasso Regression*penalizes the absolute values of regression coefficients and allows some of the coefficients to reach absolute zero (thereby allowing feature selection).

## What is linear regression? When do we use it?

Linear regression is a model that assumes a linear relationship between the input variables (X) and the single output variable (y).

With a simple equation:

```
y = B0 + B1*x1 + ... + Bn * xN
```

B is regression coefficients, x values are the independent (explanatory) variables and y is dependent variable.

The case of one explanatory variable is called simple linear regression. For more than one explanatory variable, the process is called multiple linear regression.

Simple linear regression:

```
y = B0 + B1*x1
```

Multiple linear regression:

```
y = B0 + B1*x1 + ... + Bn * xN
```

## Methods for solving linear regression do you know?

To solve linear regression, you need to find the coefficients which minimize the sum of squared errors.

Matrix Algebra method: Let’s say you have `X`

, a matrix of features, and `y`

, a vector with the values you want to predict. After going through the matrix algebra and minimization problem, you get this solution: .

But solving this requires you to find an inverse, which can be time-consuming, if not impossible. Luckily, there are methods like Singular Value Decomposition (SVD) or QR Decomposition that can reliably calculate this part (called the pseudo-inverse) without actually needing to find an inverse. The popular python ML library `sklearn`

uses SVD to solve least squares.

Alternative method: Gradient Descent.