Python enables us to predict and analyze any given data using Linear regression. **Linear Regression** is one of the basic machine learning or statistical techniques created to solve complex problems.

In Machine Learning or in Data Science regression is known to be one of the most crucial fields and there’re many regression methods available today. **Linear Regression** is one of them. whereas, regression is used to find the relationship among the variables. Using the current data along with the income and year, we can predict the future income of any year using linear regression. I’ll be using the **scikit-learn** library to implement linear regression.

## Import the relevant packages

Import package `numpy`

, `matplotlib`

for charts, `pandas`

for reading CSV files and the class `LinearRegression`

from `sklearn.linear_model`

to implement linear regression.

```
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn import linear_model
```

### Read the CSV file

You can download Canada Per Capita Income from World Bank or from other data sources. I have gathered the data from the World Bank and it’s included in the CSV file below.

Read the above **CSV** file from read_csv via pandas.

`csv = pd.read_csv("gdp-per-capita-us.csv")`

### Display using Scatter Chart

Display the CSV file in a scatter chart with the help of **MatplotLib** Library.

```
plt.scatter(csv.year, csv.income, marker="*",color="green")
plt.plot(csv.year,csv.income, color="yellow")
plt.xlabel("Year")
plt.ylabel("Income")
plt.show()
```

### Run Linear Regression

Create the linear regression model and predict the income of the year 2020.

```
l_r = linear_model.LinearRegression()
l_r.fit(csv[['year']],csv.income)
l_r.predict([[2020]])
```

You can also use the equation of the straight line to predict the income of 2020 manually as shown below,

```
l_r.coef_
l_r.intercept_
y= l_r.coef_*2020 + l_r.intercept_
print("Predicted Income of 2020 is %d" %y)
```

So the final program will look like this,

```
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from sklearn import linear_model
csv = pd.read_csv("gdp-per-capita-us.csv")
plt.scatter(csv.year, csv.income, marker="*",color="green")
plt.plot(csv.year,csv.income, color="yellow")
plt.xlabel("Year")
plt.ylabel("Income")
plt.show()
l_r = linear_model.LinearRegression()
l_r.fit(csv[['year']],csv.income)
l_r.predict([[2020]])
# To predict Manually using the equation of the straight line
l_r.coef_
l_r.intercept_
y= l_r.coef_*2020 + l_r.intercept_
print("Predicted Income of 2020 is %d" %y)
```