Simple linear regression is an approach for predicting a response using a single feature. It is assumed that the two variables are linearly related. Hence, we try to find a linear function that predicts the response value (y) as accurately as possible as a function of the feature or independent variable (x). For the implementation of algorithms in machine learning, we need to install the modules numpy, scipy, pandas, matplotlib, statsmodels and sklearn in Python. Let us see the implementation of simple linear regression in Python. For example, we may use linear regression to predict the price of the stock market Stock_Index_Price (our dependent variable) based on the following input variables: Interest_Rate . Consider the following stock market data. This dataset is used for linear regression implementation in Python.
We will have to validate that several assumptions are met before we apply linear regression models. Most notably, we have to make sure that a linear relationship exists between the dependent variable and the independent variable/s. In this example, we may want to check whether a linear relationship exists between the following.
The Stock_Index_Price (dependent variable) and the Interest_Rate (independent variable).
Following python code shows how to implement linear regression in Python.
import pandas as pd import matplotlib.pyplot as plt from pandas import DataFrame from sklearn import linear_model #Reading the input data from a csv file df=DataFrame(pd.read_csv("stock.csv")) #scatterplot for Interest_Rate vs Stock_Index_Price #to see whether the relationship is linear plt.scatter(df['Interest_Rate'], df['Stock_Index_Price'], color='red') plt.title('Stock Index Price Vs Interest Rate', fontsize=14) plt.xlabel('Interest Rate', fontsize=14) plt.ylabel('Stock Index Price', fontsize=14) plt.grid(True) plt.show() #Here we have 1 variable for linear regression # Simple linear regression X = df[['Interest_Rate']] Y = df['Stock_Index_Price'] # Model fitting with sklearn regr = linear_model.LinearRegression() regr.fit(X, Y) #Displaying Intercept and coefficients print('Intercept: \n', regr.intercept_) print('Coefficients: \n', regr.coef_) # prediction with sklearn for all the interest rates New_Interest_Rate = df[['Interest_Rate']] df1=DataFrame(regr.predict(New_Interest_Rate)) print('Predicted Stock Index Price: \n',) print(df1)
The Python code in IDLE is pasted below.
The stock.csv file looks like as follows.
Year | Month | Interest_Rate | Unemployment_Rate | Stock_Index_Price |
2017 | 12 | 2.75 | 5.3 | 1464 |
2017 | 11 | 2.5 | 5.3 | 1394 |
2017 | 10 | 2.5 | 5.3 | 1357 |
2017 | 9 | 2.5 | 5.3 | 1293 |
2017 | 8 | 2.5 | 5.4 | 1256 |
2017 | 7 | 2.5 | 5.6 | 1254 |
2017 | 6 | 2.5 | 5.5 | 1234 |
2017 | 5 | 2.25 | 5.5 | 1195 |
2017 | 4 | 2.25 | 5.5 | 1159 |
2017 | 3 | 2.25 | 5.6 | 1167 |
2017 | 2 | 2 | 5.7 | 1130 |
2017 | 1 | 2 | 5.9 | 1075 |
2016 | 12 | 2 | 6 | 1047 |
2016 | 11 | 1.75 | 5.9 | 965 |
2016 | 10 | 1.75 | 5.8 | 943 |
2016 | 9 | 1.75 | 6.1 | 958 |
2016 | 8 | 1.75 | 6.2 | 971 |
2016 | 7 | 1.75 | 6.1 | 949 |
2016 | 6 | 1.75 | 6.1 | 884 |
2016 | 5 | 1.75 | 6.1 | 866 |
2016 | 4 | 1.75 | 5.9 | 876 |
2016 | 3 | 1.75 | 6.2 | 822 |
2016 | 2 | 1.75 | 6.2 | 704 |
2016 | 1 | 1.75 | 6.1 | 719 |
The output is pasted below.
Interpretation of the Result
Simple linear regression is of the form y=w0+w1x. The output shows w0 (Intercept) as
-99.46431881371655 and w1 (Coefficient) as 564.20389249. According to the above example, the equation becomes
Stock_Index_Price= w0+w1* Interest_Rate
i.e, Stock_Index_Price= -99.46431881371655 +564.20389249* Interest_Rate
Stock_Index_Price = 1452.09638554 which is exactly the predicted stock index price for the first record. Likewise the predicted stock_price in the output shows for the following records in the stock.csv file
ليست هناك تعليقات:
إرسال تعليق