Linear Regression in Depth Tutorial with Soumil Shah¶

In this Lesson you will learn in depth about Linear Regression Model from Sklearn and we will use Popular Dataset of US Housing

Doccumentation

class sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False, copy_X=True, n_jobs=None)¶

Parameter:¶

fit_intercept : boolean, optional, default True¶

whether to calculate the intercept for this model. If set to False, no intercept will be used in calculations (e.g. data is expected to be already centered).

normalize : boolean, optional, default False¶

This parameter is ignored when fit_intercept is set to False. If True, the regressors X will be normalized before regression by subtracting the mean and dividing by the l2-norm. If you wish to standardize, please use sklearn.preprocessing.StandardScaler before calling fit on an estimator with normalize=False

copy_X : boolean, optional, default True¶

If True, X will be copied; else, it may be overwritten.

n_jobs : int or None, optional (default=None)¶

The number of jobs to use for the computation. This will only provide speedup for n_targets > 1 and sufficient large problems. None means 1 unless in a joblib.parallel_backend context. -1 means using all processors. See Glossary for more detail

Attributes¶

coef_ :¶

Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

intercept_ :¶

array Independent term in the linear model.

Step 1:¶

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

df = pd.read_csv('USA_Housing.csv')
def NNdf.head(2)

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression

class NN():
    
    def __init__(self):
        self.X_Train, self.X_Test, self.Y_Train, self.Y_Test = self.preprocess
        self.model = self.create_model
    
    @property
    def preprocess(self):
        df = pd.read_csv('USA_Housing.csv')
        X_Data = df[['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
               'Avg. Area Number of Bedrooms', 'Area Population']]
        Y_Data = df['Price']
        X_Train, X_Test, Y_Train, Y_Test = train_test_split(X_Data, Y_Data, test_size=0.4, random_state=101)
        return X_Train, X_Test, Y_Train, Y_Test 
    
    @property
    def create_model(self):
        """
        return : Model Object 
        """
        model = LinearRegression(fit_intercept=True, normalize=True)
        return model
        
    
    @property
    def train(self):
        """
        return None Train the Model
        """
        self.model = LinearRegression(fit_intercept=True, normalize=True)
        self.model.fit(self.X_Train, self.Y_Train)
        
    @property   
    def test(self):
        """
        return pred [Array ]
        return coef_ [array]
        return intercept_ [array]
        """
        pred = self.model.predict(self.X_Test)
        return pred,self.model.coef_ , self.model.intercept_

neural = NN()
neural.train
pred,coef_,intercept_ = neural.test

columns = ['Avg. Area Income', 'Avg. Area House Age', 'Avg. Area Number of Rooms',
               'Avg. Area Number of Bedrooms', 'Area Population']

df1= pd.DataFrame(data=coef_, columns=["Coef"])
df1

	Avg. Area Income	Avg. Area House Age	Avg. Area Number of Rooms	Avg. Area Number of Bedrooms	Area Population	Price	Address
0	79545.458574	5.682861	7.009188	4.09	23086.800503	1.059034e+06	208 Michael Ferry Apt. 674\nLaurabury, NE 3701...
1	79248.642455	6.002900	6.730821	3.09	40173.072174	1.505891e+06	188 Johnson Views Suite 079\nLake Kathleen, CA...

	Coef
0	21.528276
1	164883.282027
2	122368.678027
3	2233.801864
4	15.150420

Pythonist

Saturday, July 27, 2019