【Python-ML】SKlearn库多元线性回归性能评估
生活随笔
收集整理的這篇文章主要介紹了
【Python-ML】SKlearn库多元线性回归性能评估
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
# -*- coding: utf-8 -*-
'''
Created on 2018年1月24日
@author: Jason.F
@summary: 有監督回歸學習-多元線性回歸的性能評估
'''
import pandas as pd
import numpy as np
import time
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.cross_validation import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score
if __name__ == "__main__": start = time.clock() #導入波士頓房屋數據集df=pd.read_csv('https://archive.ics.uci.edu/ml/machine-learning-databases/housing/housing.data',header=None,sep='\s+')df.columns=['CRIM','ZM','INDUS','CHAS','NOX','RM','AGE','DIS','RAD','TAX','PTRATIO','B','LSTAT','MEDV']X=df.iloc[:,:-1].valuesy=df['MEDV'].values#房間價格X_train,X_test,y_train,y_test=train_test_split(X,y,test_size=0.3,random_state=0)slr=LinearRegression()slr.fit(X_train,y_train)y_train_pred=slr.predict(X_train)y_test_pred = slr.predict(X_test)#可視化殘差值和預測值plt.scatter(y_train_pred,y_train_pred-y_train,c='blue',marker='o',label='Training data')plt.scatter(y_test_pred,y_test_pred-y_test,c='lightgreen',marker='s',label='Test data')plt.xlabel('Predicted values')plt.ylabel('Residuals')plt.legend(loc='upper left')plt.hlines(y=0,xmin=-10,xmax=50,lw=2,colors='red')plt.xlim([-10,50])plt.show()#評估均方誤差print ('MSE train: %.3f,test:%.3f' % (mean_squared_error(y_train,y_train_pred),mean_squared_error(y_test,y_test_pred)))#評估決定系數(coefficient of determination),是MSE的標準化print ('R^2 train: %.3f,test:%.3f' % (r2_score(y_train,y_train_pred),r2_score(y_test,y_test_pred)))end = time.clock() print('finish all in %s' % str(end - start))
結果:
MSE train: 19.958,test:27.196 R^2 train: 0.765,test:0.673 finish all in 12.8996295796總結
以上是生活随笔為你收集整理的【Python-ML】SKlearn库多元线性回归性能评估的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 【正一专栏】《神秘巨星》总有一种真诚让你
- 下一篇: 【Python-ML】SKlearn库多