机器学习基石-作业四-代码部分
生活随笔
收集整理的這篇文章主要介紹了
机器学习基石-作业四-代码部分
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
這次的作業內容主要就是對帶正則化項的線性規劃的閉式解做各種操作:選擇、把數據分成訓練集和交叉驗證、k-折交叉驗證。完全套公式就可以了,這里唯一的一個問題就是偏移量參不參加正則化;
在林老師的課程中,最后的閉式解的公式為:
按照這個公式,線性得分函數中的b也參加了正則化。
而在吳恩達的課程,以及很多其他的地方得到的經驗是,b一般是不參加正則化的。按照這樣的理解,假設w0=b,那么公式為:
本人更傾向于后者。但是為了和林老師的答案一致,程序中采用了第一種計算公式。
具體代碼如下,套公式就可以了:
# -*- coding:utf-8 -*- # Author: Evan Mi import numpy as npdef load_data(file_name):x = []y = []with open(file_name, 'r+') as f:for line in f:line = line.rstrip("\n")temp = line.split(" ")temp.insert(0, '1')x_temp = [float(val) for val in temp[:-1]]y_tem = [int(val) for val in temp[-1:]][0]x.append(x_temp)y.append(y_tem)nx = np.array(x)ny = np.array(y)return nx, nydef sign_zero_as_neg(x):"""這里修改了np自帶的sign函數,當傳入的值為0的時候,不再返回0,而是-1;也就是說在邊界上的點按反例處理:param x::return:"""result = np.sign(x)result[result == 0] = -1return resultdef get_w_reg(x, y, lambdas):w_reg = np.dot(np.linalg.pinv(np.dot(np.transpose(x), x) + lambdas * np.eye(np.size(x, axis=1))),np.dot(np.transpose(x), y))return w_reg.flatten()def e_counter(x, y, w):local_result = sign_zero_as_neg(np.dot(x, w))e = np.where(local_result == y, 0, 1)return e.sum()/np.size(e)def exe_13():print('#13:')train_x, train_y = load_data("data/train.txt")test_x, test_y = load_data("data/test.txt")w_reg_one = get_w_reg(train_x, train_y, 10)e_in = e_counter(train_x, train_y, w_reg_one)e_out = e_counter(test_x, test_y, w_reg_one)print("E_IN:", e_in)print("E_OUT:", e_out)def exe_14_15():print('#14,#15')train_x, train_y = load_data("data/train.txt")test_x, test_y = load_data("data/test.txt")for i in range(-10, 3):lambda_tem = 10 ** iw_reg_tem = get_w_reg(train_x, train_y, lambda_tem)e_in_tem = e_counter(train_x, train_y, w_reg_tem)e_out_tem = e_counter(test_x, test_y, w_reg_tem)print("log_10(%d)" % i, e_in_tem, e_out_tem)def exe_16_17():print('#16,17')x_tem, y_tem = load_data("data/train.txt")test_x, test_y = load_data("data/test.txt")train_x = x_tem[:120, :]val_x = x_tem[120:, :]train_y = y_tem[:120]val_y = y_tem[120:]for i in range(-10, 3):lambda_tem = 10 ** iw_reg_tem = get_w_reg(train_x, train_y, lambda_tem)e_in_tem = e_counter(train_x, train_y, w_reg_tem)e_val_tem = e_counter(val_x, val_y, w_reg_tem)e_out_tem = e_counter(test_x, test_y, w_reg_tem)print("log_10(%d)" % i, e_in_tem, e_val_tem, e_out_tem)def exe_18():print('#18:')train_x, train_y = load_data("data/train.txt")test_x, test_y = load_data("data/test.txt")# lambda = log_10(0)w_reg_one = get_w_reg(train_x, train_y, 1)e_in = e_counter(train_x, train_y, w_reg_one)e_out = e_counter(test_x, test_y, w_reg_one)print("E_IN:", e_in)print("E_OUT:", e_out)def exe_19():print('#19')train_x, train_y = load_data("data/train.txt")for i in range(-10, 3):lambda_tem = 10 ** ie_cross = []for j in range(0, 200, 40):x_val = train_x[j:j+40, :]y_val = train_y[j:j+40]x_remain_left = train_x[0:j, :]x_remain_right = train_x[j+40:, :]y_remain_left = train_y[0:j]y_remain_right = train_y[j + 40:]if np.size(x_remain_left, axis=0) == 0:x_train = x_remain_righty_train = y_remain_rightelif np.size(x_remain_right, axis=0) == 0:x_train = x_remain_lefty_train = y_remain_leftelse:x_train = np.concatenate((train_x[0:j, :], train_x[j + 40:, :]), axis=0)y_train = np.concatenate((train_y[0:j], train_y[j + 40:]), axis=0)w_reg_tem = get_w_reg(x_train, y_train, lambda_tem)e_cross.append(e_counter(x_val, y_val, w_reg_tem))print("lambda:", "log_10(%d)" % i, "E_CV", np.array(e_cross).mean())def exe_20():print('#20:')train_x, train_y = load_data("data/train.txt")test_x, test_y = load_data("data/test.txt")# lambda = log_10(-8)w_reg_one = get_w_reg(train_x, train_y, 10 ** -8)e_in = e_counter(train_x, train_y, w_reg_one)e_out = e_counter(test_x, test_y, w_reg_one)print("E_IN:", e_in)print("E_OUT:", e_out)if __name__ == '__main__':exe_13()exe_14_15()exe_16_17()exe_18()exe_19()exe_20()詳細代碼及代碼使用的數據見:機器基石作業四
總結
以上是生活随笔為你收集整理的机器学习基石-作业四-代码部分的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: JavaScript onmouseup
- 下一篇: 从机器码到面向对象