机器学习算法之随机梯度下降

由于年代久远 咕咕了

先上代码,等有空再写教程

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

train = pd.read_csv('data/train.csv')

beta = [1,1]
alpha = 0.2
tol_L = 0.1

max_x = max(train['id'])
x = train['id'] / max_x
y = train['questions']

def compute_grad_SGD(beta, x, y):
    grad = [0,0]
    r = np.random.randint(0, len(x))
    grad[0] = 2. * np.mean(beta[0] + beta[1] * x[r] - y[r])
    grad[1] = 2. * np.mean(x[r] * (beta[0] + beta[1] * x[r] - y[r]))
    return np.array(grad)

def update_beta(beta,alpha,grad):
    new_beta = np.array(beta) - alpha * grad
    return new_beta

def rmse(beta, x, y):
    squared_err = (beta[0] + beta[1] * x - y) ** 2
    res = np.sqrt(np.mean(squared_err))
    return res

grad = compute_grad_SGD(beta, x, y)
loss = rmse(beta, alpha, grad)
beta = update_beta(beta, alpha, grad)
loss_new = rmse(beta, x, y)
diff = rmse(beta, x, y)

i = 1
while diff > tol_L:
    beta = update_beta(beta, alpha, grad)
    grad = compute_grad_SGD(beta, x, y)
    if i % 100 == 0:
        loss = loss_new
        loss_new = rmse(beta, x, y)
        diff = np.abs(loss_new - loss)
        # print(f"Round {i} Diff RMSE: {diff}")
    i = i + 1

print(f"Beta0: {beta[0]},Beta1: {beta[1]}")
Beta0: 833.1011122802754,Beta1: 4931.514502396765
plt.figure(figsize=(10,7))
plt.scatter(train['id'],train['questions'],s=10,color='c',alpha=0.4,label='Data')
plt.plot(train['id'],beta[1]/max_x * train['id'] + beta[0],color='r',ls='--',label='prediction')
plt.legend(loc="best") 
plt.xlabel('id')
plt.ylabel('questions')
plt.show()

回归图像:
请输入图片描述

文章名: 《机器学习算法之随机梯度下降》
文章链接:https://blog.hrhr7.cn/index.php/archives/9/
联系方式:tensor7@163.com
除特别注明外,文章均为Cupidr原创,转载时请注明本文出处及文章链接
Last modification:November 22nd, 2019 at 08:00 pm
如果觉得我的文章对你有用,请随意赞赏

One comment

Leave a Comment