博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
TensorFlow学习笔记(3):逻辑回归
阅读量:6241 次
发布时间:2019-06-22

本文共 6485 字,大约阅读时间需要 21 分钟。

前言

本文使用tensorflow训练逻辑回归模型,并将其与scikit-learn做比较。数据集来自Andrew Ng的网上公开课程

代码

#!/usr/bin/env python# -*- coding=utf-8 -*-# @author: 陈水平# @date: 2017-01-04# @description: compare the logistics regression of tensorflow with sklearn based on the exercise of deep learning course of Andrew Ng.# @ref: http://openclassroom.stanford.edu/MainFolder/DocumentPage.php?course=DeepLearning&doc=exercises/ex4/ex4.htmlimport tensorflow as tfimport numpy as npfrom sklearn.linear_model import LogisticRegressionfrom sklearn import preprocessing# Read x and yx_data = np.loadtxt("ex4x.dat").astype(np.float32)y_data = np.loadtxt("ex4y.dat").astype(np.float32)scaler = preprocessing.StandardScaler().fit(x_data)x_data_standard = scaler.transform(x_data)# We evaluate the x and y by sklearn to get a sense of the coefficients.reg = LogisticRegression(C=999999999, solver="newton-cg")  # Set C as a large positive number to minimize the regularization effectreg.fit(x_data, y_data)print "Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_)# Now we use tensorflow to get similar results.W = tf.Variable(tf.zeros([2, 1]))b = tf.Variable(tf.zeros([1, 1]))y = 1 / (1 + tf.exp(-tf.matmul(x_data_standard, W) + b))loss = tf.reduce_mean(- y_data.reshape(-1, 1) *  tf.log(y) - (1 - y_data.reshape(-1, 1)) * tf.log(1 - y))optimizer = tf.train.GradientDescentOptimizer(1.3)train = optimizer.minimize(loss)init = tf.initialize_all_variables()sess = tf.Session()sess.run(init)for step in range(100):    sess.run(train)    if step % 10 == 0:        print step, sess.run(W).flatten(), sess.run(b).flatten()print "Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten())print "Coefficients of tensorflow (raw input): K=%s, b=%s" % (sess.run(W).flatten() / scaler.scale_, sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W)))# Problem solved and we are happy. But...# I'd like to implement the logistic regression from a multi-class viewpoint instead of binary.# In machine learning domain, it is called softmax regression# In economic and statistics domain, it is called multinomial logit (MNL) model, proposed by Daniel McFadden, who shared the 2000  Nobel Memorial Prize in Economic Sciences.print "------------------------------------------------"print "We solve this binary classification problem again from the viewpoint of multinomial classification"print "------------------------------------------------"# As a tradition, sklearn firstreg = LogisticRegression(C=9999999999, solver="newton-cg", multi_class="multinomial")reg.fit(x_data, y_data)print "Coefficients of sklearn: K=%s, b=%f" % (reg.coef_, reg.intercept_)print "A little bit difference at first glance. What about multiply them with 2?"# Then try tensorflowW = tf.Variable(tf.zeros([2, 2]))  # first 2 is feature number, second 2 is class numberb = tf.Variable(tf.zeros([1, 2]))V = tf.matmul(x_data_standard, W) + by = tf.nn.softmax(V)  # tensorflow provide a utility function to calculate the probability of observer n choose alternative i, you can replace it with `y = tf.exp(V) / tf.reduce_sum(tf.exp(V), keep_dims=True, reduction_indices=[1])`# Encode the y label in one-hot mannerlb = preprocessing.LabelBinarizer()lb.fit(y_data)y_data_trans = lb.transform(y_data)y_data_trans = np.concatenate((1 - y_data_trans, y_data_trans), axis=1)  # Only necessary for binary class loss = tf.reduce_mean(-tf.reduce_sum(y_data_trans * tf.log(y), reduction_indices=[1]))optimizer = tf.train.GradientDescentOptimizer(1.3)train = optimizer.minimize(loss)init = tf.initialize_all_variables()sess = tf.Session()sess.run(init)for step in range(100):    sess.run(train)    if step % 10 == 0:        print step, sess.run(W).flatten(), sess.run(b).flatten()print "Coefficients of tensorflow (input should be standardized): K=%s, b=%s" % (sess.run(W).flatten(), sess.run(b).flatten())print "Coefficients of tensorflow (raw input): K=%s, b=%s" % ((sess.run(W) / scaler.scale_).flatten(),  sess.run(b).flatten() - np.dot(scaler.mean_ / scaler.scale_, sess.run(W)))

输出如下:

Coefficients of sklearn: K=[[ 0.14834077  0.15890845]], b=-16.3787430 [ 0.33699557  0.34786162] [ -4.84287721e-09]10 [ 1.15830743  1.22841871] [ 0.02142336]20 [ 1.3378191   1.42655993] [ 0.03946959]30 [ 1.40735555  1.50197577] [ 0.04853692]40 [ 1.43754184  1.53418231] [ 0.05283691]50 [ 1.45117068  1.54856908] [ 0.05484771]60 [ 1.45742035  1.55512536] [ 0.05578374]70 [ 1.46030474  1.55814099] [ 0.05621871]80 [ 1.46163988  1.55953443] [ 0.05642065]90 [ 1.46225858  1.56017959] [ 0.0565144]Coefficients of tensorflow (input should be standardized): K=[ 1.46252561  1.56045783], b=[ 0.05655487]Coefficients of tensorflow (raw input): K=[ 0.14831361  0.15888004], b=[-16.26265144]------------------------------------------------We solve this binary classification problem again from the viewpoint of multinomial classification------------------------------------------------Coefficients of sklearn: K=[[ 0.07417039  0.07945423]], b=-8.189372A little bit difference at first glance. What about multiply them with 2?0 [-0.33699557  0.33699557 -0.34786162  0.34786162] [  6.05359674e-09  -6.05359674e-09]10 [-0.68416572  0.68416572 -0.72988117  0.72988123] [ 0.02157043 -0.02157041]20 [-0.72234094  0.72234106 -0.77087188  0.77087194] [ 0.02693938 -0.02693932]30 [-0.72958517  0.72958535 -0.7784785   0.77847856] [ 0.02802362 -0.02802352]40 [-0.73103166  0.73103184 -0.77998811  0.77998811] [ 0.02824244 -0.02824241]50 [-0.73132294  0.73132324 -0.78029168  0.78029174] [ 0.02828659 -0.02828649]60 [-0.73138171  0.73138207 -0.78035289  0.78035301] [ 0.02829553 -0.02829544]70 [-0.73139352  0.73139393 -0.78036523  0.78036535] [ 0.02829732 -0.0282972 ]80 [-0.73139596  0.73139632 -0.78036767  0.78036791] [ 0.02829764 -0.02829755]90 [-0.73139644  0.73139679 -0.78036815  0.78036839] [ 0.02829781 -0.02829765]Coefficients of tensorflow (input should be standardized): K=[-0.7313965   0.73139679 -0.78036827  0.78036839], b=[ 0.02829777 -0.02829769]Coefficients of tensorflow (raw input): K=[-0.07417037  0.07446811 -0.07913655  0.07945422], b=[ 8.1893692  -8.18937111]

思考

  • 对于逻辑回归,损失函数比线性回归模型复杂了一些。首先需要通过sigmoid函数,将线性回归的结果转化为0至1之间的概率值。然后写出每个样本的发生概率(似然),那么所有样本的发生概率就是每个样本发生概率的乘积。为了求导方便,我们对所有样本的发生概率取对数,保持其单调性的同时,可以将连乘变为求和(加法的求导公式比乘法的求导公式简单很多)。对数极大似然估计方法的目标函数是最大化所有样本的发生概率;机器学习习惯将目标函数称为损失,所以将损失定义为对数似然的相反数,以转化为极小值问题。

  • 我们提到逻辑回归时,一般指的是二分类问题;然而这套思想是可以很轻松就拓展为多分类问题的,在机器学习领域一般称为softmax回归模型。本文的作者是统计学与计量经济学背景,因此一般将其称为MNL模型。

转载地址:http://irdia.baihongyu.com/

你可能感兴趣的文章
[解题报告]499 - What's The Frequency, Kenneth?
查看>>
Vue入门---常用指令详解
查看>>
iOS 越狱后 SSH 不能连接
查看>>
soj 3291 Distribute The Apples II DP
查看>>
苹果App Store审核指南中文翻译(更新至140227)
查看>>
转 -- OK6410 tftp下载内核、文件系统以及nand flash地址相关整理、总结
查看>>
原来对MFC一无所知
查看>>
Java程序员看C++代码
查看>>
python处理Excel - xlrd xlwr openpyxl
查看>>
JS实现的购物车
查看>>
bzoj 3998 [TJOI2015]弦论——后缀自动机
查看>>
STL 的 vector 根据元素的值来删除元素的方法
查看>>
NOI2002银河英雄传说——带权并查集
查看>>
复合数据类型,英文词频统计
查看>>
“main cannot be resolved or is not a field”解决方案
查看>>
oc中使用switch实现图片浏览功能,补充其它的实现方式
查看>>
6、DRN-----深度强化学习在新闻推荐上的应用
查看>>
用父类指针指向子类对象
查看>>
Flexigrid默认是可以选择多行
查看>>
PHP导入导出Excel方法小结
查看>>