运行TensorFlow_牛客博客

本文内容参考了机器学习实战：基于Scikit-Learn和Tensorflow一书。

安装

pip3 install --upgrade tensorflow

创建计算图并运行

import tensorflow as tf

# 创建计算图
x = tf.Variable(3, name='x')
y = tf.Variable(4, name='y')

f = x * x * y + y + 2

# 创建会话,并计算

with tf.Session()as sess:
    x.initializer.run()
    y.initializer.run()
    result = f.eval()

# 还可以为所有变量一次性完成初始化
init = tf.global_variables_initializer()
with tf.Session() as sess:
    init.run()
    result = f.eval()

节点值的生命周期

当求值一个节点时，TensorFlow会自动检测该节点依赖的节点，并先对这些节点求值。

with tf.Session() as sess:
    print(y.eval())          # w,x的值会被计算两次
    print(z.eval())

with tf.Session() as sess:
    y_val,z_val=sess.run([y,z])
    print(y_val)          # w,x计算一次
    print(z_val)

在图的每次执行间，所有节点值都会被丢弃，但是变量的值不会，因为变量的值是由会话维护的。变量的生命周期从初始化器的执行开始，到关闭会话才结束。

TensorFlow中的线性回归

TensorFlow中的操作（简称op）可以接受任意数量的输入，也可以产生任意数量的输出。常量和变量（称为源操作）则没有输入。输入和输出都是多维数组，叫作张量（这也是TensorFlow名字的来源）。
以加州的住房数据为例，训练现象回归模型：

import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing('.')
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')  # -1表示未指定
# 该维度将根据数组长度和剩余维度进行计算
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
    theta_val = theta.eval()
    print(theta_val)
 ####
 [[-3.7171074e+01]
 [ 4.3633682e-01]
 [ 9.3871783e-03]
 [-1.0717344e-01]
 [ 6.4540231e-01]
 [-4.1238391e-06]
 [-3.7809242e-03]
 [-4.2373490e-01]
 [-4.3720812e-01]]

梯度下降

当使用梯度下降法时，要记得先对输入的特征向量做归一化，否则训练过程会非常慢。

手工计算梯度

n_epochs = 1000
learning_rate = 0.01

scaled_housing_data_plus_bias = StandardScaler().fit_transform(housing_data_plus_bias)
X = tf.constant(scaled_housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name='y')
theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0), name='theta')
y_pred = tf.matmul(X, theta, name="predictions")
error = y_pred - y
mse = tf.reduce_mean(tf.square(error), name='mse')
gradients = 2 / m * tf.matmul(tf.transpose(X), error)
training_op = tf.assign(theta, theta - learning_rate * gradients)

init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(n_epochs):
        if epoch % 100 == 0:
            print('Epoch', epoch, 'MSE = ', mse.eval())
        sess.run(training_op)
    best_theta = theta.eval()
    print(best_theta)
#####
Epoch 0 MSE =  6.2640386
Epoch 100 MSE =  4.952924
Epoch 200 MSE =  4.905432
Epoch 300 MSE =  4.8819094
Epoch 400 MSE =  4.864438
Epoch 500 MSE =  4.8511105
Epoch 600 MSE =  4.8408785
Epoch 700 MSE =  4.832979
Epoch 800 MSE =  4.8268504
Epoch 900 MSE =  4.822071
[[-0.5391607 ]
 [ 0.91223806]
 [ 0.15870295]
 [-0.37587142]
 [ 0.37682113]
 [ 0.00895804]
 [-0.04446738]
 [-0.53106695]
 [-0.50912493]]

使用自动微分

gradients=tf.gradients(mse,[theta])[0]

使用优化器

optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate) 
training_op = optimizer.minimize(mse)

optimizer = tf.train.MomentumOptimizer(learning_rate=learning_rate,                                       momentum=0.9)

给训练算法提供数据

在小批次梯度下降时，需要在每次迭代时用下一个小批量替换X和y，最简单是使用占位符。

    A = tf.placeholder(tf.float32, shape=(None, 3))  # None 表示任意尺寸
    B = A + 5

    with tf.Session() as sess:
        B_val_1 = B.eval(feed_dict={A: [[1, 2, 3]]})  # 必须传入值
        B_val_2 = B.eval(feed_dict={A: [[4, 5, 6], [7, 8, 9]]})

    print(B_val_1)
    print(B_val_2)

保存和恢复模型

saver=tf.train.Saver()
with tf.Session() as sess:
    saver.save(sess,'model.cpkt')
    saver = tf.train.Saver({"weights": theta})     # 只保存theta，并命名为weights

    saver.restore(sess, "model.ckpt")    #恢复

用TensorBoard来可视化图和训练曲线

from datetime import datetime

now = datetime.utcnow().strftime("%Y%m%d%H%M%S")
root_logdir = "tf_logs"
logdir = "{}/run-{}/".format(root_logdir, now)

mse_summary = tf.summary.scalar('MSE', mse)      # 创建一个求MSE的节点
file_writer = tf.summary.FileWriter(logdir, tf.get_default_graph())   # 将汇总写到日志的FileWriter
#Session中
  summary_str = mse_summary.eval(feed_dict={X: X_batch, y: y_batch})           
 file_writer.add_summary(summary_str, epoch) 
 file_writer.close()
# 查看
tensorboard --logdir tf_logs/

查看计算图:

命名作用域

with tf.name_scope("loss") as scope:    
	error = y_pred - y   
	mse = tf.reduce_mean(tf.square(error), name="mse")
print(error.op.name)# loss/sub
 print(mse.op.name) # loss/mse

模块化

假如定义一个修正线性单元（RrLU）:
$h_{w, b} = \max (X \cdot W + b, 0)$

def relu(X):
    with tf.name_scope('relu'):
        w_shape = (int(X.get_shape()[1]), 1)
        w = tf.Variable(tf.random_normal(w_shape), name='weights')
        b = tf.Variable(0.0, name='z')
        z = tf.add(tf.matmul(X, w), b, name='z')
        return tf.maximum(z, 0, name='relu')


n_features = 3
X = tf.placeholder(tf.float32, shape=(None, n_features), name='X')
relus = [relu(X) for i in range(5)]
output = tf.add_n(relus, name='output')  # add_n 会创建一个能计算一个张量列表和的操作

共享变量

如果共享变量不存在，该方法先通过get_variable（）函数创建共享变量；如果已经存在了，就复用该共享变量。期望的行为通过当前variable_scope（）的一个属性来控制（创建或者复用）。

def relu(X):
    with tf.variable_scope("relu", reuse=True):
        threshold = tf.get_variable("threshold")  # reuse existing variable 
        # [...] 
        return tf.maximum(z, threshold, name="max")


X = tf.placeholder(tf.float32, shape=(None, n_features), name="X")
with tf.variable_scope("relu"):  # create the variable 
    threshold = tf.get_variable("threshold", shape=(), initializer=tf.constant_initializer(0.0))
relus = [relu(X) for relu_index in range(5)]
output = tf.add_n(relus, name="output")

通过get_variable（）创建的变量总是以它们的variable_scope 作为前缀来命名的（比如"relu/threshold"），对于其他节点（包括通过tf.Variable（）创建的变量）变量作用域的行为就好像是一个新的作用域。具体来说，如果一个命名作用域有一个已经创建了的变量名，那么就会加上一个后缀以保证其唯一性。比如，上面例子中的所有变量（除了threshold变量）都有一个“relu_1”到“relu_5”的前缀，见图9-8。