介绍

Google Inception Net首次出现在ILSVRC 2014的比赛中(和VGGNet同年)，就以较大优势取得第一名。那届比赛中的Inception Net通常称为Inception V1，它最大的特点是控制了计算量和参数量的同时，获得了非常好的分类性能-top-5错误率6.67%。，只有AlexNet的一半不到。Inception V1有22层，比AlexNet的8层或者VGGNet的19层还要深。但其计算量只有15亿次浮点运算，同时只有500万参数量，仅为AlexNet参数量(6000万)的1/12，却可以达到远胜于AlexNet的准确率，可以说是非常优秀且实用的模型。Inception V1降低参数量的目的主要有2点，第一，参数越多模型越庞大，需要提供模型学习的数据量就越大，而目前高质量的数据十分昂贵。第二，参数越多，耗费的计算资源也会更大。Inception V1参数少但效果好的原因除了模型层数更多，表达能力更强外，还有2点：一是去除了最后的全连接层，用全局平均池化层来取代它。全连接层几乎占据了AlexNet或VGGNet中90%的参数量，而且会引起过拟合，去除全连接层后模型训练更快且减轻了过拟合。用全局平均池化层取代全连接层的做法借鉴了Network In Network论文。而是Inception V1中精心设计的Inception Module提高了参数的利用效率。
我们来看一下Inception Module的基本结构，其中有4个分支，第一个分支对输入进行1 $\times$ 1卷积，这其实也是NIN种提出的一个重要结构。1 $\times$ 1的卷积是一个非常优秀的结构，它可以跨通道组织信息，提高网络的表达能力，同时可以对输出通道升维和降维。可以看到Inception Module的4个分支都用到了1 $\times$ 1卷积，来降低成本(计算量比3 $\times$ 3小很多)的跨通道的特征变换。第二个分支使用了1 $\times$ 1卷积，然后连接 3 $\times$ 3卷积，相当于进行了两次特征变换。第3个分支类似，先是1 $\times$ 1卷积，然后连接 5 $\times$ 5卷积。最后一个分支则是 3 $\times$ 3最大池化后直接使用1 $\times$ 1卷积。我们可以发现，有的分支只使用1 $\times$ 1卷积，有的分支使用了其他尺寸的卷积时也会再用1 $\times$ 1卷积，这是因为1 $\times$ 1卷积的性价比很高，用很小的计算量就可以增加一层特征变换和非线性化。Inception Module的4个分支在最后通过一个聚合操作合并(在输出通道这个维度上聚合)。Inception Module中包含了3种不同尺寸的卷积和1个最大池化，增加了网络对不同尺度的适应性，这一部分和Multi-Scale的思想类似。早期计算机是觉得研究中，收到灵长类神经视觉系统的启发，Serre使用不同尺寸的Gabor滤波器处理不同尺寸的图片，Inception V1借鉴了这种思想。Inception V1的论文中指出，Inception Module可以让网络的深度和宽度高效率的扩充，提升准确率且不至于过拟合。
在Inception Module中，通常1 $\times$ 1卷积的比例(输出通道数占比)最高，3 $\times$ 3卷积核5 $\times$ 5卷积稍低。而在整个网络中，会有多个堆叠的Inception Module，我们希望靠后的Inception Module可以捕捉更高阶的抽象特征，因此靠后的Inception Module的卷积的空间集中度应该组件降低，这样可以捕获更大面积的特征。因此，越靠后的Inception Module中，3 $\times$ 3和5 $\times$ 5这两个大面积的卷积核的占比(输出通道数)应该更多。
Inception V2学习了VGGNet，用两个3 $\times$ 3的卷积代替 5 $\times$ 5的大卷积核(用以降低参数量并减轻过拟合)，还提出了著名的Batch Normalization方法。BN是一个非常有效的正则化方法，可以让大型卷积网络的速度加快很多倍，同时收敛后的分类准确率也可以得到大幅度的提高。BN在用于神经网络某层时，会对每一个mini-batch数据的内存进行标准化处理，使输出规范化到N(0,1)的正太分布，减少了Internal Covariate Shift（内部神经元分布的改变）。BN的论文指出，传统的深度卷积网络在训练时，每一层输入的分布都在变化，导致训练变得困难，我们只能使用一个很小的学习速率解决这个问题。而对每一层使用BN之后，我们就可以有效地解决这个问题，学习速率可以增大很多倍，达到之前的准确率所需要的迭代次数只有1/14，训练时间大大缩短。而达到之前的准确率后可以继续训练，并最终取得远超于Inception V1模型的性能–top-5错误率4.8%，已经优于人眼水平。因为BN某种意义上还起到了正则化的作用，所以可以减少或者取消Dropout，简化网络结构。
当然只是单纯的使用BN获得的增益还不明显，还需要一些相应的调整：增大学习速率并加快学习衰减速度以适用BN规范化后的数据；去除Dropout并减轻L2正则(因为BN已经起到正则化的作用)；去除LRN；更彻底的对训练样本进行shuffle；减少数据增强过程中对数据的光学畸变（因为BN训练更快，每个样本被训练的次数更少，因此更真实的样本对训练更有帮助）。在使用了这些措施以后，Inception V2在训练达到Inception V1的准确率时快了14倍，并且模型的收敛时的上限更高。
而Inception V3网络主要有两方面的改造：一是引入了Factorization into small convolutions的思想，将一个较大的二维卷积拆成两个较小的一维卷积，比如将7 $\times$ 7卷积拆成1 $\times$ 7卷积和7 $\times$ 1卷积，比拆成3个3 $\times$ 3卷积更节约参数，同时增加了一层非线性扩展模型表达能力。论文中指出，这种非对称的卷积结构拆分，其结果比对称的拆成几个相同的小卷积核效果更明显，可以处理更多，更丰富的空间特征，增加特征多样性。
另一方面，Inception V3优化了Inception Module的结构，现在Inception Module有35 $\times$ 35, 17 $\times$ 17和8 $\times$ 8三种不同结构。这些Inception Module只在网络中的后部出现，前部还是普通的卷积层。并且Inception V3除了在Inception Module中使用分支，还在分支中使用了分支(8 $\times$ 8结构中)，可以说是Network In Network。
而Inception V4相比于V3主要是结合了微软的ResNet，而ResNet将在6.4节单独讲解，这里不多做追溯。接下来就来实现以下Inception V3，Inception V3的网络结构如下：
由于Google Inception Net V3相对比较复杂，所以这里用tf.contrib.slim辅助设计这个网络。contrib.slim中的一些功能和组件可以大大减少设计Inception Net的代码量，我们只需要少量的代码即可构件好42层深的Inception V3。

首先定义一个简单的函数trunc_normal，产生截断的正态分布。

#coding=utf-8
#Inception-V3.py
import tensorflow as tf
from datetime import datetime
import math
import time
slim = tf.contrib.slim
#产生截断的正太分布
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)

下面定义函数inception_v3_arg_scope，用来生成网络中经常用到的函数的默认参数，比如卷积的激活函数，权重初始化方式，标准化器等。设置L2正则的weight_decay默认值为0.00004，标准差stddev默认为0.1，参数batch_norm_var_collection默认值为moving_vars。接下来，定义batch normallization的参数字典，定义其衰减系数decay为0.9997，epsilon为0.001，updates_collctions为tf.GraphKeys.UPDATE_OPS，然后字典variable_collections中beta和gamma均设置为None，moving_mean和moving_variance均设置为前面的batch_norm_var_collection。
接下来使用slim.arg_scope,这是一个非常有用的工具，它可以给函数的参数自动赋予某些值。例如，这句with slim.arg_scope([slim.conv2d,slim.fully_connected],weights_regularizer=slim.l2_regularizer(weight_decay)),会对[slim.conv2d,slim.fully_connected]这两个函数的参数自动赋值，将参数weights_regularizer的值默认设为slim.l2_regularizer(weight_decay)。使用了slim.arg_scope后就不需要每次都重复设置参数了，只需要在有修改时设置。接下来，嵌套一个slim.arg_scope，对卷积层生成函数slim.conv2d的几个参数赋予默认值，其权重初始化器weights_initializer设置为trunc_normal(stddev)，激活函数设置为ReLU，标准化器设置为slim.batch_norm，标准化器的参数设置为前面定义的batch_norm_params。最后返回定义好的scope。
因为事先定义好了slim.conv2d中的各种默认参数，包括激活函数和标准化器，因此后面定义一个卷积层将会变得非常方便。我们可以用一行代码定义一个卷积层，整体代码会变得非常简洁美观，同时设计网络的工作量也会大大减轻。

#定义inception_v3_arg_scope，用来生成网络中经常用到的函数的默认参数
def inception_v3_arg_scope(weight_decay=0.00004, stddev=0.1, batch_norm_var_collection='moving_vars'):
    batch_norm_params = {
        'decay': 0.9997,
        'epsilon': 0.001,
        'updates_collections': tf.GraphKeys.UPDATE_OPS,
        'variables_collections':{
            'beta' : None,
            'gamma' : None,
            'moving_mean': [batch_norm_var_collection],
            'moving_variance': [batch_norm_var_collection]
        }
    }
    with slim.arg_scope([slim.conv2d, slim.fully_connected], weights_regularizer=slim.l2_regularizer(weight_decay)):
        with slim.arg_scope([slim.conv2d], weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
                            activation_fn=tf.nn.relu,
                            normalizer_fn=slim.batch_norm,
                            normalizer_params=batch_norm_params) as sc:
            return sc

接下来我们就定义函数inception_v3_base，它可以生成Inception V3网络的卷积部分，参数inputs为输入的图片数据的tensor，scope为包含了函数默认参数的环境。我们定义一个字典表end_points，用来保存某些关键节点供以后使用。接着再使用slim.arg_scope，对slim.conv2d，slim.max_pool2d这3个函数的参数设置默认值，将stride设为1，padding设为VALID。下面正式开始定义Inception V3的网络结构，首先是前面的非Inception Module的卷积层。这里直接使用slim.conv2d创建卷积层，slim.conv2d的第一参数为输入的tensor，第二个参数为输出的通道数，第3个参数为卷积核尺寸，第4个参数为步长stride，第5个参数为padding模式。我们的第一个卷积层后的输出通道数为32，卷积核尺寸为3 $\times$ 3，步长为2，padding模式则是默认的VALID。后面的几个卷积层采用相同的形式，按照论文中的定义，逐层定义好网络结构。因为使用了slim及slim.arg_scope，我们一行代码就可以定义好一个卷积层，相比之前AlexNet的实现中使用好几行代码定义一个卷积层，或是VGGNet中专门写一个函数来定义卷积层，都更加方便。
我们可以观察到，在前面的几个普通非Inception Module的卷积层中，主要使用了3 $\times$ 3的小卷积核，这是充分借鉴了VGGNet的结构。同时，Inception V3论文中也提出了Factorization into small convolutions思想，利用两个一维卷积模拟大尺寸的2维卷积，减少参数量的同时增加非线性。前面几层卷积中还有一层1 $\times$ 1卷积，这也是前面提到的Inception Module中经常使用的结构之一，可低成本的跨通道的对特征进行组合。另外可以看到，除了第一个卷积层步长为2，其余的卷积层步长均为1，而池化层则是尺寸为3 $\times$ 3，步长为2的重叠最大池化，这是AlexNet中使用过得结构。网络的输入数据尺寸为299 $\times$ 299 $\times$ 3，在经历3个步长为2的层后，尺寸缩小为35 $\times$ 35 $\times$ 192，空间尺寸大大降低，但是输出通道增加了很多。这部分代码一共有5个卷积层，2个池化层，实现了对输入数据的尺寸压缩，并对图片特征进行了抽象。

def inception_v3_base(inputs, scope=None):
    end_points = {}
    with tf.variable_scope(scope, 'InceptionV3', [inputs]):
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='VALID'):
            net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3')
            net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')
            net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3')
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')
            net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')
            net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')

接下来就将是3个连续的Inception模块组，这3个Inception模块组中各自分别有多个Inception Module，这部分的网络结构即是Inception V3的精华所在。每个Inception 模块组内部的几个Inception Module结构非常类似，但存在一些细节不同。
第一个Inception模块组包含了3个结构类似的Inception Module。其中第一个Inception Module的名称为Mixed_5b。我们先使用tf.slim.arg_scope设置所有Inception模块组的默认参数，将所有卷积层，最大池化，平均池化层的步长设为1，padding模式设为SAME。然后设置这个Inception Module的variable_scope名称为Mixed_5b。这个Inception Module中有4个分支，从Branch_0到Branch_3，第一个分支为有64输出通道的1 $\times$ 1卷积；第二个分支为有48输出通道的1 $\times$ 1卷积，连接有64输出通道的5 $\times$ 5卷积；第三个分支为有64输出通道的1 $\times$ 1卷积，再连续连接2个有96输出通道的3 $\times$ 3卷积；第4个分支为3 $\times$ 3的平均池化，连接有32输出通道的1 $\times$ 1卷积。最后，使用tf.concat将4个分支的输出合并在一起(在第3个维度合并，即输出通道上合并)，生成这个Inception Module的最终输出。因为这里所有的层步长均为1，并且padding模式为SAME，所以图片的尺寸并不会缩小，依然维持在35 $\times$ 35 $\times$ 256。这里需要注意，第一个Inception模块组中所有的Inception Module输出的图片尺寸为35 $\times$ 35，但是后两个Inception Module的通道数会发生变化。

#Inception Module
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
            # 第一个Inception Module的第一个模块
            with tf.variable_scope('Mixed_5b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

接下来是第一个Inception模块组的第二个Inceptiopn Module—Mixed_5c，这里依然使用前面设置的默认参数：步长为1，padding模式为SAME。这个Inception Module同样有4个分支，唯一不同的是第4个分支最后接的是64输出通道的1 $\times$ 1卷积，而此前是32输出通道。因此，我们输出tensor的最终尺寸为35 $\times$ 35 $\times$ 288，输出通道数相比之前增加了32。

# 第一个Inception Module的第2个模块
            with tf.variable_scope('Mixed_5c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0b_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0c_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

而第一个Inception模块组的第3个Inception Module----Mixed_5d和上一个Inception Module完全相同，4个分支的结构，参数一模一样，输出tensor的尺寸也一样。

  #第一个Inception Module的第3个模块
            with tf.variable_scope('Mixed_5d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

第2个Inception模块组是一个非常大的模块组，包含了5个Inception Module，其中第二个到第5个Inception Module的结构非常类似，其中第一个Inception Module名称为Mixed)6a，它包含3个分支。第一个分支是一个384输出通道的3 $\times$ 3卷积，这个分支的通道数一下就超过了之前的通道数之和。不过步长为2，因此图片的尺寸将被压缩，且padding的模式为VALID，所以图片的尺寸缩小为17 $\times$ 17；第二个分支有3层，分别是一个64输出通道的1 $\times$ 1卷积和两个96输出通道的3 $\times$ 3卷积。这里需要注意，最后一层的步长为2，padding模式为VALID，因此图片尺寸也被压缩，本分支最终输出的tensor尺寸为17 $\times$ 17 $\times$ 96；第3个分支是一个3 $\times$ 3最大池化层，步长同样为2，padding模式为VALID,因此输出的tensor尺寸为17 $\times$ 17 $\times$ 256.最后依然是使用tf.concat将3个分支在输出通道上合并，最后的输出尺寸为17 $\times$ 17 $\times$ (384 + 96 + 256)=17 $\times$ 17 $\times$ 768。在第2个Inception模块组中，5个Inception Module输出tensor的尺寸全部定格在17 $\times$ 17 $\times$ 768, 即图片的尺寸和通道数都没有发生变化。

 #第二个Inception Module的第一个模块
            with tf.variable_scope('Mixed_6a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 384, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_1a_1x1')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], stride=2, padding='VALID', scope='Conv2d_1b_1x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)

接下来是第2个Inception 模块组的第2个Inception Module–Mixed_6b，它有4个分支。第一个分支是一个简单的192输出通道的1 $\times$ 1卷积；第2个分支是由3个卷积层组成，第一层时128输出通道的1 $\times$ 1卷积，第二层是128输出通道数的1 $\times$ 7卷积，第3层时192输出通道的7 $\times$ 1卷积。这里既是前面提到的Factorization into small convolutions思想，串联的1 $\times$ 7卷积和7 $\times$ 1卷积相当于合成了7 $\times$ 7卷积，不过参数量大大减少了，只有后者的2/7，且增加了一个激活函数增强了非线性特征变换；第3个分支一下子拥有5个卷积层，分别是128输出通道的1 $\times$ 1卷积，128输出通道的7 $\times$ 1卷积，128输出通道的1 $\times$ 7卷积，128输出通道的7 $\times$ 1卷积和192输出通道的1 $\times$ 7卷积。这个分支可以算是利用Factorization into small convolutions的典范，反复的将7 $\times$ 7卷积进行拆分；最后，第4个分支是一个3 $\times$ 3的平均池化层，再连接192输出通道的1 $\times$ 1卷积。最后将4个分支合并，这一层输出tensor的尺寸为17 $\times$ 17 $\times$ (192 + 192 + 192 + 192) = 17 $\times$ 17 $\times$ 768。

#第二个Inception Module的第二个模块
            with tf.variable_scope('Mixed_6b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 128, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

然后是我们第2个Inception模块组的第3个Inception Module—Mixed_6c。Mixed_6c和前面一个Inception Module非常相似，只有一个地方不同，即第二个分支和第3个分支中前几个卷积层的输出通道不同，从128变成了160，但是这两个分支的输出通道不变，都是192.其余地方则完全一致。需要注意的是，我们的网络没经过一个Inception Module，即是输出的tensor尺寸不变，但是特征都相当于被重新精炼了一遍，其中丰富的卷积核非线性池化对网络的提升非常大。

#第二个Inception Module的第三个模块
            with tf.variable_scope('Mixed_6c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

Mixed_6d和前面的Mixed_6c完全一致，目的同样是通过Inception Module精心设计的结构增加卷积核非线性，提炼特征。

#第而个Inception Module的第四个模块
            with tf.variable_scope('Mixed_6d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

Mixed_6e也和前面2个Inception Module完全一致。这是第2个Inception模块组的最后一个Inception Module。我们将Mixed_6e存储于end_points中，作为Auxiliary Classifier辅助模型的分类。

#第二个Inception Module的第五个模块，且需要把Mixed_6e存储在end_points中，用于辅助模型的分类
            with tf.variable_scope('Mixed_6e'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            end_points['Mixed_6e'] = net

第3个Inception模块组包含了3个Inception Module，其中后两个Inception Module的结构非常相似。其中第一个Inception Module的名称为Mixed_7a，包含了3个分支。第一个分支是192输出通道的1 $\times$ 1卷积，再接320输出通道的3 $\times$ 3卷积，不过步长为2，padding模式为VALID，因此图片尺寸缩小为8 $\times$ 8；第二个分支有4个卷积层，分别是192输出通道的1 $\times$ 1卷积，192输出通道的1 $\times$ 7卷积，192输出通道的7 $\times$ 1卷积，以及192输出通道的3 $\times$ 3卷积。注意最后一个卷积层同样步长为2，padding模式为VALID，因此最后输出的tensor尺寸为8 $\times$ 8 $\times$ 192；第3个分支则是一个3 $\times$ 3的最大池化层，步长为2，padding模式为VALID，而池化层不会对输出通道产生改变，因此这个分支的输出尺寸为8 $\times$ 8 $\times$ 768.最后，我们将3个分支在输出通道上合并，输出tensor的尺寸为8 $\times$ 8 $\times$ (320 + 192 + 768) = 8 $\times$ 8 $\times$ 1280.从这个Inception Module开始，输出图片的尺寸又被缩小了，同时通道数也增加了，tensor的总size在持续下降中。

#第3个Inception Module的第一个模块
            with tf.variable_scope('Mixed_7a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_0 = slim.conv2d(branch_0, 320, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                    branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)

接下来是第3个Inception模块组的第二个Inception Module，它有4个分支。第一个分支是一个简单的320输出通道的1 $\times$ 1卷积；第二个分支先是1个384输出通道的1 $\times$ 1卷积，随后在分支内开了2个分支，这两个分支分别是384输出通道的1 $\times$ 3卷积核384输出通道的3 $\times$ 1卷积，然后使用tf.conca合并2个分支，得到的输出tensor的尺寸为8 $\times$ 8 $\times$ (384 + 384)=8 $\times$ 8 $\times$ 768；第3个分支更复杂，先是448输出通道的1 $\times$ 1卷积，然后是384输出通道的3 $\times$ 3卷积，然后同样在分支内拆成2个分支，分别是384输出通道的1 $\times$ 3卷积核384输出通道的3 $\times$ 3卷积，最后合并得到8 $\times$ 8 $\times$ 768的输出tensor;第4个分支是在一个3 $\times$ 3的平均池化层后接一个192输出通道的1 $\times$ 1卷积。最后，将这个非常复杂的Inception Module的4个分支合并在一起，得到的输出tensor的尺寸为8 $\times$ 8 $\times$ (320 + 768 + 768 + 192)=8 $\times$ 8 $\times$ 2048.到这个Inception Module，输出通道数从1280增加到了2048。

#第3个Inception Module的第二个模块
            with tf.variable_scope('Mixed_7b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                                          slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                                          slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)

                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)

Mixed_7c是第3个Inception模块组的最后一个Inception Module，不过它和前面的Mixed_7b是完全一致的，输出tensor也是8 $\times$ 8 $\times$ 2048。最后，我们返回这个Inception Module的结果，作为inception_v3_base函数的最终输出。

第3个Inception Module的地三个模块
            with tf.variable_scope('Mixed_7c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                                          slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                                          slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)

                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            return net, end_points

至此，Inception V3网络的核心部分，即卷积层部分就完成了。接下来我们来实现Inception V3网络的最后一部分，全局平均池化，Softmax和Auxiliary Logits。先看函数inception_v3的输入参数，num_classes即最后需要分类的数量，这里默认的1000是ILSVRC比赛数据集的种类数；is_training标志是否是训练过程，对Batch Normalization和Dropout有影响，只有在训练时Batch Normalization和Dropout才会被启用；dropout_keep_prob即训练时Dropout所需保留节点的比例，默认为0.8；prediction_fn是最后用来分类的函数，这里默认是使用slim.softmax；spatial_squeeze参数标志是否对输出进行squeeze操作，即去除维数为1的维度。reuse标志是否会对网络和Variable进行重复使用；最后scope为包含了函数默认参数的环境。首先使用tf.variable_scope定义网络的name和reuse等参数的默认值，然后使用slim.arg_scope定义Batch Normalization和Dropout的is_training标志的默认值。最后，使用前面定义好的inception_v3_base构筑整个网络的卷积部分，拿到最后一层的输出net和重要节点的字典表end_points。

#定义一个inception_v3函数，实现全局平均池化，Softmax和Auxiliary Logits.
def inception_v3(inputs, num_classes=1000, is_training=True, dropout_keep_prob=0.8, prediction_fn=slim.softmax,
                 sptial_squeeze=True, reuse=None, scope='InceptionV3'):
    with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes], reuse=reuse) as scope:
        with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training):
            net, end_points = inception_v3_base(inputs, scope=scope)

接下来处理Auxiliary Logits这部分的逻辑，Auxiliary Logits作为辅助分类的节点，对分类结果预测有很大帮助。先使用slim.arg_scope将卷积，最大池化，平均池化的默认步长设为1，默认为padding模式为SAME。然后通过end_points取到Mixed_6e，并在Mixed_6e之后再接一个5 $\times$ 5的平均池化，步长为3，padding模式为VALID，这样输出的尺寸就从17 $\times$ 17 $\times$ 768变成5 $\times$ 5 $\times$ 768.接着连接一个128输出通道的1 $\times$ 1卷积核一个768输出通道的5 $\times$ 5卷积，这里权重初始化方式重设为标准差0.01的正态分布，padding模式设为VALID，输出尺寸变为1 $\times$ 1 $\times$ 768。然后再连接一个输出通道数为num_classes的1 $\times$ 1卷积，不设激活函数和规范化函数，权重初始化方式重设为标准差为0.001的正太分布，这样输出变为了1 $\times$ 1 $\times$ 1000,。接下啦，使用tf.squeeze函数消除输出tensor中前两个为1的维度。最后将辅助分类节点的输出aux_logits存储到字典表end_points中。

 #实现Auxiliary Logits这部分的逻辑
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
        aux_logits = end_points['Mixed_6e']
        with tf.variable_scope('AuxLogits'):
            aux_logits = slim.avg_pool2d(aux_logits, [5, 5], stride=3, padding='VALID', scope='AvgPool_1a_5x5')
            aux_logits = slim.conv2d(aux_logits, 128, [1, 1], scope='Conv2d_1b_1x1')
            aux_logits = slim.conv2d(aux_logits, 768, [5, 5], weights_initializer=trunc_normal(0.01),
                                     padding='VALID', scope='Conv2d_2a_5x5')
            aux_logits = slim.conv2d(aux_logits, num_classes, [1, 1], activation_fn=None, normalizer_fn=None,
                                     weights_initializer=trunc_normal(0.001), scope='Conv2d_2b_1x1')
            if sptial_squeeze:
                aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
            end_points['Aux:ogits'] = aux_logits

下面处理正常的分类预测的逻辑。我们直接对Mixed_7e即最后一个卷积层的输出进行一个8 $\times$ 8全局平均池化，padding模式为VALID，这样输出tensor的尺寸就变为1 $\times$ 1 $\times$ 2048.然后连接一个Dropout层，节点保留率为dropout_keep_prob。接着连接一个输出通道为1000的1 $\times$ 1卷积，激活函数和规范化函数设为空。下面使用tf.squeeze去除输出tensor中维度为1的维度，再连接一个Softmax对结果进行分类预测。最后返回输出结果logits和包含辅助节点的end_points。

 #实现正常分类预测的逻辑
    with tf.variable_scope('Logits'):
        net = slim.avg_pool2d(net, [8, 8], padding='VALID', scope='AvgPool_1a_8x8')
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        end_points['PreLogits'] = net
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, scope='Conv2d_1c_1x1')
        if sptial_squeeze:
            logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
        end_points['Logits'] = logits
        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
    return logits, end_points

至此，整个Inception V3网络的构建就完成了。Inception V3是一个非常复杂，精妙的模型，其中用到非常多之前积累下来的设计大型卷积网络的经验和技巧。不过，虽然Inception V3论文中给出了设计卷积网络的几个原则，但是其中很多超参数的选择，包括层数，卷积核的大小，池化的位置，步长的设置，factorization使用的时机，以及分支的设计，都很难一一解释。
下面对Inception V3进行运算性能测试。这里使用time_tensorflow_run函数。因为Inception V3网络结构大，所以令batch_size为32，以免显存不够用。图片尺寸设为299 $\times$ 299，并用tf.random.uniform生成随机图片作为input。接着，我们使用slim.arg_scope加载前面定义好的inception_v3_arg_scope()，在这个scope种包含了Batch Normalization的默认参数，以及激活函数和参数初始化方式的默认值。然后，在这个arg_scope下，调用inception_v3函数，并传入inputs,获取logits和end_points。下面创建Session并初始化全部模型参数。最后我们设置测试的batch数量为100，并使用time_tensorflow_run测试Inception V3网络的forward性能。

#定义AlexNet的每轮时间评估函数
def time_tensorflow_run(session, target, info_string):
    num_steps_burn_in = 10 #程序预热
    total_durations = 0.0
    total_duration_squared = 0.0
    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target)
        duration = time.time() - start_time
        if i >= num_steps_burn_in:
            if not i % 10:
                print('%s: step %d, duration = %.3f'%(datetime.now(), i - num_steps_burn_in, duration))
            total_durations += duration
            total_duration_squared += duration * duration
    #计算每轮迭代的平均耗时和标准差sd,最后将结果显示出来
    mn = total_durations / num_batches
    vr = total_duration_squared / num_batches - mn * mn
    sd = math.sqrt(vr)
    print('%s: %s across %d steps, %.3f +/- %.3f sec / batch'%(datetime.now(), info_string, num_batches, mn, sd))
#对Inception-V3 进行运算性能测试
batch_size = 32
height, width = 299, 299
inputs = tf.random_uniform((batch_size, height, width, 3))
with slim.arg_scope(inception_v3_arg_scope()):
    logits, end_points = inception_v3(inputs, is_training=False)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
num_batches = 100
time_tensorflow_run(sess, logits, "Foward")

算法实现完整代码

#coding=utf-8
#Inception-V3.py
import tensorflow as tf
from datetime import datetime
import math
import time
slim = tf.contrib.slim
#产生截断的正太分布
trunc_normal = lambda stddev: tf.truncated_normal_initializer(0.0, stddev)
#定义inception_v3_arg_scope，用来生成网络中经常用到的函数的默认参数
def inception_v3_arg_scope(weight_decay=0.00004, stddev=0.1, batch_norm_var_collection='moving_vars'):
    batch_norm_params = {
        'decay': 0.9997,
        'epsilon': 0.001,
        'updates_collections': tf.GraphKeys.UPDATE_OPS,
        'variables_collections':{
            'beta' : None,
            'gamma' : None,
            'moving_mean': [batch_norm_var_collection],
            'moving_variance': [batch_norm_var_collection]
        }
    }
    with slim.arg_scope([slim.conv2d, slim.fully_connected], weights_regularizer=slim.l2_regularizer(weight_decay)):
        with slim.arg_scope([slim.conv2d], weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
                            activation_fn=tf.nn.relu,
                            normalizer_fn=slim.batch_norm,
                            normalizer_params=batch_norm_params) as sc:
            return sc
#定义inception_v3_base生成Inception V3网络的卷积部分（Inception Module之前的卷积池化层）
def inception_v3_base(inputs, scope=None):
    end_points = {}
    with tf.variable_scope(scope, 'InceptionV3', [inputs]):
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='VALID'):
            net = slim.conv2d(inputs, 32, [3, 3], stride=2, scope='Conv2d_1a_3x3')
            net = slim.conv2d(net, 32, [3, 3], scope='Conv2d_2a_3x3')
            net = slim.conv2d(net, 64, [3, 3], padding='SAME', scope='Conv2d_2b_3x3')
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_3a_3x3')
            net = slim.conv2d(net, 80, [1, 1], scope='Conv2d_3b_1x1')
            net = slim.conv2d(net, 192, [3, 3], scope='Conv2d_4a_3x3')
            net = slim.max_pool2d(net, [3, 3], stride=2, scope='MaxPool_5a_3x3')
        #Inception Module
        with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
            # 第一个Inception Module的第一个模块
            with tf.variable_scope('Mixed_5b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 32, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            # 第一个Inception Module的第2个模块
            with tf.variable_scope('Mixed_5c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0b_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0c_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第一个Inception Module的第3个模块
            with tf.variable_scope('Mixed_5d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 48, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 64, [5, 5], scope='Conv2d_0b_5x5')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = slim.conv2d(branch_2, 96, [3, 3], scope='Conv2d_0c_3x3')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 64, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第二个Inception Module的第一个模块
            with tf.variable_scope('Mixed_6a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 384, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 64, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], scope='Conv2d_1a_1x1')
                    branch_1 = slim.conv2d(branch_1, 96, [3, 3], stride=2, padding='VALID', scope='Conv2d_1b_1x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)
            #第二个Inception Module的第二个模块
            with tf.variable_scope('Mixed_6b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 128, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 128, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 128, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 128, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第二个Inception Module的第三个模块
            with tf.variable_scope('Mixed_6c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第而个Inception Module的第四个模块
            with tf.variable_scope('Mixed_6d'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第二个Inception Module的第五个模块，且需要把Mixed_6e存储在end_points中，用于辅助模型的分类
            with tf.variable_scope('Mixed_6e'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 160, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 160, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0b_7x1')
                    branch_2 = slim.conv2d(branch_2, 160, [1, 7], scope='Conv2d_0c_1x7')
                    branch_2 = slim.conv2d(branch_2, 160, [7, 1], scope='Conv2d_0d_7x1')
                    branch_2 = slim.conv2d(branch_2, 192, [1, 7], scope='Conv2d_0e_1x7')
                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            end_points['Mixed_6e'] = net
            #第3个Inception Module的第一个模块
            with tf.variable_scope('Mixed_7a'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_0 = slim.conv2d(branch_0, 320, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 192, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = slim.conv2d(branch_1, 192, [1, 7], scope='Conv2d_0b_1x7')
                    branch_1 = slim.conv2d(branch_1, 192, [7, 1], scope='Conv2d_0c_7x1')
                    branch_1 = slim.conv2d(branch_1, 192, [3, 3], stride=2, padding='VALID', scope='Conv2d_1a_3x3')
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.max_pool2d(net, [3, 3], stride=2, padding='VALID', scope='MaxPool_1a_3x3')
                net = tf.concat([branch_0, branch_1, branch_2], 3)
            #第3个Inception Module的第二个模块
            with tf.variable_scope('Mixed_7b'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                                          slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                                          slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)

                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            #第3个Inception Module的地三个模块
            with tf.variable_scope('Mixed_7c'):
                with tf.variable_scope('Branch_0'):
                    branch_0 = slim.conv2d(net, 320, [1, 1], scope='Conv2d_0a_1x1')
                with tf.variable_scope('Branch_1'):
                    branch_1 = slim.conv2d(net, 384, [1, 1], scope='Conv2d_0a_1x1')
                    branch_1 = tf.concat([slim.conv2d(branch_1, 384, [1, 3], scope='Conv2d_0b_1x3'),
                                          slim.conv2d(branch_1, 384, [3, 1], scope='Conv2d_0b_3x1')], 3)
                with tf.variable_scope('Branch_2'):
                    branch_2 = slim.conv2d(net, 448, [1, 1], scope='Conv2d_0a_1x1')
                    branch_2 = slim.conv2d(branch_2, 384, [3, 3], scope='Conv2d_0b_3x3')
                    branch_2 = tf.concat([slim.conv2d(branch_2, 384, [1, 3], scope='Conv2d_0c_1x3'),
                                          slim.conv2d(branch_2, 384, [3, 1], scope='Conv2d_0d_3x1')], 3)

                with tf.variable_scope('Branch_3'):
                    branch_3 = slim.avg_pool2d(net, [3, 3], scope='AvgPool_0a_3x3')
                    branch_3 = slim.conv2d(branch_3, 192, [1, 1], scope='Conv2d_0b_1x1')
                net = tf.concat([branch_0, branch_1, branch_2, branch_3], 3)
            return net, end_points
#定义一个inception_v3函数，实现全局平均池化，Softmax和Auxiliary Logits.
def inception_v3(inputs, num_classes=1000, is_training=True, dropout_keep_prob=0.8, prediction_fn=slim.softmax,
                 sptial_squeeze=True, reuse=None, scope='InceptionV3'):
    with tf.variable_scope(scope, 'InceptionV3', [inputs, num_classes], reuse=reuse) as scope:
        with slim.arg_scope([slim.batch_norm, slim.dropout], is_training=is_training):
            net, end_points = inception_v3_base(inputs, scope=scope)
    #实现Auxiliary Logits这部分的逻辑
    with slim.arg_scope([slim.conv2d, slim.max_pool2d, slim.avg_pool2d], stride=1, padding='SAME'):
        aux_logits = end_points['Mixed_6e']
        with tf.variable_scope('AuxLogits'):
            aux_logits = slim.avg_pool2d(aux_logits, [5, 5], stride=3, padding='VALID', scope='AvgPool_1a_5x5')
            aux_logits = slim.conv2d(aux_logits, 128, [1, 1], scope='Conv2d_1b_1x1')
            aux_logits = slim.conv2d(aux_logits, 768, [5, 5], weights_initializer=trunc_normal(0.01),
                                     padding='VALID', scope='Conv2d_2a_5x5')
            aux_logits = slim.conv2d(aux_logits, num_classes, [1, 1], activation_fn=None, normalizer_fn=None,
                                     weights_initializer=trunc_normal(0.001), scope='Conv2d_2b_1x1')
            if sptial_squeeze:
                aux_logits = tf.squeeze(aux_logits, [1, 2], name='SpatialSqueeze')
            end_points['Aux:ogits'] = aux_logits
    #实现正常分类预测的逻辑
    with tf.variable_scope('Logits'):
        net = slim.avg_pool2d(net, [8, 8], padding='VALID', scope='AvgPool_1a_8x8')
        net = slim.dropout(net, keep_prob=dropout_keep_prob, scope='Dropout_1b')
        end_points['PreLogits'] = net
        logits = slim.conv2d(net, num_classes, [1, 1], activation_fn=None, normalizer_fn=None, scope='Conv2d_1c_1x1')
        if sptial_squeeze:
            logits = tf.squeeze(logits, [1, 2], name='SpatialSqueeze')
        end_points['Logits'] = logits
        end_points['Predictions'] = prediction_fn(logits, scope='Predictions')
    return logits, end_points

#定义AlexNet的每轮时间评估函数
def time_tensorflow_run(session, target, info_string):
    num_steps_burn_in = 10 #程序预热
    total_durations = 0.0
    total_duration_squared = 0.0
    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target)
        duration = time.time() - start_time
        if i >= num_steps_burn_in:
            if not i % 10:
                print('%s: step %d, duration = %.3f'%(datetime.now(), i - num_steps_burn_in, duration))
            total_durations += duration
            total_duration_squared += duration * duration
    #计算每轮迭代的平均耗时和标准差sd,最后将结果显示出来
    mn = total_durations / num_batches
    vr = total_duration_squared / num_batches - mn * mn
    sd = math.sqrt(vr)
    print('%s: %s across %d steps, %.3f +/- %.3f sec / batch'%(datetime.now(), info_string, num_batches, mn, sd))
#对Inception-V3 进行运算性能测试
batch_size = 32
height, width = 299, 299
inputs = tf.random_uniform((batch_size, height, width, 3))
with slim.arg_scope(inception_v3_arg_scope()):
    logits, end_points = inception_v3(inputs, is_training=False)
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
num_batches = 100
time_tensorflow_run(sess, logits, "Foward")

Tensorflow 实战8：Tensoflow实现Google Inception Net

介绍

算法实现完整代码