量化百科

TensorFlow Slim解读(一)

由ypyu创建,最终由ypyu 被浏览 11 用户

一直想对tensorflow的slim做更深入的了解,于是就有了这篇文章,一是为了做笔记,二是为大家理解slim提供帮助。不过由于时间有限,此文章会慢慢更新,估计持续一周才能完成。另外,我是一名学习者,所以文章中难免有不正确的地方,希望大家多多包涵并请不吝指正。注:阅读这篇文章估计要花三个小时,所以分成两篇,建议边看英文边看我的解读

TensorFlow-Slim

TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow. Components of tf-slim can be freely mixed with native tensorflow, as well as other frameworks, such as tf.contrib.learn.

TF-Slim 是用于定义、训练和评估复杂模型的tensorflow轻量级库。tf-slim的组件能轻易地与原生tensorflow框架还有其他的框架(例如tf.contrib.learn)进行整合


Usage

import tensorflow.contrib.slim as slim

导入tensorflow.contrib.slim就可使用slim


Why TF-Slim?

TF-Slim is a library that makes building, training and evaluation neural networks

simple:

TF-Slim是一个能简化建立、训练和评估神经网络的

  • Allows the user to define models much more compactly by eliminating boilerplate code. This is accomplished through the use of argument scoping and numerous high level layers and variables. These tools increase readability and maintainability, reduce the likelihood of an error from copy-and-pasting hyperparameter values and simplifies hyperparameter tuning.

允许开发者去除不必要的代码从而使模型更加简洁。这是通过使用参数作用域和大量的high level layer和变量来实现的。上述的几种工具能够增加代码的可读性和可维护性,减少了由于复制粘贴超参数值可能带来的错误,还能简化超参数的调试

  • Makes developing models simple by providing commonly used regularizers.

提供一些常用的规则使开发模型变得简单

  • Several widely used computer vision models (e.g., VGG, AlexNet) have been developed in slim, and are available to users. These can either be used as black boxes, or can be extended in various ways, e.g., by adding "multiple heads" to different internal layers.

几种广泛使用的计算技术视觉模型(例如VGG、AlexNet)都使用了slim并且开发者能使用这几种模型。这些模型要么用作黑盒子,要么以各种各样的方式来扩展(例如在不同的内层添加多元化的头部)

  • Slim makes it easy to extend complex models, and to warm start training algorithms by using pieces of pre-existing model checkpoints.

Slim使扩展复杂的模型变得简单,通过使用已存在的模型断点也使半启动训练算法变得简单


What are the various components of TF-Slim?

TF-Slim is composed of several parts which were design to exist independently. These include the following main pieces (explained in detail below).

TF-Slim由几个独立存在的部分组成。包括以下的主要部分(下面会详尽解释)

  • arg_scope: provides a new scope named arg_scope that allows a user to define default arguments for specific operations within that scope.

提供了名为arg_scope的新作用域,arg_scope允许开发者在该作用域内定义默认的操作参数

包含了TF-slim数据集的定义、data providers、parallel_reader、解析的实体

  • evaluation: contains routines for evaluating models.

包含评估模型的程序

  • layers: contains high level layers for building models using tensorflow.

包含使用tensorflow建立模型的high level layers

  • learning: contains routines for training models.

包含训练模型的程序

  • losses: contains commonly used loss functions.

包含常用的损失函数

  • metrics: contains popular evaluation metrics.

包含流行的评估标准

  • nets: contains popular network definitions such as VGG and AlexNet models.

包含流行的网络定义(例如VGG、AlexNet模型)

  • queues: provides a context manager for easily and safely starting and closing QueueRunners.

提供了简单和安全地开始和关闭队列运行的内容管理器

包含权重的正则化

  • variables: provides convenience wrappers for variable creation and manipulation.

提供了变量创建和操作的简便封装


Defining Models

Models can be succinctly defined using TF-Slim by combining its variables, layers and scopes. Each of these elements are defined below.

TF-Slim通过结合它的变量、层和作用域能简洁地定义模型。这些元素在下面都做了定义和解释

Variables

Creating[Variables](https://link.zhihu.com/?target=https%3A//www.tensorflow.org/how_tos/variables/index.html)in native tensorflow requires either a predefined value or an initialization mechanism (e.g. randomly sampled from a Gaussian). Furthermore, if a variable needs to be created on a specific device, such as a GPU, the specification must be made explicit. To alleviate the code required for variable creation, TF-Slim provides a set of thin wrapper functions in variables.py which allow callers to easily define variables.

在原生tensorflow中创建变量要么需要预定义的值要么需要初始化的机制(例如任意的高斯样例)。此外,如果一个变量需要在特定的设备上(例如GPU)创建,规格必须清晰明确。为了减少创建变量所需的代码,TF-Slim在variables.py中提供了允许调用者轻易地定义变量的封装函数的集合

For example, to create aweightsvariable, initialize it using a truncated normal distribution, regularize it with anl2_lossand place it on theCPU, one need only declare the following:

例如,要创建一个权重变量、使用截断正态分布初始化、使用l2_loss正则化和放置在cpu上,只需要如下的声明:

weights = slim.variable('weights',
                             shape=[10, 10, 3 , 3],
                             initializer=tf.truncated_normal_initializer(stddev=0.1),
                             regularizer=slim.l2_regularizer(0.05),
                             device='/CPU:0')

Note that in native TensorFlow, there are two types of variables: regular variables and local (transient) variables. The vast majority of variables are regular variables: once created, they can be saved to disk using a saver. Local variables are those variables that only exist for the duration of a session and are not saved to disk.

在原生tensorflow中,有两种类型的变量:常规变量和局部(临时)变量。大多数的变量是常规变量:一旦创建,常规变量通过使用saver会保存在磁盘上。局部变量只存在session的生命期和不保存在磁盘上

TF-Slim further differentiates variables by defining model variables, which are variables that represent parameters of a model. Model variables are trained or fine-tuned during learning and are loaded from a checkpoint during evaluation or inference. Examples include the variables created by aslim.fully_connectedorslim.conv2dlayer. Non-model variables are all other variables that are used during learning or evaluation but are not required for actually performing inference. For example, theglobal_stepis a variable using during learning and evaluation but it is not actually part of the model. Similarly, moving average variables might mirror model variables, but the moving averages are not themselves model variables.


TF-Slim通过定义模型变量可以进一步更好地区分变量,模型变量表示模型的参数。模型变量在学习阶段被训练或者微调、在评估或者推理阶段从断点处被加载。下面的例子包括使用slim.fully_connectedorslim.conv2dlayer 创建的变量。非模型变量是指在学习或者评估阶段会使用到但在实际的推理中不需要的所有变量。例如,global_step是在学习和评估阶段用到的变量,但其不是模型的一部分。同样地,移动平均变量(moving average是技术分析一种分析时间序列数据的工具。最常见的是利用股价、回报、交易量等变数计算出移动平均。移动平均可抚平短期波动,反映出长期趋势或周期。数学上,移动平均可视为一种卷积。)能够反映模型变量,但移动平均不是模型变量

Both model variables and regular variables can be easily created and retrieved via TF-Slim:

通过TF-Slim,模型变量和常规变量都能够轻易地创建和检索

# Model Variables
weights = slim.model_variable('weights',
                              shape=[10, 10, 3 , 3],
                              initializer=tf.truncated_normal_initializer(stddev=0.1),
                              regularizer=slim.l2_regularizer(0.05),
                              device='/CPU:0')
model_variables = slim.get_model_variables()

# Regular variables
my_var = slim.variable('my_var',
                       shape=[20, 1],
                       initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()

How does this work? When you create a model variable via TF-Slim's layers or directly via theslim.model_variablefunction, TF-Slim adds the variable to thetf.GraphKeys.MODEL_VARIABLEScollection. What if you have your own custom layers or variable creation routine but still want TF-Slim to manage or be aware of your model variables? TF-Slim provides a convenience function for adding the model variable to its collection:


这是怎么工作的呢?当你通过TF-Slim层或者直接使用slim.model_variable()创建模型时,TF-Slim会把该变量添加到tf.GraphKeys.MODEL_VARIABLES collection中。如果你想创建自己定制的层或者创建变量的程序,但依然想TF-Slim管理或者感知你的模型变量?TF-Slim提供了一个简单的方法用以添加模型变量到collection中

my_model_variable = CreateViaCustomCode()

# Letting TF-Slim know about the additional variable.
slim.add_model_variable(my_model_variable)

Layers

While the set of TensorFlow operations is quite extensive, developers of neural networks typically think of models in terms of higher level concepts like "layers", "losses", "metrics", and "networks". A layer, such as a Convolutional Layer, a Fully Connected Layer or a BatchNorm Layer are more abstract than a single TensorFlow operation and typically involve several operations. Furthermore, a layer usually (but not always) has variables (tunable parameters) associated with it, unlike more primitive operations. For example, a Convolutional Layer in a neural network is composed of several low level operations:

虽然Tensorflow的操作集合是极其广泛的,但是神经网络的开发者通常认为模型在于更高的抽象,例如层、损失函数、矩阵还有网络。层,例如卷积层、全连接层或者BatchNorm(关于BatchNorm的解释可看如下链接 Batch Normalization导读 - CSDN博客)层比单一的Tensorflow操作更抽象和通常涉及几种操作。此外,层通常(但不总是)有与之相关的变量(可调试参数)而不像其他的原始操作的变量一样。例如,神经网络中的卷积层由几个低级别的操作组成:

  1. Creating the weight and bias variables

创建权重和偏置变量

2. Convolving the weights with the input from the previous layer

卷积权重和上一层来的输入

3. Adding the biases to the result of the convolution

把偏置添加到卷积层的结果上

4. Applying an activation function

应用激活函数

Using only plain TensorFlow code, this can be rather laborious:

如果仅使用普通的Tensorflow代码,这会是十分繁杂的

input = ...
with tf.name_scope('conv1_1') as scope:
     kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
                                           stddev=1e-1), name='weights')
     conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
     biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
                       trainable=True, name='biases')
     bias = tf.nn.bias_add(conv, biases)
     conv1 = tf.nn.relu(bias, name=scope)

To alleviate the need to duplicate this code repeatedly, TF-Slim provides a number of convenient operations defined at the more abstract level of neural network layers. For example, compare the code above to an invocation of the corresponding TF-Slim code:

为了减少反复复制这些代码的次数,TF-Slim提供了大量的简便操作,这些操作以神经网络层更加抽象的方式定义。例如,相比上面的代码,相应的TF-Slim调用如下:

input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')

TF-Slim provides standard implementations for numerous components for building neural networks. These include:















为了建立神经网络,TF-Slim提供了大量组件的标准实现,包括:Layer TF-SlimBiasAdd slim.bias_addBatchNorm slim.batch_normConv2d slim.conv2dConv2dInPlane slim.conv2d_in_planeConv2dTranspose (Deconv) slim.conv2d_transposeFullyConnected slim.fully_connectedAvgPool2D slim.avg_pool2dDropout slim.dropoutFlatten slim.flattenMaxPool2D slim.max_pool2dOneHotEncoding slim.one_hot_encodingSeparableConv2 slim.separable_conv2dUnitNorm slim.unit_norm

TF-Slim also provides two meta-operations calledrepeatandstackthat allow users to repeatedly perform the same operation. For example, consider the following snippet from the VGG network whose layers perform several convolutions in a row between pooling layers:

TF-Slim也提供了重复和栈这两种元操作,这两种元操作允许用户重复执行同样的操作。例如,思考下面来自VGG网络的小片段,该层在池化层之间执行几次卷积操作

net = ...
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')

One way to reduce this code duplication would be via aforloop:

减少这种代码重复的一种方式是使用for循环

net = ...
for i in range(3):
     net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')

This can be made even cleaner by using TF-Slim'srepeatoperation:

使用TF-Slim的重复操作会使上述过程更加清晰


net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')net = slim.max_pool2d(net, [2, 2], scope='pool2')

Notice that theslim.repeatnot only applies the same argument in-line, it also is smart enough to unroll the scopes such that the scopes assigned to each subsequent call ofslim.conv2dare appended with an underscore and iteration number. More concretely, the scopes in the example above would be named 'conv3/conv3_1', 'conv3/conv3_2' and 'conv3/conv3_3'.

slim.repeat不仅使用同样的嵌入参数,而且在显示作用域方面是足够整齐的。这样,在每一次调用slim.conv2d后,slim.repeat会添加含有迭代次数的作用域。更具体地,,上述例子中net的作用域会分别被命名为conv3/conv3_1、conv3/conv3_2和conv3/conv3_3

Furthermore, TF-Slim'sslim.stackoperator allows a caller to repeatedly apply the same operation with different arguments to create a stack or tower of layers.slim.stackalso creates a newtf.variable_scopefor each operation created. For example, a simple way to create a Multi-Layer Perceptron (MLP):

此外,TF-Slim的slim.stack操作允许使用者反复利用相同的操作和不同的参数去创建栈或者网络层塔。slim.stack也可以为每个被创建的操作创建一个新的tf.variable_scope。例如,一种创建多层感知器的方式如下:

# Verbose way:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')

# Equivalent, TF-Slim way using slim.stack:
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')

In this example,slim.stackcallsslim.fully_connectedthree times passing the output of one invocation of the function to the next. However, the number of hidden units in each invocation changes from 32 to 64 to 128. Similarly, one can use stack to simplify a tower of multiple convolutions:

在上面的这个例子中,slim.stack调用slim.fully_connected三次并且每一次传递方法的输出到下一次调用中。然而,每次调用的大量隐藏层单元数目的变化从32到64再到128。类似地,调用者能够使用stack简化多卷积层:

# Verbose way:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')

# Using stack:
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')

Scopes

In addition to the types of scope mechanisms in TensorFlow (name_scope,variable_scope), TF-Slim adds a new scoping mechanism called arg_scope. This new scope allows a user to specify one or more operations and a set of arguments which will be passed to each of the operations defined in thearg_scope. This functionality is best illustrated by example. Consider the following code snippet:

除了Tensorflow的作用域类型机制(name_scope, variable_scope),TF-Slim增加了arg_scope这个新的作用域机制。arg_scope允许开发者指定一个或多个操作和一组参数,其中这一组参数会被传递给在_arg__scope作用域里定义的每一个操作。用下面的例子能很直观地描述arg_scope的功能。思考下面的小片段代码:

net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
                  weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                  weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')

It should be clear that these three convolution layers share many of the same hyperparameters. Two have the same padding, all three have the same weights_initializer and weight_regularizer. This code is hard to read and contains a lot of repeated values that should be factored out. One solution would be to specify default values using variables:

很明显,这三层卷积层共享许多相同的超参数。其中两层有一样的padding,三层均有相同的权重初始化和权重正则化。这代码可读性不好而且包含了许多重复的理应被共享的值。一种解决方法是使用变量指定默认值:

padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
                  padding='VALID',
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
                  padding=padding,
                  weights_initializer=initializer,
                  weights_regularizer=regularizer,
                  scope='conv3')

This solution ensures that all three convolutions share the exact same parameter values but doesn't reduce completely the code clutter. By using anarg_scope, we can both ensure that each layer uses the same values and simplify the code:

上面的这种方法确保了这三层卷积层共享抽取出来的相同参数值但并没有很好地减少代码的冗余。通过使用arg_scope,我们不仅能确保每层共享相同的参数值还能简化代码:

with slim.arg_scope([slim.conv2d], padding='SAME',
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
                      weights_regularizer=slim.l2_regularizer(0.0005)):
       net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
       net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
       net = slim.conv2d(net, 256, [11, 11], scope='conv3')

As the example illustrates, the use of arg_scope makes the code cleaner, simpler and easier to maintain. Notice that while argument values are specified in the arg_scope, they can be overwritten locally. In particular, while the padding argument has been set to 'SAME', the second convolution overrides it with the value of 'VALID'.

上述的例子表明,使用asgscope能使代码更清晰、更简单和更容易维护_。_注意到,当在arg_scope中指定参数值时,这些参数值可以在局部做修改。也就是,当把padding参数设置为'SAME'时,第二层卷积层用'VALID'覆盖了原来的指定值:

One can also nestarg_scopesand use multiple operations in the same scope. For example:

开发者也能嵌套arg_scopes和在同一作用域中使用多种操作。例如:

with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
   with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
       net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
       net = slim.conv2d(net, 256, [5, 5],
                      weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
                      scope='conv2')
       net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')

In this example, the firstarg_scopeapplies the sameweights_initializerandweights_regularizerarguments to theconv2dandfully_connectedlayers in its scope. In the secondarg_scope, additional default arguments toconv2donly are specified.

在这个例子中,第一个arg_scope在它作用域上的conv2d和fully_connected层使用了相同的权重初始化和权重正则化。在第二个arg_scope中,在指定的conv2d中添加额外的默认参数

Working Example: Specifying the VGG16 Layers

By combining TF-Slim Variables, Operations and scopes, we can write a normally very complex network with very few lines of code. For example, the entire VGG architecture can be defined with just the following snippet:

通过结合TF-Slim的变量、操作和作用域,我们只需少许代码就能写出一个很复杂的网络。例如,只需以下的代码片段就能定义出整个VGG结构:

def vgg16(inputs):
    with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      activation_fn=tf.nn.relu,
                      weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
                      weights_regularizer=slim.l2_regularizer(0.0005)):
        net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
        net = slim.max_pool2d(net, [2, 2], scope='pool1')
        net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
        net = slim.max_pool2d(net, [2, 2], scope='pool2')
        net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
        net = slim.max_pool2d(net, [2, 2], scope='pool3')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
        net = slim.max_pool2d(net, [2, 2], scope='pool4')
        net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
        net = slim.max_pool2d(net, [2, 2], scope='pool5')
        net = slim.fully_connected(net, 4096, scope='fc6')
        net = slim.dropout(net, 0.5, scope='dropout6')
        net = slim.fully_connected(net, 4096, scope='fc7')
        net = slim.dropout(net, 0.5, scope='dropout7')
        net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
  return net

Training Models

Training Tensorflow models requires a model, a loss function, the gradient computation and a training routine that iteratively computes the gradients of the model weights relative to the loss and updates the weights accordingly. TF-Slim provides both common loss functions and a set of helper functions that run the training and evaluation routines.

训练Tensorflow模型需要模型、损失函数、梯度计算和训练程序,训练程序迭代地计算与损失函数有关的模型权重的梯度并且相应地更新权重。TF-Slim提供了常用的损失函数和一套运行训练和评估程序的帮助函数

Losses

The loss function defines a quantity that we want to minimize. For classification problems, this is typically the cross entropy between the true distribution and the predicted probability distribution across classes. For regression problems, this is often the sum-of-squares differences between the predicted and true values.

损失函数定义了一个我们想最小化的量。对于分类问题,损失函数是各个类别的真实分布和预测分布两者之间的交叉熵。对于回归问题,损失函数通常是预测值和真实值两者的平方和

Certain models, such as multi-task learning models, require the use of multiple loss functions simultaneously. In other words, the loss function ultimately being minimized is the sum of various other loss functions. For example, consider a model that predicts both the type of scene in an image as well as the depth from the camera of each pixel. This model's loss function would be the sum of the classification loss and depth prediction loss.

对于某一模型,例如多任务学习模型,需要同时使用多个损失函数。也就是说,待最小化的损失函数最终是多个损失函数的的和。例如,思考一个这样的模型,这模型既预测图片中场景的类型也预测图片中每个像素的深度。这个模型的损失函数将会是分类损失和深度预测损失的总和

TF-Slim provides an easy-to-use mechanism for defining and keeping track of loss functions via the losses module. Consider the simple case where we want to train the VGG network:

TF-Slim提供了一个简单易用的机制也就是通过损失模块去定义和记录损失函数。思考这个我们想训练的VGG网络的简单案例:

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg

# Load the images and labels.
images, labels = ...

# Create the model.
predictions, _ = vgg.vgg_16(images)

# Define the loss functions and get the total loss.
loss = slim.losses.softmax_cross_entropy(predictions, labels)

In this example, we start by creating the model (using TF-Slim's VGG implementation), and add the standard classification loss. Now, lets turn to the case where we have a multi-task model that produces multiple outputs:

在这个例子中,在创建模型(使用TF-Slim的VGG实现)后,我们添加了标准的分类损失。那么现在,让我们回到上面那个案例也就是生成多个输出的多任务模型:

# Load the images and labels.
images, scene_labels, depth_labels = ...

# Create the model.
scene_predictions, depth_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)

# The following two lines have the same effect:
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)

In this example, we have two losses which we add by callingslim.losses.softmax_cross_entropyandslim.losses.sum_of_squares. We can obtain the total loss by adding them together (total_loss) or by callingslim.losses.get_total_loss(). How did this work? When you create a loss function via TF-Slim, TF-Slim adds the loss to a special TensorFlow collection of loss functions. This enables you to either manage the total loss manually, or allow TF-Slim to manage them for you.

在这个例子中,我们有两个损失,一个是调用slim.losses.softmax_cross_entropy添加的,另一个是调用slim.losses.sum_of_squares添加的。通过直接把上述的两个损失相加或者直接调用slim.losses.get_total_loss(),我们就能得到总损失。那这是怎么运作的呢?当你通过TF-Slim创建一个损失函数的时候,TF-Slim添加这个损失到特殊的TensorFlow 损失函数collection中。这使你要么手工管理总的损失函数,要么允许TF-Slim帮你管理

What if you want to let TF-Slim manage the losses for you but have a custom loss function? loss_ops.py also has a function that adds this loss to TF-Slims collection. For example:


当你想要TF-Slim帮你管理这些损失但又有一个自己定制的损失函数,这时候该怎么做呢?loss_ops.py也有一个函数,这个函数能够把这个损失添加到TF-Slims的collection中。例如:

# Load the images and labels.
images, scene_labels, depth_labels, pose_labels = ...

# Create the model.
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)

# Define the loss functions and get the total loss.
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss) # Letting TF-Slim know about the additional loss.

# The following two ways to compute the total loss are equivalent:
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss

# (Regularization Loss is included in the total loss by default).
total_loss2 = slim.losses.get_total_loss()

In this example, we can again either produce the total loss function manually or let TF-Slim know about the additional loss and let TF-Slim handle the losses.

在这个例子中,我们手工生成这个总损失或者让TF-Slim感知这个额外的损失并且让TF-Slim处理这些损失

Training Loop

TF-Slim provides a simple but powerful set of tools for training models found in learning.py. These include a Train function that repeatedly measures the loss, computes gradients and saves the model to disk, as well as several convenience functions for manipulating gradients. For example, once we've specified the model, the loss function and the optimization scheme, we can callslim.learning.create_train_opandslim.learning.trainto perform the optimization:

TF-Slim提供了一套简单但很强大的用于训练模型的工具,这套工具可以在learning.py上找到。这套工具包括了一个训练函数,这个训练函数可以反复损失、计算梯度和把模型保存到磁盘上,除了上述的训练函数,还有几个用于操作梯度的便利函数。例如,一旦我们指定了模型、损失函数和优化策略,我们能够调用slim.learning.create_train_op和slim.learning.train去执行优化:

g = tf.Graph()

# Create the model and specify the losses...
...

total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)

# create_train_op ensures that each time we ask for the loss, the update_ops
# are run and the gradients being computed are applied too.
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ... # Where checkpoints are stored.

slim.learning.train(
    train_op,
    logdir,
    number_of_steps=1000,
    save_summaries_secs=300,
    save_interval_secs=600)

In this example,slim.learning.trainis provided with thetrain_opwhich is used to (a) compute the loss and (b) apply the gradient step.logdirspecifies the directory where the checkpoints and event files are stored. We can limit the number of gradient steps taken to any number. In this case, we've asked for1000steps to be taken. Finally,save_summaries_secs=300indicates that we'll compute summaries every 5 minutes andsave_interval_secs=600indicates that we'll save a model checkpoint every 10 minutes.

在这个例子中,slim.learning.train的参数train_op用于计算损失和apply the gradient step。参数logdir指定checkpoints文件和事件文件的保存目录。我们能限定训练的步数为任意值。在这个例子中,我们限定为1000步。最后,save_summaries_secs=300表示每五分钟计算一下summaries和save_interval_secs=600表示每十分钟保存一次模型

Working Example: Training the VGG16 Model

To illustrate this, lets examine the following sample of training the VGG network:

为了阐述上面的训练工具,我们看一下训练VGG网络的样例:

import tensorflow as tf
import tensorflow.contrib.slim.nets as nets

slim = tf.contrib.slim
vgg = nets.vgg

...

train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
  tf.gfile.MakeDirs(train_log_dir)

with tf.Graph().as_default():
  # Set up the data loading:
   images, labels = ...

  # Define the model:
   predictions = vgg.vgg_16(images, is_training=True)

  # Specify the loss function:
   slim.losses.softmax_cross_entropy(predictions, labels)

   total_loss = slim.losses.get_total_loss()
   tf.summary.scalar('losses/total_loss', total_loss)

  # Specify the optimization scheme:
   optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)

  # create_train_op that ensures that when we evaluate it to get the loss,
  # the update_ops are done and the gradient updates are computed.
   train_tensor = slim.learning.create_train_op(total_loss, optimizer)

  # Actually runs training.
   slim.learning.train(train_tensor, train_log_dir)

\

标签

深度学习神经网络TensorFlow卷积神经网络