深度学习
  • 前言
  • 第一章:经典网络
    • ImageNet Classification with Deep Convolutional Neural Network
    • Very Deep Convolutional Networks for Large-Scale Image Recognition
    • Going Deeper with Convolutions
    • Deep Residual Learning for Image Recognition
    • PolyNet: A Pursuit of Structural Diversity in Very Deep Networks
    • Squeeze-and-Excitation Networks
    • Densely Connected Convolutional Networks
    • SQUEEZENET: ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE
    • MobileNet v1 and MobileNet v2
    • Xception: Deep Learning with Depthwise Separable Convolutions
    • Aggregated Residual Transformations for Deep Neural Networks
    • ShuffleNet v1 and ShuffleNet v2
    • CondenseNet: An Efficient DenseNet using Learned Group Convolution
    • Neural Architecture Search with Reinforecement Learning
    • Learning Transferable Architectures for Scalable Image Recognition
    • Progressive Neural Architecture Search
    • Regularized Evolution for Image Classifier Architecture Search
    • 实例解析:12306验证码破解
  • 第二章:自然语言处理
    • Recurrent Neural Network based Language Model
    • Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
    • Neural Machine Translation by Jointly Learning to Align and Translate
    • Hierarchical Attention Networks for Document Classification
    • Connectionist Temporal Classification : Labelling Unsegmented Sequence Data with Recurrent Neural Ne
    • About Long Short Term Memory
    • Attention Is All you Need
    • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  • 第三章:语音识别
    • Speech Recognition with Deep Recurrent Neural Network
  • 第四章:物体检测
    • Rich feature hierarchies for accurate object detection and semantic segmentation
    • Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition
    • Fast R-CNN
    • Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
    • R-FCN: Object Detection via Region-based Fully Convolutuional Networks
    • Mask R-CNN
    • You Only Look Once: Unified, Real-Time Object Detection
    • SSD: Single Shot MultiBox Detector
    • YOLO9000: Better, Faster, Stronger
    • Focal Loss for Dense Object Detection
    • YOLOv3: An Incremental Improvement
    • Learning to Segment Every Thing
    • SNIPER: Efficient Multi-Scale Training
  • 第五章:光学字符识别
    • 场景文字检测
      • DeepText: A Unified Framework for Text Proposal Generation and Text Detection in Natural Images
      • Detecting Text in Natural Image with Connectionist Text Proposal Network
      • Scene Text Detection via Holistic, Multi-Channel Prediction
      • Arbitrary-Oriented Scene Text Detection via Rotation Proposals
      • PixelLink: Detecting Scene Text via Instance Segmentation
    • 文字识别
      • Spatial Transform Networks
      • Robust Scene Text Recognition with Automatic Rectification
      • Bidirectional Scene Text Recognition with a Single Decoder
      • multi-task learning for text recognition with joint CTC-attention
    • 端到端文字检测与识别
      • Reading Text in the Wild with Convolutional Neural Networks
      • Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework
    • 实例解析:字符验证码破解
    • 二维信息识别
      • 基于Seq2Seq的公式识别引擎
      • Show and Tell: A Neural Image Caption Generator
      • Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
  • 第六章:语义分割
    • U-Net: Convolutional Networks for Biomedical Image Segmentation
  • 第七章:人脸识别
    • 人脸检测
      • DenseBox: Unifying Landmark Localization with End to End Object Detection
      • UnitBox: An Advanced Object Detection Network
  • 第八章:网络优化
    • Batch Normalization
    • Layer Normalization
    • Weight Normalization
    • Instance Normalization
    • Group Normalization
    • Switchable Normalization
  • 第九章:生成对抗网络
    • Generative Adversarial Nets
  • 其它应用
    • Holistically-Nested Edge Detection
    • Image Style Transfer Using Convolutional Nerual Networks
    • Background Matting: The World is Your Green Screen
  • Tags
  • References
由 GitBook 提供支持
在本页
  • 前言
  • 1. DenseNet算法解析及源码实现
  • 1.1 Dense Block
  • 1.2 合成函数(Composite function)
  • 1.3 成长率(Growth Rate)
  • 1.4 Compression
  • 2. 分析:

这有帮助吗?

  1. 第一章:经典网络

Densely Connected Convolutional Networks

上一页Squeeze-and-Excitation Networks下一页SQUEEZENET: ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE

最后更新于4年前

这有帮助吗?

前言

在残差网络的文章中,我们知道残差网格,能够应用在特别深的网络中的一个重要原因是,无论正向计算精度还是反向计算梯度,信息都能毫无损失的从一层传到另一层。如果我们的目的是保证信息毫无阻碍的传播,那么残差网络的stacking残差块的设计便不是信息流通最合适的结构。

基于信息流通的原理,一个最简单的思想便是在网络中的每个卷积操作中,将其低层的所有特征作为该网络的输入,也就是在一个层数为L的网络中加入L(L+1)2\frac{L(L+1)}{2}2L(L+1)​个short-cut, 如图1。为了更好的保存低层网络的特征,DenseNet 使用的是将不同层的输出拼接在一起,而在残差网络中使用的是单位加操作。以上便是DenseNet算法的动机。

图1:DenseNet中一个Dense Block的设计

1. DenseNet算法解析及源码实现

在DenseNet中,如果全部采用图1的结构的话,第L层的输入是之前所有的Feature Map拼接到一起。考虑到现今内存/显存空间的问题,该方法显然是无法应用到网络比较深的模型中的,故而DenseNet采用了图2所示的堆积Dense Block的形式,下面我们针对图2详细解析DenseNet算法。

图2:DenseNet网络结构

1.1 Dense Block

图1便是一个Dense Block,在Dense Block中,第lll层的输入xlx_lxl​是这个块中前面所有层的输出:

xl=[y0,y1,...,yl−1]x_l = [y_0, y_1, ..., y_{l-1}]xl​=[y0​,y1​,...,yl−1​]

yl=Hl(xl)y_l = H_l(x_l)yl​=Hl​(xl​)

其中,中括号[y0,y1,...,yl−1][y_0, y_1, ..., y_{l-1}][y0​,y1​,...,yl−1​]表示拼接操作,即按照Feature Map将l−1l-1l−1个输入拼接成一个Tensor。Hl(⋅)H_l(\cdot)Hl​(⋅)表示合成函数(Composite function)。在实现时,我使用了stored_features存储每个合成函数的输出。

def dense_block(x, depth=5, growth_rate = 3):
    nb_input_feature_map = x.shape[3].value
    stored_features = x
    for i in range(depth):
        feature = composite_function(stored_features, growth_rate = growth_rate)
        stored_features = concatenate([stored_features, feature], axis=3)
    return stored_features

1.2 合成函数(Composite function)

合成函数位于Dense Block的每一个节点中,其输入是拼接在一起的Feature Map, 输出则是这些特征经过BN->ReLU->3*3卷积的三步得到的结果,其中卷积的Feature Map的数量是成长率(Growth Rate)。在DenseNet中,成长率k一般是个比较小的整数,在论文中,k=12k=12k=12。但是拼接在一起的Feature Map的数量一般比较大,为了提高网络的计算性能,DenseNet先使用了1×11\times11×1卷积将输入数据降维到4k4k4k,再使用3×33\times33×3卷积提取特征,作者将这一过程标准化为BN->ReLU->1*1卷积->BN->ReLU->3*3卷积,这种结构定义为DenseNetB。

 def composite_function(x, growth_rate):
    if DenseNetB: #Add 1*1 convolution when using DenseNet B
        x = BatchNormalization()(x)
        x = Activation('relu')(x)
        x = Conv2D(kernel_size=(1,1), strides=1, filters = 4 * growth_rate, padding='same')(x)
    x = BatchNormalization()(x)
    x = Activation('relu')(x)
    output = Conv2D(kernel_size=(3,3), strides=1, filters = growth_rate, padding='same')(x)
    return output

1.3 成长率(Growth Rate)

成长率kkk是DenseNet的一个超参数,反应的是Dense Block中每个节点的输入数据的增长速度。在Dense Block中,每个节点的输出均是一个kkk维的特征向量。假设整个Dense Block的输入数据是k0k_0k0​维的,那么第lll个节点的输入便是k0+k×(l−1)k_0 + k\times(l-1)k0​+k×(l−1)。作者通过实验验证,kkk一般取一个比较小的值,作者通过实验将kkk设置为12。

1.4 Compression

至此,DenseNet的Dense Block已经介绍完毕,在图2中,Dense Block之间的结构叫做压缩层(Compression Layer)。压缩层有降维和降采样两个作用。假设Dense Block的输出是mmm维的特征向量,那么下一个Dense Block的输入是⌊θm⌋\lfloor \theta m \rfloor⌊θm⌋,其中θ\thetaθ是压缩因子(Compression Factor),用户自行设置的超参数。当θ\thetaθ等于1时,Dense Block的输入和输出的维度相同,当θ<1\theta < 1θ<1时,网络叫做DenseNet-C,在论文中,θ=0.5\theta=0.5θ=0.5。包含瓶颈层和压缩层的DenseNet叫做DenseNet-BC。Pooling层使用的是2×22\times22×2的Average Pooling层。

def dense_net(input_image, nb_blocks = 2):
    x = Conv2D(kernel_size=(3,3), filters=8, strides=1, padding='same', activation='relu')(input_image)
    for block in range(nb_blocks):
        x = dense_block(x, depth=NB_DEPTH, growth_rate = GROWTH_RATE)
        if not block == nb_blocks-1:
            if DenseNetC:
                theta = COMPRESSION_FACTOR
            nb_transition_filter =  int(x.shape[3].value * theta)
            x = Conv2D(kernel_size=(1,1), filters=nb_transition_filter, strides=1, padding='same', activation='relu')(x)
        x = AveragePooling2D(pool_size=(2,2), strides=2)(x)
    x = Flatten()(x)
    x = Dense(100, activation='relu')(x)
    outputs = Dense(10, activation='softmax', kernel_initializer='he_normal')(x)
    return outputs

2. 分析:

DenseNet具有如下优点:

  1. 信息流通更为顺畅;

  2. 支持特征重用;

  3. 网络更窄

由于DenseNet需要在内存中保存Dense Block的每个节点的输出,此时需要极大的显存才能支持较大规模的DenseNet,这也导致了现在工业界主流的算法依旧是残差网络。

下面Demo是在MNIST数据集上的DenseNet代码,完整代码见:

https://github.com/senliuy/CNN-Structures/blob/master/DenseNet.ipynb