Initialization
Discover how the ResNet model is structured for each of its configurations.
We'll cover the following
Chapter Goals:
- Learn about the ResNet model structure
- Understand the difference between channels-first and channels-last
A. Model overview
The ResNet model follows a repetitive structure built around its main building block. The overall model contains four layers of blocks, where the number of blocks per layer depends on how large we want our ResNet model to be. In our code, the block_layer_sizes
dictionary gives the number of blocks for each of the four block layers, based on the total number of weight layers in the ResNet model.
There are six options for how many weight layers (convolution + one fully-connected layer) are in the ResNet model: 18, 34, 50, 101, 152, 200. When the number of layers is less than 50, the model uses a regular building block. However, when the number of layers is 50 or more, the model uses a bottleneck block. We’ll discuss what a bottleneck block is in a later chapter.
B. Filters and strides
The number of filters used in the model’s convolution layers starts off at 64. In each subsequent block layer, we double the number of filters used. This follows the practice of increasing the number of filters for convolution layers deeper in the model.
Most of the blocks in ResNet use a stride size of 1 for their convolution layers. Only the first block of each block layer may use a different stride size. The stride sizes for the first block in each layer are defined in self.block_strides
.
C. Channel placement
So far, all our model data has been in the NHWC format. This is referred to as channels-last placement because the channels are the last dimension. However, TensorFlow also supports channels-first placement, i.e. NCHW format. Since the NCHW format is optimized for GPU training and the NHWC format is a bit better when using CPUs, we create our model to allow for both formats.
When we initialize our model, the channel format ('channels_first'
or 'channels_last'
) is specified through the data_format
argument. This matches the data_format
keyword argument that can be used in tf.layers.conv2d
, tf.layers.max_pooling2d
, and tf.layers.average_pooling2d
.
Time to Code!
In this section of the course, you’ll be creating the ResNetModel
class, which represents a ResNet model.
Specifically in this chapter, you’ll be completing the initialization of the ResNetModel
. The __init__
function is already completed with some variable initialization. Your task will be to set up the parts specific to the ResNet model: the blocks per layer and bottleneck usage.
We’ll first set up the number of blocks to use per block layer.
Set self.block_layer_sizes
equal to block_layer_sizes[num_layers]
.
Now we set self.bottleneck
to a boolean, which is True
if the model uses bottleneck blocks and False
if it uses regular blocks.
Set self.bottleneck
equal to num_layers >= 50
.
import tensorflow as tfblock_layer_sizes = {18: [2, 2, 2, 2],34: [3, 4, 6, 3],50: [3, 4, 6, 3],101: [3, 4, 23, 3],152: [3, 8, 36, 3],200: [3, 24, 36, 3]}class ResNetModel(object):# Model Initializationdef __init__(self, min_aspect_dim, resize_dim, num_layers, output_size,data_format='channels_last'):self.min_aspect_dim = min_aspect_dimself.resize_dim = resize_dimself.filters_initial = 64self.block_strides = [1, 2, 2, 2]self.data_format = data_formatself.output_size = output_size# CODE HERE
Get hands-on with 1300+ tech skills courses.