Initialization
Learn about the CIFAR-10 dataset along with the input and output sizes of the data.
We'll cover the following
Chapter Goals:
- Learn about the CIFAR-10 dataset
- Initialize the model with data dimensions
A. CIFAR-10
The CIFAR-10 (Canadian Institute for Advanced Research) dataset contains 60,000 color images with dimensions 32x32. The images are distributed evenly across 10 categories: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck. We split the dataset into 50,000 images for training and 10,000 images for testing.
The CIFAR-10 dataset is available for download (along with the 100 category version, CIFAR-100), on Alex Krizhevsky's website.
(Fun fact: Alex Krizhevsky helped invent the AlexNet model mentioned in the previous chapter, which incidentally was named after him. The LeNet model from the CNN section was also named after its inventor, Yann LeCun. SqueezeNet, however, is not named after its developers.)
B. Inputs and labels
Each image has dimensions 32x32, meaning original_dim
will be 32 for the CIFAR-10 dataset. The images have color, so they follow the RGB format, meaning each pixel contains three integers (one each for the red, blue, and green channels). In total, an image is represented by
integers between 0 and 255. The first 1024 integers represent the red channel pixel values, the next 1024 represent the blue channel pixel values, and the final 1024 represent the green channel pixel values.
The labels this time are just single integers corresponding to the class index of the image, rather than one-hot vectors. This is referred to as a sparse representation of the labels. However, output_size
is still equal to the number of image categories, in this case 10.
For a batch of input data, the shape of inputs
is (batch_size, 3 * self.original_dim**2)
and the shape of labels
is (batch_size,)
, where batch_size
represents the size of the batch. Due to the sparse representation, labels
is a 1-D tensor (1-D tensors have a trailing comma in their shape).
batch_size = 32dataset = dataset.batch(batch_size)it = tf.compat.v1.data.make_one_shot_iterator(dataset)inputs, labels = it.get_next()with tf.compat.v1.Session() as sess:# Batch of data size 10input_arr, label_arr = sess.run((inputs, labels))
In the example, inputs
represents the input data tensor, while labels
represents the 1-D label tensor. The batch size is set to 32 in the example.
Time to Code!
In this section of the course you'll be creating a Python class, SqueezeNetModel
that represents the SqueezeNet model you'll be building.
We'll set the original height/width dimension of the image data (original_dim
) and the number of classes (output_size
).
Inside the __init__
function, set self.original_dim
equal to original_dim
and set self.output_size
equal to output_size
.
In the next chapter we'll be doing image processing, so we need to set the resized dimension of the images.
Set self.resize_dim
equal to resize_dim
.
import tensorflow as tfclass SqueezeNetModel(object):# Model Initializationdef __init__(self, original_dim, resize_dim, output_size):# CODE HERE
Create a free account to view this lesson.
Continue your learning journey with a 14-day free trial.
By signing up, you agree to Educative's Terms of Service and Privacy Policy