Parsing Examples
Set up a function for parsing Example objects with and without labels.
We'll cover the following
Chapter Goals:
- Create a function to parse feature data from serialized Example objects
A. Extracting feature data
The input pipeline consists of reading serialized Example objects from the TFRecords file and parsing feature data from the serialized Examples. We parse feature data for a single Example (which represents data for one DataFrame row) using the tf.io.parse_single_example
function.
import tensorflow as tfdef create_example_spec(has_labels):example_spec = {}int_vals = ['Store', 'Dept', 'IsHoliday', 'Size']float_vals = ['Temperature', 'Fuel_Price', 'CPI', 'Unemployment']if has_labels:float_vals.append('Weekly_Sales')for feature_name in int_vals:example_spec[feature_name] = tf.io.FixedLenFeature((), tf.int64)for feature_name in float_vals:example_spec[feature_name] = tf.io.FixedLenFeature((), tf.float32)example_spec['Type'] = tf.io.FixedLenFeature((), tf.string)return example_specexample_spec = create_example_spec(True)parsed_example = tf.io.parse_single_example(ser_ex, example_spec)print(parsed_example)
B. Labeled data
For both training and evaluation, we require the data to be labeled in order to calculate the loss for our machine learning model. Since our model is trained to predict weekly sales, we use the 'Weekly_Sales'
feature as the label for each data observation.
This means that every feature apart from 'Weekly_Sales'
is used as input for the model, which then predicts a 'Weekly_Sales'
value based on these input features. The value is compared to the actual 'Weekly_Sales'
label to assess the model’s performance.
Time to Code!
In this chapter you’ll be completing the parse_feature
function, which parses features from a serialized Example object.
We first use the input Example spec argument, example_spec
, to parse a dictionary of features from the input serialized Example, ser_ex
.
Set parsed_features
equal to tf.io.parse_single_example
with ser_ex
and example_spec
as the required arguments.
The 'Weekly_Sales'
feature is not actually used as an input for the machine learning model. Instead, it is used as a label during training and evaluation.
Therefore, we’ll separate the 'Weekly_Sales'
feature from the rest of the features (if it’s even present).
Create a dictionary called features
which is identical to parsed_features
, but does not contain the 'Weekly_Sales'
key.
If has_labels
is False
(meaning the 'Weekly_Sales'
feature is not present), we’ll just return the rest of the features.
Create an if
block that checks if has_labels
is False
. Inside the if
block, return features
.
If has_labels
is True
, we can extract the 'Weekly_Sales'
feature from the parsed features, which is used as a label.
Outside the if
block, set label
equal to the value in the parsed_features
dictionary corresponding to 'Weekly_Sales'
. Return the tuple features, label
.
import tensorflow as tf# Helper function to convert serialized Example objects into featuresdef parse_features(ser_ex, example_spec, has_labels):# CODE HERE
Get hands-on with 1300+ tech skills courses.