Loss
Calculate the model's sigmoid cross entropy loss.
We'll cover the following
Chapter Goals:
- Calculate the model's loss using sigmoid cross entropy
A. Sigmoid cross entropy
The task for our model is to classify input text sequences as either negative (label 0
) or positive (label 1
). This is equivalent to binary classification. As with regular binary classification, we use sigmoid cross entropy to calculate the model's loss.
For an in-depth and intuitive explanation of sigmoid cross entropy and binary classification, check out the Machine Learning for Software Engineers course on Educative.
Time to Code!
In this chapter you'll be completing the calculate_loss
function, which calculates model loss based on the outputs of the BiLSTM.
The first step to calculating the model's loss is to first calculate the logits. We can use the calculate_logits
function we completed in the previous chapter.
Set logits
equal to self.calculate_logits
applied with lstm_outputs
, batch_size
, and sequence_lengths
as arguments.
Since we're performing binary classification, we use sigmoid cross entropy for the loss. We also need to convert the integer labels into floats.
Set float_labels
equal to tf.cast
applied with labels
as the first argument and tf.float32
as the second argument.
Set batch_loss
equal to tf.nn.sigmoid_cross_entropy_with_logits
applied with float_labels
and logits
for the labels
and logits
keyword arguments, respectively.
The output of the function is the overall aggregate loss. This means we need to sum together each individual sequence's loss in the batch.
Set overall_loss
equal to tf.reduce_sum
applied to batch_loss
. Then return overall_loss
.
import tensorflow as tf# Text classification modelclass ClassificationModel(object):# Model initializationdef __init__(self, vocab_size, max_length, num_lstm_units):self.vocab_size = vocab_sizeself.max_length = max_lengthself.num_lstm_units = num_lstm_units# See the Word Embeddings Lab for details on the Tokenizerself.tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words=self.vocab_size)def make_lstm_cell(self, dropout_keep_prob):cell = tf.keras.layers.LSTMCell(self.num_lstm_units, dropout=dropout_keep_prob)return cell# Use feature columns to create input embeddingsdef get_input_embeddings(self, input_sequences):input_col = tf.compat.v1.feature_column \.categorical_column_with_identity('inputs', self.vocab_size)embed_size = int(self.vocab_size**0.25)embed_col = tf.compat.v1.feature_column.embedding_column(input_col, embed_size)input_dict = {'inputs': input_sequences}input_embeddings= tf.compat.v1.feature_column \.input_layer(input_dict, [embed_col])sequence_lengths = tf.compat.v1.placeholder("int64", shape=(None,), name="input_layer/input_embedding/sequence_length")return input_embeddings, sequence_lengths# Create and run a BiLSTM on the input sequencesdef run_bilstm(self, input_sequences, is_training):input_embeddings, sequence_lengths = self.get_input_embeddings(input_sequences)dropout_keep_prob = 0.5 if is_training else 1.0cell = self.make_lstm_cell(dropout_keep_prob)rnn = tf.keras.layers.RNN(cell, return_sequences=True ,go_backwards=True , return_state=True)input_embeddings = tf.compat.v1.placeholder(tf.float32, shape=(None, 10, 12))Bi_rnn= tf.keras.layers.Bidirectional(rnn,merge_mode=None)outputs = Bi_rnn(input_embeddings)return outputs , sequence_lengthsdef get_gather_indices(self, batch_size, sequence_lengths):row_indices = tf.range(batch_size)final_indexes = tf.cast(sequence_lengths - 1, tf.int32)return tf.transpose([row_indices, final_indexes])# Calculate the loss for the BiLSTMdef calculate_logits(self, lstm_outputs, batch_size, sequence_lengths):lstm_outputs_fw = lstm_outputs[0]lstm_outputs_bw = lstm_outputs[1]combined_outputs = tf.concat([lstm_outputs_fw, lstm_outputs_bw], -1)gather_indices = self.get_gather_indices(batch_size, sequence_lengths)final_outputs = tf.gather_nd(combined_outputs, gather_indices)logits = tf.keras.layers.Dense(1)(final_outputs)return logits#calculate LOSSdef calculate_loss(self, lstm_outputs, batch_size, sequence_lengths, labels):#CODE HEREpass
Get hands-on with 1300+ tech skills courses.