Loss

Calculate the model's sigmoid cross entropy loss.

Chapter Goals:

  • Calculate the model's loss using sigmoid cross entropy

A. Sigmoid cross entropy

The task for our model is to classify input text sequences as either negative (label 0) or positive (label 1). This is equivalent to binary classification. As with regular binary classification, we use sigmoid cross entropy to calculate the model's loss.

For an in-depth and intuitive explanation of sigmoid cross entropy and binary classification, check out the Machine Learning for Software Engineers course on Educative.

Time to Code!

In this chapter you'll be completing the calculate_loss function, which calculates model loss based on the outputs of the BiLSTM.

The first step to calculating the model's loss is to first calculate the logits. We can use the calculate_logits function we completed in the previous chapter.

Set logits equal to self.calculate_logits applied with lstm_outputs, batch_size, and sequence_lengths as arguments.

Since we're performing binary classification, we use sigmoid cross entropy for the loss. We also need to convert the integer labels into floats.

Set float_labels equal to tf.cast applied with labels as the first argument and tf.float32 as the second argument.

Set batch_loss equal to tf.nn.sigmoid_cross_entropy_with_logits applied with float_labels and logits for the labels and logits keyword arguments, respectively.

The output of the function is the overall aggregate loss. This means we need to sum together each individual sequence's loss in the batch.

Set overall_loss equal to tf.reduce_sum applied to batch_loss. Then return overall_loss.

Press + to interact
import tensorflow as tf
# Text classification model
class ClassificationModel(object):
# Model initialization
def __init__(self, vocab_size, max_length, num_lstm_units):
self.vocab_size = vocab_size
self.max_length = max_length
self.num_lstm_units = num_lstm_units
# See the Word Embeddings Lab for details on the Tokenizer
self.tokenizer = tf.keras.preprocessing.text.Tokenizer(num_words=self.vocab_size)
def make_lstm_cell(self, dropout_keep_prob):
cell = tf.keras.layers.LSTMCell(self.num_lstm_units, dropout=dropout_keep_prob)
return cell
# Use feature columns to create input embeddings
def get_input_embeddings(self, input_sequences):
input_col = tf.compat.v1.feature_column \
.categorical_column_with_identity(
'inputs', self.vocab_size)
embed_size = int(self.vocab_size**0.25)
embed_col = tf.compat.v1.feature_column.embedding_column(
input_col, embed_size)
input_dict = {'inputs': input_sequences}
input_embeddings= tf.compat.v1.feature_column \
.input_layer(
input_dict, [embed_col])
sequence_lengths = tf.compat.v1.placeholder("int64", shape=(None,), name="input_layer/input_embedding/sequence_length")
return input_embeddings, sequence_lengths
# Create and run a BiLSTM on the input sequences
def run_bilstm(self, input_sequences, is_training):
input_embeddings, sequence_lengths = self.get_input_embeddings(input_sequences)
dropout_keep_prob = 0.5 if is_training else 1.0
cell = self.make_lstm_cell(dropout_keep_prob)
rnn = tf.keras.layers.RNN(cell, return_sequences=True ,
go_backwards=True , return_state=True)
input_embeddings = tf.compat.v1.placeholder(
tf.float32, shape=(None, 10, 12))
Bi_rnn= tf.keras.layers.Bidirectional(
rnn,
merge_mode=None
)
outputs = Bi_rnn(input_embeddings)
return outputs , sequence_lengths
def get_gather_indices(self, batch_size, sequence_lengths):
row_indices = tf.range(batch_size)
final_indexes = tf.cast(sequence_lengths - 1, tf.int32)
return tf.transpose([row_indices, final_indexes])
# Calculate the loss for the BiLSTM
def calculate_logits(self, lstm_outputs, batch_size, sequence_lengths):
lstm_outputs_fw = lstm_outputs[0]
lstm_outputs_bw = lstm_outputs[1]
combined_outputs = tf.concat([lstm_outputs_fw, lstm_outputs_bw], -1)
gather_indices = self.get_gather_indices(batch_size, sequence_lengths)
final_outputs = tf.gather_nd(combined_outputs, gather_indices)
logits = tf.keras.layers.Dense(1)(final_outputs)
return logits
#calculate LOSS
def calculate_loss(self, lstm_outputs, batch_size, sequence_lengths, labels):
#CODE HERE
pass

Get hands-on with 1300+ tech skills courses.