Logits
Dive into the inner layers of a neural network and understand the importance of logits.
We'll cover the following
Chapter Goals:
- Build a single fully-connected model layer
- Output logits, AKA log-odds
A. Fully-connected layer
Before we can get into multilayer perceptrons, we need to start off with a single layer perceptron. The single fully-connected layer means that the input layer, i.e. self.inputs
, is directly connected to the output layer, which has output_size
neurons. Each of the input_size
neurons in the input layer has a connection to each neuron in the output layer, hence the fully-connected layer.
In TensorFlow, this type of fully-connected neuron layer is implemented using tf.keras.layers.Dense
, which takes in a neuron layer and output size as required arguments, and adds a fully-connected output layer with the given size to the computation graph.
In addition to the input layer neurons, tf.keras.layers.Dense
adds another neuron called the bias, which always has a value of 1 and has full connections to the output layer.
The bias neuron helps our neural network produce better results, by allowing each fully-connected layer to model a true linear combination of the input values.
B. Weighted connections
The forces that drive a neural network are the real number weights attached to each connection. The weight on a connection from neuron A into neuron B tells how strongly A affects B as well as whether that effect is positive or negative, i.e. direct vs. inverse relationship.
The diagram above has three weighted connections:
A → B: Direct relationship.A → C: No relationship.
A → D: Inverse relationship.
The weighted connections allow fully-connected layers to model a linear combination of the inputs. Let's take a look at an example fully-connected layer with weights:
The output of this fully-connected layer is now a linear combination of the input neuron values:
The logits produced by our single layer perceptron are therefore just a linear combination of the input data feature values.
Connection weights can be optimized through training (which we will cover in a later chapter), so that the logits produced by the neural network allow the model to make highly accurate predictions based on the input data.
C. Logits
So what exactly are logits? In classification problems they represent log-odds, which maps a probability between 0 and 1 to a real number. When output_size = 1
, our model outputs a single logit per data point. The logits will then be converted to probabilities representing how likely it is for the data point to be labeled 1 (as opposed to 0).
In the above diagram, the x-axis represents the probability and the y-axis represents the logit.
Note the vertical asymptotes at x = 0 and x = 1.
We want our neural network to produce logits with large absolute values, since those represent probabilities close to 0 (meaning we are very sure the label is 0/False) or probabilities close to 1 (meaning we are very sure the label is 1/True).
D. Regression
In the next chapter, you'll be producing actual probabilities from the logits. This makes our single layer perceptron model equivalent to logistic regression. Despite the name, logistic regression is used for classification, not regression problems. If we wanted a model for regression problems (i.e. predicting a real number such as a stock price), we would have our model directly output the logits rather than convert them to probabilities. In this case, it would be better to rename logits
, since they don't map to a probability anymore.
Time to Code!
The code for this chapter will build a single layer perceptron, whose output is logits
. The code goes inside the model_layers
function.
We're going to obtain the logits
by applying a dense layer to inputs
(the placeholder from Chapter 2) to return a tensor with shape (None, self.output_size)
.
Set logits
equal to tf.keras.layers.Dense
with required arguments inputs
and output_size
, and keyword argument name=‘logits’
.
Then return logits
.
def model_layers(inputs, output_size):pass
Get hands-on with 1300+ tech skills courses.