Introduction

In this section you will be building an LSTM (long short-term memory) network for language modeling.

A. Language modeling

The main idea of machine learning in NLP is to train a model to understand a text corpus well enough so that it can automatically perform tasks such as text classification or text generation. In order to do this, a machine learning model must be able to quantify what counts as a "good" or "bad" sequence of words.

Like every machine learning task, the task of quantifying text quality is centered around calculating probabilities. In this case, to calculate how "good" a sentence is, we would need to calculate how "good" each word in the sentence is. We can perform this task using a language model.

A language model can tell us the likelihood of each word in a given sentence or text passage based on the words that came before it. We can then determine how likely a sentence or text passage is by aggregating its individual word probabilities.

B. Language model tasks

Language models are useful for both text classification and generation. In text classification, we can use the language model's probability calculations to separate texts into different categories. For example, if we trained a language model on spam email subject titles, the model would likely give the subject "CLICK HERE FOR FREE EASY MONEY" a relatively high probability of being spam.

In text generation, a language model completes a sentence by generating text based on the incomplete input sentence. This is the idea behind the autocomplete feature when texting on a phone or typing in a search engine. The model will give suggestions to complete the sentence based on the words it predicts with the highest probabilities.

Create a free account to view this lesson.

Continue your learning journey with a 14-day free trial.

By signing up, you agree to Educative's Terms of Service and Privacy Policy