About the dataset

First, we need to understand what the dataset looks like so that when we see the generated text, we can assess whether it makes sense, given the training data. We’ll download the first 100 books from “Grimms’ Fairy Tales.” These are translations of a set of books (from German to English) by the Grimm brothers.

Initially, we’ll download all 209 books from the website with an automated script as follows:

Get hands-on with 1200+ tech skills courses.