Gain insights into basic and intermediate deep learning concepts, including CNNs, RNNs, GANs, and transformers. Delve into fundamental architectures to enhance your machine learning model training skills.

docker.tar.gz

Pytorch

GANS

This course is an accumulation of well-grounded knowledge and experience in deep learning. It provides you with the basic concepts you need in order to start working with and training various machine learning models. 

You will cover both basic and intermediate concepts including but not limited to: convolutional neural networks, recurrent neural networks, generative adversarial networks as well as transformers.

After completing this course, you will have a comprehensive understanding of the fundamental architectural components of deep learning. Whether you’re a data and computer scientist, computer and big data engineer, solution architect, or software engineer, you will benefit from this course.

Introduction to Deep Learning & Neural Networks

## Short residual skip connections 

In language, there is a significant notion of a wider understanding of the world and our ability to combine ideas. Humans extensively utilize these top-down influences (our expectations) to combine words in different contexts. 

> In a very rough manner, skip connections give a transformer a tiny ability to **allow the representations** of different levels of processing to **interact**.

With the forming of multiple paths, we can “pass” our higher-level understanding of the last layers to the previous layers. This allows us to re-modulate how we understand the input. Again, this is the same idea as human top-down understanding, which is nothing more than expectations. 

## Layer normalization

Let’s open the Layer Norm black box.

> In **Layer Normalization** (**LN**), the mean and variance are **computed across channels and spatial dims**. 

In language, each word is a vector. Since we are dealing with vectors, we only have one spatial dimension. 

$$\mu_{n}=\frac{1}{K} \sum_{k=1}^{K} x_{nk}$$

$$\sigma_{n}^{2}=\frac{1}{K}

\sum_{k=1}^{K}\left(x_{nk}-\mu_{n}\right)^{2}$$

$$\hat{x}_{nk}= \frac{x_{nk}-\mu_{n}}{\sqrt{\sigma_{n}^{2}+\epsilon}}, \hat{x}_{nk} \in R $$

$$\mathrm{LN}_{\gamma, \beta}\left(x_{n}\right) =\gamma \hat{x}_{n}+\beta ,x_{n} \in R^{K} ,  $$


where $\gamma$ and $\beta$ are trainable parameters.

In a 4D tensor with merged spatial dimensions, we can visualize this with the following figure:

# Short residual skip connections 

In language, there is a significant notion of a wider understanding of the world and our ability to combine ideas. Humans extensively utilize these top-down influences (our expectations) to combine words in different contexts. 

> In a very rough manner, skip connections give a transformer a tiny ability to **allow the representations** of different levels of processing to **interact**.

With the forming of multiple paths, we can “pass” our higher-level understanding of the last layers to the previous layers. This allows us to re-modulate how we understand the input. Again, this is the same idea as human top-down understanding, which is nothing more than expectations. 

# Layer normalization

Let’s open the Layer Norm black box.

> In **Layer Normalization** (**LN**), the mean and variance are **computed across channels and spatial dims**. 

In language, each word is a vector. Since we are dealing with vectors, we only have one spatial dimension. 

$$\mu_{n}=\frac{1}{K} \sum_{k=1}^{K} x_{nk}$$

$$\sigma_{n}^{2}=\frac{1}{K}

\sum_{k=1}^{K}\left(x_{nk}-\mu_{n}\right)^{2}$$

$$\hat{x}_{nk}= \frac{x_{nk}-\mu_{n}}{\sqrt{\sigma_{n}^{2}+\epsilon}}, \hat{x}_{nk} \in R $$

$$\mathrm{LN}_{\gamma, \beta}\left(x_{n}\right) =\gamma \hat{x}_{n}+\beta ,x_{n} \in R^{K} ,  $$


where $\gamma$ and $\beta$ are trainable parameters.

In a 4D tensor with merged spatial dimensions, we can visualize this with the following figure:

Learn why we use skip connections and layer normalization inside a transformer.

Learn Deep Learning

Neural Networks

Training Neural Networks

Convolutional Neural Networks

Recurrent Neural Networks

Autoencoders

Generative Adversarial Networks

Attention and Transformers

Graph Neural Networks

Conclusion

Final Quiz

Transformers Building Blocks

Short residual skip connections

Layer normalization