Gain insights into basic and intermediate deep learning concepts, including CNNs, RNNs, GANs, and transformers. Delve into fundamental architectures to enhance your machine learning model training skills.

docker.tar.gz

Pytorch

GANS

This course is an accumulation of well-grounded knowledge and experience in deep learning. It provides you with the basic concepts you need in order to start working with and training various machine learning models. 

You will cover both basic and intermediate concepts including but not limited to: convolutional neural networks, recurrent neural networks, generative adversarial networks as well as transformers.

After completing this course, you will have a comprehensive understanding of the fundamental architectural components of deep learning. Whether you’re a data and computer scientist, computer and big data engineer, solution architect, or software engineer, you will benefit from this course.

Introduction to Deep Learning & Neural Networks

## Batch normalization
If you open any introductory machine learning textbook, you will find the idea of **input scaling**. It is undesirable to train a model with **gradient descent** with non-normalized input features.

Let's start with an intuitive example to understand why we want normalization inside any model.

Suppose you have an input feature $x1$ in the range [0,10000] and another feature $x2$ in the range [0,1]. Any linear combination would ignore $x2$:  $x1*w1 + x2*w2 \approx x1$ , since our weights are initialized in a very tiny range like [-1,1].

We encounter the same issues inside the layers of deep neural networks. In this lesson, we will propagate this idea inside the NN.

>If we think out of the box, any intermediate layer is conceptually the same as the input layer; it accepts features and transforms them. 



## Notations 

Throughout this lesson,  $N$ will be the batch size, $H$ will refer to the height, $W$ to the width, and $C$ to the feature channels. The greek letter μ() refers to mean and the greek letter σ() refers to standard deviation.

The batch features are denoted by $x$ with a shape of [N, C, H, W]. 

$$ x \in R^{N \times C \times H \times W}  $$

We will visualize the 4D activation maps **x** by **merging the spatial dimensions**. Now, we have a 3D shape that looks like this:

# Batch normalization
If you open any introductory machine learning textbook, you will find the idea of **input scaling**. It is undesirable to train a model with **gradient descent** with non-normalized input features.

Let's start with an intuitive example to understand why we want normalization inside any model.

Suppose you have an input feature $x1$ in the range [0,10000] and another feature $x2$ in the range [0,1]. Any linear combination would ignore $x2$:  $x1*w1 + x2*w2 \approx x1$ , since our weights are initialized in a very tiny range like [-1,1].

We encounter the same issues inside the layers of deep neural networks. In this lesson, we will propagate this idea inside the NN.

>If we think out of the box, any intermediate layer is conceptually the same as the input layer; it accepts features and transforms them. 



# Notations 

Throughout this lesson,  $N$ will be the batch size, $H$ will refer to the height, $W$ to the width, and $C$ to the feature channels. The greek letter μ() refers to mean and the greek letter σ() refers to standard deviation.

The batch features are denoted by $x$ with a shape of [N, C, H, W]. 

$$ x \in R^{N \times C \times H \times W}  $$

We will visualize the 4D activation maps **x** by **merging the spatial dimensions**. Now, we have a 3D shape that looks like this:

Discover how batch normalization and dropout improve a model's accuracy.

Learn Deep Learning

Neural Networks

Training Neural Networks

Convolutional Neural Networks

Recurrent Neural Networks

Autoencoders

Generative Adversarial Networks

Attention and Transformers

Graph Neural Networks

Conclusion

Final Quiz

Batch Normalization and Dropout

Batch normalization

Notations