Delve into practical machine learning with NumPy, pandas, scikit-learn, and more. Gain insights into data analysis, feature engineering, and deep learning using industry-standard frameworks. Basic Python required.

ml.tar.gz

mnist-fashion

If you're a software engineer looking to add machine learning to your skillset, this is the place to start. 

This course will teach you to write useful code and create impactful machine learning applications immediately. From the start, you'll be given all the tools that you need to create industry-level machine learning projects. Rather than reading through dense theory, you’ll learn practical skills and gain actionable insights. Topics covered include data analysis/visualization, feature engineering, supervised learning, unsupervised learning, and deep learning. All of these topics are taught using industry-standard frameworks: NumPy, pandas, scikit-learn, XGBoost, TensorFlow, and Keras.

Basic knowledge of Python is a prerequisite to this course.

This course was created by AdaptiLab, a company specializing in evaluating, sourcing, and upskilling enterprise machine learning talent. It is built in collaboration with industry machine learning experts from Google, Microsoft, Amazon, and Apple.

Machine Learning with NumPy, pandas, scikit-learn, and More

## A. Standard data format
Data can contain all sorts of different values. For example, Olympic 100m sprint times will range from 9.5 to 10.5 seconds, while calorie counts in large pepperoni pizzas can range from 1500 to 3000 calories. Even data measuring the exact same quantities can range in value (e.g. weight in kilograms vs. weight in pounds).

When data can take on any range of values, it makes it difficult to interpret. Therefore, data scientists will convert the data into a standard format to make it easier to understand. The standard format refers to data that has 0 mean and unit variance (i.e. standard deviation = 1), and the process of converting data into this format is called *data standardization*.

Data standardization is a relatively simple process. For each data value, *x*, we subtract the overall mean of the data, &mu;, then divide by the overall standard deviation, &sigma;. The new value, *z*, represents the standardized data value. Thus, the formula for data standardization is:

# A. Standard data format
Data can contain all sorts of different values. For example, Olympic 100m sprint times will range from 9.5 to 10.5 seconds, while calorie counts in large pepperoni pizzas can range from 1500 to 3000 calories. Even data measuring the exact same quantities can range in value (e.g. weight in kilograms vs. weight in pounds).

When data can take on any range of values, it makes it difficult to interpret. Therefore, data scientists will convert the data into a standard format to make it easier to understand. The standard format refers to data that has 0 mean and unit variance (i.e. standard deviation = 1), and the process of converting data into this format is called *data standardization*.

Data standardization is a relatively simple process. For each data value, *x*, we subtract the overall mean of the data, &mu;, then divide by the overall standard deviation, &sigma;. The new value, *z*, represents the standardized data value. Thus, the formula for data standardization is:

Learn about data standardization and implement it with scikit-learn.

What you'll learn from this course

Data Manipulation with NumPy

Data Analysis with pandas

Data Preprocessing with scikit-learn

Data Modeling with scikit-learn

Clustering with scikit-learn

Gradient Boosting with XGBoost

Deep Learning with TensorFlow

Deep Learning with Keras

Standardizing Data

A. Standard data format