Learn about  ML pipeline risk management data, bias, and security. Explore data privacy, attacks, and AI alternatives like causal AI and federated learning.

Disaster-Proofing Machine Learning Pipelines.png

The machine learning (ML) pipeline involves a complex relationship between the data, the model, and its implementation—each with its own risks that can adversely affect the utility and profitability of the solution. This course is a primer on what these risks are, where they come from, and how to mitigate them effectively.

In this course, you’ll start with a comprehensive look at the data side of the pipeline, including data privacy, data drift, and more. You’ll learn how to mitigate these in theory and practice. You’ll also discover problems related to ML models such as bias, security, and adversarial attacks. Finally, you’ll learn some of the alternative AI paradigms that exist in the world today—from causal AI to federated learning to generative AI.

A deep understanding of where problems can arise is a critical part of a data engineer or data scientist’s ML knowledge. From a career perspective, this course’s content can effectively address the real risks faced by developers while setting up ML pipelines.

Mitigating Disasters in ML Pipelines

Learn how to identify data drift with statistical and algorithmic methods.

Introduction

Disasters in Data

Disasters in Models

Measuring Causal Relations with Python

Alternatives to Traditional ML

Adversarial Robustness of Neural Networks

Conclusion

Assessment: Disasters in ML Pipelines

Detecting Data Drift

Statistical methods

Kolmogorov-Smirnov