Amazon SageMaker Clarify
Learn how to create AI models that are fair, transparent, and explainable by detecting biases and improving model interpretability.
Machine learning is widely used in every field to improve decision-making and enhance customer experiences. It also improves employees’ operational efficiency by analyzing vast amounts of data and uncovering patterns that drive better predictions and automation.
Consider an example of a bank that uses an ML model to evaluate a loan application. The ML model evaluates the applicant’s profile and predicts the probability of repaying the loan. Let’s see the challenges that can appear in this context:
Fairness: The model might favor or disadvantage specific demographic groups.
Bias: The model might overly rely on patterns from historical loan decisions, which may not apply to current economic conditions. For example, high inflation rates and economic recessions might make the model’s predictions less reliable.
Explainability: The model predicts loan approval but does not provide clear reasons for its decision.
To make accurate predictions, it is very important to evaluate the model for its fairness, bias, and explainability. Amazon SageMaker Clarify is a tool that helps make AI models fairer and more understandable. With Clarify, data scientists and machine learning practitioners can evaluate large language models (LLMs) using automatic or human-in-the-loop evaluation jobs.
Features of the SageMaker Clarify
SageMaker Clarify offers a range of capabilities for analyzing data and models in both training and production environments. These capabilities make model predictions more interpretable and build user trust.
The core functionality of Clarify can be divided into two main categories:
Identify and address biases
Sagemaker Clarify helps identify biases in all stages of the ML model.
Data preparation: It can identify imbalances in sensitive features (e.g., race, gender) in data and provide visual reports to mitigate pretraining bias.
Model training: It can assess prediction variations across demographic groups to monitor fairness post-training.
In production: It can work with SageMaker Model Monitor to track bias shifts in real-time data and alert when model fairness changes.
Improve transparency and explainability
ML models work like black box that make predictions based on the input data. However, it is critical to make sure that the ML’s predictions are understandable both prior to deployment and after deployment.
Shapley values: These provide insights into the contribution of each feature to the model’s predictions, helping to explain why the model made a specific decision. For example, in evaluating the loan application, the Shapley values can tell us how different features—such as income level, credit score, debt-to-income ratio, gender, and race—contributed to the model’s decision. If the Shapley values show that gender or race significantly influenced the outcome, it could indicate potential bias in the model.
Partial dependence plots (PDPs): These show how the predictions change when a specific feature changes while keeping all other features constant. This helps reveal the sensitivity to specific features. Extending out the loan application example, the PDP of the income level feature might show the following:
A low probability of loan approval when the income level is below $30,000.
The probability of loan approval increases gradually for the income level between $30,000 and $70,000.
For income levels above $70,000, we see no change in the loan approval rate.
These features are especially valuable in areas like banking, healthcare, and customer support, where using responsible and ethical AI is essential.
How SageMaker Clarify works
Here’s a simplified overview of how Amazon SageMaker Clarify works:
Get hands-on with 1400+ tech skills courses.