Introduction to Jupyter and pandas

Learn about Jupyter and pandas.

We'll cover the following

What is a Jupyter notebook?
Loading the case study data with pandas
- Python DataFrame

Now it’s time to take a first look at the data we will use in our case study. We won’t do anything in this lesson other than ensure that we can load the data into a Jupyter notebook correctly. Examining the data, and understanding the problem you will solve with it, will come later.

The data file is an Excel spreadsheet called default_of_credit_card_ clients__courseware_version_1_21_19.xls. We recommend you first open the spreadsheet in Excel or the spreadsheet program of your choice. Note the number of rows and columns. Look at some example values. This will help you know whether or not you have loaded it correctly in the Jupyter notebook.

Note: The dataset we will be using is a modified version of the original dataset, which has been sourced from the UCI Machine Learning Repository.

What is a Jupyter notebook?

Jupyter notebooks are interactive coding environments that allow for inline text and graphics. They are great tools for data scientists to communicate and preserve their results because both the methods (code) and the message (text and graphics) are integrated. You can think of the environment as a kind of web page where you can write and execute code. Jupyter notebooks can, in fact, be rendered as web pages.

Get hands-on with 1400+ tech skills courses.

Introduction

Data Exploration and Cleaning

(Challenge) Exploring Remaining Financial Features in Dataset

Introduction to scikit-learn and Model Evaluation

(Challenge) Logistic Regression and Precision-Recall Curve

Details of Logistic Regression and Feature Extraction

(Challenge) Logistic Regression Model and Coefficients

The Bias-Variance Trade-Off

(Challenge) Cross-Validation and Feature Engineering

Decision Trees and Random Forests

(Challenge) Cross-Validation Grid Search with Random Forest

Gradient Boosting, XGBoost, and SHAP Values

(Challenge) XGBoost and SHAP Explanation for Case Study Data

Test Set Analysis, Financial Insights, and Delivery to the Client

(Challenge) Deriving Financial Insights

Appendix

Introduction to Jupyter and pandas

What is a Jupyter notebook?