Skip to content

Erdincuzunlu/python-data-cleaning-cheatsheet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

python-data-cleaning-cheatsheet

This project is based on the 11-step data cleaning framework shared by Data Scientist Dawn Choo (ex-Meta, ex-Amazon).
It summarizes essential steps in preparing messy data for impactful analysis.

Steps Covered

  1. Import libraries
  2. Understand the data structure
  3. Explore the dataset
  4. Standardize data formats
  5. Remove duplicates
  6. Handle missing values
  7. Standardize string values
  8. Filter out bad data
  9. Remove outliers
  10. Rename columns
  11. Save cleaned data

Technologies Used

  • Python
  • Pandas
  • NumPy
  • Seaborn

Usage

To run the example and generate the cleaned dataset:

python data_cleaning_cheatsheet.py

---

About

A practical 11-step Python cheatsheet for data cleaning tasks in data science.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages