Data Science in Python: Data Prep & EDA

Data Science in Python: Data Prep & EDA

Data Science in Python: Data Prep & EDA

Learn how to use Python & Pandas to gather, clean, explore and analyze data for Data Science and Machine Learning

Created by Maven Analytics, Alice Zhao | 8.5 hours on-demand video course

This is a hands-on, project-based course designed to help you master the core building blocks of Python for data science. We’ll start by introducing the fields of data science and machine learning, discussing the difference between supervised and unsupervised learning, and reviewing the data science workflow we’ll be using throughout the course.

From there we’ll do a deep dive into the data prep & EDA steps of the workflow. You’ll learn how to scope a data science project, use Pandas to gather data from multiple sources and handle common data cleaning issues, and perform exploratory data analysis using techniques like filtering, grouping, and visualizing data.

Throughout the course, you’ll play the role of a Jr. Data Scientist for Maven Music, a streaming service that’s been struggling with customer churn. Using the skills you learn throughout the course, you’ll use Python to gather, clean, and explore the data to provide insights about their customers. Last but not least, you’ll practice preparing data for machine learning models by joining multiple tables, adjusting row granularity, and engineering useful fields and features.

What you’ll learn

  • Master the core building blocks of Python for data science BEFORE applying machine learning algorithms
  • Scope data science projects by clearly defining the goals, techniques, and data sources needed for your analysis
  • Import and export flat files, Excel workbooks, and SQL database tables using Pandas
  • Clean data by converting data types, handling common data issues, and creating new columns for analysis
  • Perform exploratory data analysis (EDA) by sorting, filtering, grouping, and visualizing data to discover patterns and insights
  • Prepare data for machine learning models by joining tables, aggregating rows, and applying feature engineering techniques

