Python Data Wrangling Training

Course 1273

  • Duration: 3 days
  • Sandbox: Yes
  • Language: English
  • Level: Intermediate

In this Python Data Wrangling course, you will learn how to use Python to extract/transform data from various sources, including large database vaults and Excel financial tables.

You will also explore insights into why you should avoid traditional data cleaning methods, as done in other languages, and take advantage of the specialised functions from NumPy and Pandas.

Python Data Wrangling Training Delivery Methods

  • In-Person

  • Online

Python Data Wrangling Training Information

In this Python Wrangling course, you will learn how to do the following:

  • Extract and parse data from various sources.
  • Transform and clean data using Numpy and Pandas.
  • Summarise and visualise data with Matplotlib.
  • Read HTML, XML, and JSON data from internet resources.
  • Search and filter data sets.
  • Apply Python tools and techniques to process data sets efficiently.
  • Continue learning and face new challenges with after-course one-on-one instructor coaching.

Prerequisites:

You should know Python basics, including data structures, importing and using modules, creating functions, and using the Jupyter Notebook platform.

Python Data Wrangling Training Outline

In this module, you will learn about the following:

  • Python for Data Wrangling
  • Lists, Sets, Strings, Tuples, and Dictionaries

In this module, you will learn about the following:

  • Advanced Data Structures
  • Basic File Operations in Python

In this module, you will learn about the following: 

  • NumPy Arrays 
  • Pandas DataFrames 
  • Statistics and Visualisation with NumPy and Pandas 
  • Using NumPy and Pandas to Calculate Basic Descriptive Statistics on the DataFrame 

In this module, you will learn about the following: 

  • Subsetting, Filtering, and Grouping 
  • Detecting Outliers and Handling Missing Values 
  • Concatenating, Merging, and Joining 
  • Useful Methods of Pandas 

In this module, you will learn about the following: 

  • Reading Data from Different Text-Based (and Non-Text-Based) Sources 
  • Introduction to BeautifulSoup4 and Web Page Parsing 

In this module, you will learn about the following: 

  • Advanced List Comprehension and the zip function 
  • Data Formatting 

In this module, you will learn about the following: 

  • Basics of Web Scraping and BeautifulSoup libraries 
  • Reading Data from XML 

In this module, you will learn about the following:

  • Refresher of RDBMS and SQL
  • Using an RDBMS (MySQL/PostgreSQL/SQLite)

In this module, you will learn about the following:

  • Applying Your Knowledge to a Real-life Data Wrangling Task
  • An Extension to Data Wrangling

Need Help Finding The Right Training Solution?

Our training advisors are here for you.

Python Data Wrangling Training Course FAQs

This course is for data analysts and data scientists looking to utilise Python to extract from various sources and prepare it for machine learning modelling.

To succeed in this course, you should have a working knowledge of Python basics, including data structures, importing and using modules, creating functions, and using the Jupyter Notebook platform.

No, we will read and write Excel spreadsheets but not use the Excel product.

Data wrangling is ingesting, cleaning, and unifying raw data sources into a format for more accessible analysis.

No. This is not a programming class but rather an instruction on data management and processing. The Jupyter Notebook/Lab applications are used for their interactive features to speed development.

Yes, these skills are fundamental to creating a data analytics pipeline. However, additional training may be required to perform visualisation, modelling, and prediction.

The software is based on the Anaconda distribution, a combination of Python, Jupyter, and many data analytics libraries. All software tools are platform-independent and would work using Windows, Linux, or OS/X. The class runs in a Linux environment, but the skills and tools used would apply to any platform.

No, though we strive to keep our software up to date, we often use older versions of packages for interoperability. All the software packages used are available for any of the major operating systems.