Data Cleaning and Normalisation: From Lake to Warehouse

Of the huge amounts of data generated daily, hourly, by the minute, little is usable at first.

This course explores ways in which analysts can clean data and normalise it to make it amenable to analysis.

Learning Outcomes

By the end of this course, participants should be able to:

  • Describe various strategies and decisions around cleaning data
  • Clean datasets in Python using standard toolsets, eg Pandas
  • Merge and standardise disparate datasets representing the same problem space
  • Make informed decision on what to do with e.g. edge cases where simple numerical approaches are insufficient

Course Content

  • Orientation: What can unstructured data look like? What is data cleaning? Is it just removing missing or incongruous values?
  • Simple, library implementations of numerical cleaning approaches – pandas’ isna, fillna, dropna and related functionality
  • Workshop 1: extend the use of these functions to drop fields, rows or columns or interpolate missing values based on thresholds
  • Text conversion: correcting misspelt entries or converting categories (e.g. titles in language A -> titles in English)
  • Workshop 2: how many tellers are there in your global organisation?
    • Standardise two or more datasets with differing criteria (e.g. job titles, salary brackets) so that they are commensurate, and perform the resulting analysis

100%

of our clients would recommend Alpha to a colleague or peer

90%

of a cohort was promoted within 2 years after participating in an Alpha course.

96%

of participants felt more confident in their role, following an Alpha commercial leadership course.

Why study with Alpha?

Innovation and creativity born from experience

We are thought leaders in instructional and learning journey design and holistic solution architects. We have extensive finance and investments experience combined with skills application to deliver performance improving results. We develop immersive learning environments that maximize time to productivity, support talent retention and added value to improving quality of hires.

Knowledge Exchange Evangelists

We are focused on mining the embedded organisational intellectual capital for the benefit of the next generation. We create and curate best in class practice gathered from our experience with the leading financial institutions. We design our programmes with the end in mind – what results are you trying to achieve with this intervention? What metrics will we set ourselves to achieve that?

Generation Proof

Quality and innovation, using current market and industry best practices, have made us a trusted partner in delivering dynamic and motivating training for the financial and capital markets. Our programmes are generation proof and responsive to evolving learner needs and styles. Our solutions use a multi-stakeholder engagement strategy that expands beyond relationships between the learner and learning provider. We create connections with managers, peers and the wider business to drive impactful return on investment..

Enquire now

"*" indicates required fields

This form collects your contact information so that we can correspond with you.
Check out our privacy policy for more information about how we protect and manage your data.
This field is for validation purposes and should be left unchanged.
The team are so friendly and pleasant to work with, everyone is very professional and keen to help us. Building a relationship over the past couple of years helps us to feel like the Alpha team are even more able to understand our needs and provide more proactive solutions.
Find out more about our in-house training courses