Deeplearning.ai, Coursera, and AWS Launch a New 3-Course Specialization for Practical Data Science with Amazon Sagemaker

By Antje Barth, Chris Fregly, Shelbee Eigenbrode, and Sireesha Muppala

Vocareum Deploys Hands-On Labs for the Practical Data Science Specialization

Vocareum is a cloud lab platform built specifically for learning, research and assessment. The overall architecture combines cost effective compute, powerful assessment capability, and integration into various learning workflows. Our labs are delivered through the browser and have a flexible infrastructure to support a range of use cases including interactive computing (via notebooks), Cloud Computing, cyber security, programming, etc. Our labs are deployed in various learning contexts including MOOCs, large residential courses, bootcamps, and instructor-led corporate training, etc.  Vocareum is also being used for performance-based assessments and certification exams.

Amazon Web Services (AWS), Coursera, and DeepLearning.AI are excited to announce Practical Data Science, a 3-course, 10-week, hands-on Specialization designed for data professionals to quickly master the essentials of machine learning in the AWS cloud. DeepLearning.AI was founded in 2017 by Andrew Ng, a machine learning and education pioneer, to fill a need for world-class AI education. DeepLearning.AI teamed up with an all-female team of instructors including AWS machine learning solution architects and developer advocates to develop and deliver the 3-course Specialization on Coursera’s education platform. Sign up for the Practical Data Science Specialization today on Coursera.

Moving data science projects from idea to production requires a new set of skills to address the scale and operational efficiencies required by today’s machine learning problems. This Specialization addresses common challenges we hear from our customers and teaches you the practical knowledge needed to efficiently deploy your data science projects at scale in the AWS cloud.

Specialization Overview

The Practical Data Science Specialization is designed for data-focused developers, scientists, and analysts familiar with Python to learn how to build, train, and deploy scalable, end-to-end, machine learning pipelines – both automated and human-in-the-loop – in the AWS cloud. Each of the 10 weeks features a comprehensive, hands-on lab developed specifically for this Specialization and hosted by AWS Partner, Vocareum. The labs provide hands-on experience with state-of-the-art algorithms for natural language processing (NLP) and natural language understanding (NLU) using Amazon SageMaker and Hugging Face’s highly-optimized implementation of the state-of-the-art BERT algorithm.

In the first course, you will learn foundational concepts for exploratory data analysis (EDA), automated machine learning (AutoML), and text-classification algorithms. With Amazon SageMaker Clarify and Amazon SageMaker Data Wrangler, you will analyze a dataset for statistical bias, transform the dataset into machine-readable features, and select the most important features to train a multi-class text classifier. You will then perform automated machine learning (AutoML) to automatically train, tune, and deploy the best text-classification algorithm for the given dataset using Amazon SageMaker Autopilot. Next, you will work with Amazon SageMaker BlazingText, a highly-optimized and scalable implementation of the popular FastText algorithm, to train a text classifier with very little code.

In the second course, you will learn to automate a natural language processing task by building an end-to-end machine learning pipeline using BERT with Amazon SageMaker Pipelines. Your pipeline will first transform the dataset into BERT-readable features and store the features in the Amazon SageMaker Feature Store. It will then fine-tune a text classification model to the dataset using a Hugging Face pre-trained model which has learned to understand human language from millions of Wikipedia documents. Finally, your pipeline will evaluate the model’s accuracy and only deploy the model if the accuracy exceeds a given threshold.

In the third course, you will learn a series of performance-improvement and cost-reduction techniques to automatically tune model accuracy, compare prediction performance, and generate new training data with human intelligence.  After tuning your text classifier using Amazon SageMaker Hyper-parameter Tuning (HPT), you will deploy two model candidates into an A/B test to compare their real-time prediction performance and automatically scale the winning model using Amazon SageMaker Hosting. Lastly, you will set up a human-in-the-loop pipeline to fix misclassified predictions and generate new training data using Amazon Augmented AI and Amazon SageMaker Ground Truth.

“The field of data science is constantly evolving with new tools, technologies, and methods,” said Betty Vandenbosch, Chief Content Officer at Coursera. “We’re excited to expand our partnership with DeepLearning.AI and AWS to help data scientists around the world keep up with the many tools at their disposal. Through hands-on learning, cutting-edge technology, and expert instruction, this new content will help learners acquire the latest job-relevant data science skills.”

Register Today

The Practical Data Science Specialization from DeepLearning.AI, AWS, and Coursera is a great way to learn AI and machine learning essentials in the cloud. The 3-course Specialization is a great resource to start building and operationalizing data science projects efficiently with the depth and breadth of AWS machine learning services. Improve your data science skills by signing up for the Practical Data Science Specialization today at Coursera!