Get hands-on experience with designing and building data processing systems on Google Cloud. This course uses lectures, demos, and hand-on labs to show you how to design data processing systems, build end-to-end data pipelines, analyze data, and implement machine learning. This course covers structured, unstructured, and streaming data.
Who should attend
This class is intended for experienced developers who are responsible for managing big data transformations including:
- Extracting, loading, transforming, cleaning, and validating data.
- Designing pipelines and architectures for data processing.
- Creating and maintaining machine learning and statistical models.
- Querying datasets, visualizing query results and creating reports
This course is part of the following Certifications:
To get the most of out of this course, participants should have:
- Completed Google Cloud Fundamentals: Big Data and Machine Learning (GCF-BDM) course OR have equivalent experience
- Basic proficiency with common query language such as SQL Experience with data modeling, extract, transform, load activities.
- Developing applications using a common programming language such as Python Familiarity with basic statistics
- Design and build data processing systems on Google Cloud Platform.
- Leverage unstructured data using Spark and ML APIs on Cloud Dataproc.
- Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow.
- Derive business insights from extremely large datasets using Google BigQuery.
- Train, evaluate and predict using machine learning models using TensorFlow and Cloud ML.
- Enable instant insights from streaming data
- Module 1: Introduction to Data Engineering
- Module 2: Building a Data Lake
- Module 3: Building a Data Warehouse
- Module 4: Introduction to Building Batch Data Pipelines,
- Module 5: Executing Spark on Cloud Dataproc
- Module 6: Serverless Data Processing with Cloud Dataflow
- Module 7: Manage Data Pipelines with Cloud Data Fusion and Cloud Composer
- Module 8: Introduction to Processing Streaming Data
- Module 9: Serverless Messaging with Cloud Pub/Sub
- Module 10: Cloud Dataflow Streaming Features
- Module 11: High-Throughput BigQuery and Bigtable Streaming Features
- Module 12: Advanced BigQuery Functionality and Performance
- Module 13: Introduction to Analytics and AI
- Module 14: Prebuilt ML model APIs for Unstructured Data
- Module 15: Big Data Analytics with Cloud AI Platform Notebooks
- Module 16: Production ML Pipelines with Kubeflow
- Module 17: Custom Model building with SQL in BigQuery ML
- Module 18: Custom Model building with Cloud AutoML