<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE FL_Course SYSTEM "https://www.flane.de/dtd/fl_course095.dtd"><?xml-stylesheet type="text/xsl" href="https://portal.flane.ch/css/xml-course.xsl"?><course productid="25495" language="en" source="https://portal.flane.ch/swisscom/en/xml-course/google-sdpf" lastchanged="2025-09-30T15:14:28+02:00" parent="https://portal.flane.ch/swisscom/en/xml-courses"><title>Serverless Data Processing with Dataflow</title><productcode>SDPF</productcode><vendorcode>GO</vendorcode><vendorname>Google</vendorname><fullproductcode>GO-SDPF</fullproductcode><version>1.0</version><objective>&lt;ul&gt;
&lt;li&gt;Demonstrate how Apache Beam and Dataflow work together to fulfill your organization&amp;rsquo;s data processing needs.&lt;/li&gt;&lt;li&gt;Summarize the benefits of the Beam Portability Framework and enable it for your Dataflow pipelines.&lt;/li&gt;&lt;li&gt;Enable Shuffle and Streaming Engine, for batch and streaming pipelines respectively, for maximum performance.&lt;/li&gt;&lt;li&gt;Enable Flexible Resource Scheduling for more cost-efficient performance.&lt;/li&gt;&lt;li&gt;Select the right combination of IAM permissions for your Dataflow job.&lt;/li&gt;&lt;li&gt;Implement best practices for a secure data processing environment.&lt;/li&gt;&lt;li&gt;Select and tune the I/O of your choice for your Dataflow pipeline.&lt;/li&gt;&lt;li&gt;Use schemas to simplify your Beam code and improve the performance of your pipeline.&lt;/li&gt;&lt;li&gt;Develop a Beam pipeline using SQL and DataFrames.&lt;/li&gt;&lt;li&gt;Perform monitoring, troubleshooting, testing and CI/CD on Dataflow pipelines.&lt;/li&gt;&lt;/ul&gt;</objective><essentials>&lt;p&gt;To get the most out of this course, participants should have completed the following courses:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Building Batch Data Pipelines&lt;/li&gt;&lt;li&gt;Building Resilient Streaming Analytics Systems&lt;/li&gt;&lt;/ul&gt;</essentials><audience>&lt;ul&gt;
&lt;li&gt;Data engineer.&lt;/li&gt;&lt;li&gt;Data analysts and data scientists aspiring to develop data engineering skills&lt;/li&gt;&lt;/ul&gt;</audience><outline>&lt;h5&gt;Module 1: Introduction&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Introduce the course objectives.&lt;/li&gt;&lt;li&gt;Demonstrate how Apache Beam and Dataflow work together to fulfill your organization&amp;rsquo;s data processing needs.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 2: Beam Portability&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Summarize the benefits of the Beam Portability Framework.&lt;/li&gt;&lt;li&gt;Customize the data processing environment of your pipeline using custom containers.&lt;/li&gt;&lt;li&gt;Review use cases for cross-language transformations.&lt;/li&gt;&lt;li&gt;Enable the Portability framework for your Dataflow pipelines.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 3: Separating Compute and Storage with Dataflow&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Enable Shuffle and Streaming Engine, for batch and streaming pipelines respectively, for maximum performance.&lt;/li&gt;&lt;li&gt;Enable Flexible Resource Scheduling for more cost-efficient performance.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 4: IAM, Quotas, and Permissions&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Select the right combination of IAM permissions for your Dataflow job.&lt;/li&gt;&lt;li&gt;Determine your capacity needs by inspecting the relevant quotas for your Dataflow jobs.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 5: Security&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Select your zonal data processing strategy using Dataflow, depending on your data locality needs.&lt;/li&gt;&lt;li&gt;Implement best practices for a secure data processing environment.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 6: Beam Concepts Review&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Review main Apache Beam concepts (Pipeline, PCollections, PTransforms, Runner, reading/writing, Utility PTransforms, side inputs), bundles and DoFn Lifecycle.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 7: Windows, Watermarks, Triggers&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Implement logic to handle your late data.&lt;/li&gt;&lt;li&gt;Review different types of triggers.&lt;/li&gt;&lt;li&gt;Review core streaming concepts (unbounded PCollections, windows).&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 8: Sources and Sinks&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Write the I/O of your choice for your Dataflow pipeline.&lt;/li&gt;&lt;li&gt;Tune your source/sink transformation for maximum performance.&lt;/li&gt;&lt;li&gt;Create custom sources and sinks using SDF.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 9: Schemas&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Introduce schemas, which give developers a way to express structured data in their Beam pipelines.&lt;/li&gt;&lt;li&gt;Use schemas to simplify your Beam code and improve the performance of your pipeline.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 10: State and Timers&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Identify use cases for state and timer API implementations.&lt;/li&gt;&lt;li&gt;Select the right type of state and timers for your pipeline.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 11: Best Practices&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Implement best practices for Dataflow pipelines.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 12: Dataflow SQL and DataFrames&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Develop a Beam pipeline using SQL and DataFrames.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 13: Beam Notebooks&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Prototype your pipeline in Python using Beam notebooks.&lt;/li&gt;&lt;li&gt;Launch a job to Dataflow from a notebook.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 14: Monitoring&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Navigate the Dataflow Job Details UI.&lt;/li&gt;&lt;li&gt;Interpret Job Metrics charts to diagnose pipeline regressions.&lt;/li&gt;&lt;li&gt;Set alerts on Dataflow jobs using Cloud Monitoring.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 15: Logging and Error Reporting&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Use the Dataflow logs and diagnostics widgets to troubleshoot pipeline issues.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 16: Troubleshooting and Debug&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Use a structured approach to debug your Dataflow pipelines.&lt;/li&gt;&lt;li&gt;Examine common causes for pipeline failures.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 17: Performance&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Understand performance considerations for pipelines.&lt;/li&gt;&lt;li&gt;Consider how the shape of your data can affect pipeline performance.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 18: Testing and CI/CD&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Testing approaches for your Dataflow pipeline.&lt;/li&gt;&lt;li&gt;Review frameworks and features available to streamline your CI/CD workflow for Dataflow pipelines.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 19: Reliability&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Implement reliability best practices for your Dataflow pipelines.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 20: Flex Templates&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Using flex templates to standardize and reuse Dataflow pipeline code.&lt;/li&gt;&lt;/ul&gt;&lt;h5&gt;Module 21: Summary&lt;/h5&gt;&lt;ul&gt;
&lt;li&gt;Summary.&lt;/li&gt;&lt;/ul&gt;</outline><objective_plain>- Demonstrate how Apache Beam and Dataflow work together to fulfill your organization’s data processing needs.
- Summarize the benefits of the Beam Portability Framework and enable it for your Dataflow pipelines.
- Enable Shuffle and Streaming Engine, for batch and streaming pipelines respectively, for maximum performance.
- Enable Flexible Resource Scheduling for more cost-efficient performance.
- Select the right combination of IAM permissions for your Dataflow job.
- Implement best practices for a secure data processing environment.
- Select and tune the I/O of your choice for your Dataflow pipeline.
- Use schemas to simplify your Beam code and improve the performance of your pipeline.
- Develop a Beam pipeline using SQL and DataFrames.
- Perform monitoring, troubleshooting, testing and CI/CD on Dataflow pipelines.</objective_plain><essentials_plain>To get the most out of this course, participants should have completed the following courses:


- Building Batch Data Pipelines
- Building Resilient Streaming Analytics Systems</essentials_plain><audience_plain>- Data engineer.
- Data analysts and data scientists aspiring to develop data engineering skills</audience_plain><outline_plain>Module 1: Introduction


- Introduce the course objectives.
- Demonstrate how Apache Beam and Dataflow work together to fulfill your organization’s data processing needs.
Module 2: Beam Portability


- Summarize the benefits of the Beam Portability Framework.
- Customize the data processing environment of your pipeline using custom containers.
- Review use cases for cross-language transformations.
- Enable the Portability framework for your Dataflow pipelines.
Module 3: Separating Compute and Storage with Dataflow


- Enable Shuffle and Streaming Engine, for batch and streaming pipelines respectively, for maximum performance.
- Enable Flexible Resource Scheduling for more cost-efficient performance.
Module 4: IAM, Quotas, and Permissions


- Select the right combination of IAM permissions for your Dataflow job.
- Determine your capacity needs by inspecting the relevant quotas for your Dataflow jobs.
Module 5: Security


- Select your zonal data processing strategy using Dataflow, depending on your data locality needs.
- Implement best practices for a secure data processing environment.
Module 6: Beam Concepts Review


- Review main Apache Beam concepts (Pipeline, PCollections, PTransforms, Runner, reading/writing, Utility PTransforms, side inputs), bundles and DoFn Lifecycle.
Module 7: Windows, Watermarks, Triggers


- Implement logic to handle your late data.
- Review different types of triggers.
- Review core streaming concepts (unbounded PCollections, windows).
Module 8: Sources and Sinks


- Write the I/O of your choice for your Dataflow pipeline.
- Tune your source/sink transformation for maximum performance.
- Create custom sources and sinks using SDF.
Module 9: Schemas


- Introduce schemas, which give developers a way to express structured data in their Beam pipelines.
- Use schemas to simplify your Beam code and improve the performance of your pipeline.
Module 10: State and Timers


- Identify use cases for state and timer API implementations.
- Select the right type of state and timers for your pipeline.
Module 11: Best Practices


- Implement best practices for Dataflow pipelines.
Module 12: Dataflow SQL and DataFrames


- Develop a Beam pipeline using SQL and DataFrames.
Module 13: Beam Notebooks


- Prototype your pipeline in Python using Beam notebooks.
- Launch a job to Dataflow from a notebook.
Module 14: Monitoring


- Navigate the Dataflow Job Details UI.
- Interpret Job Metrics charts to diagnose pipeline regressions.
- Set alerts on Dataflow jobs using Cloud Monitoring.
Module 15: Logging and Error Reporting


- Use the Dataflow logs and diagnostics widgets to troubleshoot pipeline issues.
Module 16: Troubleshooting and Debug


- Use a structured approach to debug your Dataflow pipelines.
- Examine common causes for pipeline failures.
Module 17: Performance


- Understand performance considerations for pipelines.
- Consider how the shape of your data can affect pipeline performance.
Module 18: Testing and CI/CD


- Testing approaches for your Dataflow pipeline.
- Review frameworks and features available to streamline your CI/CD workflow for Dataflow pipelines.
Module 19: Reliability


- Implement reliability best practices for your Dataflow pipelines.
Module 20: Flex Templates


- Using flex templates to standardize and reuse Dataflow pipeline code.
Module 21: Summary


- Summary.</outline_plain><duration unit="d" days="3">3 days</duration><pricelist><price country="DE" currency="EUR">1950.00</price><price country="US" currency="USD">1995.00</price><price country="CH" currency="CHF">2220.00</price><price country="AT" currency="EUR">1950.00</price><price country="IT" currency="EUR">1950.00</price><price country="GB" currency="GBP">1980.00</price><price country="IL" currency="ILS">6770.00</price><price country="BE" currency="EUR">2095.00</price><price country="NL" currency="EUR">2095.00</price><price country="GR" currency="EUR">2050.00</price><price country="MK" currency="EUR">2050.00</price><price country="HU" currency="EUR">2050.00</price><price country="SI" currency="EUR">1950.00</price><price country="CA" currency="CAD">2755.00</price><price country="FR" currency="EUR">2450.00</price></pricelist><miles/></course>