{"course":{"productid":18642,"modality":1,"active":true,"language":"fr","title":"Data Engineering on Google Cloud Platform","productcode":"DEGCP","vendorcode":"GO","vendorname":"Google","fullproductcode":"GO-DEGCP","courseware":{"has_ekit":false,"has_printkit":true,"language":"en"},"url":"https:\/\/portal.flane.ch\/course\/google-degcp","objective":"<p>This course teaches participants the following skills:<\/p>\n<ul>\n<li>Design and build data processing systems on Google Cloud Platform<\/li><li>Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow<\/li><li>Derive business insights from extremely large datasets using Google BigQuery<\/li><li>Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML<\/li><li>Leverage unstructured data using Spark and ML APIs on Cloud Dataproc<\/li><li>Enable instant insights from streaming data<\/li><\/ul>","essentials":"<p>To get the most of out of this course, participants should have:<\/p>\n<ul>\n<li>Completed <span class=\"attentionbbcode\" title=\"inactive or disabled course: GO-GCF-BDM\">!<\/span>Google Cloud Fundamentals: Big Data and Machine Learning <span class=\"fl-prod-pcode\">(GCF-BDM)<\/span> course OR have equivalent experience<\/li><li>Basic proficiency with common query language such as SQL<\/li><li>Experience with data modeling, extract, transform, load activities Developing applications using a common programming language such Python<\/li><li>Familiarity with Machine Learning and\/or statistics<\/li><\/ul>","audience":"<p>This class is intended for experienced developers who are responsible for managing big data transformations including:<\/p>\n<ul>\n<li>Extracting, Loading, Transforming, cleaning, and validating data<\/li><li>Designing pipelines and architectures for data processing<\/li><li>Creating and maintaining machine learning and statistical models<\/li><li>Querying datasets, visualizing query results and creating reports<\/li><\/ul>","contents":"<h5>Module 1: Google Cloud Dataproc Overview<\/h5><ul>\n<li>Creating and managing clusters.<\/li><li>Leveraging custom machine types and preemptible worker nodes.<\/li><li>Scaling and deleting Clusters.<\/li><li>Lab: Creating Hadoop Clusters with Google Cloud Dataproc.<\/li><\/ul><h5>Module 2: Running Dataproc Jobs<\/h5><ul>\n<li>Running Pig and Hive jobs.<\/li><li>Separation of storage and compute.<\/li><li>Lab: Running Hadoop and Spark Jobs with Dataproc.<\/li><li>Lab: Submit and monitor jobs.<\/li><\/ul><h5>Module 3: Integrating Dataproc with Google Cloud Platform<\/h5><ul>\n<li>Customize cluster with initialization actions.<\/li><li>BigQuery Support.<\/li><li>Lab: Leveraging Google Cloud Platform Services.<\/li><\/ul><h5>Module 4: Making Sense of Unstructured Data with Google&rsquo;s Machine Learning APIs<\/h5><ul>\n<li>Google&rsquo;s Machine Learning APIs.<\/li><li>Common ML Use Cases.<\/li><li>Invoking ML APIs.<\/li><li>Lab: Adding Machine Learning Capabilities to Big Data Analysis.<\/li><\/ul><h5>Module 5: Serverless data analysis with BigQuery<\/h5><ul>\n<li>What is BigQuery.<\/li><li>Queries and Functions.<\/li><li>Lab: Writing queries in BigQuery.<\/li><li>Loading data into BigQuery.<\/li><li>Exporting data from BigQuery.<\/li><li>Lab: Loading and exporting data.<\/li><li>Nested and repeated fields.<\/li><li>Querying multiple tables.<\/li><li>Lab: Complex queries.<\/li><li>Performance and pricing.<\/li><\/ul><h5>Module 6: Serverless, autoscaling data pipelines with Dataflow<\/h5><ul>\n<li>The Beam programming model.<\/li><li>Data pipelines in Beam Python.<\/li><li>Data pipelines in Beam Java.<\/li><li>Lab: Writing a Dataflow pipeline.<\/li><li>Scalable Big Data processing using Beam.<\/li><li>Lab: MapReduce in Dataflow.<\/li><li>Incorporating additional data.<\/li><li>Lab: Side inputs.<\/li><li>Handling stream data.<\/li><li>GCP Reference architecture.<\/li><\/ul><h5>Module 7: Getting started with Machine Learning<\/h5><ul>\n<li>What is machine learning (ML).<\/li><li>Effective ML: concepts, types.<\/li><li>ML datasets: generalization.<\/li><li>Lab: Explore and create ML datasets.<\/li><\/ul><h5>Module 8: Building ML models with Tensorflow<\/h5><ul>\n<li>Getting started with TensorFlow.<\/li><li>Lab: Using tf.learn.<\/li><li>TensorFlow graphs and loops + lab.<\/li><li>Lab: Using low-level TensorFlow + early stopping.<\/li><li>Monitoring ML training.<\/li><li>Lab: Charts and graphs of TensorFlow training.<\/li><\/ul><h5>Module 9: Scaling ML models with CloudML<\/h5><ul>\n<li>Why Cloud ML?<\/li><li>Packaging up a TensorFlow model.<\/li><li>End-to-end training.<\/li><li>Lab: Run a ML model locally and on cloud.<\/li><\/ul><h5>Module 10: Feature Engineering<\/h5><ul>\n<li>Creating good features.<\/li><li>Transforming inputs.<\/li><li>Synthetic features.<\/li><li>Preprocessing with Cloud ML.<\/li><li>Lab: Feature engineering.<\/li><\/ul><h5>Module 11: Architecture of streaming analytics pipelines<\/h5><ul>\n<li>Stream data processing: Challenges.<\/li><li>Handling variable data volumes.<\/li><li>Dealing with unordered\/late data.<\/li><li>Lab: Designing streaming pipeline.<\/li><\/ul><h5>Module 12: Ingesting Variable Volumes<\/h5><ul>\n<li>What is Cloud Pub\/Sub?<\/li><li>How it works: Topics and Subscriptions.<\/li><li>Lab: Simulator.<\/li><\/ul><h5>Module 13: Implementing streaming pipelines<\/h5><ul>\n<li>Challenges in stream processing.<\/li><li>Handle late data: watermarks, triggers, accumulation.<\/li><li>Lab: Stream data processing pipeline for live traffic data.<\/li><\/ul><h5>Module 14: Streaming analytics and dashboards<\/h5><ul>\n<li>Streaming analytics: from data to decisions.<\/li><li>Querying streaming data with BigQuery.<\/li><li>What is Google Data Studio?<\/li><li>Lab: build a real-time dashboard to visualize processed data.<\/li><\/ul><h5>Module 15: High throughput and low-latency with Bigtable<\/h5><ul>\n<li>What is Cloud Spanner?<\/li><li>Designing Bigtable schema.<\/li><li>Ingesting into Bigtable.<\/li><li>Lab: streaming into Bigtable.<\/li><\/ul>","outline":"<h5>Module 1: Google Cloud Dataproc Overview<\/h5><ul>\n<li>Creating and managing clusters.<\/li><li>Leveraging custom machine types and preemptible worker nodes.<\/li><li>Scaling and deleting Clusters.<\/li><li>Lab: Creating Hadoop Clusters with Google Cloud Dataproc.<\/li><\/ul><h5>Module 2: Running Dataproc Jobs<\/h5><ul>\n<li>Running Pig and Hive jobs.<\/li><li>Separation of storage and compute.<\/li><li>Lab: Running Hadoop and Spark Jobs with Dataproc.<\/li><li>Lab: Submit and monitor jobs.<\/li><\/ul><h5>Module 3: Integrating Dataproc with Google Cloud Platform<\/h5><ul>\n<li>Customize cluster with initialization actions.<\/li><li>BigQuery Support.<\/li><li>Lab: Leveraging Google Cloud Platform Services.<\/li><\/ul><h5>Module 4: Making Sense of Unstructured Data with Google&rsquo;s Machine Learning APIs<\/h5><ul>\n<li>Google&rsquo;s Machine Learning APIs.<\/li><li>Common ML Use Cases.<\/li><li>Invoking ML APIs.<\/li><li>Lab: Adding Machine Learning Capabilities to Big Data Analysis.<\/li><\/ul><h5>Module 5: Serverless data analysis with BigQuery<\/h5><ul>\n<li>What is BigQuery.<\/li><li>Queries and Functions.<\/li><li>Lab: Writing queries in BigQuery.<\/li><li>Loading data into BigQuery.<\/li><li>Exporting data from BigQuery.<\/li><li>Lab: Loading and exporting data.<\/li><li>Nested and repeated fields.<\/li><li>Querying multiple tables.<\/li><li>Lab: Complex queries.<\/li><li>Performance and pricing.<\/li><\/ul><h5>Module 6: Serverless, autoscaling data pipelines with Dataflow<\/h5><ul>\n<li>The Beam programming model.<\/li><li>Data pipelines in Beam Python.<\/li><li>Data pipelines in Beam Java.<\/li><li>Lab: Writing a Dataflow pipeline.<\/li><li>Scalable Big Data processing using Beam.<\/li><li>Lab: MapReduce in Dataflow.<\/li><li>Incorporating additional data.<\/li><li>Lab: Side inputs.<\/li><li>Handling stream data.<\/li><li>GCP Reference architecture.<\/li><\/ul><h5>Module 7: Getting started with Machine Learning<\/h5><ul>\n<li>What is machine learning (ML).<\/li><li>Effective ML: concepts, types.<\/li><li>ML datasets: generalization.<\/li><li>Lab: Explore and create ML datasets.<\/li><\/ul><h5>Module 8: Building ML models with Tensorflow<\/h5><ul>\n<li>Getting started with TensorFlow.<\/li><li>Lab: Using tf.learn.<\/li><li>TensorFlow graphs and loops + lab.<\/li><li>Lab: Using low-level TensorFlow + early stopping.<\/li><li>Monitoring ML training.<\/li><li>Lab: Charts and graphs of TensorFlow training.<\/li><\/ul><h5>Module 9: Scaling ML models with CloudML<\/h5><ul>\n<li>Why Cloud ML?<\/li><li>Packaging up a TensorFlow model.<\/li><li>End-to-end training.<\/li><li>Lab: Run a ML model locally and on cloud.<\/li><\/ul><h5>Module 10: Feature Engineering<\/h5><ul>\n<li>Creating good features.<\/li><li>Transforming inputs.<\/li><li>Synthetic features.<\/li><li>Preprocessing with Cloud ML.<\/li><li>Lab: Feature engineering.<\/li><\/ul><h5>Module 11: Architecture of streaming analytics pipelines<\/h5><ul>\n<li>Stream data processing: Challenges.<\/li><li>Handling variable data volumes.<\/li><li>Dealing with unordered\/late data.<\/li><li>Lab: Designing streaming pipeline.<\/li><\/ul><h5>Module 12: Ingesting Variable Volumes<\/h5><ul>\n<li>What is Cloud Pub\/Sub?<\/li><li>How it works: Topics and Subscriptions.<\/li><li>Lab: Simulator.<\/li><\/ul><h5>Module 13: Implementing streaming pipelines<\/h5><ul>\n<li>Challenges in stream processing.<\/li><li>Handle late data: watermarks, triggers, accumulation.<\/li><li>Lab: Stream data processing pipeline for live traffic data.<\/li><\/ul><h5>Module 14: Streaming analytics and dashboards<\/h5><ul>\n<li>Streaming analytics: from data to decisions.<\/li><li>Querying streaming data with BigQuery.<\/li><li>What is Google Data Studio?<\/li><li>Lab: build a real-time dashboard to visualize processed data.<\/li><\/ul><h5>Module 15: High throughput and low-latency with Bigtable<\/h5><ul>\n<li>What is Cloud Spanner?<\/li><li>Designing Bigtable schema.<\/li><li>Ingesting into Bigtable.<\/li><li>Lab: streaming into Bigtable.<\/li><\/ul>","summary":"<p>This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data.<\/p>","objective_plain":"This course teaches participants the following skills:\n\n\n- Design and build data processing systems on Google Cloud Platform\n- Process batch and streaming data by implementing autoscaling data pipelines on Cloud Dataflow\n- Derive business insights from extremely large datasets using Google BigQuery\n- Train, evaluate and predict using machine learning models using Tensorflow and Cloud ML\n- Leverage unstructured data using Spark and ML APIs on Cloud Dataproc\n- Enable instant insights from streaming data","essentials_plain":"To get the most of out of this course, participants should have:\n\n\n- Completed Google Cloud Fundamentals: Big Data and Machine Learning (GCF-BDM) course OR have equivalent experience\n- Basic proficiency with common query language such as SQL\n- Experience with data modeling, extract, transform, load activities Developing applications using a common programming language such Python\n- Familiarity with Machine Learning and\/or statistics","audience_plain":"This class is intended for experienced developers who are responsible for managing big data transformations including:\n\n\n- Extracting, Loading, Transforming, cleaning, and validating data\n- Designing pipelines and architectures for data processing\n- Creating and maintaining machine learning and statistical models\n- Querying datasets, visualizing query results and creating reports","contents_plain":"Module 1: Google Cloud Dataproc Overview\n\n\n- Creating and managing clusters.\n- Leveraging custom machine types and preemptible worker nodes.\n- Scaling and deleting Clusters.\n- Lab: Creating Hadoop Clusters with Google Cloud Dataproc.\nModule 2: Running Dataproc Jobs\n\n\n- Running Pig and Hive jobs.\n- Separation of storage and compute.\n- Lab: Running Hadoop and Spark Jobs with Dataproc.\n- Lab: Submit and monitor jobs.\nModule 3: Integrating Dataproc with Google Cloud Platform\n\n\n- Customize cluster with initialization actions.\n- BigQuery Support.\n- Lab: Leveraging Google Cloud Platform Services.\nModule 4: Making Sense of Unstructured Data with Google\u2019s Machine Learning APIs\n\n\n- Google\u2019s Machine Learning APIs.\n- Common ML Use Cases.\n- Invoking ML APIs.\n- Lab: Adding Machine Learning Capabilities to Big Data Analysis.\nModule 5: Serverless data analysis with BigQuery\n\n\n- What is BigQuery.\n- Queries and Functions.\n- Lab: Writing queries in BigQuery.\n- Loading data into BigQuery.\n- Exporting data from BigQuery.\n- Lab: Loading and exporting data.\n- Nested and repeated fields.\n- Querying multiple tables.\n- Lab: Complex queries.\n- Performance and pricing.\nModule 6: Serverless, autoscaling data pipelines with Dataflow\n\n\n- The Beam programming model.\n- Data pipelines in Beam Python.\n- Data pipelines in Beam Java.\n- Lab: Writing a Dataflow pipeline.\n- Scalable Big Data processing using Beam.\n- Lab: MapReduce in Dataflow.\n- Incorporating additional data.\n- Lab: Side inputs.\n- Handling stream data.\n- GCP Reference architecture.\nModule 7: Getting started with Machine Learning\n\n\n- What is machine learning (ML).\n- Effective ML: concepts, types.\n- ML datasets: generalization.\n- Lab: Explore and create ML datasets.\nModule 8: Building ML models with Tensorflow\n\n\n- Getting started with TensorFlow.\n- Lab: Using tf.learn.\n- TensorFlow graphs and loops + lab.\n- Lab: Using low-level TensorFlow + early stopping.\n- Monitoring ML training.\n- Lab: Charts and graphs of TensorFlow training.\nModule 9: Scaling ML models with CloudML\n\n\n- Why Cloud ML?\n- Packaging up a TensorFlow model.\n- End-to-end training.\n- Lab: Run a ML model locally and on cloud.\nModule 10: Feature Engineering\n\n\n- Creating good features.\n- Transforming inputs.\n- Synthetic features.\n- Preprocessing with Cloud ML.\n- Lab: Feature engineering.\nModule 11: Architecture of streaming analytics pipelines\n\n\n- Stream data processing: Challenges.\n- Handling variable data volumes.\n- Dealing with unordered\/late data.\n- Lab: Designing streaming pipeline.\nModule 12: Ingesting Variable Volumes\n\n\n- What is Cloud Pub\/Sub?\n- How it works: Topics and Subscriptions.\n- Lab: Simulator.\nModule 13: Implementing streaming pipelines\n\n\n- Challenges in stream processing.\n- Handle late data: watermarks, triggers, accumulation.\n- Lab: Stream data processing pipeline for live traffic data.\nModule 14: Streaming analytics and dashboards\n\n\n- Streaming analytics: from data to decisions.\n- Querying streaming data with BigQuery.\n- What is Google Data Studio?\n- Lab: build a real-time dashboard to visualize processed data.\nModule 15: High throughput and low-latency with Bigtable\n\n\n- What is Cloud Spanner?\n- Designing Bigtable schema.\n- Ingesting into Bigtable.\n- Lab: streaming into Bigtable.","outline_plain":"Module 1: Google Cloud Dataproc Overview\n\n\n- Creating and managing clusters.\n- Leveraging custom machine types and preemptible worker nodes.\n- Scaling and deleting Clusters.\n- Lab: Creating Hadoop Clusters with Google Cloud Dataproc.\nModule 2: Running Dataproc Jobs\n\n\n- Running Pig and Hive jobs.\n- Separation of storage and compute.\n- Lab: Running Hadoop and Spark Jobs with Dataproc.\n- Lab: Submit and monitor jobs.\nModule 3: Integrating Dataproc with Google Cloud Platform\n\n\n- Customize cluster with initialization actions.\n- BigQuery Support.\n- Lab: Leveraging Google Cloud Platform Services.\nModule 4: Making Sense of Unstructured Data with Google\u2019s Machine Learning APIs\n\n\n- Google\u2019s Machine Learning APIs.\n- Common ML Use Cases.\n- Invoking ML APIs.\n- Lab: Adding Machine Learning Capabilities to Big Data Analysis.\nModule 5: Serverless data analysis with BigQuery\n\n\n- What is BigQuery.\n- Queries and Functions.\n- Lab: Writing queries in BigQuery.\n- Loading data into BigQuery.\n- Exporting data from BigQuery.\n- Lab: Loading and exporting data.\n- Nested and repeated fields.\n- Querying multiple tables.\n- Lab: Complex queries.\n- Performance and pricing.\nModule 6: Serverless, autoscaling data pipelines with Dataflow\n\n\n- The Beam programming model.\n- Data pipelines in Beam Python.\n- Data pipelines in Beam Java.\n- Lab: Writing a Dataflow pipeline.\n- Scalable Big Data processing using Beam.\n- Lab: MapReduce in Dataflow.\n- Incorporating additional data.\n- Lab: Side inputs.\n- Handling stream data.\n- GCP Reference architecture.\nModule 7: Getting started with Machine Learning\n\n\n- What is machine learning (ML).\n- Effective ML: concepts, types.\n- ML datasets: generalization.\n- Lab: Explore and create ML datasets.\nModule 8: Building ML models with Tensorflow\n\n\n- Getting started with TensorFlow.\n- Lab: Using tf.learn.\n- TensorFlow graphs and loops + lab.\n- Lab: Using low-level TensorFlow + early stopping.\n- Monitoring ML training.\n- Lab: Charts and graphs of TensorFlow training.\nModule 9: Scaling ML models with CloudML\n\n\n- Why Cloud ML?\n- Packaging up a TensorFlow model.\n- End-to-end training.\n- Lab: Run a ML model locally and on cloud.\nModule 10: Feature Engineering\n\n\n- Creating good features.\n- Transforming inputs.\n- Synthetic features.\n- Preprocessing with Cloud ML.\n- Lab: Feature engineering.\nModule 11: Architecture of streaming analytics pipelines\n\n\n- Stream data processing: Challenges.\n- Handling variable data volumes.\n- Dealing with unordered\/late data.\n- Lab: Designing streaming pipeline.\nModule 12: Ingesting Variable Volumes\n\n\n- What is Cloud Pub\/Sub?\n- How it works: Topics and Subscriptions.\n- Lab: Simulator.\nModule 13: Implementing streaming pipelines\n\n\n- Challenges in stream processing.\n- Handle late data: watermarks, triggers, accumulation.\n- Lab: Stream data processing pipeline for live traffic data.\nModule 14: Streaming analytics and dashboards\n\n\n- Streaming analytics: from data to decisions.\n- Querying streaming data with BigQuery.\n- What is Google Data Studio?\n- Lab: build a real-time dashboard to visualize processed data.\nModule 15: High throughput and low-latency with Bigtable\n\n\n- What is Cloud Spanner?\n- Designing Bigtable schema.\n- Ingesting into Bigtable.\n- Lab: streaming into Bigtable.","summary_plain":"This four-day instructor-led class provides participants a hands-on introduction to designing and building data processing systems on Google Cloud Platform. Through a combination of presentations, demos, and hand-on labs, participants will learn how to design data processing systems, build end-to-end data pipelines, analyze data and carry out machine learning. The course covers structured, unstructured, and streaming data.","skill_level":"Intermediate","version":"3.0","duration":{"unit":"d","value":4,"formatted":"4 jours"},"pricelist":{"List Price":{"IT":{"country":"IT","currency":"EUR","taxrate":20,"price":2600},"DE":{"country":"DE","currency":"EUR","taxrate":19,"price":2600},"NL":{"country":"NL","currency":"EUR","taxrate":21,"price":2695},"BE":{"country":"BE","currency":"EUR","taxrate":21,"price":2695},"AT":{"country":"AT","currency":"EUR","taxrate":20,"price":2600},"US":{"country":"US","currency":"USD","taxrate":null,"price":2495},"ES":{"country":"ES","currency":"EUR","taxrate":18,"price":1950},"SG":{"country":"SG","currency":"SGD","taxrate":8,"price":3450},"SE":{"country":"SE","currency":"EUR","taxrate":25,"price":2600},"AE":{"country":"AE","currency":"USD","taxrate":5,"price":2600},"CH":{"country":"CH","currency":"CHF","taxrate":8.1,"price":3380},"IN":{"country":"IN","currency":"USD","taxrate":12.36,"price":1500},"RU":{"country":"RU","currency":"RUB","taxrate":18,"price":221000},"IL":{"country":"IL","currency":"ILS","taxrate":17,"price":9020},"GR":{"country":"GR","currency":"EUR","taxrate":null,"price":1950},"MK":{"country":"MK","currency":"EUR","taxrate":null,"price":1950},"HU":{"country":"HU","currency":"EUR","taxrate":20,"price":1950},"SI":{"country":"SI","currency":"EUR","taxrate":20,"price":2600},"GB":{"country":"GB","currency":"GBP","taxrate":20,"price":2640},"CA":{"country":"CA","currency":"CAD","taxrate":null,"price":3445},"FR":{"country":"FR","currency":"EUR","taxrate":19.6,"price":2990}}},"lastchanged":"2025-11-18T18:18:14+01:00","parenturl":"https:\/\/portal.flane.ch\/swisscom\/fr\/json-courses","nexturl_course_schedule":"https:\/\/portal.flane.ch\/swisscom\/fr\/json-course-schedule\/18642","source_lang":"fr","source":"https:\/\/portal.flane.ch\/swisscom\/fr\/json-course\/google-degcp"}}