<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE FL_Course SYSTEM "https://www.flane.de/dtd/fl_course095.dtd"><?xml-stylesheet type="text/xsl" href="https://portal.flane.ch/css/xml-course.xsl"?><course productid="34489" language="en" source="https://portal.flane.ch/swisscom/en/xml-course/nvidia-edsoew" lastchanged="2025-07-29T12:18:27+02:00" parent="https://portal.flane.ch/swisscom/en/xml-courses"><title>Enhancing Data Science Outcomes With Efficient Workflow</title><productcode>EDSOEW</productcode><vendorcode>NV</vendorcode><vendorname>Nvidia</vendorname><fullproductcode>NV-EDSOEW</fullproductcode><version>1.0</version><objective>&lt;ul&gt;
&lt;li&gt;Develop and deploy an accelerated end-to-end data processing pipeline for large datasets&lt;/li&gt;&lt;li&gt;Scale data science workflows using distributed computing&lt;/li&gt;&lt;li&gt;Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns&lt;/li&gt;&lt;li&gt;Enhance machine learning solutions through feature engineering and rapid experimentation&lt;/li&gt;&lt;li&gt;Improve data processing pipeline performance by optimizing memory management and hardware utilization&lt;/li&gt;&lt;/ul&gt;</objective><essentials>&lt;ul&gt;
&lt;li&gt;Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.&lt;/li&gt;&lt;li&gt;Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the &amp;ldquo;Get Started&amp;rdquo; guide from Dask.&lt;/li&gt;&lt;li&gt;Completion of the DLI&amp;rsquo;s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.&lt;/li&gt;&lt;/ul&gt;</essentials><outline>&lt;p&gt;&lt;strong&gt;Introduction&lt;/strong&gt;	
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Meet the instructor.&lt;/li&gt;&lt;li&gt;Create an account at courses.nvidia.com/join&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Advanced Extract, Transform, and Load (ETL)&lt;/strong&gt;	
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Learn to process large volumes of data efficiently for downstream analysis:&lt;ul&gt;
&lt;li&gt;Discuss current challenges of growing data sizes.&lt;/li&gt;&lt;li&gt;Perform ETL efficiently on large datasets.&lt;/li&gt;&lt;li&gt;Discuss hidden slowdowns and perform DataFrame transformations properly.&lt;/li&gt;&lt;li&gt;Discuss diagnostic tools to monitor and optimize hardware utilization.&lt;/li&gt;&lt;li&gt;Persist data in a way that&amp;rsquo;s conducive for downstream analytics.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Training on Multiple GPUs With PyTorch Distributed Data Parallel (DDP)&lt;/strong&gt;	
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Learn how to improve data analysis on large datasets:&lt;ul&gt;
&lt;li&gt;Build and compare classification models.&lt;/li&gt;&lt;li&gt;Perform feature selection based on predictive power of new and existing features.&lt;/li&gt;&lt;li&gt;Perform hyperparameter tuning.&lt;/li&gt;&lt;li&gt;Create embeddings using deep learning and clustering on embeddings.&lt;/li&gt;&lt;/ul&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Deployment&lt;/strong&gt;	
&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Learn how to deploy and measure the performance of an accelerated data processing pipeline:&lt;/li&gt;&lt;li&gt;Deploy a data processing pipeline with Triton Inference Server.&lt;/li&gt;&lt;li&gt;Discuss various tuning parameters to optimize performance.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;&lt;strong&gt;Assessment and Q&amp;amp;A&lt;/strong&gt;&lt;/p&gt;</outline><objective_plain>- Develop and deploy an accelerated end-to-end data processing pipeline for large datasets
- Scale data science workflows using distributed computing
- Perform DataFrame transformations that take advantage of hardware acceleration and avoid hidden slowdowns
- Enhance machine learning solutions through feature engineering and rapid experimentation
- Improve data processing pipeline performance by optimizing memory management and hardware utilization</objective_plain><essentials_plain>- Basic knowledge of a standard data science workflow on tabular data. To gain an adequate understanding, we recommend this article.
- Knowledge of distributed computing using Dask. To gain an adequate understanding, we recommend the “Get Started” guide from Dask.
- Completion of the DLI’s Fundamentals of Accelerated Data Science course or an ability to manipulate data using cuDF and some experience building machine learning models using cuML.</essentials_plain><outline_plain>Introduction	



- Meet the instructor.
- Create an account at courses.nvidia.com/join
Advanced Extract, Transform, and Load (ETL)	



- Learn to process large volumes of data efficiently for downstream analysis:
- Discuss current challenges of growing data sizes.
- Perform ETL efficiently on large datasets.
- Discuss hidden slowdowns and perform DataFrame transformations properly.
- Discuss diagnostic tools to monitor and optimize hardware utilization.
- Persist data in a way that’s conducive for downstream analytics.
Training on Multiple GPUs With PyTorch Distributed Data Parallel (DDP)	



- Learn how to improve data analysis on large datasets:
- Build and compare classification models.
- Perform feature selection based on predictive power of new and existing features.
- Perform hyperparameter tuning.
- Create embeddings using deep learning and clustering on embeddings.
Deployment	



- Learn how to deploy and measure the performance of an accelerated data processing pipeline:
- Deploy a data processing pipeline with Triton Inference Server.
- Discuss various tuning parameters to optimize performance.
Assessment and Q&amp;A</outline_plain><duration unit="d" days="0">0.5 days</duration><pricelist><price country="US" currency="USD">500.00</price><price country="DE" currency="EUR">500.00</price><price country="AT" currency="EUR">500.00</price><price country="SE" currency="EUR">500.00</price><price country="SI" currency="EUR">500.00</price><price country="GB" currency="GBP">420.00</price><price country="IT" currency="EUR">500.00</price><price country="CA" currency="CAD">690.00</price></pricelist><miles/></course>