Tech Overview Part 2: Onboard

Content
7 modules

Difficulty
Intermediate

Course Length
120 mins

Instructor
Leo Schuman

Released
13 Oct 2021

Price
Free

Description

This course is designed for more technical audiences who have some experience with enterprise data management, but are new to Infoworks. The material presented, and the quiz questions asked, will help you understand Infoworks, and begin ingesting data to your target compute environment and storage location.

Objectives

Introduction to Onboarding

  • List and describe metrics identifying the value of automated data onboarding
  • Describe the role of historic and ongoing ingestion processes
  • List change data capture (CDC) methods supported by Infoworks
  • Describe the purpose of segmented ingestion 
  • List categories and examples of source types Infoworks can ingest
  • Distinguish Infoworks behavior when crawling vs ingesting a source

Source configuration and full default ingestion

  • Define a new data source
  • Configure a data source for ingestion, including connection URL, credentials, and schema
  • Configure and describe the behavior of a Full Refresh ingestion using default settings
  • Define and launch an ingestion job using a specified compute cluster
  • View and download ingestion job logs
  • Enhance the data catalog with source and table specific metadata

Table-specific configured ingestion

  • Truncate ingested data to enable reconfiguration of a prior ingestion
  • Track current and historic jobs using Job Queue and Job Search views
  • Define reformatting to be applied to data during ingestion
  • Locate sample data and table specific audit records
  • Define target table name and storage format
  • Implement a partition column to optimize later query parallelization
  • Implement a split by column to optimize ingestion with parallel workers
  • Describe and implement a filter to ingest only a subset of the source table
  • Locate and export ingestion metrics

Incremental ingestion

  • Distinguish full refresh vs incremental ingestion
  • Distinguish and describe merge vs append mode in incremental ingestion
  • Identify and describe the role of natural keys in incremental merge ingestion
  • Identify and describe the role of a watermark column in incremental ingestion
  • Describe and implement a table group from an ingestion job

Structured file (CSV) ingestion

  • Identify and describe source file locations accessible by Infoworks
  • Configure a secure cloud storage source for structured file ingestion
  • Define source base and relative pathing for accessing structured files
  • Define and configure source file format and file name patterns
  • Define a table to be created in storage from ingested file data

 

Certificate

By completing/passing this course, you will attain the certificate PTC Course Certificate

1.
Introduction to Data Onboarding
{{ vm.helper.t('reports.module') }}
2.
Tutorial - Source configuration and full default ingestion
{{ vm.helper.t('reports.module') }}
3.
Tutorial - Table specific configured ingestion
{{ vm.helper.t('reports.module') }}
4.
Tutorial - Incremental ingestion
{{ vm.helper.t('reports.module') }}
5.
Tutorial - Structured file (CSV) ingestion
{{ vm.helper.t('reports.module') }}
6.
Quiz - Ingestion
{{ vm.helper.t('courses.exam') }}
7.
End of Course Survey
{{ vm.helper.t('courses.survey') }}

Shopping Cart

Your cart is empty