t.BA.DS.DE2.20HS (Data Engineering 2) 
Module: Data Engineering 2
This information was generated on: 29 March 2024
No.
t.BA.DS.DE2.20HS
Title
Data Engineering 2
Organised by
T InIT
Credits
4

Description

Version: 2.0 start 01 August 2021
 

Short description

Data Engineering topics are essential components of successful data products and data projects. Students learn the requirements for running successful data engineering pipelines, the key methods, and both the theoretical foundations and practical implementation of different methods and applications.

Module coordinator

Weiler Andreas (wele) (ad interim)

Learning objectives (competencies)

Objectives Competences Taxonomy levels
You understand the fundamentals and specialities of data engineering, especially in contrast to Software development projects. F, M K1, K2
You know different topics of data engineering, especially in the domain of transformation and delivery of data. F, M K1, K2, K3
You know the difference between data engineering of structured, unstructured, batch and stream processing data. F, M K1, K2, K3
You know perspectives and opportunities of current research and development in the domain of data engineering. F K1

Module contents

The digitalization of processes and environments is difficult challenge for computer scientists. Software development is hereby not the primary problem, rather the professional processing and analysis of different datatypes and volumes. For this purpose it is essential to have a certain fundamental experience in the area of data engineering.
This module provides you a practical introduction in elemental data engineering. The focus is on an good overall view and a clean methodology; proofs and details of the methods are not part of this course and are expected to be discussed in later course. The content of the lecture is applied practically by the participants in several projects.

The following contents will be discussed in the module:

Storing Structured Data:  
  Relational  
  NoSQL
  Big Data / Hadoop etc.  

Transforming Data:
  ETL  Jobs
  Schedulers
  Cleaning  (Noise removal, Outlier detection, Interpolation) 
  Anonymization  

Sourcing Data:
  APIs
  Web crawling, Web scraping 
  Other Sources  

Querying Data:
  Advanced Queries
  Query Optimization
  Distributed Queries  

Ingesting Data:
  Batch vs. Real-Time Data Streams
  Queues

Accompanying Assignments
The lecture is accompanied by practical assignments containing of implementations in python and related tools and libraries with real-world datasets. About the half of the semester the students work in small groups on individual projects, which are presented at the end of the semester.

Teaching materials

  • Slides of the lecture
  • Additional material to the practical assignments

Supplementary literature

tbd

Prerequisites

tbd

Teaching language

(X) German ( ) English

Part of International Profile

( ) Yes (X) No

Module structure

Type 3a
  For more details please click on this link: T_CL_Modulauspraegungen_SM2025

Exams

Description Type Form Scope Grade Weighting
Graded assignments during teaching semester practical exercises (with mark) written 2 practical exercises max. 20 points 20%
End-of-semester exam exam written 90 minutes max. 80 point 80%

Remarks

 

Legal basis

The module description is part of the legal basis in addition to the general academic regulations. It is binding. During the first week of the semester a written and communicated supplement can specify the module description in more detail.
Course: Operating Systems - Praktikum
No.
t.BA.DS.DE2.20HS.P
Title
Operating Systems - Praktikum

Note

  • No module description is available in the system for the cut-off date of 02 August 2099.
Course: Data Engineering 2 - Vorlesung
No.
t.BA.DS.DE2.20HS.V
Title
Data Engineering 2 - Vorlesung

Note

  • No module description is available in the system for the cut-off date of 02 August 2099.