t.BA.DS.DE1.20HS (Data Engineering 1) 
Module: Data Engineering 1
This information was generated on: 27 September 2021
No.
t.BA.DS.DE1.20HS
Title
Data Engineering 1
Organised by
T InIT
Credits
4

Description

Version: 2.0 start 01 August 2020
 

Short description

The field of "Data Engineering" covers the crucial steps from acquisition of the raw data to making the validated, cleaned data available for exploitation. The "Data Engineering 1" module discusses the basics of this field and the handling of unstructured data.

Module coordinator

Braschler, Martin (bram) (ad interim)

Learning objectives (competencies)

Objectives Competences Taxonomy levels
You know the basics of Data Engineering F K1
You understand how data pipelines are used for acquiring, transforming and cleaning raw data, and you know how to design and implement such pipelines F K2, K3
You know how information retrieval (IR) is used to leverage unstructured data F K2
You can use named entity recognition and mechanisms such as knowledge graphs to bridge the gap between unstructured and structured data F K3

Module contents

We live in a world in which the collection, transformation and exploitation of data is more central than ever. The field of  "Data Engineering" covers the crucial steps from acquisition of raw data to making the validated, cleaned data available for exploitation - such as interpretation, learning and visual rendering. The module "Data Engineering 1" discusses the basics of the field and the handling of unstructured data.

1. Introduction (3 weeks)
- What is Data Engineering?
- Data Engineering in the broader context of Data Science
- Data (Processing) Pipelines
- Different forms of data: Big Data, Small Data, Smart Data, ..

2. Working with data (4 weeks)
- Data formats and file formats (XML, JSON, CSV, ...)
- Navigating XML/JSON data (XPath, JSONPath)
- Tools
- Structured vs. unstructured data

3. Handling unstructured data (7 weeks)
- Information Retrieval (IR) basics
- Indexing Pipelines
- Named Entity Recognition
- Vector Space Model, Probabilistic Retrieval Model
- Knowledge Graphs as a bridge to structured data
- IR Evaluation 

 

Teaching materials

Set of slides

Supplementary literature

 

Prerequisites

 

Teaching language

(X) German ( ) English

Part of International Profile

( ) Yes (X) No

Module structure

Type 3a
  For more details please click on this link: T_CL_Modulauspraegungen_SM2025

Exams

Description Type Form Scope Grade Weighting
Graded assignments during teaching semester practical exercises (graded) written, at home 3 practical exercises   20%
End-of-semester exam   written 90 minutes   80%

Remarks

 

Legal basis

The module description is part of the legal basis in addition to the general academic regulations. It is binding. During the first week of the semester a written and communicated supplement can specify the module description in more detail.
Course: Operating Systems - Praktikum
No.
t.BA.DS.DE1.20HS.P
Title
Operating Systems - Praktikum

Note

  • No module description is available in the system for the cut-off date of 02 August 2099.
Course: Data Engineering 1 - Vorlesung
No.
t.BA.DS.DE1.20HS.V
Title
Data Engineering 1 - Vorlesung

Note

  • No module description is available in the system for the cut-off date of 02 August 2099.