Nuova ricerca

GIOVANNI SIMONINI

Professore Associato
Dipartimento di Ingegneria "Enzo Ferrari"
SUPPLENTE DOCENTE
Dipartimento di Economia "Marco Biagi"

Insegnamento: Big Data Management and Governance

Ingegneria informatica (Offerta formativa 2023)

Obiettivi formativi

To provide fundamental notions of: database management & architecture, management of massive data, and big data quality and integration.
To provide advanced skills on: database system implementation; large scale data management systems.

Prerequisiti

The courses of Database and Lab. of the Computer Engineering degree at the "Enzo Ferrari" Engineering Department in Modena.

For students coming from other universities, basics of: SQL and relational DBMS

Programma del corso

Relational Database Programming and Advanced Database Modelling (1.5 CFU)
- Advanced SQL
- Views, CTE, recursive queries
- SQL in big data: SparkSQL

Database System Implementation (3.0 CFU)
- Database Management Systems
- How to store the data: Storage and Management of very large amount of data on secondary memory devices
- How to access to the data: Primary and secondary file indexing techniques, hash indexing
- How to retrieve information: Query Execution, the Query Compiler & Optimizer
- How to perform transactions: Concurrency Control, Serializability and Recoverability

Management of Massive Data (4 CFU)
- Probabilistic data structures: bloom filters, count-min sketch, hyperLogLog, locality sensitive hashing
- Distributed databases and Distributed transactions processing: two phase commit, from ACID to BASE properties
- Data preparation and integration:
- Data Lakes, ETL/ELT, Data Discovery
- Data fusion: Entity Resolution
- Architectures and Technology for Big Data Management (Map Reduce, Hadoop Spark); NOSQL data models and DBMS, Cloud DBMS.

Data Governance (0.5 CFU)
- General notions on architectures, policies, practices and procedures to properly manage the full data life cycle of an enterprise
- General notions on Data Security and Quality Management

Metodi didattici

Lectures, practical exercises, laboratory activities.

Testi di riferimento

- Course slides
- A. Silberschatz, H. F. Korth, S. Sudarshan: "Database System Concepts" – 7th edition – McGraw-Hill ISBN 9780078022159
- D. Beneventano, S. Bergamaschi, F. Guerra, M. Vincini: Progetto di Basi di Dati Relazionali: lezioni ed esercizi Pitagora editrice - Bologna (edizione 2007). (in italian)

Verifica dell'apprendimento

Test on Advanced SQL [30%]
- Advanced SQL
- Stored Procedures
- Triggers

Group project [30%]
- Project on topics proposed by the instructors, or by the candidates upon approval
- Output: Short paper
- Evaluation: Discussion **during the oral examination**

Oral examination [40%]
- Questions on the entire program: theory and whiteboard exercises

Risultati attesi

Knowledge and understanding: Through lectures, students will get deep knowledge and understanding of the relational technology, also at the level of implementation techniques, as well as the basics of data distributed databases.
Applying knowledge and understanding: Through classroom exercises and practical computer exercises, the student will be able to use the advanced features of the standard language for DBMS and to apply the knowledge gained in the design and implementation of distributed databases.
Making judgments: Thanks to the resolution of individual exercises and practical exercises in the laboratory, the student will be able to critically evaluate the design and implementative choices taken and the results obtained.
Communication skills: The oral exam with a subject of further choice, will equip the student to organize and clearly present, through the technical language, the results of his work.
Learning skills: The activities carried out during the course and during the examination allow the student to acquire the instruments to autonomously upgrade his knowledge. This is especially crucial in the field of advanced data management, where technology is constantly evolving.