|
GIOVANNI SIMONINI
Professore Associato Dipartimento di Ingegneria "Enzo Ferrari" SUPPLENTE DOCENTE Dipartimento di Economia "Marco Biagi"
|
Insegnamento: Big Data Management and Governance
Ingegneria informatica (D.M.270/04) (Offerta formativa 2023)
Obiettivi formativi
To provide fundamental notions of: database management & architecture, management of massive data, and big data quality and integration.
To provide advanced skills on: database system implementation; large scale data management systems.
Prerequisiti
The courses of Database and Lab. of the Computer Engineering degree at the "Enzo Ferrari" Engineering Department in Modena.
For students coming from other universities, basics of: SQL and relational DBMS
Programma del corso
Relational Database Programming and Advanced Database Modelling (1.5 CFU)
- Advanced SQL
- Views, CTE, recursive queries
- SQL in big data: SparkSQL
Database System Implementation (3.0 CFU)
- Database Management Systems
- How to store the data: Storage and Management of very large amount of data on secondary memory devices
- How to access to the data: Primary and secondary file indexing techniques, hash indexing
- How to retrieve information: Query Execution, the Query Compiler & Optimizer
- How to perform transactions: Concurrency Control, Serializability and Recoverability
Management of Massive Data (4 CFU)
- Probabilistic data structures: bloom filters, count-min sketch, hyperLogLog, locality sensitive hashing
- Distributed databases and Distributed transactions processing: two phase commit, from ACID to BASE properties
- Data preparation and integration:
- Data Lakes, ETL/ELT, Data Discovery
- Data fusion: Entity Resolution
- Architectures and Technology for Big Data Management (Map Reduce, Hadoop Spark); NOSQL data models and DBMS, Cloud DBMS.
Data Governance (0.5 CFU)
- General notions on architectures, policies, practices and procedures to properly manage the full data life cycle of an enterprise
- General notions on Data Security and Quality Management
Metodi didattici
Lectures, practical exercises, laboratory activities.
Testi di riferimento
- Course slides
- A. Silberschatz, H. F. Korth, S. Sudarshan: "Database System Concepts" – 7th edition – McGraw-Hill ISBN 9780078022159
- D. Beneventano, S. Bergamaschi, F. Guerra, M. Vincini: Progetto di Basi di Dati Relazionali: lezioni ed esercizi Pitagora editrice - Bologna (edizione 2007). (in italian)
Verifica dell'apprendimento
Test on Advanced SQL [30%]
- Advanced SQL
- Stored Procedures
- Triggers
Group project [30%]
- Project on topics proposed by the instructors, or by the candidates upon approval
- Output: Short paper
- Evaluation: Discussion **during the oral examination**
Oral examination [40%]
- Questions on the entire program: theory and whiteboard exercises
Risultati attesi
Knowledge and understanding: Through lectures, students will get deep knowledge and understanding of the relational technology, also at the level of implementation techniques, as well as the basics of data distributed databases.
Applying knowledge and understanding: Through classroom exercises and practical computer exercises, the student will be able to use the advanced features of the standard language for DBMS and to apply the knowledge gained in the design and implementation of distributed databases.
Making judgments: Thanks to the resolution of individual exercises and practical exercises in the laboratory, the student will be able to critically evaluate the design and implementative choices taken and the results obtained.
Communication skills: The oral exam with a subject of further choice, will equip the student to organize and clearly present, through the technical language, the results of his work.
Learning skills: The activities carried out during the course and during the examination allow the student to acquire the instruments to autonomously upgrade his knowledge. This is especially crucial in the field of advanced data management, where technology is constantly evolving.