Nuova ricerca


Dipartimento di Ingegneria "Enzo Ferrari"

Home |


2022 - Big Data Integration for Data-Centric AI [Abstract in Atti di Convegno]
Bergamaschi, Sonia; Beneventano, Domenico; Simonini, Giovanni; Gagliardelli, Luca; Aslam, Adeel; De Sabbata, Giulio; Zecchini, Luca

Big data integration represents one of the main challenges for the use of techniques and tools based on Artificial Intelligence (AI) in several crucial areas: eHealth, energy management, enterprise data, etc. In this context, Data-Centric AI plays a primary role in guaranteeing the quality of the data on which these tools and techniques operate. Thus, the activities of the Database Research Group (DBGroup) of the “Enzo Ferrari” Engineering Department of the University of Modena and Reggio Emilia are moving in this direction. Therefore, we present the main research projects of the DBGroup, which are part of collaborations in various application sectors.

2022 - Progressive Entity Resolution with Node Embeddings [Relazione in Atti di Convegno]
Simonini, Giovanni; Gagliardelli, Luca; Rinaldi, Michele; Zecchini, Luca; De Sabbata, Giulio; Aslam, Adeel; Beneventano, Domenico; Bergamaschi, Sonia

Entity Resolution (ER) is the task of finding records that refer to the same real-world entity, which are called matches. ER is a fundamental pre-processing step when dealing with dirty and/or heterogeneous datasets; however, it can be very time-consuming when employing complex machine learning models to detect matches, as state-of-the-art ER methods do. Thus, when time is a critical component and having a partial ER result is better than having no result at all, progressive ER methods are employed to try to maximize the number of detected matches as a function of time. In this paper, we study how to perform progressive ER by exploiting graph embeddings. The basic idea is to represent candidate matches in a graph: each node is a record and each edge is a possible comparison to check—we build that on top of a well-known, established graph-based ER framework. We experimentally show that our method performs better than existing state-of-the-art progressive ER methods on real-world benchmark datasets.