Nuova ricerca


Dipartimento di Ingegneria "Enzo Ferrari"

Home | Curriculum(pdf) |


2023 - Interpretable Entity Matching with WYM [Relazione in Atti di Convegno]
Baraldi, A.; Del Buono, F.; Guerra, F.; Guiduzzi, G.; Paganelli, M.; Vincini, M.

2022 - A Framework to Evaluate the Quality of Integrated Datasets [Articolo su rivista]
Buono, Francesco Del; Faggioli, Guglielmo; Paganelli, Matteo; Baraldi, Andrea; Guerra, Francesco; Ferro, Nicola

2022 - Analyzing How BERT Performs Entity Matching [Articolo su rivista]
Paganelli, M.; Del Buono, F.; Baraldi, A.; Guerra, F.

State-of-the-art Entity Matching (EM) approaches rely on transformer architectures, such as BERT, for generating highly contextualized embeddings of terms. The embeddings are then used to predict whether pairs of entity descriptions refer to the same real-world entity. BERT-based EM models demonstrated to be effective, but act as black-boxes for the users, who have limited insight into the motivations behind their decisions. In this paper, we perform a multi-facet analysis of the components of pre-trained and fine-tuned BERT architectures applied to an EM task. The main findings resulting from our extensive experimental evaluation are (1) the fine-tuning process applied to the EM task mainly modifies the last layers of the BERT components, but in a different way on tokens belonging to descriptions of matching / non-matching entities; (2) the special structure of the EM datasets, where records are pairs of entity descriptions is recognized by BERT; (3) the pair-wise semantic similarity of tokens is not a key knowledge exploited by BERT-based EM models.

2022 - Landmark Explanation: A Tool for Entity Matching [Relazione in Atti di Convegno]
Baraldi, A.; Del Buono, F.; Paganelli, M.; Guerra, F.

We introduce Landmark Explanation, a framework that extends the capabilities of a post-hoc perturbationbased explainer to the EM scenario. Landmark Explanation leverages on the specific schema typically adopted by the EM datasets, representing pairs of entity descriptions, for generating word-based explanations that effectively describe the matching model.

2022 - Novelty Detection with Autoencoders for System Health Monitoring in Industrial Environments [Articolo su rivista]
Del Buono, Francesco; Calabrese, Francesca; Baraldi, Andrea; Paganelli, Matteo; Guerra, Francesco

Predictive Maintenance (PdM) is the newest strategy for maintenance management in industrial contexts. It aims to predict the occurrence of a failure to minimize unexpected downtimes and maximize the useful life of components. In data-driven approaches, PdM makes use of Machine Learning (ML) algorithms to extract relevant features from signals, identify and classify possible faults (diagnostics), and predict the components’ remaining useful life (prognostics). The major challenge lies in the high complexity of industrial plants, where both operational conditions change over time and a large number of unknown modes occur. A solution to this problem is offered by novelty detection, where a representation of the machinery normal operating state is learned and compared with online measurements to identify new operating conditions. In this paper, a systematic study of autoencoder-based methods for novelty detection is conducted. We introduce an architecture template, which includes a classification layer to detect and separate the operative conditions, and a localizer for identifying the most influencing signals. Four implementations, with different deep learning models, are described and used to evaluate the approach on data collected from a test rig. The evaluation shows the effectiveness of the architecture and that the autoencoders outperform the current baselines.

2021 - Using Landmarks for Explaining Entity Matching Models [Relazione in Atti di Convegno]
Baraldi, Andrea; DEL BUONO, Francesco; Paganelli, Matteo; Guerra, Francesco

The state of the art approaches for performing Entity Matching (EM) rely on machine & deep learning models for inferring pairs of matching / non-matching entities. Although the experimental evaluations demonstrate that these approaches are effective, their adoption in real scenarios is limited by the fact that they are difficult to interpret. Explainable AI systems have been recently proposed for complementing deep learning approaches. Their application to the scenario offered by EM is still new and requires to address the specificity of this task, characterized by particular dataset schemas, describing a pair of entities, and imbalanced classes. This paper introduces Landmark Explanation, a generic and extensible framework that extends the capabilities of a post-hoc perturbation-based explainer over the EM scenario. Landmark Explanation generates perturbations that take advantage of the particular schemas of the EM datasets, thus generating explanations more accurate and more interesting for the users than the ones generated by competing approaches.