Nuova ricerca

FEDERICA ROLLO

Ricercatore t.d. art. 24 c. 3 lett. A
Dipartimento di Ingegneria "Enzo Ferrari"


Home | Curriculum(pdf) | Didattica |


Pubblicazioni

2024 - A Comparative Analysis of Word Embeddings Techniques for Italian News Categorization [Articolo su rivista]
Rollo, Federica; Bonisoli, Giovanni; Po, Laura
abstract


2024 - HypeAIR: A novel framework for real-time low-cost sensor calibration for air quality monitoring in smart cities [Articolo su rivista]
Bachechi, Chiara; Rollo, Federica; Po, Laura
abstract


2023 - Anomaly Detection and Repairing for Improving Air Quality Monitoring [Articolo su rivista]
Rollo, F.; Bachechi, C.; Po, L.
abstract

Clean air in cities improves our health and overall quality of life and helps fight climate change and preserve our environment. High-resolution measures of pollutants’ concentrations can support the identification of urban areas with poor air quality and raise citizens’ awareness while encouraging more sustainable behaviors. Recent advances in Internet of Things (IoT) technology have led to extensive use of low-cost air quality sensors for hyper-local air quality monitoring. As a result, public administrations and citizens increasingly rely on information obtained from sensors to make decisions in their daily lives and mitigate pollution effects. Unfortunately, in most sensing applications, sensors are known to be error-prone. Thanks to Artificial Intelligence (AI) technologies, it is possible to devise computationally efficient methods that can automatically pinpoint anomalies in those data streams in real time. In order to enhance the reliability of air quality sensing applications, we believe that it is highly important to set up a data-cleaning process. In this work, we propose AIrSense, a novel AI-based framework for obtaining reliable pollutant concentrations from raw data collected by a network of low-cost sensors. It enacts an anomaly detection and repairing procedure on raw measurements before applying the calibration model, which converts raw measurements to concentration measurements of gasses. There are very few studies of anomaly detection in raw air quality sensor data (millivolts). Our approach is the first that proposes to detect and repair anomalies in raw data before they are calibrated by considering the temporal sequence of the measurements and the correlations between different sensor features. If at least some previous measurements are available and not anomalous, it trains a model and uses the prediction to repair the observations; otherwise, it exploits the previous observation. Firstly, a majority voting system based on three different algorithms detects anomalies in raw data. Then, anomalies are repaired to avoid missing values in the measurement time series. In the end, the calibration model provides the pollutant concentrations. Experiments conducted on a real dataset of 12,000 observations produced by 12 low-cost sensors demonstrated the importance of the data-cleaning process in improving calibration algorithms’ performances.


2023 - CEM: an Ontology for Crime Events in Newspaper Articles [Relazione in Atti di Convegno]
Rollo, Federica; Po, Laura; Castellucci, Alessandro
abstract


2023 - DICE: a Dataset of Italian Crime Event news [Banca dati]
Rollo, Federica; Bonisoli, Giovanni; Po, Laura
abstract

DICE is a collection of 10,395 Italian news articles describing 13 types of crime events that happened in the province of Modena, Italy, between the end of 2011 and 2021.


2023 - DICE: a Dataset of Italian Crime Event news [Relazione in Atti di Convegno]
Bonisoli, Giovanni; Pia Di Buono, Maria; Po, Laura; Rollo, Federica
abstract


2023 - Italian FastText models [Software]
Rollo, Federica; Bonisoli, Giovanni; Po, Laura
abstract


2023 - Italian GloVe models [Software]
Rollo, Federica; Bonisoli, Giovanni; Po, Laura
abstract


2023 - Italian Word2Vec models [Software]
Rollo, Federica; Bonisoli, Giovanni; Po, Laura
abstract


2023 - Modeling Event-Centric Knowledge Graph for Crime Analysis on Online News [Capitolo/Saggio]
Rollo, F.; Po, L.
abstract


2022 - Big Data Analytics and Visualization in Traffic Monitoring [Articolo su rivista]
Bachechi, Chiara; Po, Laura; Rollo, Federica
abstract

This paper presents a system that employs information visualization techniques to analyze urban traffic data and the impact of traffic emissions on urban air quality. Effective visualizations allow citizens and public authorities to identify trends, detect congested road sections at specific times, and perform monitoring and maintenance of traffic sensors. Since road transport is a major source of air pollution, also the impact of traffic on air quality has emerged as a new issue that traffic visualizations should address. Trafair Traffic Dashboard exploits traffic sensor data and traffic flow simulations to create an interactive layout focused on investigating the evolution of traffic in the urban area over time and space. The dashboard is the last step of a complex data framework that starts from the ingestion of traffic sensor observations, anomaly detection, traffic modeling, and also air quality impact analysis. We present the results of applying our proposed framework on two cities (Modena, in Italy, and Santiago de Compostela, in Spain) demonstrating the potential of the dashboard in identifying trends, seasonal events, abnormal behaviors, and understanding how urban vehicle fleet affects air quality. We believe that the framework provides a powerful environment that may guide the public decision-makers through effective analysis of traffic trends devoted to reducing traffic issues and mitigating the polluting effect of transportation.


2022 - Crime Event Model [Software]
Rollo, Federica; Po, Laura; Castellucci, Alessandro
abstract


2022 - Detection and Classification of Sensor Anomalies for Simulating Urban Traffic Scenarios [Articolo su rivista]
Bachechi, Chiara; Rollo, Federica; Po, Laura
abstract


2022 - Knowledge Graphs for Community Detection in Textual Data [Relazione in Atti di Convegno]
Rollo, F.; Po, L.
abstract

Online sources produce a huge amount of textual data, i.e., freeform text. To derive insightful information from them and facilitate the application of Machine Learning algorithms textual data need to be processed and structured. Knowledge Graphs (KGs) are intelligent systems for the analysis of documents. In recent years, they have been adopted in multiple contexts, including text mining for the development of data-driven solutions to different problems. The scope of this paper is to provide a methodology to build KGs from textual data and apply algorithms to group similar documents in communities. The methodology exploits semantic and statistical approaches to extract relevant insights from each document; these data are then organized in a KG that allows for their interconnection. The methodology has been successfully tested on news articles related to crime events occurred in the city of Modena, in Italy. The promising results demonstrate how KG-based analysis can improve the management of information coming from online sources.


2022 - Online News Event Extraction for Crime Analysis [Relazione in Atti di Convegno]
Rollo, F.; Po, L.; Bonisoli, G.
abstract

Event Extraction is a complex and interesting topic in Information Extraction that includes methods for the identification of event's type, participants, location, and date from free text or web data. The result of event extraction systems can be used in several fields, such as online monitoring systems or decision support tools. In this paper, we introduce a framework that combines several techniques (lexical, semantic, machine learning, neural networks) to extract events from Italian news articles for crime analysis purposes. Furthermore, we concentrate to represent the extracted events in a Knowledge Graph. An evaluation on crimes in the province of Modena is reported.


2022 - Semi Real-time Data Cleaning of Spatially Correlated Data in Traffic Sensor Networks [Relazione in Atti di Convegno]
Rollo, Federica; Bachechi, Chiara; Po, Laura
abstract


2022 - Supervised and Unsupervised Categorization of an Imbalanced Italian Crime News Dataset [Relazione in Atti di Convegno]
Rollo, F.; Bonisoli, G.; Po, L.
abstract

The automatic categorization of crime news is useful to create statistics on the type of crimes occurring in a certain area. This assignment can be treated as a text categorization problem. Several studies have shown that the use of word embeddings improves outcomes in many Natural Language Processing (NLP), including text categorization. The scope of this paper is to explore the use of word embeddings for Italian crime news text categorization. The approach followed is to compare different document pre-processing, Word2Vec models and methods to obtain word embeddings, including the extraction of bigrams and keyphrases. Then, supervised and unsupervised Machine Learning categorization algorithms have been applied and compared. In addition, the imbalance issue of the input dataset has been addressed by using Synthetic Minority Oversampling Technique (SMOTE) to oversample the elements in the minority classes. Experiments conducted on an Italian dataset of 17,500 crime news articles collected from 2011 till 2021 show very promising results. The supervised categorization has proven to be better than the unsupervised categorization, overcoming 80% both in precision and recall, reaching an accuracy of 0.86. Furthermore, lemmatization, bigrams and keyphrase extraction are not so decisive. In the end, the availability of our model on GitHub together with the code we used to extract word embeddings allows replicating our approach to other corpus either in Italian or other languages.


2021 - Air Quality Sensor Network Data Acquisition, Cleaning, Visualization, and Analytics: A Real-world IoT Use Case [Relazione in Atti di Convegno]
Rollo, Federica; Sudharsan, Bharath; Po, Laura; Breslin, John
abstract

Monitoring and analyzing air quality is of primary importance to encourage more sustainable lifestyles and plan corrective actions. This paper presents the design and end-To-end implementation1 of a real-world urban air quality data collection and analytics use case which is a part of the TRAFAIR (Understanding Traffic Flows to Improve Air Quality) European project [1, 2]. This implementation is related to the project work done in Modena city, Italy, starting from distributed low-cost multi-sensor IoT devices installation, LoRa network setup, data collection at LoRa server database, ML-based anomaly measurement detection plus cleaning, sensor calibration, central control and visualization using designed SenseBoard [3].


2021 - Anomaly Detection in Multivariate Spatial Time Series: A Ready-to-Use implementation [Relazione in Atti di Convegno]
Bachechi, Chiara; Rollo, Federica; Po, Laura; Quattrini, Fabio
abstract


2021 - ElastiCL: Elastic Quantization for Communication Efficient Collaborative Learning in IoT [Relazione in Atti di Convegno]
Sudharsan, B.; Sheth, D.; Arya, S.; Rollo, F.; Yadav, P.; Patel, P.; Breslin, J. G.; Ali, M. I.
abstract


2021 - SenseBoard: Sensor monitoring for air quality experts [Relazione in Atti di Convegno]
Rollo, F.; Po, L.
abstract

Air quality monitoring is crucial within cities since air pollution is one of the main causes of premature death in Europe. However, performing trustworthy monitoring of urban air quality is not a simple process. Especially, if you want to try to create extensive and timely monitoring of the entire urban area using low-cost sensors. In order to collect reliable measurements from low-cost sensors, a lot of work is required from environmental experts who deploy and maintain the air quality network, and daily calibrate, control, and clean up the data generated by these sensors. In this paper, we describe SenseBoard, an interactive dashboard created to support environmental experts in the sensor network control, management of sensor data calibration, and anomaly detection.


2021 - Using Word Embeddings for Italian Crime News Categorization [Relazione in Atti di Convegno]
Bonisoli, Giovanni; Rollo, Federica; Po, Laura
abstract


2020 - Crime event localization and deduplication [Relazione in Atti di Convegno]
Rollo, Federica; Po, Laura
abstract


2020 - Real-time data cleaning in traffic sensor networks [Relazione in Atti di Convegno]
Bachechi, Chiara; Rollo, Federica; Po, Laura
abstract


2020 - Semantic Traffic Sensor Data: The TRAFAIR Experience [Articolo su rivista]
Desimoni, Federico; Ilarri, Sergio; Po, Laura; Rollo, Federica; Trillo Lado, Raquel
abstract

Modern cities face pressing problems with transportation systems including, but not limited to, traffic congestion, safety, health, and pollution. To tackle them, public administrations have implemented roadside infrastructures such as cameras and sensors to collect data about environmental and traffic conditions. In the case of traffic sensor data not only the real-time data are essential, but also historical values need to be preserved and published. When real-time and historical data of smart cities become available, everyone can join an evidence-based debate on the city’s future evolution. The TRAFAIR (Understanding Traffic Flows to Improve Air Quality) project seeks to understand how traffic affects urban air quality. The project develops a platform to provide real-time and predicted values on air quality in several cities in Europe, encompassing tasks such as the deployment of low-cost air quality sensors, data collection and integration, modeling and prediction, the publication of open data, and the development of applications for end-users and public administrations. This paper explicitly focuses on the modeling and semantic annotation of traffic data. We present the tools and techniques used in the project and validate our strategies for data modeling and its semantic enrichment over two cities: Modena (Italy) and Zaragoza (Spain). An experimental evaluation shows that our approach to publish Linked Data is effective.


2020 - Using real sensors data to calibrate a traffic model for the city of Modena [Relazione in Atti di Convegno]
Bachechi, Chiara; Rollo, Federica; Desimoni, Federico; Po, Laura
abstract

In Italy, road vehicles are the preferred mean of transport. Over the last years, in almost all the EU Member States, the passenger car fleet increased. The high number of vehicles complicates urban planning and often results in traffic congestion and areas of increased air pollution. Overall, efficient traffic control is profitable in individual, societal, financial, and environmental terms. Traffic management solutions typically require the use of simulators able to capture in detail all the characteristics and dependencies associated with real-life traffic. Therefore, the realization of a traffic model can help to discover and control traffic bottlenecks in the urban context. In this paper, we analyze how to better simulate vehicle flows measured by traffic sensors in the streets. A dynamic traffic model was set up starting from traffic sensors data collected every minute in about 300 locations in the city of Modena. The reliability of the model is discussed and proved with a comparison between simulated values and real values from traffic sensors. This analysis pointed out some critical issues. Therefore, to better understand the origin of fake jams and incoherence with real data, we approached different configurations of the model as possible solutions.


2019 - From Sensors Data to Urban Traffic Flow Analysis [Relazione in Atti di Convegno]
Po, Laura; Rollo, Federica; Bachechi, Chiara; Corni, Alberto
abstract

By 2050, almost 70% of the population will live in cities. As the population grows, travel demand increases and this might affect air quality in urban areas. Traffic is among the main sources of pollution within cities. Therefore, monitoring urban traffic means not only identifying congestion and managing accidents but also preventing the impact on air pollution. Urban traffic modeling and analysis is part of the advanced traffic intelligent management technologies that has become a crucial sector for smart cities. Its main purpose is to predict congestion states of a specific urban transport network and propose improvements in the traffic network that might result into a decrease of the travel times, air pollution and fuel consumption. This paper describes the implementation of an urban traffic flow model in the city of Modena based on real traffic sensor data. This is part of a wide European project that aims at studying the correlation among traffic and air pollution, therefore at combining traffic and air pollution simulations for testing various urban scenarios and raising citizen awareness about air quality where necessary.


2019 - TRAFAIR: Understanding Traffic Flow to Improve Air Quality [Relazione in Atti di Convegno]
Po, Laura; Rollo, Federica; Ramòn Rìos Viqueira, Josè; Trillo Lado, Raquel; Bigi, Alessandro; Cacheiro Lòpez, Javier; Paolucci, Michela; Nesi, Paolo
abstract

Environmental impacts of traffic are of major concern throughout many European metropolitan areas. Air pollution causes 400 000 deaths per year, making it first environmental cause of premature death in Europe. Among the main sources of air pollution in Europe, there are road traffic, domestic heating, and industrial combustion. The TRAFAIR project brings together 9 partners from two European countries (Italy and Spain) to develop innovative and sustainable services combining air quality, weather conditions, and traffic flows data to produce new information for the benefit of citizens and government decision-makers. The project is started in November 2018 and lasts two years. It is motivated by the huge amount of deaths caused by the air pollution. Nowadays, the situation is particularly critical in some member states of Europe. In February 2017, the European Commission warned five countries, among which Spain and Italy, of continued air pollution breaches. In this context, public administrations and citizens suffer from the lack of comprehensive and fast tools to estimate the level of pollution on an urban scale resulting from varying traffic flow conditions that would allow optimizing control strategies and increase air quality awareness. The goals of the project are twofold: monitoring urban air quality by using sensors in 6 European cities and making urban air quality predictions thanks to simulation models. The project is co-financed by the European Commission under the CEF TELECOM call on Open Data.


2018 - Building an Urban Theft Map by Analyzing Newspaper Crime Reports [Relazione in Atti di Convegno]
Po, Laura; Rollo, Federica
abstract

One of the main issues in today's cities is related to public safety, which can be improved by implementing a systematic analysis for identifying and analyzing patterns and trends in crime also called crime mapping. Mapping crime allows police analysts to identify crime hot spots, moreover it increases public confidence and citizen engagement and promotes transparency.This paper is focused on analyzing and mapping thefts through on-line newspaper using text mining techniques for an Italian city.


2017 - Student research abstract: A key-entity graph for clustering multichannel news [Relazione in Atti di Convegno]
Rollo, Federica
abstract

Social networks (SN) have gained a very important role in the dissemination of news, since they allow a greater share of news than web sites and are more timely to provide updates, publishing more updated versions of the same news on the same day. The use of a variety of communication media (or channels) stimulates the need for integration and analysis of the huge amount of information published globally. The scale and heterogeneity of these messages makes the analysis of news very challenging. This paper presents an in-progress research work: the definition of a tool for clustering news according to their topics in order to understand whether there are correlations between news published by different newspapers on the same channel or by the same newspaper on different channels. We started the implementation of a method [3] based on the Keygraph algorithm [4] in order to perform multichannel clustering of news according to their topics. In this paper, we extend the proposed method [3] by considering entities in addition to the keywords to detect topics. We argue that each event can be described by entities such as times, locations, persons, things and topics. Detecting entities in a news might improve the clustering results.


2017 - Topic detection in multichannel Italian newspapers [Relazione in Atti di Convegno]
Po, Laura; Rollo, Federica; Lado, Raquel Trillo
abstract

Nowadays, any person, company or public institution uses and exploits different channels to share private or public information with other people (friends, customers, relatives, etc.) or institutions. This context has changed the journalism, thus, the major newspapers report news not just on its own web site, but also on several social media such as Twitter or YouTube. The use of multiple communication media stimulates the need for integration and analysis of the content published globally and not just at the level of a single medium. An analysis to achieve a comprehensive overview of the information that reaches the end users and how they consume the information is needed. This analysis should identify the main topics in the news flow and reveal the mechanisms of publication of news on different media (e.g. news timeline). Currently, most of the work on this area is still focused on a single medium. So, an analysis across different media (channels) should improve the result of topic detection. This paper shows the application of a graph analytical approach, called Keygraph, to a set of very heterogeneous documents such as the news published on various media. A preliminary evaluation on the news published in a 5 days period was able to identify the main topics within the publications of a single newspaper, and also within the publications of 20 newspapers on several on-line channels.