Nuova ricerca

Costantino GRANA

Professore Ordinario
Dipartimento di Ingegneria "Enzo Ferrari"


Home | Curriculum(pdf) | Didattica |


Pubblicazioni

2025 - Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios [Relazione in Atti di Convegno]
Pipoli, Vittorio; Bolelli, Federico; Sarto, Sara; Cornia, Marcella; Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita; Ficarra, Elisa
abstract

This paper tackles the domain of multimodal prompting for visual recognition, specifically when dealing with missing modalities through multimodal Transformers. It presents two main contributions: (i) we introduce a novel prompt learning module which is designed to produce sample-specific prompts and (ii) we show that modality-agnostic prompts can effectively adjust to diverse missing modality scenarios. Our model, termed SCP, exploits the semantic representation of available modalities to query a learnable memory bank, which allows the generation of prompts based on the semantics of the input. Notably, SCP distinguishes itself from existing methodologies for its capacity of self-adjusting to both the missing modality scenario and the semantic context of the input, without prior knowledge about the specific missing modality and the number of modalities. Through extensive experiments, we show the effectiveness of the proposed prompt learning framework and demonstrate enhanced performance and robustness across a spectrum of missing modality cases.


2024 - A State-of-the-Art Review with Code about Connected Components Labeling on GPUs [Articolo su rivista]
Bolelli, Federico; Allegretti, Stefano; Lumetti, Luca; Grana, Costantino
abstract

This article is about Connected Components Labeling (CCL) algorithms developed for GPU accelerators. The task itself is employed in many modern image-processing pipelines and represents a fundamental step in different scenarios, whenever object recognition is required. For this reason, a strong effort in the development of many different proposals devoted to improving algorithm performance using different kinds of hardware accelerators has been made. This paper focuses on GPU-based algorithmic solutions published in the last two decades, highlighting their distinctive traits and the improvements they leverage. The state-of-the-art review proposed is equipped with the source code, which allows to straightforwardly reproduce all the algorithms in different experimental settings. A comprehensive evaluation on multiple environments is also provided, including different operating systems, compilers, and GPUs. Our assessments are performed by means of several tests, including real-case images and synthetically generated ones, highlighting the strengths and weaknesses of each proposal. Overall, the experimental results revealed that block-based oriented algorithms outperform all the other algorithmic solutions on both 2D images and 3D volumes, regardless of the selected environment.


2024 - BarBeR: A Barcode Benchmarking Repository [Relazione in Atti di Convegno]
Vezzali, Enrico; Bolelli, Federico; Santi, Stefano; Grana, Costantino
abstract

Since their invention in 1949, barcodes have remained the preferred method for automatic data capture, playing a crucial role in supply chain management. To detect a barcode in an image, multiple algorithms have been proposed in the literature, with a significant increase of interest in the topic since the rise of deep learning. However, research in the field suffers from many limitations, including the scarcity of public datasets and code implementations, which hampers the reproducibility and reliability of published results. For this reason, we developed "BarBeR" (Barcode Benchmark Repository), a benchmark designed for testing and comparing barcode detection algorithms. This benchmark includes the code implementation of various detection algorithms for barcodes, along with a suite of useful metrics. It offers a range of test setups and can be expanded to include any localization algorithm. In addition, we provide a large, annotated dataset of 8748 barcode images, combining multiple public barcode datasets with standardized annotation formats for both detection and segmentation tasks. Finally, we share the results obtained from running the benchmark on our dataset, offering valuable insights into the performance of different algorithms.


2024 - Enhancing Patch-Based Learning for the Segmentation of the Mandibular Canal [Articolo su rivista]
Lumetti, Luca; Pipoli, Vittorio; Bolelli, Federico; Ficarra, Elisa; Grana, Costantino
abstract

Segmentation of the Inferior Alveolar Canal (IAC) is a critical aspect of dentistry and maxillofacial imaging, garnering considerable attention in recent research endeavors. Deep learning techniques have shown promising results in this domain, yet their efficacy is still significantly hindered by the limited availability of 3D maxillofacial datasets. An inherent challenge is posed by the size of input volumes, which necessitates a patch-based processing approach that compromises the neural network performance due to the absence of global contextual information. This study introduces a novel approach that harnesses the spatial information within the extracted patches and incorporates it into a Transformer architecture, thereby enhancing the segmentation process through the use of prior knowledge about the patch location. Our method significantly improves the Dice score by a factor of 4 points, with respect to the previous work proposed by Cipriano et al., while also reducing the training steps required by the entire pipeline. By integrating spatial information and leveraging the power of Transformer architectures, this research not only advances the accuracy of IAC segmentation, but also streamlines the training process, offering a promising direction for improving dental and maxillofacial image analysis.


2024 - Identifying Impurities in Liquids of Pharmaceutical Vials [Relazione in Atti di Convegno]
Rosati, Gabriele; Marchesini, Kevin; Lumetti, Luca; Sartori, Federica; Balboni, Beatrice; Begarani, Filippo; Vescovi, Luca; Bolelli, Federico; Grana, Costantino
abstract

The presence of visible particles in pharmaceutical products is a critical quality issue that demands strict monitoring. Recently, Convolutional Neural Networks (CNNs) have been widely used in industrial settings to detect defects, but there remains a gap in the literature concerning the detection of particles floating in liquid substances, mainly due to the lack of publicly available datasets. In this study, we focus on the detection of foreign particles in pharmaceutical liquid vials, leveraging two state-of-the-art deep-learning approaches adapted to our specific multiclass problem. The first methodology employs a standard ResNet-18 architecture, while the second exploits a Multi-Instance Learning (MIL) technique to efficiently deal with multiple images (sequences) of the same sample. To address the issue of no data availability, we devised and partially released an annotated dataset consisting of sequences containing 19 images for each sample, captured from rotating vials, both with and without impurities. The dataset comprises 2,426 sequences for a total of 46,094 images labeled at the sequence level and including five distinct classes. The proposed methodologies, trained on this new extensive dataset, represent advancements in the field, offering promising strategies to improve the safety and quality control of pharmaceutical products and setting a benchmark for future comparisons.


2024 - Investigating the ABCDE Rule in Convolutional Neural Networks [Relazione in Atti di Convegno]
Bolelli, Federico; Lumetti, Luca; Marchesini, Kevin; Candeloro, Ettore; Grana, Costantino
abstract

Convolutional Neural Networks (CNNs) have been broadly employed in dermoscopic image analysis, mainly due to the large amount of data gathered by the International Skin Imaging Collaboration (ISIC). But where do neural networks look? Several authors have claimed that the ISIC dataset is affected by strong biases, i.e. spurious correlations between samples that machine learning models unfairly exploit while discarding the useful patterns they are expected to learn. These strong claims have been supported by showing that deep learning models maintain excellent performance even when "no information about the lesion remains" in the debased input images. With this paper, we explore the interpretability of CNNs in dermoscopic image analysis by analyzing which characteristics are considered by autonomous classification algorithms. Starting from a standard setting, experiments presented in this paper gradually conceal well-known crucial dermoscopic features and thoroughly investigate how CNNs performance subsequently evolves. Experimental results carried out on two well-known CNNs, EfficientNet-B3, and ResNet-152, demonstrate that neural networks autonomously learn to extract features that are notoriously important for melanoma detection. Even when some of such features are removed, the others are still enough to achieve satisfactory classification performance. Obtained results demonstrate that literature claims on biases are not supported by carried-out experiments. Finally, to demonstrate the generalization capabilities of state-of-the-art CNN models for skin lesion classification, a large private dataset has been employed as an additional test set.


2024 - Location Matters: Harnessing Spatial Information to Enhance the Segmentation of the Inferior Alveolar Canal in CBCTs [Relazione in Atti di Convegno]
Lumetti, Luca; Pipoli, Vittorio; Bolelli, Federico; Ficarra, Elisa; Grana, Costantino
abstract

The segmentation of the Inferior Alveolar Canal (IAC) plays a central role in maxillofacial surgery, drawing significant attention in the current research. Because of their outstanding results, deep learning methods are widely adopted in the segmentation of 3D medical volumes, including the IAC in Cone Beam Computed Tomography (CBCT) data. One of the main challenges when segmenting large volumes, including those obtained through CBCT scans, arises from the use of patch-based techniques, mandatory to fit memory constraints. Such training approaches compromise neural network performance due to a reduction in the global contextual information. Performance degradation is prominently evident when the target objects are small with respect to the background, as it happens with the inferior alveolar nerve that develops across the mandible, but involves only a few voxels of the entire scan. In order to target this issue and push state-of-the-art performance in the segmentation of the IAC, we propose an innovative approach that exploits spatial information of extracted patches and integrates it into a Transformer architecture. By incorporating prior knowledge about patch location, our model improves state of the art by ~2 points on the Dice score when integrated with the standard U-Net architecture. The source code of our proposal is publicly released.


2024 - Sustainable Use of Resources in Hospitals: A Machine Learning-Based Approach to Predict Prolonged Length of Stay at the Time of Admission [Abstract in Atti di Convegno]
Perliti, Paolo; Giovanetti, Anita; Bolelli, Federico; Grana, Costantino
abstract


2023 - Annotating the Inferior Alveolar Canal: the Ultimate Tool [Relazione in Atti di Convegno]
Lumetti, Luca; Pipoli, Vittorio; Bolelli, Federico; Grana, Costantino
abstract

The Inferior Alveolar Nerve (IAN) is of main interest in the maxillofacial field, as an accurate localization of such nerve reduces the risks of injury during surgical procedures. Although recent literature has focused on developing novel deep learning techniques to produce accurate segmentation masks of the canal containing the IAN, there are still strong limitations due to the scarce amount of publicly available 3D maxillofacial datasets. In this paper, we present an improved version of a previously released tool, IACAT (Inferior Alveolar Canal Annotation Tool), today used by medical experts to produce 3D ground truth annotation. In addition, we release a new dataset, ToothFairy, which is part of the homonymous MICCAI2023 challenge hosted by the Grand-Challenge platform, as an extension of the previously released Maxillo dataset, which was the only publicly available. With ToothFairy, the number of annotations has been increased as well as the quality of existing data.


2023 - Artificial intelligence evaluation of confocal microscope prostate images: our preliminary experience [Articolo su rivista]
Bianchi, G.; Puliatti, S.; Rodriguez, N.; Micali, S.; Bertoni, L.; Reggiani Bonetti, L.; Caramaschi, S.; Bolelli, F.; Pinamonti, M.; Rozze, D.; Grana, C.
abstract


2023 - Inferior Alveolar Canal Automatic Detection with Deep Learning CNNs on CBCTs: Development of a Novel Model and Release of Open-Source Dataset and Algorithm [Articolo su rivista]
Di Bartolomeo, Mattia; Pellacani, Arrigo; Bolelli, Federico; Cipriano, Marco; Lumetti, Luca; Negrello, Sara; Allegretti, Stefano; Minafra, Paolo; Pollastri, Federico; Nocini, Riccardo; Colletti, Giacomo; Chiarini, Luigi; Grana, Costantino; Anesi, Alexandre
abstract

Introduction: The need of accurate three-dimensional data of anatomical structures is increasing in the surgical field. The development of convolutional neural networks (CNNs) has been helping to fill this gap by trying to provide efficient tools to clinicians. Nonetheless, the lack of a fully accessible datasets and open-source algorithms is slowing the improvements in this field. In this paper, we focus on the fully automatic segmentation of the Inferior Alveolar Canal (IAC), which is of immense interest in the dental and maxillo-facial surgeries. Conventionally, only a bidimensional annotation of the IAC is used in common clinical practice. A reliable convolutional neural network (CNNs) might be timesaving in daily practice and improve the quality of assistance. Materials and methods: Cone Beam Computed Tomography (CBCT) volumes obtained from a single radiological center using the same machine were gathered and annotated. The course of the IAC was annotated on the CBCT volumes. A secondary dataset with sparse annotations and a primary dataset with both dense and sparse annotations were generated. Three separate experiments were conducted in order to evaluate the CNN. The IoU and Dice scores of every experiment were recorded as the primary endpoint, while the time needed to achieve the annotation was assessed as the secondary end-point. Results: A total of 347 CBCT volumes were collected, then divided into primary and secondary datasets. Among the three experiments, an IoU score of 0.64 and a Dice score of 0.79 were obtained thanks to the pre-training of the CNN on the secondary dataset and the creation of a novel deep label propagation model, followed by proper training on the primary dataset. To the best of our knowledge, these results are the best ever published in the segmentation of the IAC. The datasets is publicly available and algorithm is published as open-source software. On average, the CNN could produce a 3D annotation of the IAC in 6.33 s, compared to 87.3 s needed by the radiology technician to produce a bidimensional annotation. Conclusions: To resume, the following achievements have been reached. A new state of the art in terms of Dice score was achieved, overcoming the threshold commonly considered of 0.75 for the use in clinical practice. The CNN could fully automatically produce accurate three-dimensional segmentation of the IAC in a rapid setting, compared to the bidimensional annotations commonly used in the clinical practice and generated in a time-consuming manner. We introduced our innovative deep label propagation method to optimize the performance of the CNN in the segmentation of the IAC. For the first time in this field, the datasets and the source codes used were publicly released, granting reproducibility of the experiments and helping in the improvement of IAC segmentation.


2022 - Applications of AI and HPC in the Health Domain [Capitolo/Saggio]
Oniga, D.; Cantalupo, B.; Tartaglione, E.; Perlo, D.; Grangetto, M.; Aldinucci, M.; Bolelli, F.; Pollastri, F.; Cancilla, M.; Canalini, L.; Grana, C.; Alcalde, C. M.; Cardillo, F. A.; Florea, M.
abstract


2022 - Automated Prediction of Kidney Failure in IgA Nephropathy with Deep Learning from Biopsy Images [Articolo su rivista]
Testa, F.; Fontana, F.; Pollastri, F.; Chester, J.; Leonelli, M.; Giaroni, F.; Gualtieri, F.; Bolelli, F.; Mancini, E.; Nordio, M.; Sacco, P.; Ligabue, G.; Giovanella, S.; Ferri, M.; Alfano, G.; Gesualdo, L.; Cimino, S.; Donati, G.; Grana, C.; Magistroni, R.
abstract

Background and objectives Digital pathology and artificial intelligence offer new opportunities for automatic histologic scoring. We applied a deep learning approach to IgA nephropathy biopsy images to develop an automatic histologic prognostic score, assessed against ground truth (kidney failure) among patients with IgA nephropathy who were treated over 39 years. We assessed noninferiority in comparison with the histologic component of currently validated predictive tools. We correlated additional histologic features with our deep learning predictive score to identify potential additional predictive features. Design, setting, participants, & measurements Training for deep learning was performed with randomly selected, digitalized, cortical Periodic acid–Schiff–stained sections images (363 kidney biopsy specimens) to develop our deep learning predictive score. We estimated noninferiority using the area under the receiver operating characteristic curve (AUC) in a randomly selected group (95 biopsy specimens) against the gold standard Oxford classification (MEST-C) scores used by the International IgA Nephropathy Prediction Tool and the clinical decision supporting system for estimating the risk of kidney failure in IgA nephropathy. We assessed additional potential predictive histologic features against a subset (20 kidney biopsy specimens) with the strongest and weakest deep learning predictive scores. Results We enrolled 442 patients; the 10-year kidney survival was 78%, and the study median follow-up was 6.7 years. Manual MEST-C showed no prognostic relationship for the endocapillary parameter only. The deep learning predictive score was not inferior to MEST-C applied using the International IgA Nephropathy Prediction Tool and the clinical decision supporting system (AUC of 0.84 versus 0.77 and 0.74, respectively) and confirmed a good correlation with the tubolointerstitial score (r50.41, P,0.01). We observed no correlations between the deep learning prognostic score and the mesangial, endocapillary, segmental sclerosis, and crescent parameters. Additional potential predictive histopathologic features incorporated by the deep learning predictive score included (1)inflammation within areas of interstitial fibrosis and tubular atrophy and (2) hyaline casts. Conclusions The deep learning approach was noninferior to manual histopathologic reporting and considered prognostic features not currently included in MEST-C assessment.


2022 - Connected Components Labeling on Bitonal Images [Relazione in Atti di Convegno]
Bolelli, Federico; Allegretti, Stefano; Grana, Costantino
abstract


2022 - Deep Segmentation of the Mandibular Canal: a New 3D Annotated Dataset of CBCT Volumes [Articolo su rivista]
Cipriano, Marco; Allegretti, Stefano; Bolelli, Federico; Di Bartolomeo, Mattia; Pollastri, Federico; Pellacani, Arrigo; Minafra, Paolo; Anesi, Alexandre; Grana, Costantino
abstract

Inferior Alveolar Nerve (IAN) canal detection has been the focus of multiple recent works in dentistry and maxillofacial imaging. Deep learning-based techniques have reached interesting results in this research field, although the small size of 3D maxillofacial datasets has strongly limited the performance of these algorithms. Researchers have been forced to build their own private datasets, thus precluding any opportunity for reproducing results and fairly comparing proposals. This work describes a novel, large, and publicly available mandibular Cone Beam Computed Tomography (CBCT) dataset, with 2D and 3D manual annotations, provided by expert clinicians. Leveraging this dataset and employing deep learning techniques, we are able to improve the state of the art on the 3D mandibular canal segmentation. The source code which allows to exactly reproduce all the reported experiments is released as an open-source project, along with this article.


2022 - Improving Segmentation of the Inferior Alveolar Nerve through Deep Label Propagation [Relazione in Atti di Convegno]
Cipriano, Marco; Allegretti, Stefano; Bolelli, Federico; Pollastri, Federico; Grana, Costantino
abstract


2022 - Long-Range 3D Self-Attention for MRI Prostate Segmentation [Relazione in Atti di Convegno]
Pollastri, Federico; Cipriano, Marco; Bolelli, Federico; Grana, Costantino
abstract

The problem of prostate segmentation from Magnetic Resonance Imaging (MRI) is an intense research area, due to the increased use of MRI in the diagnosis and treatment planning of prostate cancer. The lack of clear boundaries and huge variation of texture and shapes between patients makes the task very challenging, and the 3D nature of the data makes 2D segmentation algorithms suboptimal for the task. With this paper, we propose a novel architecture to fill the gap between the most recent advances in 2D computer vision and 3D semantic segmentation. In particular, the designed model retrieves multi-scale 3D features with dilated convolutions and makes use of a self-attention transformer to gain a global field of view. The proposed Long-Range 3D Self-Attention block allows the convolutional neural network to build significant features by merging together contextual information collected at various scales. Experimental results show that the proposed method improves the state-of-the-art segmentation accuracy on MRI prostate segmentation.


2022 - One DAG to Rule Them All [Articolo su rivista]
Bolelli, Federico; Allegretti, Stefano; Grana, Costantino
abstract

In this paper, we present novel strategies for optimizing the performance of many binary image processing algorithms. These strategies are collected in an open-source framework, GRAPHGEN, that is able to automatically generate optimized C++ source code implementing the desired optimizations. Simply starting from a set of rules, the algorithms introduced with the GRAPHGEN framework can generate decision trees with minimum average path-length, possibly considering image pattern frequencies, apply state prediction and code compression by the use of Directed Rooted Acyclic Graphs (DRAGs). Moreover, the proposed algorithmic solutions allow to combine different optimization techniques and significantly improve performance. Our proposal is showcased on three classical and widely employed algorithms (namely Connected Components Labeling, Thinning, and Contour Tracing). When compared to existing approaches —in 2D and 3D—, implementations using the generated optimal DRAGs perform significantly better than previous state-of-the-art algorithms, both on CPU and GPU.


2022 - Quest for Speed: The Epic Saga of Record-Breaking on OpenCV Connected Components Extraction [Relazione in Atti di Convegno]
Bolelli, Federico; Allegretti, Stefano; Grana, Costantino
abstract

Connected Components Labeling (CCL) represents an essential part of many Image Processing and Computer Vision pipelines. Given its relevance on the field, it has been part of most cutting-edge Computer Vision libraries. In this paper, all the algorithms included in the OpenCV during the years are reviewed, from sequential to parallel/GPU-based implementations. Our goal is to provide a better understanding of what has changed and why one algorithm should be preferred to another both in terms of memory usage and execution speed.


2021 - A Cone Beam Computed Tomography Annotation Tool for Automatic Detection of the Inferior Alveolar Nerve Canal [Relazione in Atti di Convegno]
Mercadante, Cristian; Cipriano, Marco; Bolelli, Federico; Pollastri, Federico; Di Bartolomeo, Mattia; Anesi, Alexandre; Grana, Costantino
abstract

In recent years, deep learning has been employed in several medical fields, achieving impressive results. Unfortunately, these algorithms require a huge amount of annotated data to ensure the correct learning process. When dealing with medical imaging, collecting and annotating data can be cumbersome and expensive. This is mainly related to the nature of data, often three-dimensional, and to the need for well-trained expert technicians. In maxillofacial imagery, recent works have been focused on the detection of the Inferior Alveolar Nerve (IAN), since its position is of great relevance for avoiding severe injuries during surgery operations such as third molar extraction or implant installation. In this work, we introduce a novel tool for analyzing and labeling the alveolar nerve from Cone Beam Computed Tomography (CBCT) 3D volumes.


2021 - A Deep Analysis on High Resolution Dermoscopic Image Classification [Articolo su rivista]
Pollastri, Federico; Parreño, Mario; Maroñas, Juan; Bolelli, Federico; Paredes, Roberto; Ramos, Daniel; Grana, Costantino
abstract

Convolutional Neural Networks (CNNs) have been broadly employed in dermoscopic image analysis, mainly as a result of the large amount of data gathered by the International Skin Imaging Collaboration (ISIC). Like in many other medical imaging domains, state-of-the-art methods take advantage of architectures developed for other tasks, frequently assuming full transferability between enormous sets of natural images (eg{} ImageNet) and dermoscopic images, which is not always the case. With this paper we provide a comprehensive analysis on the effectiveness of state-of-the-art deep learning techniques when applied to dermoscopic image analysis. In order to achieve this goal, we consider several CNNs architectures and analyze how their performance is affected by the size of the network, image resolution, data augmentation process, amount of available data, and model calibration. Moreover, taking advantage of the analysis performed, we design a novel ensemble method to further increase the classification accuracy. The proposed solution achieved the third best result in the 2019 official ISIC challenge, with an accuracy of 0.593.


2021 - A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes [Relazione in Atti di Convegno]
Söchting, Maximilian; Allegretti, Stefano; Bolelli, Federico; Grana, Costantino
abstract

Connected Components Labeling represents a fundamental step for many Computer Vision and Image Processing pipelines. Since the first appearance of the task in the sixties, many algorithmic solutions to optimize the computational load needed to label an image have been proposed. Among them, block-based scan approaches and decision trees revealed to be some of the most valuable strategies. However, due to the cost of the manual construction of optimal decision trees and the computational limitations of automatic strategies employed in the past, the application of blocks and decision trees has been restricted to small masks, and thus to 2D algorithms. With this paper we present a novel heuristic algorithm based on decision tree learning methodology, called Entropy Partitioning Decision Tree (EPDT). It allows to compute near-optimal decision trees for large scan masks. Experimental results demonstrate that algorithms based on the generated decision trees outperform state-of-the-art competitors.


2021 - A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes: Implementation and Reproducibility Notes [Relazione in Atti di Convegno]
Bolelli, Federico; Allegretti, Stefano; Grana, Costantino
abstract

This paper provides a detailed description of how to install, setup, and use the YACCLAB benchmark to test the algorithms published in "A Heuristic-Based Decision Tree for Connected Components Labeling of 3D Volumes," underlying how the parameters affect and influence experimental results.


2021 - Confidence Calibration for Deep Renal Biopsy Immunofluorescence Image Classification [Relazione in Atti di Convegno]
Pollastri, Federico; Maroñas, Juan; Bolelli, Federico; Ligabue, Giulia; Paredes, Roberto; Magistroni, Riccardo; Grana, Costantino
abstract

With this work we tackle immunofluorescence classification in renal biopsy, employing state-of-the-art Convolutional Neural Networks. In this setting, the aim of the probabilistic model is to assist an expert practitioner towards identifying the location pattern of antibody deposits within a glomerulus. Since modern neural networks often provide overconfident outputs, we stress the importance of having a reliable prediction, demonstrating that Temperature Scaling (TS), a recently introduced re-calibration technique, can be successfully applied to immunofluorescence classification in renal biopsy. Experimental results demonstrate that the designed model yields good accuracy on the specific task, and that TS is able to provide reliable probabilities, which are highly valuable for such a task given the low inter-rater agreement.


2021 - Fast Run-Based Connected Components Labeling for Bitonal Images [Relazione in Atti di Convegno]
Wonsang, Lee; Allegretti, Stefano; Bolelli, Federico; Grana, Costantino
abstract

Connected Components Labeling (CCL) is a fundamental task in binary image processing. Since its introduction in the sixties, several algorithmic strategies have been proposed to optimize its execution time. Most CCL algorithms in literature, including the current state-of-the-art, are designed to work on an input stored with 1-byte per pixel, even if the most memory-efficient format for a binary input only uses 1-bit per pixel. This paper deals with connected components labeling on 1-bit per pixel images, also known as 1bpp or bitonal images. An existing run-based CCL strategy is adapted to this input format, and optimized with Find First Set hardware operations and a smart management of provisional labels, giving birth to an efficient solution called Bit-Run Two Scan (BRTS). Then, BRTS is further optimized by merging pairs of consecutive lines through bitwise OR, and finding runs on this reduced data. This modification is the basis for another new algorithm on bitonal images, Bit-Merge-Run Scan (BMRS). When evaluated on a public benchmark, the two proposals outperform all the fastest competitors in literature, and therefore represent the new state-of-the-art for connected components labeling on bitonal images.


2021 - Supporting Skin Lesion Diagnosis with Content-Based Image Retrieval [Relazione in Atti di Convegno]
Allegretti, Stefano; Bolelli, Federico; Pollastri, Federico; Longhitano, Sabrina; Pellacani, Giovanni; Grana, Costantino
abstract

In recent years, many attempts have been dedicated to the creation of automated devices that could assist both expert and beginner dermatologists towards fast and early diagnosis of skin lesions. Tasks such as skin lesion classification and segmentation have been extensively addressed with deep learning algorithms, which in some cases reach a diagnostic accuracy comparable to that of expert physicians. However, the general lack of interpretability and reliability severely hinders the ability of those approaches to actually support dermatologists in the diagnosis process. In this paper a novel skin image retrieval system is presented, which exploits features extracted by Convolutional Neural Networks to gather similar images from a publicly available dataset, in order to assist the diagnosis process of both expert and novice practitioners. In the proposed framework, ResNet-50 is initially trained for the classification of dermoscopic images; then, the feature extraction part is isolated, and an embedding network is built on top of it. The embedding learns an alternative representation, which allows to check image similarity by means of a distance measure. Experimental results reveal that the proposed method is able to select meaningful images, which can effectively boost the classification accuracy of human dermatologists.


2021 - The DeepHealth Toolkit: A Key European Free and Open-Source Software for Deep Learning and Computer Vision Ready to Exploit Heterogeneous HPC and Cloud Architectures [Capitolo/Saggio]
Aldinucci, Marco; Atienza, David; Bolelli, Federico; Caballero, Mónica; Colonnelli, Iacopo; Flich, José; Gómez, Jon A.; González, David; Grana, Costantino; Grangetto, Marco; Leo, Simone; López, Pedro; Oniga, Dana; Paredes, Roberto; Pireddu, Luca; Quiñones, Eduardo; Silva, Tatiana; Tartaglione, Enzo; Zapater, Marina
abstract

At the present time, we are immersed in the convergence between Big Data, High-Performance Computing and Artificial Intelligence. Technological progress in these three areas has accelerated in recent years, forcing different players like software companies and stakeholders to move quicky. The European Union is dedicating a lot of resources to maintain its relevant position in this scenario, funding projects to implement large-scale pilot testbeds that combine the latest advances in Artificial Intelligence, High-Performance Computing, Cloud and Big Data technologies. The DeepHealth project is an example focused on the health sector whose main outcome is the DeepHealth toolkit, a European unified framework that offers deep learning and computer vision capabilities, completely adapted to exploit underlying heterogeneous High-Performance Computing, Big Data and cloud architectures, and ready to be integrated into any software platform to facilitate the development and deployment of new applications for specific problems in any sector. This toolkit is intended to be one of the European contributions to the field of AI. This chapter introduces the toolkit with its main components and complementary tools; providing a clear view to facilitate and encourage its adoption and wide use by the European community of developers of AI-based solutions and data scientists working in the healthcare sector and others.


2021 - The DeepHealth Toolkit: A Unified Framework to Boost Biomedical Applications [Relazione in Atti di Convegno]
Cancilla, Michele; Canalini, Laura; Bolelli, Federico; Allegretti, Stefano; Carrión, Salvador; Paredes, Roberto; Ander Gómez, Jon; Leo, Simone; Enrico Piras, Marco; Pireddu, Luca; Badouh, Asaf; Marco-Sola, Santiago; Alvarez, Lluc; Moreto, Miquel; Grana, Costantino
abstract

Given the overwhelming impact of machine learning on the last decade, several libraries and frameworks have been developed in recent years to simplify the design and training of neural networks, providing array-based programming, automatic differentiation and user-friendly access to hardware accelerators. None of those tools, however, was designed with native and transparent support for Cloud Computing or heterogeneous High-Performance Computing (HPC). The DeepHealth Toolkit is an open source Deep Learning toolkit aimed at boosting productivity of data scientists operating in the medical field by providing a unified framework for the distributed training of neural networks, which is able to leverage hybrid HPC and cloud environments in a transparent way for the user. The toolkit is composed of a Computer Vision library, a Deep Learning library, and a front-end for non-expert users; all of the components are focused on the medical domain, but they are general purpose and can be applied to any other field. In this paper, the principles driving the design of the DeepHealth libraries are described, along with details about the implementation and the interaction between the different elements composing the toolkit. Finally, experiments on common benchmarks prove the efficiency of each separate component and of the DeepHealth Toolkit overall.


2020 - A Warp Speed Chain-Code Algorithm Based on Binary Decision Trees [Relazione in Atti di Convegno]
Allegretti, Stefano; Bolelli, Federico; Grana, Costantino
abstract

Contours extraction, also known as chain-code extraction, is one of the most common algorithms of binary image processing. Despite being the raster way the most cache friendly and, consequently, fast way to scan an image, most commonly used chain-code algorithms perform contours tracing, and therefore tend to be fairly inefficient. In this paper, we took a rarely used algorithm that extracts contours in raster scan, and optimized its execution time through template functions, look-up tables and decision trees, in order to reduce code branches and the average number of load/store operations required. The result is a very fast solution that outspeeds the state-of-the-art contours extraction algorithm implemented in OpenCV, on a collection of real case datasets. Contribution: This paper significantly improves the performance of existing chain-code algorithms, by smartly introducing decision trees to reduce code branches and the average number of load/store operations required.


2020 - Augmenting data with GANs to segment melanoma skin lesions [Articolo su rivista]
Pollastri, Federico; Bolelli, Federico; Paredes Palacios, Roberto; Grana, Costantino
abstract

This paper presents a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the skin lesion segmentation task, which is a fundamental first step in the automated melanoma detection process. The proposed framework generates both skin lesion images and their segmentation masks, making the data augmentation process extremely straightforward. In order to thoroughly analyze how the quality and diversity of synthetic images impact the efficiency of the method, we remodel two different well known GANs: a Deep Convolutional GAN (DCGAN) and a Laplacian GAN (LAPGAN). Experimental results reveal that, by introducing such kind of synthetic data into the training process, the overall accuracy of a state-of-the-art Convolutional/Deconvolutional Neural Network for melanoma skin lesion segmentation is increased.


2020 - Evaluation of the Classification Accuracy of the Kidney Biopsy Direct Immunofluorescence through Convolutional Neural Networks [Articolo su rivista]
Ligabue, Giulia; Pollastri, Federico; Fontana, Francesco; Leonelli, Marco; Furci, Luciana; Giovanella, Silvia; Alfano, Gaetano; Cappelli, Gianni; Testa, Francesca; Bolelli, Federico; Grana, Costantino; Magistroni, Riccardo
abstract

Background and objectives: Immunohistopathology is an essential technique in the diagnostic workflow of a kidney biopsy. Deep learning is an effective tool in the elaboration of medical imaging. We wanted to evaluate the role of a convolutional neural network as a support tool for kidney immunofluorescence reporting. Design, setting, participants, & measurements: High-magnification (×400) immunofluorescence images of kidney biopsies performed from the year 2001 to 2018 were collected. The report, adopted at the Division of Nephrology of the AOU Policlinico di Modena, describes the specimen in terms of “appearance,” “distribution,” “location,” and “intensity” of the glomerular deposits identified with fluorescent antibodies against IgG, IgA, IgM, C1q and C3 complement fractions, fibrinogen, and κ- and λ-light chains. The report was used as ground truth for the training of the convolutional neural networks. Results: In total, 12,259 immunofluorescence images of 2542 subjects undergoing kidney biopsy were collected. The test set analysis showed accuracy values between 0.79 (“irregular capillary wall” feature) and 0.94 (“fine granular” feature). The agreement test of the results obtained by the convolutional neural networks with respect to the ground truth showed similar values to three pathologists of our center. Convolutional neural networks were 117 times faster than human evaluators in analyzing 180 test images. A web platform, where it is possible to upload digitized images of immunofluorescence specimens, is available to evaluate the potential of our approach. Conclusions: The data showed that the accuracy of convolutional neural networks is comparable with that of pathologists experienced in the field.


2020 - Optimized Block-Based Algorithms to Label Connected Components on GPUs [Articolo su rivista]
Allegretti, Stefano; Bolelli, Federico; Grana, Costantino
abstract

Connected Components Labeling (CCL) is a crucial step of several image processing and computer vision pipelines. Many efficient sequential strategies exist, among which one of the most effective is the use of a block-based mask to drastically cut the number of memory accesses. In the last decade, aided by the fast development of Graphics Processing Units (GPUs), a lot of data parallel CCL algorithms have been proposed along with sequential ones. Applications that entirely run in GPU can benefit from parallel implementations of CCL that allow to avoid expensive memory transfers between host and device. In this paper, two new eight-connectivity CCL algorithms are proposed, namely Block-based Union Find (BUF) and Block-based Komura Equivalence (BKE). These algorithms optimize existing GPU solutions introducing a block-based approach. Extensions for three-dimensional datasets are also discussed. In order to produce a fair comparison with previously proposed alternatives, YACCLAB, a public CCL benchmarking framework, has been extended and made suitable for evaluating also GPU algorithms. Moreover, three-dimensional datasets have been added to its collection. Experimental results on real cases and synthetically generated datasets demonstrate the superiority of the new proposals with respect to state-of-the-art, both on 2D and 3D scenarios.


2020 - Spaghetti Labeling: Directed Acyclic Graphs for Block-Based Connected Components Labeling [Articolo su rivista]
Bolelli, Federico; Allegretti, Stefano; Baraldi, Lorenzo; Grana, Costantino
abstract

Connected Components Labeling is an essential step of many Image Processing and Computer Vision tasks. Since the first proposal of a labeling algorithm, which dates back to the sixties, many approaches have optimized the computational load needed to label an image. In particular, the use of decision forests and state prediction have recently appeared as valuable strategies to improve performance. However, due to the overhead of the manual construction of prediction states and the size of the resulting machine code, the application of these strategies has been restricted to small masks, thus ignoring the benefit of using a block-based approach. In this paper, we combine a block-based mask with state prediction and code compression: the resulting algorithm is modeled as a Directed Rooted Acyclic Graph with multiple entry points, which is automatically generated without manual intervention. When tested on synthetic and real datasets, in comparison with optimized implementations of state-of-the-art algorithms, the proposed approach shows superior performance, surpassing the results obtained by all compared approaches in all settings.


2020 - Towards Reliable Experiments on the Performance of Connected Components Labeling Algorithms [Articolo su rivista]
Bolelli, Federico; Cancilla, Michele; Baraldi, Lorenzo; Grana, Costantino
abstract

The problem of labeling the connected components of a binary image is well-defined and several proposals have been presented in the past. Since an exact solution to the problem exists, algorithms mainly differ on their execution speed. In this paper, we propose and describe YACCLAB, Yet Another Connected Components Labeling Benchmark. Together with a rich and varied dataset, YACCLAB contains an open source platform to test new proposals and to compare them with publicly available competitors. Textual and graphical outputs are automatically generated for many kinds of tests, which analyze the methods from different perspectives. An extensive set of experiments among state-of-the-art techniques is reported and discussed.


2019 - A Block-Based Union-Find Algorithm to Label Connected Components on GPUs [Relazione in Atti di Convegno]
Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Grana, Costantino
abstract

In this paper, we introduce a novel GPU-based Connected Components Labeling algorithm: the Block-based Union Find. The proposed strategy significantly improves an existing GPU algorithm, taking advantage of a block-based approach. Experimental results on real cases and synthetically generated datasets demonstrate the superiority of the new proposal with respect to state-of-the-art.


2019 - Connected Components Labeling on DRAGs: Implementation and Reproducibility Notes [Relazione in Atti di Convegno]
Bolelli, Federico; Cancilla, Michele; Baraldi, Lorenzo; Grana, Costantino
abstract

In this paper we describe the algorithmic implementation details of "Connected Components Labeling on DRAGs'' (Directed Rooted Acyclic Graphs), studying the influence of parameters on the results. Moreover, a detailed description of how to install, setup and use YACCLAB (Yet Another Connected Components LAbeling Benchmark) to test DRAG is provided.


2019 - How does Connected Components Labeling with Decision Trees perform on GPUs? [Relazione in Atti di Convegno]
Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Pollastri, Federico; Canalini, Laura; Grana, Costantino
abstract

In this paper the problem of Connected Components Labeling (CCL) in binary images using Graphic Processing Units (GPUs) is tackled by a different perspective. In the last decade, many novel algorithms have been released, specifically designed for GPUs. Because CCL literature concerning sequential algorithms is very rich, and includes many efficient solutions, designers of parallel algorithms were often inspired by techniques that had already proved successful in a sequential environment, such as the Union-Find paradigm for solving equivalences between provisional labels. However, the use of decision trees to minimize memory accesses, which is one of the main feature of the best performing sequential algorithms, was never taken into account when designing parallel CCL solutions. In fact, branches in the code tend to cause thread divergence, which usually leads to inefficiency. Anyway, this consideration does not necessarily apply to every possible scenario. Are we sure that the advantages of decision trees do not compensate for the cost of thread divergence? In order to answer this question, we chose three well-known sequential CCL algorithms, which employ decision trees as the cornerstone of their strategy, and we built a data-parallel version of each of them. Experimental tests on real case datasets show that, in most cases, these solutions outperform state-of-the-art algorithms, thus demonstrating the effectiveness of decision trees also in a parallel environment.


2019 - Improving the Performance of Thinning Algorithms with Directed Rooted Acyclic Graphs [Relazione in Atti di Convegno]
Bolelli, Federico; Grana, Costantino
abstract

In this paper we propose a strategy to optimize the performance of thinning algorithms. This solution is obtained by combining three proven strategies for binary images neighborhood exploration, namely modeling the problem with an optimal decision tree, reusing pixels from the previous step of the algorithm, and reducing the code footprint by means of Directed Rooted Acyclic Graphs. A complete and open-source benchmarking suite is also provided. Experimental results confirm that the proposed algorithms clearly outperform classical implementations.


2019 - Skin Lesion Segmentation Ensemble with Diverse Training Strategies [Relazione in Atti di Convegno]
Canalini, Laura; Pollastri, Federico; Bolelli, Federico; Cancilla, Michele; Allegretti, Stefano; Grana, Costantino
abstract

This paper presents a novel strategy to perform skin lesion segmentation from dermoscopic images. We design an effective segmentation pipeline, and explore several pre-training methods to initialize the features extractor, highlighting how different procedures lead the Convolutional Neural Network (CNN) to focus on different features. An encoder-decoder segmentation CNN is employed to take advantage of each pre-trained features extractor. Experimental results reveal how multiple initialization strategies can be exploited, by means of an ensemble method, to obtain state-of-the-art skin lesion segmentation accuracy.


2018 - A Hierarchical Quasi-Recurrent approach to Video Captioning [Relazione in Atti di Convegno]
Bolelli, Federico; Baraldi, Lorenzo; Grana, Costantino
abstract

Video captioning has picked up a considerable measure of attention thanks to the use of Recurrent Neural Networks, since they can be utilized to both encode the input video and to create the corresponding description. In this paper, we present a recurrent video encoding scheme which can find and exploit the layered structure of the video. Differently from the established encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose to employ Quasi-Recurrent Neural Networks, further extending their basic cell with a boundary detector which can recognize discontinuity points between frames or segments and likewise modify the temporal connections of the encoding layer. We assess our approach on a large scale dataset, the Montreal Video Annotation dataset. Experiments demonstrate that our approach can find suitable levels of representation of the input information, while reducing the computational requirements.


2018 - Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Cornia, Marcella; Grana, Costantino; Cucchiara, Rita
abstract

While several approaches to bring vision and language together are emerging, none of them has yet addressed the digital humanities domain, which, nevertheless, is a rich source of visual and textual data. To foster research in this direction, we investigate the learning of visual-semantic embeddings for historical document illustrations, devising both supervised and semi-supervised approaches. We exploit the joint visual-semantic embeddings to automatically align illustrations and textual elements, thus providing an automatic annotation of the visual content of a manuscript. Experiments are performed on the Borso d'Este Holy Bible, one of the most sophisticated illuminated manuscript from the Renaissance, which we manually annotate aligning every illustration with textual commentaries written by experts. Experimental results quantify the domain shift between ordinary visual-semantic datasets and the proposed one, validate the proposed strategies, and devise future works on the same line.


2018 - Connected Components Labeling on DRAGs [Relazione in Atti di Convegno]
Bolelli, Federico; Baraldi, Lorenzo; Cancilla, Michele; Grana, Costantino
abstract

In this paper we introduce a new Connected Components Labeling (CCL) algorithm which exploits a novel approach to model decision problems as Directed Acyclic Graphs with a root, which will be called Directed Rooted Acyclic Graphs (DRAGs). This structure supports the use of sets of equivalent actions, as required by CCL, and optimally leverages these equivalences to reduce the number of nodes (decision points). The advantage of this representation is that a DRAG, differently from decision trees usually exploited by the state-of-the-art algorithms, will contain only the minimum number of nodes required to reach the leaf corresponding to a set of condition values. This combines the benefits of using binary decision trees with a reduction of the machine code size. Experiments show a consistent improvement of the execution time when the model is applied to CCL.


2018 - Improving Skin Lesion Segmentation with Generative Adversarial Networks [Relazione in Atti di Convegno]
Pollastri, Federico; Bolelli, Federico; Paredes, Roberto; Grana, Costantino
abstract

This paper proposes a novel strategy that employs Generative Adversarial Networks (GANs) to augment data in the image segmentation field, and a Convolutional-Deconvolutional Neural Network (CDNN) to automatically generate lesion segmentation mask from dermoscopic images. Training the CDNN with our GAN generated data effectively improves the state-of-the-art.


2018 - Optimizing GPU-Based Connected Components Labeling Algorithms [Relazione in Atti di Convegno]
Allegretti, Stefano; Bolelli, Federico; Cancilla, Michele; Grana, Costantino
abstract

Connected Components Labeling (CCL) is a fundamental image processing technique, widely used in various application areas. Computational throughput of Graphical Processing Units (GPUs) makes them eligible for such a kind of algorithms. In the last decade, many approaches to compute CCL on GPUs have been proposed. Unfortunately, most of them have focused on 4-way connectivity neglecting the importance of 8-way connectivity. This paper aims to extend state-of-the-art GPU-based algorithms from 4 to 8-way connectivity and to improve them with additional optimizations. Experimental results revealed the effectiveness of the proposed strategies.


2018 - SACHER Project: A Cloud Platform and Integrated Services for Cultural Heritage and for Restoration [Relazione in Atti di Convegno]
Bertacchi, Silvia; Al Jawarneh, Isam Mashhour; Apollonio, Fabrizio Ivan; Bertacchi, Gianna; Cancilla, Michele; Foschini, Luca; Grana, Costantino; Martuscelli, Giuseppe; Montanari, Rebecca
abstract

The SACHER project provides a distributed, open source and federated cloud platform able to support the life-cycle management of various kinds of data concerning tangible Cultural Heritage. The paper describes the SACHER platform and, in particular, among the various integrated service prototypes, the most important ones to support restoration processes and cultural asset management: (i) 3D Life Cycle Management for Cultural Heritage (SACHER 3D CH), based on 3D digital models of architecture and dedicated to the management of Cultural Heritage and to the storage of the numerous data generated by the team of professionals involved in the restoration process; (ii) Multidimensional Search Engine for Cultural Heritage (SACHER MuSE CH), an advanced multi-level search system designed to manage Heritage data from heterogeneous sources.


2018 - XDOCS: An Application to Index Historical Documents [Relazione in Atti di Convegno]
Bolelli, Federico; Borghi, Guido; Grana, Costantino
abstract

Dematerialization and digitalization of historical documents are key elements for their availability, preservation and diffusion. Unfortunately, the conversion from handwritten to digitalized documents presents several technical challenges. The XDOCS project is created with the main goal of making available and extending the usability of historical documents for a great variety of audience, like scholars, institutions and libraries. In this paper the core elements of XDOCS, i.e. page dewarping and word spotting technique, are described and two new applications, i.e. annotation/indexing and search tool, are presented.


2017 - A Video Library System Using Scene Detection and Automatic Tagging [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

We present a novel video browsing and retrieval system for edited videos, in which videos are automatically decomposed into meaningful and storytelling parts (i.e. scenes) and tagged according to their transcript. The system relies on a Triplet Deep Neural Network which exploits multimodal features, and has been implemented as a set of extensions to the eXo Platform Enterprise Content Management System (ECMS). This set of extensions enable the interactive visualization of a video, its automatic and semi-automatic annotation, as well as a keyword-based search inside the video collection. The platform also allows a natural integration with third-party add-ons, so that automatic annotations can be exploited outside the proposed platform.


2017 - Affective Classication of Gaming Activities Coming From RPG Gaming Sessions [Relazione in Atti di Convegno]
Balducci, Fabrizio; Grana, Costantino
abstract

Each human activity involves feelings and subjective emotions: different people will perform and sense the same task with different outcomes and experience; to understand this experience, concepts like Flow or Boredom must be investigated using objective data provided by methods like electroencephalography. This work carries on the analysis of EEG data coming from brain-computer interface and videogame "Neverwinter Nights 2": we propose an experimental methodology comparing results coming from different off-the-shelf machine learning techniques, employed on the gaming activities, to check if each affective state corresponds to the hypothesis xed in their formal design guidelines.


2017 - Affective level design for a role-playing videogame evaluated by a brain–computer interface and machine learning methods [Articolo su rivista]
Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita
abstract

Game science has become a research field, which attracts industry attention due to a worldwide rich sell-market. To understand the player experience, concepts like flow or boredom mental states require formalization and empirical investigation, taking advantage of the objective data that psychophysiological methods like electroencephalography (EEG) can provide. This work studies the affective ludology and shows two different game levels for Neverwinter Nights 2 developed with the aim to manipulate emotions; two sets of affective design guidelines are presented, with a rigorous formalization that considers the characteristics of role-playing genre and its specific gameplay. An empirical investigation with a brain–computer interface headset has been conducted: by extracting numerical data features, machine learning techniques classify the different activities of the gaming sessions (task and events) to verify if their design differentiation coincides with the affective one. The observed results, also supported by subjective questionnaires data, confirm the goodness of the proposed guidelines, suggesting that this evaluation methodology could be extended to other evaluation tasks.


2017 - Hierarchical Boundary-Aware Neural Encoder for Video Captioning [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

The use of Recurrent Neural Networks for video captioning has recently gained a lot of attention, since they can be used both to encode the input video and to generate the corresponding description. In this paper, we present a recurrent video encoding scheme which can discover and leverage the hierarchical structure of the video. Unlike the classical encoder-decoder approach, in which a video is encoded continuously by a recurrent layer, we propose a novel LSTM cell, which can identify discontinuity points between frames or segments and modify the temporal connections of the encoding layer accordingly. We evaluate our approach on three large-scale datasets: the Montreal Video Annotation dataset, the MPII Movie Description dataset and the Microsoft Video Description Corpus. Experiments show that our approach can discover appropriate hierarchical representations of input videos and improve the state of the art results on movie description datasets.


2017 - Historical Handwritten Text Images Word Spotting through Sliding Window HOG Features [Relazione in Atti di Convegno]
Bolelli, Federico; Borghi, Guido; Grana, Costantino
abstract

In this paper we present an innovative technique to semi-automatically index handwritten word images. The proposed method is based on HOG descriptors and exploits Dynamic Time Warping technique to compare feature vectors elaborated from single handwritten words. Our strategy is applied to a new challenging dataset extracted from Italian civil registries of the XIX century. Experimental results, compared with some previously developed word spotting strategies, confirmed that our method outperforms competitors.


2017 - Layout analysis and content classification in digitized books [Relazione in Atti di Convegno]
Corbelli, Andrea; Baraldi, Lorenzo; Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita
abstract

Automatic layout analysis has proven to be extremely important in the process of digitization of large amounts of documents. In this paper we present a mixed approach to layout analysis, introducing a SVM-aided layout segmentation process and a classification process based on local and geometrical features. The final output of the automatic analysis algorithm is a complete and structured annotation in JSON format, containing the digitalized text as well as all the references to the illustrations of the input page, and which can be used by visualization interfaces as well as annotation interfaces. We evaluate our algorithm on a large dataset built upon the first volume of the “Enciclopedia Treccani”.


2017 - NeuralStory: an Interactive Multimedia System for Video Indexing and Re-use [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

In the last years video has been swamping the Internet: websites, social networks, and business multimedia systems are adopting video as the most important form of communication and information. Video are normally accessed as a whole and are not indexed in the visual content. Thus, they are often uploaded as short, manually cut clips with user-provided annotations, keywords and tags for retrieval. In this paper, we propose a prototype multimedia system which addresses these two limitations: it overcomes the need of human intervention in the video setting, thanks to fully deep learning-based solutions, and decomposes the storytelling structure of the video into coherent parts. These parts can be shots, key-frames, scenes and semantically related stories, and are exploited to provide an automatic annotation of the visual content, so that parts of video can be easily retrieved. This also allows a principled re-use of the video itself: users of the platform can indeed produce new storytelling by means of multi-modal presentations, add text and other media, and propose a different visual organization of the content. We present the overall solution, and some experiments on the re-use capability of our platform in edutainment by conducting an extensive user valuation %with students from primary schools.


2017 - Pixel classification methods to detect skin lesions on dermoscopic medical images [Relazione in Atti di Convegno]
Balducci, Fabrizio; Grana, Costantino
abstract

In recent years the interest of biomedical and computer vision communities in acquisition and analysis of epidermal images increased because melanoma is one of the deadliest form of skin cancer and its early identification could save lives reducing unnecessary medical treatments. User-friendly automatic tools can be very useful for physicians and dermatologists in fact high-resolution images and their annotated data, combined with analysis pipelines and machine learning techniques, represent the base to develop intelligent and proactive diagnostic systems. In this work we present two skin lesion detection pipelines on dermoscopic medical images, by exploiting standard techniques combined with workarounds that improve results; moreover to highlight the performance we consider a set of metrics combined with pixel labeling and classification. A preliminary but functional evaluation phase has been conducted with a sub-set of hard-to-treat images, in order to check which proposed detection pipeline reaches the best results.


2017 - Preface [Relazione in Atti di Convegno]
Grana, C.; Baraldi, L.
abstract


2017 - Recognizing and Presenting the Storytelling Video Structure with Deep Multimodal Networks [Articolo su rivista]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

In this paper, we propose a novel scene detection algorithm which employs semantic, visual, textual and audio cues. We also show how the hierarchical decomposition of the storytelling video structure can improve retrieval results presentation with semantically and aesthetically effective thumbnails. Our method is built upon two advancements of the state of the art: 1) semantic feature extraction which builds video specific concept detectors; 2) multimodal feature embedding learning, that maps the feature vector of a shot to a space in which the Euclidean distance has task specific semantic properties. The proposed method is able to decompose the video in annotated temporal segments which allow for a query specific thumbnail extraction. Extensive experiments are performed on different data sets to demonstrate the effectiveness of our algorithm. An in-depth discussion on how to deal with the subjectivity of the task is conducted and a strategy to overcome the problem is suggested.


2017 - SACHER: Smart Architecture for Cultural Heritage in Emilia Romagna [Relazione in Atti di Convegno]
Apollonio, F. I.; Rizzo, F.; Bertacchi, S.; Dall'Osso, G.; Corbelli, A.; Grana, C.
abstract

The current Cultural Heritage management system lacks of ICT platforms for the management and integration of heterogeneous and fragmented data sources and interconnection between private and public subjects involved in the process. The SACHER project intends to fill this gap, working both on a technological level and on a business model level: firstly providing a platform based on an open-source distributed cloud-computing environment for the management of the complete data lifecycle related to cultural assets; moreover providing new models based on participatory design for Cultural Heritage data directed towards social entrepreneurship. This paper presents the first implementation of a system for managing data based on the 3D model of the cultural object, with a focus on the process for cultural assets management and the interface design for cultural services.


2017 - Segmentation models diversity for object proposals [Articolo su rivista]
Manfredi, Marco; Grana, Costantino; Cucchiara, Rita; Smeulders, Arnold W. M.
abstract

In this paper we present a segmentation proposal method which employs a box-hypotheses generation step followed by a lightweight segmentation strategy. Inspired by interactive segmentation, for each automatically placed bounding-box we compute a precise segmentation mask. We introduce diversity in segmentation strategies enhancing a generic model performance exploiting class-independent regional appearance features. Foreground probability scores are learned from groups of objects with peculiar characteristics to specialize segmentation models. We demonstrate results comparable to the state-of-the-art on PASCAL VOC 2012 and a further improvement by merging our proposals with those of a recent solution. The ability to generalize to unseen object categories is demonstrated on Microsoft COCO 2014.


2017 - Two More Strategies to Speed Up Connected Components Labeling Algorithms [Relazione in Atti di Convegno]
Bolelli, Federico; Cancilla, Michele; Grana, Costantino
abstract

This paper presents two strategies that can be used to improve the speed of Connected Components Labeling algorithms. The first one operates on optimal decision trees considering image patterns occurrences, while the second one articulates how two scan algorithms can be parallelized using multi-threading. Experimental results demonstrate that the proposed methodologies reduce the total execution time of state-of-the-art two scan algorithms.


2016 - A Browsing and Retrieval System for Broadcast Videos using Scene Detection and Automatic Annotation [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Messina, Alberto; Cucchiara, Rita
abstract

This paper presents a novel video access and retrieval system for edited videos. The key element of the proposal is that videos are automatically decomposed into semantically coherent parts (called scenes) to provide a more manageable unit for browsing, tagging and searching. The system features an automatic annotation pipeline, with which videos are tagged by exploiting both the transcript and the video itself. Scenes can also be retrieved with textual queries; the best thumbnail for a query is selected according to both semantics and aesthetics criteria.


2016 - Analysis and Re-use of Videos in Educational Digital Libraries with Automatic Scene Detection [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

The advent of modern approaches to education, like Massive Open Online Courses (MOOC), made video the basic media for educating and transmitting knowledge. However, IT tools are still not adequate to allow video content re-use, tagging, annotation and personalization. In this paper we analyze the problem of identifying coherent sequences, called scenes, in order to provide the users with a more manageable editing unit. A simple spectral clustering technique is proposed and compared with state-of-the-art results. We also discuss correct ways to evaluate the performance of automatic scene detection algorithms.


2016 - Dynamic Optical Coherence Tomography in Dermatology [Articolo su rivista]
Ulrich, Martina; Themstrup, Lotte; de Carvalho, Nathalie; Manfredi, Marco; Grana, Costantino; Ciardo, Silvana; Kästle, Raphaela; Holmes, Jon; Whitehead, Richard; Jemec, Gregor B. E; Pellacani, Giovanni; Welzel, Julia
abstract

Optical coherence tomography (OCT) represents a non-invasive imaging technology, which may be applied to the diagnosis of non-melanoma skin cancer and which has recently been shown to improve the diagnostic accuracy of basal cell carcinoma. Technical developments of OCT continue to expand the applicability of OCT for different neoplastic and inflammatory skin diseases. Of these, dynamic OCT (D-OCT) based on speckle variance OCT is of special interest as it allows the in vivo evaluation of blood vessels and their distribution within specific lesions, providing additional functional information and consequently greater density of data. In an effort to assess the potential of D-OCT for future scientific and clinical studies, we have therefore reviewed the literature and preliminary unpublished data on the visualization of the microvasculature using D-OCT. Information on D-OCT in skin cancers including melanoma, as well as in a variety of other skin diseases, is presented in an atlas. Possible diagnostic features are suggested, although these require additional validation.


2016 - Guest editorial: Multimedia for cultural heritage [Articolo su rivista]
Grana, C.; Serra, G.
abstract


2016 - Historical Document Digitization through Layout Analysis and Deep Content Classification [Relazione in Atti di Convegno]
Corbelli, Andrea; Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

Document layout segmentation and recognition is an important task in the creation of digitized documents collections, especially when dealing with historical documents. This paper presents an hybrid approach to layout segmentation as well as a strategy to classify document regions, which is applied to the process of digitization of an historical encyclopedia. Our layout analysis method merges a classic top-down approach and a bottom-up classification process based on local geometrical features, while regions are classified by means of features extracted from a Convolutional Neural Network merged in a Random Forest classifier. Experiments are conducted on the first volume of the ``Enciclopedia Treccani'', a large dataset containing 999 manually annotated pages from the historical Italian encyclopedia.


2016 - Layout analysis and content enrichment of digitized books [Articolo su rivista]
Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Coppi, Dalia; Cucchiara, Rita
abstract

In this paper we describe a system for automatically analyzing old documents and creating hyper linking between different epochs, thus opening ancient documents to young people and to make them available on the web with old and current content. We propose a supervised learning approach to segment text and illustration of digitized old documents using a texture feature based on local correlation aimed at detecting the repeating patterns of text regions and differentiate them from pictorial elements. Moreover we present a solution to help the user in finding contemporary content connected to what is automatically extracted from the ancient documents.


2016 - Optimized Connected Components Labeling with Pixel Prediction [Relazione in Atti di Convegno]
Grana, Costantino; Baraldi, Lorenzo; Bolelli, Federico
abstract

In this paper we propose a new paradigm for connected components labeling, which employs a general approach to minimize the number of memory accesses, by exploiting the information provided by already seen pixels, removing the need to check them again. The scan phase of our proposed algorithm is ruled by a forest of decision trees connected into a single graph. Every tree derives from a reduction of the complete optimal decision tree. Experimental results demonstrated that on low density images our method is slightly faster than the fastest conventional labeling algorithms.


2016 - Scene-driven Retrieval in Edited Videos using Aesthetic and Semantic Deep Features [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

This paper presents a novel retrieval pipeline for video collections, which aims to retrieve the most significant parts of an edited video for a given query, and represent them with thumbnails which are at the same time semantically meaningful and aesthetically remarkable. Videos are first segmented into coherent and story-telling scenes, then a retrieval algorithm based on deep learning is proposed to retrieve the most significant scenes for a textual query. A ranking strategy based on deep features is finally used to tackle the problem of visualizing the best thumbnail. Qualitative and quantitative experiments are conducted on a collection of edited videos to demonstrate the effectiveness of our approach.


2016 - Shot, scene and keyframe ordering for interactive video re-use [Relazione in Atti di Convegno]
Baraldi, L.; Grana, C.; Borghi, G.; Vezzani, R.; Cucchiara, R.
abstract

This paper presents a complete system for shot and scene detection in broadcast videos, as well as a method to select the best representative key-frames, which could be used in new interactive interfaces for accessing large collections of edited videos. The final goal is to enable an improved access to video footage and the re-use of video content with the direct management of user-selected video-clips.


2016 - Skin Surface Reconstruction and 3D Vessels Segmentation in Speckle Variance Optical Coherence Tomography [Relazione in Atti di Convegno]
Manfredi, Marco; Grana, Costantino; Pellacani, Giovanni
abstract

In this paper we present a method for in vivo surface reconstruction and 3D vessels segmentation from Speckle-Variance Optical Coherence Tomography imaging, applied to dermatology. This novel technology allows to capture motion underneath the skin surface revealing the presence of blood vessels. Standard OCT visualization techniques are inappropriate for this new source of information, that is crucial in early skin cancer diagnosis. We investigate 3D reconstruction techniques for better visualization of both the external and internal structure of skin lesions, as a tool to help clinicians in the task of qualitative tumor evaluation.


2016 - YACCLAB - Yet Another Connected Components Labeling Benchmark [Relazione in Atti di Convegno]
Grana, Costantino; Bolelli, Federico; Baraldi, Lorenzo; Vezzani, Roberto
abstract

The problem of labeling the connected components (CCL) of a binary image is well-defined and several proposals have been presented in the past. Since an exact solution to the problem exists and should be mandatory provided as output, algorithms mainly differ on their execution speed. In this paper, we propose and describe YACCLAB, Yet Another Connected Components Labeling Benchmark. Together with a rich and varied dataset, YACCLAB contains an open source platform to test new proposals and to compare them with publicly available competitors. Textual and graphical outputs are automatically generated for three kinds of test, which analyze the methods from different perspectives. The fairness of the comparisons is guaranteed by running on the same system and over the same datasets. Examples of usage and the corresponding comparisons among state-of-the-art techniques are reported to confirm the potentiality of the benchmark.


2015 - A Deep Siamese Network for Scene Detection in Broadcast Videos [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

We present a model that automatically divides broadcast videos into coherent scenes by learning a distance measure between shots. Experiments are performed to demonstrate the effectiveness of our approach by comparing our algorithm against recent proposals for automatic scene segmentation. We also propose an improved performance measure that aims to reduce the gap between numerical evaluation and expected results, and propose and release a new benchmark dataset.


2015 - Classification of Affective Data to Evaluate the Level Design in a Role-Playing Videogame [Relazione in Atti di Convegno]
Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita
abstract

This paper presents a novel approach to evaluate game level design strategies, applied to role playing games. Following a set of well defined guidelines, two game levels were designed for Neverwinter Nights 2 to manipulate particular emotions like boredom or flow, and tested by 13 subjects wearing a brain computer interface helmet. A set of features was extracted from the affective data logs and used to classify different parts of the gaming sessions, to verify the correspondence of the original level aims and the effective results on people emotions. The very interesting correlations observed, suggest that the technique is extensible to other similar evaluation tasks.


2015 - GOLD: Gaussians of Local Descriptors for Image Representation [Articolo su rivista]
Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita
abstract

The Bag of Words paradigm has been the baseline from which several successful image classification solutions were developed in the last decade. These represent images by quantizing local descriptors and summarizing their distribution. The quantization step introduces a dependency on the dataset, that even if in some contexts significantly boosts the performance, severely limits its generalization capabilities. Differently, in this paper, we propose to model the local features distribution with a multivariate Gaussian, without any quantization. The full rank covariance matrix, which lies on a Riemannian manifold, is projected on the tangent Euclidean space and concatenated to the mean vector. The resulting representation, a Gaussian of local descriptors (GOLD), allows to use the dot product to closely approximate a distance between distributions without the need for expensive kernel computations. We describe an image by an improved spatial pyramid, which avoids boundary effects with soft assignment: local descriptors contribute to neighboring Gaussians, forming a weighted spatial pyramid of GOLD descriptors. In addition, we extend the model leveraging dataset characteristics in a mixture of Gaussian formulation further improving the classification accuracy. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. Experimental results on several publicly available datasets show that the proposed method obtains state-of-the-art performance.


2015 - Measuring scene detection performance [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

In this paper we evaluate the performance of scene detection techniques, starting from the classic precision/recall approach, moving to the better designed coverage/overflow measures, and finally proposing an improved metric, in order to solve frequently observed cases in which the numeric interpretation is different from the expected results. Numerical evaluation is performed on two recent proposals for automatic scene detection, and comparing them with a simple but effective novel approach. Experimental results are conducted to show how different measures may lead to different interpretations.


2015 - Scene segmentation using temporal clustering for accessing and re-using broadcast video [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

Scene detection is a fundamental tool for allowing effective video browsing and re-using. In this paper we present a model that automatically divides videos into coherent scenes, which is based on a novel combination of local image descriptors and temporal clustering techniques. Experiments are performed to demonstrate the effectiveness of our approach, by comparing our algorithm against two recent proposals for automatic scene segmentation. We also propose improved performance measures that aim to reduce the gap between numerical evaluation and expected results.


2015 - Shot and Scene Detection via Hierarchical Clustering for Re-using Broadcast Video [Relazione in Atti di Convegno]
Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita
abstract

Video decomposition techniques are fundamental tools for allowing effective video browsing and re-using. In this work, we consider the problem of segmenting broadcast videos into coherent scenes, and propose a scene detection algorithm based on hierarchical clustering, along with a very fast state-of-the-art shot segmentation approach. Experiments are performed to demonstrate the effectiveness of our algorithms, by comparing against recent proposals for automatic shot and scene segmentation.


2015 - Standards in dermatologic imaging [Articolo su rivista]
Marghoob, A. A.; Soyer, H. P.; Curiel, C.; Dasilva, D.; High, W. A.; Morrison, L. H.; Zirato, J.; Kittler, H.; Argenziano, G.; Braun, R. P.; Haenssle, H.; Menzies, S. W.; Puig, S.; Scope, A.; Stolz, W.; Thomas, L.; Zalaudek, I.; Malvehy, J.; Abedini, M.; Chen, Q.; Garnavi, R.; Sun, X.; Canfield, D.; Codella, N. C. F.; Garcia, R.; Quintana, J.; Grana, C.; Pellacani, G.; Josipovic, M.; Klar, P.; Mayer, A.; Molenda, M. A.; Mullani, N.; Skladnev, V.; Stoecker, W. V.; Hoffman-Wellenhof, R.
abstract

.


2014 - A complete system for garment segmentation and color classification [Articolo su rivista]
Manfredi, Marco; Grana, Costantino; Calderara, Simone; Cucchiara, Rita
abstract

In this paper, we propose a general approach for automatic segmentation, color-based retrieval and classification of garments in fashion store databases, exploiting shape and color information. The garment segmentation is automatically initialized by learning geometric constraints and shape cues, then it is performed by modeling both skin and accessory colors with Gaussian Mixture Models. For color similarity retrieval and classification, to adapt the color description to the users’ perception and the company marketing directives, a color histogram with an optimized binning strategy, learned on the given color classes, is introduced and combined with HOG features for garment classification. Experiments validating the proposed strategy, and a free-to-use dataset publicly available for scientific purposes, are finally detailed.


2014 - Covariance of Covariance Features for Image Classification [Relazione in Atti di Convegno]
Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita
abstract

In this paper we propose a novel image descriptor built by computing the covariance of pixel level features on densely sampled patches and encoding them using their covariance. Appropriate projections to the Euclidean space and feature normalizations are employed in order to provide a strong descriptor usable with linear classifiers. In order to remove border effects, we further enhance the Spatial Pyramid representation with bilinear interpolation. Experimental results conducted on two common datasets for object and texture classification show that the performance of our method is comparable with state of the art techniques, but removing any dataset specific dependency in the feature encoding step.


2014 - Illustrations Segmentation in Digitized Documents Using Local Correlation Features [Relazione in Atti di Convegno]
Coppi, Dalia; Grana, Costantino; Cucchiara, Rita
abstract

In this paper we propose an approach for Document Layout Analysis based on local correlation features. We identify and extract illustrations in digitized documents by learning the discriminative patterns of textual and pictorial regions. The proposal has been demonstrated to be effective on historical datasets and to outperform the state-of-the-art in presence of challenging documents with a large variety of pictorial elements.


2014 - Learning Graph Cut Energy Functions for Image Segmentation [Relazione in Atti di Convegno]
Manfredi, Marco; Grana, Costantino; Cucchiara, Rita
abstract

In this paper we address the task of learning how to segment a particular class of objects, by means of a training set of images and their segmentations. In particular we propose a method to overcome the extremely high training time of a previously proposed solution to this problem, Kernelized Structural Support Vector Machines. We employ a one-class SVM working with joint kernels to robustly learn significant support vectors (representative image-mask pairs) and accordingly weight them to build a suitable energy function for the graph cut framework. We report results obtained on two public datasets and a comparison of training times on different training set sizes.


2014 - Learning Superpixel Relations for Supervised Image Segmentation [Relazione in Atti di Convegno]
Manfredi, Marco; Grana, Costantino; Cucchiara, Rita
abstract

In this paper we propose to extend the well known graph cut segmentation framework by learning superpixel relations and use them to weight superpixel-to-superpixel edges in a superpixel graph. Adjacent superpixel-pairs are analyzed to build an object boundary model, able to discriminate between superpixel-pairs belonging to the same object or placed on the edge between the foreground object and the background. Several superpixel-pair features are investigated and exploited to build a non-linear SVM to learn object boundary appearance. The adoption of this modified graph cut enhances the performance of a previously proposed segmentation method on two publicly available datasets, reaching state-of-the-art results.


2014 - Miniature illustrations retrieval and innovative interaction for digital illuminated manuscripts [Articolo su rivista]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

In this paper we propose a multimedia solution for the interactive exploration of illuminated manuscripts. We leveraged on the joint exploitation of content-based image retrieval and relevance feedback to provide an effective mechanism to navigate through the manuscript and add custom knowledge in the form of tags. The similarity retrieval between miniature illustrations is based on covariance descriptors, integrating color, spatial and gradient information. The proposed relevance feedback technique, namely Query Remapping Feature Space Warping, accounts for the user’s opinions by accordingly warping the data points. This is obtained by means of a remapping strategy (from the Riemannian space where covariance matrices lie, referring back to Euclidean space) useful to boost the retrieval performance. Experiments are reported to show the quality of the proposal. Moreover, the complete prototype with user interaction, as already showcased at museums and exhibitions, is presented.


2014 - Truncated Isotropic Principal Component Classifier for Image Classification [Relazione in Atti di Convegno]
A., Rozza; Serra, Giuseppe; Grana, Costantino
abstract

This paper reports a novel approach to deal with the problem of Object and Scene recognition extending the traditional Bag of Words approach in two ways. Firstly, a dataset independent method of summarizing local features, based on multivariate Gaussian descriptors, is employed. Secondly, a recently proposed classification technique, particularly suited for high dimensional feature spaces without any dimensionality reduction step, allows to effectively exploit these features. Experiments are performed on two publicly available datasets and demonstrate the effectiveness of our approach when compared to state-of-the-art methods.


2013 - A Fast Approach for Integrating ORB Descriptors in the Bag of Words Model [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Manfredi, Marco; Cucchiara, Rita
abstract

In this paper we propose to integrate the recently introduces ORB descriptors in the currently favored approach for image classification, that is the Bag of Words model. In particular the problem to be solved is to provide a clustering method able to deal with the binary string nature of the ORB descriptors. We suggest to use a k-means like approach, called k-majority, substituting Euclidean distance with Hamming distance and majority selected vector as the new cluster center. Results combining this new approach with other features are provided over the ImageCLEF 2011 dataset.


2013 - Automatic Single-Image People Segmentation and Removal for Cultural Heritage Imaging [Relazione in Atti di Convegno]
Manfredi, Marco; Grana, Costantino; Cucchiara, Rita
abstract

In this paper, the problem of automatic people removal from digital photographs is addressed. Removing unintended people from a scene can be very useful to focus further steps of image analysis only on the object of interest, A supervised segmentation algorithm is presented and tested in several scenarios.


2013 - Beyond Bag of Words for Concept Detection and Search of Cultural Heritage Archives [Relazione in Atti di Convegno]
Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita
abstract

Several local features have become quite popular for concept detection and search, due to their ability to capture distinctive details. Typically a Bag of Words approach is followed, where a codebook is built by quantizing the local features. In this paper, we propose to represent SIFT local features extracted from an image as a multivariate Gaussian distribution, obtaining a mean vector and a covariance matrix. Differently from common techniques based on the Bag of Words model, our solution does not rely on the construction of a visual vocabulary, thus removing the dependence of the image descriptors on the specific dataset and allowing to immediately retargeting the features to different classification and search problems. Experimental results are conducted on two very different Cultural Heritage image archives, composed of illuminated manuscript miniatures, and architectural elements pictures collected from the web, on which the proposed approach outperforms the Bag of Words technique both in classification and retrieval.


2013 - Image Classification with Multivariate Gaussian Descriptors [Relazione in Atti di Convegno]
Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita
abstract

Techniques based on Bag Of Words approach represent images by quantizing local descriptors and summarizing their distribution in a histogram. Dierently, in this paper we describe an image as multivariate Gaussian distribution, estimated over the extracted local descriptors. The estimated distribution is mapped to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. To deal with large scale datasets and high dimensional feature spaces the Stochastic Gradient Descent solver is adopted. The experimental results on Caltech-101 and ImageCLEF2011 show that the method obtains competitive performance with state-of-the art approaches.


2013 - Lightweight Sign Recognition for Mobile Devices [Relazione in Atti di Convegno]
Fornaciari, Michele; Prati, Andrea; Grana, Costantino; Cucchiara, Rita
abstract

The diffusion of powerful mobile devices has posed the basis for new applications implementing on the devices (which are embedded devices) sophisticated computer vision and pattern recognition algorithms. This paper describes the implementation of a complete system for automatic recognition of places localized on a map through the recognition of significant signs by means of the camera of a mobile device (smartphone, tablet, etc.). The paper proposes a novel classification algorithm based on the innovative use of bag-of-words on ORB features. The recognition is achieved using a simple yet effective search scheme which exploits GPS localization to limit the possible matches. This simple solution brings several advantages, such as the speed also on limited-resource devices, the usability also with limited training samples and the easiness of adapting to new training samples and classes. The overall architecture of the system is based on a REST-JSON client-server architecture. The experimental results have been conducted in a real scenario and evaluating the different parameters which influence the performance.


2013 - Modeling Local Descriptors with Multivariate Gaussians for Object and Scene Recognition [Relazione in Atti di Convegno]
Serra, Giuseppe; Grana, Costantino; Manfredi, Marco; Cucchiara, Rita
abstract

Common techniques represent images by quantizing local descriptors and summarizing their distribution in a histogram. In this paper we propose to employ a parametric description and compare its capabilities to histogram based approaches. We use the multivariate Gaussian distribution, applied over the SIFT descriptors, extracted with dense sampling on a spatial pyramid. Every distribution is converted to a high-dimensional descriptor, by concatenating the mean vector and the projection of the covariance matrix on the Euclidean space tangent to the Riemannian manifold. Experiments on Caltech-101 and ImageCLEF2011 are performed using the Stochastic Gradient Descent solver, which allows to deal with large scale datasets and high dimensional feature spaces.


2013 - UNIMORE at ImageCLEF 2013: Scalable Concept Image Annotation [Relazione in Atti di Convegno]
Grana, Costantino; Serra, Giuseppe; Manfredi, Marco; Cucchiara, Rita; Martoglia, Riccardo; Mandreoli, Federica
abstract

In this paper we propose a large-scale Image annotation system for the Scalable Concept Image Annotation task. For each concept to be detected a separated classifier is built using the provided textual annotation. Images are represented as a Multivariate Gaussian distribution of a set of local features extracted over a dense regular grid. Textual analysis, on the web pages containing training images, is performed to retrieve a relevant set of samples for learning each concept classifier. An online SVMs solver based on Stochastic Gradient Descent is used to manage the large amount of training data. Experimental results show that the combination of different kind of local features encoded with our strategy achieves very competitive performance both in terms of mAP and mean F-measure.


2012 - 2D Images Map Warping for Improved User Interaction [Relazione in Atti di Convegno]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

In this paper, we suggest an interaction model designed to fit users' expectations in front of an image retrieval system. A lightweight relevance feedback strategy, working directly on the 2D projection of image features, allows the user to spatially navigate the media collection maintaining the real-time constraint. A preliminary evaluation of this relevance feedback strategy shows good performance compared with other known approaches.


2012 - A human vs. machine challenge in fashion color classification [Relazione in Atti di Convegno]
Grana, C.; Borghesani, D.; Cucchiara, R.
abstract

For this demo, we present a set of stark applications designed to evaluate the performance of a color similarity retrieval system against human operators performance in the same tasks. The proposed series of tests give some interesting insights about the perception of color classes and the reliability of manual annotation in the fashion context. © 2012 Springer-Verlag.


2012 - Class-based color bag of words for fashion retrieval [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

Color signatures, histograms and bag of colors are basic and effective strategies for describing the color content of images, for retrieving images by their color appearance or providing color annotation. In some domains, colors assume a specific meaning for users and the color-based classification and retrieval should mirror the initial suggestions given by users in the training set. For instance in fashion world, the names given to the dominant color of a garment or a dress reflect the fashion dictact and not an uniform division of the color space.In this paper we propose a general approach to implement color signature as a trained bag of words, defined on the basis of user defined color classes. The novel Class-based Color Bag of Words is a easy computable bag of words of color, constructed following an approach similar to the Median Cut algorithm, but biased by color distribution in the trained classes. Moreover, to dramatically reduce the computational effort we propose 3D integral histograms, a 3D extension of integral images, easily extensible for many histogram-based signature in 3D color space. Several comparisons in large fashion datasets confirm the discriminant power of this signature.


2012 - Learning Non-Target Items for Interesting Clothes Segmentation in Fashion Images [Relazione in Atti di Convegno]
Grana, Costantino; Calderara, Simone; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we propose a color-based approach for skin detection and interest garment selection aimed at an automatic segmentation of pieces of clothing. For both purposes, the color description is extracted by an iterative energy minimization approach and an automatic initialization strategy is proposed by learning geometric constraints and shape cues. Experiments confirms the good performance of this technique both in the context of skin removal and in the context of classification of garments.


2012 - Multimedia for Cultural Heritage: Key Issues [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Borghesani, Daniele; M., Agosti; A. D., Bagdanov
abstract

Multimedia technologies have recently created the conditions for a true revolution in the Cultural Heritage domain, particularly in reference to the study, exploitation, and fruition of artistic works. New opportunities are arising for researchers in the field of multimedia to share their research results with people coming from the field of art and culture, and viceversa. This paper gathers together opinions and ideas shared during the final discussion session at the 1st International Workshop on Multimedia for Cultural Heritage, as a summary of the problems and possible directions to solve to them.


2012 - Optimal Decision Trees for Local Image Processing Algorithms [Articolo su rivista]
Grana, Costantino; Montangero, Manuela; Borghesani, Daniele
abstract

In this paper we present a novel algorithm to synthesize an optimal decision tree from OR-decision tables, an extension of standard decision tables, complete with the formal proof of optimality and computational cost analysis. As many problems which require to recognize particular patterns can be modeled with this formalism, we select two common binary image processing algorithms, namely connected components labeling and thinning, to show how these can be represented with decision tables, and the benets of their implementation as optimal decision trees in terms of reduced memory accesses. Experiments are reported, to show the computational time improvements over state of the art implementations.


2012 - Preface [Relazione in Atti di Convegno]
Grana, C.; Cucchiara, R.
abstract


2012 - Relevance Feedback as an Interactive Navigation Tool [Relazione in Atti di Convegno]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

Image collections are searched in common retrieval systems in many different ways, but the typical presentation is by means of a grid styled view. In this paper we try to suggest a novel use of relevance feedback as a tool to warp the view and allow the user to spatially navigate the image collection, and at the same time focus on his retrieval aim. This is obtained by the use of a distance based space warping on the 2D projection of the distance matrix.


2012 - Special Issue: Recent Achievements in Multimedia for Cultural Heritage - Guest Editorial [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino
abstract

For quite some time, libraries, document and historical centers from opposite corners of the world have been the caretakers of our rich and assorted social legacy. They have protected and furnished access to the testimonies of knowledge, beauty and inspiration, such as sculptures, paintings, music and literature. The new information technologies have created unbelievable opportunities to make this common heritage more accessible for all. Culture is following the digital path and “memory institutions” are adapting the way in which they communicate with their public. Multimedia technologies have recently created the conditions for a true revolution in the cultural heritage area, with reference to the study, valorization, and fruition of artistic works. New multimedia technologies shall be able to be utilized to plan unique approaches to the perception and fulfillment of the masterful legacy, for instance, through smart cultural objects and new interfaces with the backing of items such as story-telling, gaming and learning.All the plurality of masterpieces (paintings, books, manuscripts, even photos of sculptures and architecture) can be effectively embedded into a unique ``paradigm'' through digitization. This allows a significant reduction in costs, an enormous expansion of public accessibility (and therefore income), and at the same time a tremendous freedom for data elaboration. In brief, digitization enhances pleasure for the public and usefulness to experts on cultural heritage assets.


2012 - Towards Artistic Collections Navigation Tools based on Relevance Feedback [Relazione in Atti di Convegno]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

Artistic image collections are usually managed via textual metadata into standard content management systems. More sophisticated searches can be performed using image retrieval technologies based on visual content. Nevertheless, the problem of the information presentation remains. In this paper we try to move beyond the classic grid-styled presentation model, suggesting a novel use of relevance feedback as a navigation tool. Relevance feedback is therefore used to warp the view and allow the user to spatially navigate the image collection, and at the same time focus on his retrieval aim. This is obtained exploiting a distance based space warping on the 2D projection of the distance matrix. Multitouch gestures are employed to provide feedbacks by natural interaction with the system.


2012 - Veiling Luminance estimation on FPGA-based embedded smart camera [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Santinelli, Paolo; Cucchiara, Rita
abstract

This paper describes the design and development of a Veiling Luminance estimation system based on the use of a CMOS image sensor, fully implemented on FPGA. The system is composed of the CMOS Image sensor, FPGA, DDR SDRAM, USB controller and SPI (Serial Peripheral Interface) Flash. The FPGA is used to build a system-on-chip integrating a soft processor (Xilinx MicroBlaze) and all the hardware blocks needed to handle the external peripherals and memory. The soft processor is used to handle image acquisition and all computational tasks need to compute the Veiling Luminance value. The advantages of this single chip FPGA implementation include the reduction of the hardware requirements, power consumption, and system complexity. The problem of the high dynamic range images have been addressed with multiple acquisitions at different exposure times. Vignetting, radial distortion and angular weighting, as required by veiling luminance definition, are handled by a single integer look-up table (LUT) access. Results are compared with a state of the art certified instrument.


2011 - A low-cost system and calibration method for veiling luminance measurement [Relazione in Atti di Convegno]
Cattini, Stefano; Grana, Costantino; Cucchiara, Rita; Rovati, Luigi
abstract

A CCD-based measuring instrument aimed at the veiling luminance estimation and the relative low-cost calibration method are described. The system may allow the estimation of the optimum luminance levels in road-tunnels lighting, thus both increasing the drivers safety and avoiding energy wasting hence unjustified higher lighting-costs.


2011 - Automatic segmentation of digitalized historical manuscripts [Articolo su rivista]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

The artistic content of historical manuscripts provides a lot of challenges in terms of automatic text extraction, picture segmentation and retrieval by similarity. In particular this work addresses the problem of automatic extraction of meaningful pictures, distinguishing them from handwritten text and floral and abstract decorations. The proposed solution firstly employs a circular statistics description of a directional histogram in order to extract text. Then visual descriptors are computed over the pictorial regions of the page: the semantic content is distinguished from the decorative parts using color histograms and a novel texture feature called Gradient Spatial Dependency Matrix. The feature vectors are finally processed using an embedding procedure which allows increased performance in later SVM classification. Results for both feature extraction and embedding based classification are reported, supporting the effectiveness of the proposal on high resolution replicas of artistic manuscripts.


2011 - Feature Space Warping Relevance Feedback with Transductive Learning [Relazione in Atti di Convegno]
Borghesani, Daniele; Coppi, Dalia; Grana, Costantino; Calderara, Simone; Cucchiara, Rita
abstract

Relevance feedback is a widely adopted approach to improve content-based information retrieval systems by keeping the user in the retrieval loop. Among the fundamental relevance feedback approaches, feature space warping has been proposed as an effective approach for bridging the gap between high-level semantics and the low-level features. Recently, combination of feature space warping and query point movement techniques has been proposed in contrast to learning based approaches, showing good performance under dierent data distributions. In this paper we propose to merge feature space warping and transductive learning, in order to benet from both the ability of adapting data to the user hints and the information coming from unlabeled samples. Experimental results on an image retrieval task reveal signicant performance improvements from the proposed method.


2011 - Optimal Decision Trees Generation from OR-Decision Tables [Relazione in Atti di Convegno]
Grana, Costantino; Montangero, Manuela; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we present a novel dynamic programming algorithm to synthesize an optimal decision tree from OR-decision tables,an extension of standard decision tables,which allow to choose between several alternative actions in the same rule. Experiments are reported,showing the computational time improvements over state of the art implementations of connected components labeling,using this modelling technique.


2011 - Probabilistic people tracking with appearance models and occlusion classification: The AD-HOC system [Articolo su rivista]
Vezzani, Roberto; Grana, Costantino; Cucchiara, Rita
abstract

AD-HOC (Appearance Driven Human tracking with Occlusion Classification) is a complete framework for multiple people tracking in video surveillance applications in presence of large occlusions. The appearance-based approach allows the estimation of the pixel-wise shape of each tracked person even during the occlusion. This peculiarity can be very useful for higher level processes, such as action recognition or event detection. A first step predicts the position of all the objects in the new frame while a MAP framework provides a solution for best placement. A second step associates each candidate foreground pixel to an object according to mutual object position and color similarity. A novel definition of non-visible regions accounts for the parts of the objects that are not detected in the current frame, classifying them as dynamic, scene or apparent occlusions. Results on surveillance videos are reported, using in-house produced videos and the PETS2006 test set.


2011 - Relevance feedback strategies for artistic image collections tagging [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

This paper provides an analysis on relevance feedback techniques in a multimedia system designed for the interactive exploration and annotation of artistic collections, in particular illuminated manuscripts. The relevance feedback is presented not only as a very effective technique to improve the performance of the system, but also as a clever way to increase the user experience, mixing the interactive surfing through the artistic content with the possibility to gather valuable information from the user, and consequently improving his retrieval satisfaction. We compare a modification of the Mean-Shift Feature Space Warping algorithm, as representative of the standard RF procedures, and a learning-based technique based on transduction, considered in order to overcome some limitation of the previous technique. Experiments are reported regarding the adopted visual features based on covariance matrices.


2011 - Workshop IMPRESS 2011: Preface [Relazione in Atti di Convegno]
Decker, H.; Grana, C.; Perez, J. -C.
abstract


2010 - Bag-Of-Words Classification of Miniature Illustrations [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Gualdi, Giovanni; Cucchiara, Rita
abstract

In this paper a system for illuminated manuscripts images analysis is presented. In particular the bag-of-keypoints strategy, commonly adopted for object recognition, image classification and scene recognition, is applied to the classification of automatically extracted miniatures. Pictures are characterized by SURF descriptors, and a classification procedure is performed, comparing the results of Naive Bayes and histogram intersection distance measures.


2010 - Decision Trees for Fast Thinning Algorithms [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

We propose a new efficient approach for neighborhood exploration, optimized with decision tables and decision trees, suitable for local algorithms in image processing. In this work, it is employed to speed up two widely used thinning techniques. The performance gain is shown over a large freely available dataset of scanned document images.


2010 - High Performance Connected Components Labeling on FPGA [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Santinelli, Paolo; Cucchiara, Rita
abstract

This paper proposes a comparison of the two most advanced algorithms for connected components labeling, highlighting how they perform on a soft core SoC architecture based on FPGA. In particular we test our block based connected components labeling algorithm, optimized with decision tables and decision trees. The embedded system is composed of the CMOS image sensor, FPGA, DDR SDRAM, USB controller and SPI Flash. Results highlight the importance of caching and instructions and data cache sizes for high performance image processing tasks.


2010 - Improving classification and retrieval of illuminated manuscripts with semantic information [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we detail a proposal of exploitation of expert-made commentaries in a unified system for illuminated manuscripts images analysis. In particular we will explore the possibility to improve the automatic segmentation of meaningful pictures, as well as the retrieval by similarity search engine, using clusters of keywords extracted from commentaries as semantic information.


2010 - Message from the IMPRESS 2010 Workshop Chairs [Relazione in Atti di Convegno]
H., Decker; Grana, Costantino; J. C., Pérez; E., Vidal
abstract

-


2010 - Message from the organizers [Prefazione o Postfazione]
Decker, H.; Grana, C.; Perez, J. C.; Vidal, E.
abstract


2010 - Optimized Block-based Connected Components Labeling with Decision Trees [Articolo su rivista]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we define a new paradigm for 8-connection labeling, which employes a general approach to improve neighborhood exploration and minimizes the number of memory accesses. Firstly we exploit and extend the decision table formalism introducing OR-decision tables, in which multiple alternative actions are managed. An automatic procedure to synthesize the optimal decision tree from the decision table is used, providing the most effective conditions evaluation order. Secondly we propose a new scanning technique that moves on a 2x2 pixel grid over the image, which is optimized by the automatically generated decision tree.An extensive comparison with the state of art approaches is proposed, both on synthetic and real datasets. The synthetic dataset is composed of different sizes and densities random images, while the real datasets are an artistic image analysis dataset, a document analysis dataset for text detection and recognition, and finally a standard resolution dataset for picture segmentation tasks. The algorithm provides an impressive speedup over the state of the art algorithms.


2010 - Rerum Novarum: Interactive Exploration of Illuminated Manuscripts [Relazione in Atti di Convegno]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

This paper describes an interactive application for the exploration and annotation of illuminated manuscripts, which typically contain thousands of pictures, used to comment or embellish the manuscript Gothic text. The system is composed by a modern user interface for browsing, surfing and querying, an automatic segmentation module, to ease the initial picture extraction task, and a similarity based retrieval engine, used to provide visually assisted tagging capabilities. A relevance feedback procedure is included to further refine the results.


2010 - Surfing on Artistic Documents with Visually Assisted Tagging [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

This paper describes a complete architecture for the interactive exploration and annotation of artistic collections. In particular the focus is on Renaissance illuminated manuscripts, which typically contain thousands of pictures, used to comment or embellish the manuscript Gothic text. The final aim is to create a human centered multimedia application allowing the non practitioners to enjoy these masterpieces and expert users to share their knowledge. The system is composed by a modern user interface for browsing, surfing and querying, an automatic segmentation module, to ease the initial picture extraction task, and a similarity based retrieval engine, used to provide visually assisted tagging capabilities. A relevance feedback procedure is included to further refine the results. Experiments are reported regarding the adopted visual features based on covariance matrices and the Mean Shift Feature Space Warping relevance feedback. Finally some hints on the user interface for museum installations are discussed.


2009 - Automatic Analysis of Historical Manuscripts [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper a document analysis tool for historical manuscripts is proposed. The goal is to automatically segment layout components of the page, that is text, pictures and decorations. We specifically focused on the pictures, proposing a set of visual features able to identify significant pictures and separating them from all the floral and abstract decorations. The analysis is performed by blocks using a limited set of color and texture features, including a new texture descriptor particularly effective for this task, namely Gradient Spatial Dependency Matrix. The feature vectors are processed by an embedding procedure which allows increased performance in later SVM classification.


2009 - Color features performance comparison for image retrieval [Relazione in Atti di Convegno]
Borghesani, Daniele; Grana, Costantino; Cucchiara, Rita
abstract

This paper proposes a comparison of color features for image retrieval. In particular the UCID image database has been employed to compare the retrieval capabilities of different color descriptors. The set of descriptors comprises global and spatially related features, and the tests show that HSV based global features provide the best performance at varying brightness and contrast settings.


2009 - Connected component labeling techniques on modern architectures [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we present an overview of the historical evolution of connected component labeling algorithms, and in particular the ones applied on images stored in raster scan order. This brief survey aims at providing a comprehensive comparison of their performance on modern architectures, since the high availability of memory and the presence of caches make some solutions more suitable and fast. Moreover we propose a new strategy for label propagation based on a 2x2 blocks, which allows to improve the performance of many existing algorithms. The tests are conducted on high resolution images obtained from digitized historical manuscripts and a set of transformations is applied in order to show the algorithms behavior at different image resolutions and with a varying number of labels.


2009 - Dynamic Pictorially Enriched Ontologies for Digital Video Libraries [Articolo su rivista]
M., Bertini; A., Del Bimbo; Serra, Giuseppe; C., Torniai; Cucchiara, Rita; Grana, Costantino; Vezzani, Roberto
abstract

This article presents a framework for automatic semantic annotation of video streams with an ontology that includes concepts expressed using linguistic terms and visual data.


2009 - Fast Block Based Connected Components Labeling [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this paper we present a new optimization technique for the neighborhood computation in connected component labeling focused on images stored in raster scan order. This new technique is based on a 2x2 square block analysis of the image, and it exploits the fact that, when using 8-connection, the pixels of a 2x2 square are all connected to each other. This implies that they will share the same label at the end of the computation. To prove the effectiveness of our proposal, we show a comprehensive comparison of the most used and advanced connected components labeling techniques presented so far. The tests are conducted on high resolution images obtained from digitized historical manuscripts and a set of transformations is applied in order to show the algorithms behavior at different image resolutions and with a varying number of labels.


2009 - Optimal decision tree synthesis for efficient neighborhood computation [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele
abstract

This work proposes a general approach to optimize the time required to perform a choice in a decision support system, with particular reference to image processing tasks with neighborhood analysis. The decisions are encoded in a decision table paradigm that allows multiple equivalent procedures to be performed for the same situation. An automatic synthesis of the optimal decision tree is implemented in order to generate the most efficient order in which conditions should be considered to minimize the computational requirements.To test out approach, the connected component labeling scenario is considered. Results will show the speedup introduced using an automatically built decision system able to efficiently analyze and explore the neighborhood.


2009 - Picture Extraction from Digitized Historical Manuscripts [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this work we propose a system for automatic document segmentation to extract graphical elements from historical manuscripts and then to identify significant pictures from them, removing floral and abstract decorations. The system performs a block based analysis by means of color and texture features. The Gradient Spatial Dependency Matrix, a new texture operator particularly effective for this task, is proposed. The feature vectors are processed by an embedding procedure which allows increased performance in later SVM classification. Results for both feature extraction and embedding based classification are reported, supporting the effectiveness of the proposal.


2008 - "Inside the Bible": Segmentation, Annotation and Retrieval for a New Browsing Experience [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Calderara, Simone; Cucchiara, Rita
abstract

In this paper we present a system for automatic segmentation, annotation and image retrieval based on content, focused on illuminated manuscripts and in particular the Borso D'Este Holy Bible. To enhance the interaction possibilities with this work, full of decorations and illustrations, we exploit some well known document analysis techniques in addition to some new approaches, in order to achieve good segmentation of pages into meaningful visual objects with the relative annotation. We wanted to extend the standard keyword-based retrieval approach in a commentary with a modern visual-based retrieval by appearance similarity: an entire software user interface for exploration and visual search of illuminated manuscripts.


2008 - Describing Texture Directions with Von Mises Distributions [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this work we describe a new approach for texture characterization. Starting from the autocorrelation matrix an elegant description through a mixture of Von Mises distributions is proposed. A compact 6 valued descriptor is produced for each block and served as input to an SVM classifier. Tests are carried out on high resolution illuminated manuscripts images.


2007 - Colour clusters for computer diagnosis of melanocytic lesions [Articolo su rivista]
Seidenari, Stefania; Grana, Costantino; Pellacani, Giovanni
abstract

Background: To overcome subjectivity and variability in the interpretation of dermoscopic images, image analysis programs, enabling the numerical description of melanocytic lesion images, have been developed. Objectives: Our aim was to assess a method for the description of colours in melanocytic lesion images, based on the subdivision of image colours into red, green and blue clusters. Methods: Melanomas and naevi of the test set were described by means of 23 colour clusters previously selected by a training set comprising 369 melanocytic lesion images. The diagnostic performance obtained by this automated method was compared to sensitivity and specificity of diagnosis of 4 dermatologists. Results: Colour cluster values significantly differed between melanomas and naevi. Moreover, sensitivity and specificity values of computer diagnosis were similar to those achieved by the dermatologists. Conclusion: Our image analysis program based on the assessment of one single parameter has the diagnostic accuracy of dermatologists employing dermoscopy on a regular basis.


2007 - Compressed Domain Features Extraction for Shot Characterization [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Borghesani, Daniele; Cucchiara, Rita
abstract

In this work, we propose a system for shot comparison directly working on the MPEG-1 stream in the compressed domain, extracting both color, texture and motion features considering all frames with a reasonable computational cost, and results comparable to those obtained on uncompressed keyframes. In particular a summary descriptor for each Group Of Pictures (GOP) is computed and employed for shot characterization and comparison. The Mallows distance allows to match different length clips in a unified framework.


2007 - Dynamic Pictorial Ontologies for Video Digital libraries Annotation [Relazione in Atti di Convegno]
M., Bertini; A., Del Bimbo; C., Torniai; Grana, Costantino; Cucchiara, Rita
abstract

In this paper, we present the dynamic pictorial ontology paradigm for video annotation. Ontologies are often used to describe a given domain for different goals, including description of multimedia data. In the case of video annotation, the visual knowledge cannot be described using only abstract concepts but is more effectively represented in a visual form. To this aim, we introduce visual concepts, elicited from the data set as the most representative prototypes that specialize abstract concepts. The ontology created is intrinsically dynamic since it must embrace the perceptual and visual experience during annotation. Thus visual concepts can change, adapting to the multimedia content analyzed. Motivation for this new ontology paradigm are discussed together with a proposal of a framework for ontology creation, maintenance, and automatic annotation of video. The creation and usage of dynamic pictorial ontologies have been tested for soccer domain exploiting low level perceptual features and higher level domain features.


2007 - Early Detection of Melanoma by Image Analysis [Capitolo/Saggio]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

The worldwide incidence of cutaneous melanoma has increased dramatically over the past decades. It is well known that a good prognosis of melanoma is only expected for thin lesions. Preventive effort has therefore been concentrated on identification of early lesions facilitated by the introduction and dissemination of standardized clinical criteria and by the use of dermoscopy (epiluminescence microscopy). However, the interpretation of dermoscopic criteria is often confusing especially for the inexperienced observer. This chapter summarizes recent computer techniques to help the clinician in these tasks.


2007 - Enhancing HSV Histograms with Achromatic Points Detection for Video Retrieval [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Cucchiara, Rita
abstract

Color is one of the most meaningful features used in content based retrieval of visual data. In video content based retrieval, color features computed on selected frames are integrated with other low-level features concerning texture, shape and motion in order to find clip similarities. For example, the Scalable Color feature defined in the MPEG-7 standard exploits HSV histograms to create color feature vectors. HSV is a widely adopted space in image and video retrieval, but its quantization for histogram generation can create misleading errors in classification of achromatic and low saturated colors. In this paper we propose an Enhanced HSV Histogram with achromatic point detection based on a single Hue and Saturation parameter that can correct this limitation. The enhanced histograms have proven to be effective in color analysis and they have been used in a system for automatic clip annotation called PEANO, where pictorial concepts are extracted by a clip clustering and used for similarity based automatic annotation.


2007 - Linear Transition Detection as a Unified Shot Detection Approach [Articolo su rivista]
Grana, Costantino; Cucchiara, Rita
abstract

In this paper, we propose an automatic system forvideo shot segmentation, called Linear Transition Detector (LTD),unique for both cuts and linear transitions detection. Comparisonwith publicly available shot detection systems is reported ondifferent sports (Formula 1, basket, soccer and cycling) andTRECVID 2005 results are also reported.


2007 - Network patterns recognition for automatic dermatologie images classification [Relazione in Atti di Convegno]
Grana, C.; Daniele, V.; Pellacani, G.; Seidenari, S.; Cucchiara, R.
abstract

In this paper we focus on the problem of automatic classification of melanocytic lesions, aiming at identifying the presence of reticular patterns. The recognition of reticular lesions is an important step in the description of the pigmented network, in order to obtain meaningful diagnostic information. Parameters like color, size or symmetry could benefit from the knowledge of having a reticular or non-reticular lesion. The detection of network patterns is performed with a three-steps procedure. The first step is the localization of line points, by means of the line points detection algorithm, firstly described by Steger. The second step is the linking of such points into a line considering the direction of the line at its endpoints and the number of line points connected to these. Finally a third step discards the meshes which couldn't be closed at the end of the linking procedure and the ones characterized by anomalous values of area or circularity. The number of the valid meshes left and their area with respect to the whole area of the lesion are the inputs of a discriminant function which classifies the lesions into reticular and non-reticular. This approach was tested on two balanced (both sets are formed by 50 reticular and 50 non-reticular images) training and testing sets. We obtained above 86% correct classification of the reticular and non-reticular lesions on real skin images, with a specificity value never lower than 92%.


2007 - Network patterns recognition for automatic dermatoscopic images classification [Relazione in Atti di Convegno]
Grana, Costantino; Vanini, Daniele; Seidenari, Stefania; Pellacani, Giovanni; Cucchiara, Rita
abstract

In this paper we focus on the problem of automatic classification of melanocytic lesions, aiming at identifying the presence of reticular patterns. The recognition of reticular lesions is an important step in the description of the pigmented network, in order to obtain meaningful diagnostic information. Parameters like color, size or symmetry could benefit from the knowledge of having a reticular or non-reticular lesion. The detection of network patterns is performed with a three-steps procedure. The first step is the localization of line points, by means of the line points detection algorithm, firstly described by Steger. The second step is the linking of such points into a line considering the direction of the line at its endpoints and the number of line points connected to these. Finally a third step discards the meshes which couldn’t be closed at the end of the linking procedure and the ones characterized by anomalous values of area or circularity. The number of the valid meshes left and their area with respect to the whole area of the lesion are the inputs of a discriminant function which classifies the lesions into reticular and non-reticular. This approach was tested on two balanced (both sets are formed by 50 reticular and 50 non-reticular images) training and testing sets. We obtained above 86% correct classification of the reticular and non-reticular lesions on real skin images, with a specificity value never lower than 92%.


2007 - Prototypes Selection with Context Based Intra-class Clustering for Video Annotation with Mpeg7 Features [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Cucchiara, Rita
abstract

In this work, we analyze the effectiveness of perceptual features to automatically annotate video clips in domain-specific video digital libraries. Typically, automatic annotation is provided by computing clip similarity with respect to given examples, which constitute the knowledgebase, in accordance with a given ontology or a classification scheme. Since the amount of training clips is normally very large, we propose to automatically extract some prototypes, or visual concepts, for each class instead of using the whole knowledge base. The prototypes are generated after a Complete Link clustering based on perceptual features with an automatic selection of the number of clusters. Context based information are used in an intra-class clustering framework to provide selection of more discriminative clips. Reducing the number of samples makes the matching process faster and lessens the storage requirements. Clips are annotated following the MPEG-7 directives to provide easier portability. Results are provided on videos taken from sports and news digital libraries.


2007 - Semi-automatic Video Digital Library Annotation Tools [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Vezzani, Roberto
abstract

In this work, we present a general purpose systemfor hierarchical structural segmentation and automaticannotation of video clips, by means of standardizedlow level features. We propose to automatically extractsome prototypes for each class with a context basedintra-class clustering. Clips are annotated followingthe MPEG-7 standard directives to provide easierportability. Results of automatic annotation and semiautomaticmetadata creation are provided.


2007 - Similarity-Based Retrieval with MPEG-7 3D Descriptors: Performance Evaluation on the Princeton Shape Benchmark [Relazione in Atti di Convegno]
Grana, Costantino; M., Davolio; Cucchiara, Rita
abstract

In this work, we describe in detail the new MPEG-7 Perceptual 3D Shape Descriptor and provide a set of tests with different 3D objects databases, mainly with the Princeton Shape Benchmark. With this purpose we created a function library called Retrieval-3D and fixed some bugs of the MPEG-7 eXperimentation Model (XM). We explain how to match the Attributed Relational Graph (ARG) of every 3D model with the modified nested Earth Mover’s Distance (mnEMD). Finally we compare our results with the best found in literature, including the first MPEG-7 3D descriptor, i.e. the Shape Spectrum Descriptor.


2007 - Sports Video Annotation Using Enhanced HSV Histograms in Multimedia Ontologies [Relazione in Atti di Convegno]
M., Bertini; A., Del Bimbo; C., Torniai; Grana, Costantino; Vezzani, Roberto; Cucchiara, Rita
abstract

This paper presents multimedia ontologies, where multimedia data and traditional textual ontologies are merged. A solution for their implementation for the soccer video domain and a method to perform automatic soccer video annotation using these extended ontologies is shown. HSV is a widely adopted space in image and video retrieval, but its quantization for histogram generation can create misleading errors in classification of achromatic and low saturated colors. In this paper we propose an Enhanced HSV Histogram with achromatic point detection based on a single Hue and Saturation parameter that can correct this limitation.The more general concepts of the sport domain (e.g. play/break, crowd, etc.) are put in correspondence with the more general visual features of the video like color and texture, while the more specific concepts of the soccer domain (e.g. highlights such as attack actions) are put in correspondence with domain specific visual feature like the soccer playfield and the players. Experimental results for annotation of soccer videos using generic concepts are presented.


2007 - Video Shots Comparison using the Mallows Distance [Relazione in Atti di Convegno]
Grana, Costantino; Borghesani, Daniele; Cucchiara, Rita
abstract

In this work, we focus on two aspects of the comparison of video shots. We present a new approach to extract a variable number of key frames from a shot, by the use of a hierarchical clustering with automatic level selection, in order to provide optimal allocation of features on different parts of the shot. We then employ the Mallows distance as an effective technique to compare the discrete distributions of features, independently from the features selected for the specific application. Results and comparisons on a soccer documentary video are provided.


2006 - A Distributed Domotic Surveillance System [Capitolo/Saggio]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Vezzani, Roberto
abstract

Distributed video surveillance has a direct application in intelligent home automation or domotics (from the Latin word domus, that means “home”, and informatics); in particular, in-house videosurveillance can provide good support for people with some difficulties (e.g., elderly or disabled people) living alone and with a limited autonomy. New hardware technologies for surveillance are now affordable and provide high reliability. Problems related to reliable software solutions are not completely solved, especially concerning the application of general-purpose computer vision techniques in indoor environments. Indeed, assuming the objective is to detect the presence of people, track them, and recognize dangerous behaviours by means of abrupt changes in their posture, robust techniques must cope with non-trivial difficulties. In particular, luminance changes and shadows must be taken into account, frequent posture changes must be faced, and large and long-lasting occlusions are common due to the vicinity of the cameras and the presence of furnitureand doors that can often hide parts of the person’s body. These problems are analyzed and solutions based on background suppression, appearance-based probabilistic tracking, and probabilistic reasoning for posture recognition are described.


2006 - A semi-automatic video annotation tool with MPEG-7 content collections [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; D., Bulgarelli; Vezzani, Roberto
abstract

In this work, we present a general purpose system for hierarchical structural segmentation and automatic annotation of video clips, by means of standardized low level features. We propose to automatically extract some prototypes for each class with a context based intra-class clustering. Clips are annotated following the MPEG-7 standard directives to provide easier portability. Results of automatic annotation and semiautomatic metadata creation are provided


2006 - Algorithmic reproduction of asymmetry and border cut-off parameters according to the ABCD rule for dermoscopy [Articolo su rivista]
Pellacani, Giovanni; Grana, Costantino; Seidenari, Stefania
abstract

Background Semiquantitative algorithms were applied to dermoscopic images to improve the clinical diagnosis for melanoma. Objective The aim of the study was to develop a computerized method for automated quantification of the 'A' (asymmetry) and 'B' (border cut-off) parameters, according to the ABCD rule for dermoscopy, thus reproducing human evaluation. Methods Three hundred and thirty-one melanocytic lesion images, referring to 113 melanomas and 218 melanocytic nevi, acquired by means of a digital videodermatoscope, were considered. Images were evaluated by two experienced observers and by using computer algorithms developed by us. Clinical evaluation of asymmetry was performed by attributing scores to shape asymmetry and asymmetry of pigment distribution and structures, whereas computer evaluation of shape and pigment distribution asymmetries were based on the assessment of differences in area and lightness in the two halves of the image, respectively. Borders were evaluated both by clinicians and by the computer, by attributing a score to each border segment ending abruptly. Differences between nevus and melanoma values were evaluated using the chi-square test, while Cohen's Kappa index for agreement was employed for the evaluation of the concordance between human and computer. Results Pigment distribution asymmetry appears the most striking parameter for melanoma diagnosis both for human and for automated diagnosis. A good concordance between clinicians and computer evaluation was achieved for all asymmetry parameters, and was excellent for border cut-off evaluation. Conclusions These algorithms enable a good reproduction of the 'A' and 'B' parameters of the ABCD rule for dermoscopy, and appear useful for diagnostic and learning purposes.


2006 - Asymmetry in dermoscopic melanocytic lesion images: a computer description based on colour distribution [Articolo su rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

Digital dermoscopy improves the accuracy of melanoma diagnosis. The aim of this study was to develop and validate software for assessment of asymmetry in melanocytic lesion images, based on evaluation of colour symmetry, and to compare it with assessment by human observers. An image analysis program enabling numerical assessment of asymmetry in melanocytic lesions, based on the evaluation and comparison of CIE L*a*b* colour components (CIE L*a*b* is the name of a colour space defined by the Commission Internationale de l'Eclairage) inside image colour blocks, was employed on the recorded lesion images. Clinical evaluation of asymmetry in dermoscopic images was performed on the same image set employing a 0-1 scoring system. Asymmetry judgement was expressed by the clinicians for 12.8% of benign naevi, 44.7% of atypical naevi and 64.2% of malignant melanomas, whereas the computer identified as asymmetric 6.3%, 33.3% and 82.2%, respectively. Numerical parameters referring to malignant melanomas were significantly higher, both with respect to benign naevi and atypical naevi. The numerical parameters produced could be effectively employed for computer-aided melanoma diagnosis.


2006 - Automated Assessment of Pigment Distribution and Color Areas for Melanoma Diagnosis [Capitolo/Saggio]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

In this paper an automated assessment of pigment distribution and color areas for melanoma diagnosis is described.


2006 - Comparison of color clustering algorithms for segmentation of dermatological images [Relazione in Atti di Convegno]
Melli, Rudy Mirko; Grana, Costantino; Cucchiara, Rita
abstract

Automatic segmentation of skin lesions in clinical images is a very challenging task; it is necessary for visual analysis of the edges, shape and colors of the lesions to support the melanoma diagnosis, but, at the same time, it is cumbersome since lesions (both naevi and melanomas) do not have regular shape, uniform color, or univocal structure. Most of the approaches adopt unsupervised color clustering. This works compares the most spread color clustering algorithms, namely median cut, k-means, fuzzy-c means and mean shift applied to a method for automatic border extraction, providing an evaluation of the upper bound in accuracy that can be reached with these approaches. Different tests have been performed to examine the influence of the choice of the parameter settings with respect to the performances of the algorithms. Then a new supervised learning phase is proposed to select the best number of clusters and to segment the lesion automatically. Examples have been carried out in a large database of medical images, manually segmented by dermatologists. From these experiments mean shift was resulted the best technique, in term of sensitivity and specificity. Finally, a qualitative evaluation of the goodness of segmentation has been validated by the human experts too, confirming the results of the quantitative comparison.


2006 - Distance transform for automatic dermatologic images composition [Relazione in Atti di Convegno]
Grana, Costantino; Pellacani, Giovanni; Seidenari, Stefania; Cucchiara, Rita
abstract

In this paper we focus on the problem of automatically registering dermatological images, because even if different products are available, most of them share the problem of a limited field of view on the skin. A possible solution is then the composition of multiple takes of the same lesion with digital software, such as that for panorama images creation.In this work, to perform an automatic selection of matching points the Harris Corner Detector is used, and to cope with outlier couples we employed the RANSAC method. Projective mapping is then used to match the two images. Given a set of correspondence points, Singular Value Decomposition was used to compute the transform parameters.At this point the two images need to be blended together. One initial assumption is often implicitly made: the aim is to merge two rectangular images. But when merging occurs between more than two images iteratively, this assumption will fail. To cope with differently shaped images, we employed the Distance Transform and provided a weighted merging of images. Different tests were conducted with dermatological images, both with standard rectangular frame and with not typical shapes, as for example a ring due to the objective and lens selection. The successive composition of different circular images with other blending functions, such as the Hat function, doesn’t correctly get rid of the border and residuals of the circular mask are still visible. By applying Distance Transform blending, the result produced is insensitive of the outer shape of the image.


2006 - Line Detection and Texture Characterization of Network Patterns [Relazione in Atti di Convegno]
Grana, Costantino; Cucchiara, Rita; Pellacani, Giovanni; Seidenari, Stefania
abstract

This paper describes a complete approach to detect, localize and describe network patterns. Such texture is automatically detected with Gaussian derivative kernels and Fisher linear discriminant analysis; line closure and thinning is provided by morphological masking and line luminance profile fitting provides width estimation. Detection results on dermatological images are reported and discussed.


2006 - MOM: multimedia ontology manager. A framework for automatic annotation and semantic retrieval of video sequences [Relazione in Atti di Convegno]
M., Bertini; A., Del Bimbo; C., Torniai; Grana, Costantino; Cucchiara, Rita
abstract

Effective usage of multimedia digital libraries has to deal with the problem of building efficient content annotation and retrieval tools. MOM (Multimedia Ontology Manager) is a complete system that allows the creation of multimedia ontologies, supports automatic annotation and creation of extended text (and audio) commentaries of video sequences, and permits complex queries by reasoning on the ontology.


2006 - MPEG-7 Pictorially Enriched Ontologies for Video Annotation [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Bulgarelli, Daniele; Cucchiara, Rita
abstract

A system for the automatic creation of Pictorially Enriched Ontologies is presented, that is ontologies for context-based video digital libraries, enriched by pictorial concepts for video annotation, summarization and similarity-based retrieval. Extraction of pictorial concepts with video clips clustering, ontology storing with MPEG-7, and the use of the ontology for stored video annotation are described. Re-sults on sport videos and TRECVID2005 video material are reported.


2006 - PEANO: Pictorial Enriched Annotation of Video [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Bulgarelli, Daniele; Gualdi, Giovanni; Cucchiara, Rita; M., Bertini; C., Torniai; A., Del Bimbo
abstract

In this DEMO, we present a tool set for video digital library management that allows i) structural annotation of edited videos in MPEG-7 by automatically extracting shots and clips; ii) automatic semantic annotation based on perceptual similarity against a taxonomy enriched with pictorial concepts iii) video clip access and hierarchical summarization with stand-alone and web interface iv) access to clips from mobile platform in GPRS-UMTS videostreaming. The tools can be applied in different domain-specific Video Digital Libraries. The main novelty is the possibility to enrich the annotation with pictorial concepts that are added to a textual taxonomy in order to make the automatic annotation process more fast and often effective. The resulting multimedia ontology is described in the MPEG-7 framework. The PEANO (Perceptual Annotation of Video) tool has been tested over video art, sport (Soccer, Olimpic Games 2006, Formula 1) and news clips.


2006 - Performance of the MPEG-7 Shape Spectrum Descriptor for 3D objects retrieval [Relazione in Atti di Convegno]
Grana, Costantino; Cucchiara, Rita
abstract

In this work, we describe in detail the MPEG-7 Shape Spectrum Descriptor and provide a set of tests with different 3D objects databases. To verify if the literature reported low performance of this descriptor were due to the comparison employed, we also used the Earth Movers Distance which allows much more detailed histograms comparisons. Finally we compare our outcomes with the best results in related work.


2006 - Practical Color Calibration for Dermatoscopic Images [Capitolo/Saggio]
Grana, Costantino; Pellacani, Giovanni; Seidenari, Stefania
abstract

In this paper a practical color calibration procedure for dermatoscopic image acquisition is illustrated, with details on the algorithms employed and results on real data.


2006 - Sub-Shot Summarization for MPEG-7 based Fast Browsing [Relazione in Atti di Convegno]
Grana, Costantino; Cucchiara, Rita
abstract

In this paper, we propose a system for automatic video summarization at sub-shot level. Our work covers two main aspects: the first is the sub-shot detection, which is performed without a priori constraints on the number or length of the shots. The algorithm is based on color histograms and motion features, and employs fuzzy c-means with variable number of clusters. The second aspect is an in depth discussion on the annotation of summaries with the MPEG-7 standard. Results on mixed genres TV material, from TRECVID videos, are reported.


2006 - University of Modena and Reggio Emilia at TRECVID 2006 [Relazione in Atti di Convegno]
Grana, Costantino; Vezzani, Roberto; Cucchiara, Rita
abstract

What approach or combination of approaches did you test in each of your submitted runs?TRECVID2005_UNIMORE_??.xml: the same linear transition detector (LTD) was tested forevery run, with ten uniformly spaced thresholds for the detection.What if any significant differences (in terms of what measures) did you find among theruns?The system behaved as expected: the higher the threshold the better the recall. Of course theprecision lowered correspondently. Interesting enough, it seems that we cannot overcome theoverall limit around 80% for recall and 88% for precision, independently of the other parameter.Based on the results, can you estimate the relative contribution of each component of yoursystem/approach to its effectiveness?One of the main objective of our system was to test the performance of a single algorithm forboth cuts and gradual transitions. So all the merit and the demerits are related to our LTD.Overall, what did you learn about runs/approaches and the research question(s) thatmotivated them?The use of a single algorithm allows the system to be run without training. Just a singleparameter may be employed to tune the sensibility of the system, thus allowing its use in generalpurpose/user friendly systems.


2006 - Video Clip Clustering for Assisted Creation of MPEG-7 Pictorially Enriched Ontologies [Relazione in Atti di Convegno]
Grana, Costantino; Bulgarelli, Daniele; Cucchiara, Rita
abstract

In this paper, we present a system for the assisted creation of Pictorially Enriched Ontologies, that is ontologies for context-based digital libraries enriched by pictorial concepts for video annotation, summarization and similarity based retrieval. Here we detail the approach for video clips clustering and pictorial concepts extraction together with the approach for storing the ontology within the MPEG-7 framework. The clustering is performed by Complete Link hierarchical clustering on color histograms and motion features. Results on Formula 1 TV material are reported.


2005 - Adaptation and Annotation of Formula 1 Sport Videos [Relazione in Atti di Convegno]
Grana, Costantino; Tardini, Giovanni; Cucchiara, Rita
abstract

In this paper, we approach the problem of detecting editing features suitable for video annotation, by paying attention to artifacts and effects introduced in video editing. In particular, a linear transition detection algorithm is presented, which can characterize the transition center and length with high precision. The technique works with sub-frame granularity and is able to include both abrupt cuts and longer dissolves in a single approach. Theoretical justification for the algorithm is provided with an optimization technique for real cases. We present results obtained exploiting the editing features on a Formula 1 video digital library, detecting replays and providing pre classification hints for automatic shot annotation.


2005 - Colors in atypical nevi: a computer description reproducing clinical assessment [Articolo su rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

Background/purpose: Atypical nevi (AN) share some dermoscopic features with early melanoma (MM), and computer elaboration of digital images could represent a useful support to diagnosis to assess automatically colors in AN, and to compare the data with those referring to clearly benign nevi (BN) and MMs. Methods: An image analysis program enabling the numerical description of color areas in melanocytic lesions was used on 459 videomicroscopic images, referring to 76 AN, 288 clearly BN and 95 MMs. Results: Black, white and blue-gray were more frequently found in AN than in clearly BN, but less frequently than in MMs. Color area values significantly differed between the three groups. Conclusion: The clinical-morphological interpretation of the numerical data, based on the mathematical description of the aspect and distribution of different color areas in different lesion types may contribute to the characterization of AN and their distinction from MMs.


2005 - Computer vision system for in-house video surveillance [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Vezzani, Roberto
abstract

In-house video surveillance to control the safety of people living in domestic environments is considered. In this context, common problems and general purpose computer vision techniques are discussed and implemented in an integrated solution comprising a robust moving object detection module which is able to disregard shadows, a tracking module designed to handle large occlusions, and a posture detector. These factors, shadows, large occlusions and people's posture, are the key problems that are encountered with in-house surveillance systems, A distributed system with cameras installed in each room of a house can be used to provide full coverage of people's movements. Tracking is based on a probabilistic approach in which the appearance and probability of occlusions are computed for the current camera and warped in the next camera's view by positioning the cameras to disambiguate the occlusions. The application context is the emerging area of domotics (from the Latin word domus, meaning 'home', and informatics). In particular, indoor video surveillance, which makes it possible for elderly and disabled people to live with a sufficient degree of autonomy, via interaction with this new technology, which can be distributed in a house at affordable costs and with high reliability.


2005 - In Vivo Confocal Microscopy of Melanocytic Lesions Improves Diagnostic Accuracy for Melanoma [Abstract in Rivista]
Pellacani, Giovanni; A. M., Cesinaro; Longo, Caterina; Bassoli, Sara; Grana, Costantino; Seidenari, Stefania
abstract

In vivo reflectance-mode confocal laser microscopy enables the visualization of the skin at quasi- histopathologic resolution. The aim of our study was to describe confocal features in melanocytic lesions, to evaluate their diagnostic significance for melanoma identification, to develop a simple algorithm useful for diagnostic purposes. A total of 102 consecutive melanocytic lesions (37 melanomas, 49 acquired nevi and 16 Spitz nevi), corresponding to lesions with equivocal aspects at clinical and dermoscopic inspection and excised in order to rule out a melanoma, were investigated by means of confocal microscopy (Vivascope 1000). In superficial layers the general pattern and the presence and aspects of pagetoid cells were evaluated. At basal cell layer dermal papilla features and cytological aspects, suggesting the presence of cellular atypia, were described. In dermal papilla, the presence and morphology of melanocytic nests and the presence and aspect of solitary cells were evaluated. Some features were more frequently observed in melanomas. In multivariate analysis 6 features appeared independently correlated with melanoma diagnosis. The presence of non edged dermal papillae, atypical cells in basal layers and isolated nucleated cells within dermal papilla were strongly correlated with melanoma diagnosis and were considered as major criteria (scored 2 points), whereas the presence of pagetoid cells, a widespread pagetoid infiltration in superficial layers and cerebriform nests in upper dermis were considered ‘‘minor’’ criteria (scored 1 point). A total score, ranging between 0 to 9, was obtained for each lesion and a ROC curve with an area under the curve of 0.951 was obtained on our dataset. In conclusion, characterization of confocal microscopy features of melanomas and nevi seems to improve diagnostic accuracy for difficult to diagnose melanocytic lesions.


2005 - MPEG-7 Compliant Shot Detection in Sport Videos [Relazione in Atti di Convegno]
Grana, Costantino; Tardini, Giovanni; Cucchiara, Rita
abstract

In this paper we propose a system for automatic detection of shots in sport videos. Our work covers two main aspects: the first is robust shot detection in presence of fast object motion and camera operations. To this aim we propose a new algorithm, unique for both cuts and linear transitions detection, which only needs the tuning of two parameters. An extended comparison with four transition detection algorithms, representing the state of the art in literature, is reported. Examples with formula 1, basket, soccer and cycling videos are analyzed. The second aspect is an in depth discussion on the annotation of shots and transitions with the MPEG-7 standard.


2005 - Microscopic in vivo description of cellular architecture of dermoscopic pigment network in nevi and melanomas [Articolo su rivista]
Pellacani, Giovanni; Am, Cesinaro; Longo, Caterina; Grana, Costantino; Seidenari, Stefania
abstract

Objective: To characterize the microscopic aspects of the dermoscopic pigment network in vivo, by means of confocal scanning laser microscopy. Design: Confocal imaging was performed on melanocytic lesions characterized by pigment network at dermoscopy. Some confocal architectural and cytologic features, as observed at the dermoepidermal junction, were morphologically described and quantified by means of a dedicated program. Setting: University medical department. Study Population: We studied confocal images of 15 melanomas, 15 dermoscopic atypical nevi, and 15 common nevi. Main Outcome Measures: Features referring to aspect, size, regularity, homogeneity, and infiltration of dermal papillae and to cellular size, regularity, and atypia were described by 2 observers on confocal images. Mean dermal papillary diameter, mean cell area, and shape irregularity were quantified by drawing papillae and cell contours on confocal images and measured with the use of a computer program. Results: Pigment network in melanomas consisted of large basal cells that circumscribed small to medium-sized dermal papillae with marked cellular atypia, sometimes infiltrating dermal papillae. On the other hand, common acquired nevi were characterized by lack of atypical cells and edged dermal papillae. Atypical nevi presented intermediate characteristics between clearly benign and malignant lesions. Conclusion: Cellular atypia was the most sensitive feature for melanoma diagnosis, whereas the presence of nucleated cells infiltrating dermal papillae was the most specific one.


2005 - Pigment distribution in melanocytic lesion images: a digital parameter to be employed for computer-aided diagnosis [Articolo su rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

Background/purpose: Since in early melanoma (MM) and especially in in situ MM differential structures, which are diagnostic for MM may be lacking, pigment distribution asymmetry represents an important diagnostic feature. Our aim was to automatically assess pigment distribution in images referring to MMs, atypical nevi (AN) and clearly benign nevi (BN), and to evaluate the diagnostic capability of numerical parameters describing a non homogeneous distribution of pigmentation. Methods: An image analysis program enabling the numerical assessment of pigment distribution in melanocytic lesions (ML), based on evaluation and comparison of red, green, blue (RGB) colour components inside image colour blocks, was employed on 459 videomicroscopic digital images, referring to 95 MMs, 76 AN and 288 BN. Results: Significant differences in pigment distribution parameters (mean RGB distance, variance and maximum distance) between the three ML populations were observed, permitting a good discrimination of MMs. On the test set comprising 230 lesion images, the area under the curve value of the receiver operating characteristic curve was 0.933. For a D score equal to 0, corresponding to the best diagnostic accuracy (86.6%), a sensitivity of 87.5% and a specificity of 85.7% were obtained. Conclusion: This original evaluation method for digital pigment distribution, based on mathematical description and comparison of colours in different image blocks, provides numerical parameters to be implemented in image analysis programs for computer-aided MM diagnosis.


2005 - Practical color calibration for dermoscopy, applied to a digital epiluminescence microscope [Articolo su rivista]
Grana, Costantino; Pellacani, Giovanni; Seidenari, Stefania
abstract

Background/purpose: The assessment of colors is essential for melanoma (MM) diagnosis, both for pattern analysis on dermoscopic images, and when using semiquantitative methods. Our aim was to provide a simple, precise characterization and reproducible calibration of the color response for dermoscopic instruments. Methods: Three processes were used to correct the non-uniform illumination pattern of the instrument, to easily estimate the camera gamma settings and to describe the color space conversion matrices required to produce standard images, in any color space. A specific color space was also developed to optimize the representation of dermatoscopic colors. The calibration technique was tested both on synthetic reference surfaces and on real images by comparing the difference between the images colors obtained with two different equipments. Results: The differences between the images acquired by means of the two instruments, calculated on the reference patterns after calibration, were up to 10 times lower then before, while comparison of histograms referring to real images provided an improvement of about seven times on average. Conclusions: A complete workflow for dermatologic image calibration, which allows the user to continue using his own software and algorithms, but with a much higher informative content, is presented. The technique is simple and may improve cooperation between different research centers, in teleconsulting contexts or for result comparisons.


2005 - Probabilistic posture classification for human-behavior analysis [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Vezzani, Roberto
abstract

Computer vision and ubiquitous multimedia access nowadays make feasible the development of a mostly automated system for human-behavior analysis. In this context, our proposal is to analyze human behaviors by classifying the posture of the monitored person and, consequently, detecting corresponding events and alarm situations, like a fall. To this aim, our approach can be divided in two phases: for each frame, the projection histograms (Haritaoglu et al., 1998) of each person are computed and compared with the probabilistic projection maps stored for each posture during the training phase; then, the obtained posture is further validated exploiting the information extracted by a tracking module in order to take into account the reliability of the classification of the first phase. Moreover, the tracking algorithm is used to handle occlusions, making the system particularly robust even in indoors environments. Extensive experimental results demonstrate a promising average accuracy of more than 95% in correctly classifying human postures, even in the case of challenging conditions.


2005 - Shot Detection for Formula 1 Video Digital Libraries [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Tardini, Giovanni
abstract

Metadata extraction is one of the first tasks to be performed for automatic Digital Library annotation, and in particular shot detection has been widely explored in literature. While a lot of methods have been proposed for the detection of abrupt cuts, only a small number of them has explicitly addressed the problem of gradual transitions. In this paper we propose an algorithm that exploits a precise model of linear transition. Experimental results on Formula 1 car races videos show the robustness of this method. These test videos are characterized by extreme situations such as fast camera and objects motion and very different kinds of shots. The algorithm is able to estimate the exact length of the transition and an error score is also given as a fitness measure to the linear model, to discriminate true transitions from false detections. The final shot segmentation is delivered as an MPEG7 compliant output.


2005 - Shot detection and motion analysis for automatic MPEG-7 annotation of sports videos [Relazione in Atti di Convegno]
Tardini, Giovanni; Grana, Costantino; R., Marchi; Cucchiara, Rita
abstract

In this paper we describe general algorithms that are devised for MPEG-7 automatic annotation of Formula 1 videos, and in particular for camera-car shots detection. We employed a shot detection algorithm suitable for cuts and linear transitions detection, which is able to precisely detect both the transition's center and length. Statistical features based on MPEG motion compensation vectors arc then employed to provide motion characterization, using a subset of the motion types defined in MPEG-7, and shot type classification. Results on shot detection and classification are provided.


2005 - Video understanding and content-based retrieval [Relazione in Atti di Convegno]
Y., Zhai; J., Liu; X., Cao; A., Basharat; A., Hakeem; S., Ali; M., Shah; Grana, Costantino; Cucchiara, Rita
abstract

This year, the joint team of UCF and the University of Modenahas participated in the following tasks: (1) shot boundarydetection, (2) low-level feature extraction, (3) high-levelfeature extraction, (4) topic search and (5) BBC rushes management.The shot boundary detection was contributed bythe Image Lab at the University of Modena. The other taskswere performed by the Computer Vision Team at UCF.


2004 - A computer description of asymmetry in melanocytic lesion images based on color distribution [Abstract in Rivista]
Seidenari, Stefania; Pellacani, Giovanni; A., Martella; Grana, Costantino
abstract

The assessment of asymmetry is essential for melanoma (MM) diagnosis, both when using a heuristic approach and when employing semiquantitative methods on dermoscopic images. The aim of our study was to develop and validate a software for assessment of asymmetry in melanocytic lesion images, based on evaluation of color symmetry, and to compare the automatic evaluation to the one performed by human observers. An image analysis program enabling the numerical assessment of asymmetry in melanocytic lesions, based on evaluation and comparison of RGB color components inside image color blocks, was employed on 459 videomicroscopic digital images, referring to 95 melanomas (MMs), 76 atypical nevi (AN) and 288 clearly benign nevi (BN). Clinical evaluation of asymmetry on dermoscopic images was performed on the same image set employing a 0–1 scoring system. Asymmetry judgement was expressed by the clinicians for 12.8% of BN, 44.7% of AN, and for 64.2% of MMs, whereas the computer identified 6.9% of BN, 27.6% of AN, and 87.4% of MMs as asymmetric. Sensitivity and specificity of clinical judgement were 64.2 and 80.5%, respectively, whereas for computer evaluation, a sensitivity of 87.5% and a specificity of 85.7% were obtained. Numerical parameters (mean RGB distance, variance and maximum distance) referring to MMs were significantly higher both with respect to BN and AN. This innovative method for automatic asymmetry evaluation, based on the mathematical description of color distribution in different image blocks, provides numerical parameters for employment in computer-aided melanoma diagnosis.


2004 - Automated description of colours in polarized-light surface microscopy images of melanocytic lesions [Articolo su rivista]
Pellacani, Giovanni; Grana, Costantino; Seidenari, Stefania
abstract

The aim of this study was to develop a computerized method for the identification and description of colour areas in melanocytic lesion images based on an approach mimicking the human perception of colours. A colour palette comprising six colour groups (black, dark brown, light brown, blue-grey, red and white) was created by selecting single colour components within melanocytic lesion images acquired using a digital videomicroscope, and was implemented in the image analysis program. For each colour region, the area, the distance from the lesion centroid, the spread, the colour area distribution in the internal and the external part of the lesion, and asymmetries were assessed on 604 melanocytic lesion images in our image database. Black, white and blue-grey colour areas were detected more frequently in melanomas compared with naevi. Moreover, significant differences in colour descriptors were observed for each colour group, showing that colour areas are more unevenly distributed in melanomas compared with naevi. Using a discriminant analysis approach, the extension of dark, white and blue-grey areas and some descriptors of the distribution of the colour areas were identified as the most relevant colour parameters for differentiating between benign and malignant lesions. In conclusion, our automatic procedure breaks down the image into the colour areas used in the clinical examination process, and also supplies a description of their extension and distribution, with parameters that correlate with the clinical concepts of regularity and homogeneity.


2004 - Automated extraction and description of dark areas in surface microscopy melanocytic lesion images [Articolo su rivista]
Pellacani, Giovanni; Grana, Costantino; Cucchiara, Rita; Seidenari, Stefania
abstract

Background: Identification of dark areas inside a melanocytic lesion (ML) is of great importance for melanoma diagnosis, both during clinical examination and employing programs for automated image analysis. Objective: The aim of our study was to compare two different methods for the automated identification and description of dark areas in epiluminescence microscopy images of MLs and to evaluate their diagnostic capability. Methods: Two methods for the automated extraction of ´absolute´ (ADAs) and ´relative´ dark areas (RDAs) and a set of parameters for their description were developed and tested on 339 images of MLs acquired by means of a polarized-light videomicroscope. Results: Significant differences in dark area distribution between melanomas and nevi were observed employing both methods, permitting a good discrimination of MLs (diagnostic accuracy = 74.6 and 71.2% for ADAs and RDAs, respectively). Conclusions: Both methods for the automated identification of dark areas are useful for melanoma diagnosis and can be implemented in programs for image analysis. Copyright


2004 - Color Calibration for a Dermatological Video Camera System [Relazione in Atti di Convegno]
Grana, Costantino; Pellacani, Giovanni; Seidenari, Stefania; Cucchiara, Rita
abstract

In this work, we describe a technique to calibrate images for skin analysis in dermatology. Using a common reference we correct non-uniform illumination effects, give an estimation of the gamma correction and produce a XYZ conversion matrix. The final result is then reverted to a non standard RGB color space, built from the instrument images. In this way different instruments behave uniformly allowing colorimetric characterization, while improving the results of common algorithms. The proposed techniques should be the initial support for a distributed framework where dermatological images can be consistently compared.


2004 - Colors in atypical nevi: a computer description reproducing clinical assessment [Abstract in Rivista]
Seidenari, Stefania; Pellacani, Giovanni; A., Martella; Grana, Costantino
abstract

Atypical nevi share some dermoscopic features with early melanoma, and computer elaboration of digital images could represent a useful support to diagnosis. The aim of our study was to automatically assess colors in atypical nevi, and to compare the data with those referring to clearly benign nevi and melanomas. Dermoscopic images of 459 melanocytic lesions, referring to 76 atypical nevi, 288 clearly benign nevi and 95 melanomas, were acquired by means of a digital videomicroscope (Videocap 100, DS-Medica, Italy) employing a 20-fold magnification. An image analysis program, based on an approach, which shares some similarities with the human perception of colors, was employed. For the evaluation of colors in melanocytic lesion images, the identification of the six main color groups (black, dark brown, light brown, red, white and blue-gray) and the numerical description of color areas were obtained. Black, white and blue-gray were more frequently found in atypical nevi than in clearly benign nevi, but less frequently than in melanomas. Color area values significantly differed between the three groups, showing increasing irregularity in color distribution from benign lesions to atypical nevi and melanomas. The clinical–morphological interpretation of the numerical data, based on the mathematical description of the aspect and distribution of different color areas in different lesion types may contribute to the characterization of atypical nevi and their distinction from melanomas.


2004 - Computer Description Of Colors In Dermoscopic Melanocytic Lesion Images Reproducing Clinical Assessment [Abstract in Rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

-


2004 - Differential diagnosis between spitz nevi and melanomas by means of in-vivo confocal microscopy [Abstract in Rivista]
Pellacani, Giovanni; A. M., Cesinaro; Longo, Caterina; Grana, Costantino; Seidenari, Stefania
abstract

Spitz nevi may be often confused with malignant melanoma, because of its rapid growth and alarming clinical features. In vivo confocal reflectance microscopy (CRM) is a novel technique enabling the noninvasive imaging of the skin at a cellular level resolution. Twelve Spitz and 25 melanomas (MMs) were studied by means of CRM (Vivascope 1000, Lucid Inc., USA) and digital dermoscopy (Videocap 200, DS-Mediroup, Italy) for in vivo characterization of cytological and architectural features at CRM, and correlation with dermoscopy and histology. Although large cells with bright cytoplasm and dark eccentric nucleus, sometimes spreading upwards in a pagetoid fashion, were observed both in Spitz nevi and MMs, in the latter case they were more numerous and irregularly shaped. Dermoscopic globules corresponded to cell clusters at CRM and melanocytic nests at histopathology. Spitz nevi frequently presented a peripheral rim of medium sized peripheral clusters, constituted by compact aggregates of large polygonal cells, sometimes observable also on the whole lesion area. In MMs cell clusters were frequently constituted by sparse cells intercalated with thin fibrils giving a multi-lobate appearance or by large confluent aggregates of low reflecting polygonal or elongated cells, resulting in a cerebriform appearance. Although CRM appeared useful for distinction between melanocytic lesions, Spitz nevi presenting numerous atypical cells and dermal-epidermal architecture disarrangement can not be always distinguished from MMs, owing to the limited penetration of the near-infrared laser light, not enabling the evaluation of ‘cell maturation’ with increasing depth.


2004 - Improving melanoma diagnosis by means of in vivo confocal laser microscopy [Abstract in Rivista]
Pellacani, Giovanni; A. M., Cesinaro; Longo, Caterina; Grana, Costantino; Seidenari, Stefania
abstract

Confocal reflectance microscopy (CRM) enables the in-vivo observation of the skin at a nearly histologic resolution. Since melanin represents a strong source of contrast, this technique appeared particularly indicated for the study of melanocytic lesions. Cytological and architectural features of melanocytic skin lesions were studied on 25 melanomas and 50 atypical melanocytic nevi employing CRM (Vivascope 1000, Lucid Inc., USA) and digital dermoscopy (Videocap 200, DS-Mediroup, Italy). All lesions were excised for diagnostic confirmation. Some differences in CRM features were observed between benign and malignant lesions: in melanocytic nevi, cells were usually round to oval, mainly located in the basal layers or clustered into nests within the papillary dermis. Melanomas were characterized by numerous large cells within the superficial layers of the epidermis, suggesting a pagetoid fashion, and by cells polymorphic in size and shape mainly located in the basal layer, sometimes interrupted by small dermal papillae irregularly distributed throughout the lesion, owing to disarrangement of the normal architecture of the rete ridges. Moreover, large irregular cells with refractive cytoplasm and eccentric dark nucleus infiltrating dermal papilla and cell clusters with a multilobulated feature constituted by sparse cells or with a cerebriform aspect were specifically observed in melanomas. Although preliminary and based on a limited number of cases, these findings show the potential of this technique for the noninvasive diagnosis of clinically difficult lesions.


2004 - In vivo confocal scanning laser microscopy of pigmented Spitz nevi: Comparison of in vivo confocal images with dermoscopy and routine histopathology [Articolo su rivista]
Pellacani, Giovanni; A. M., Cesinaro; Grana, Costantino; Seidenari, Stefania
abstract

Background: Spitz nevus is a benign melanocytic lesion sometimes mistakenly diagnosed clinically as melanoma. Objective: Our aim was to evaluate in vivo reflectance-mode confocal scanning laser microscopy (CSLM) aspects of globular Spitz nevi and to correlate them with those of surface microscopy and histopathology. Methods: A total of 6 Spitz nevi, with globular aspects on epiluminescence observation, were imaged with CSLM and subsequently excised for histopathologic examination. Results: A close correlation among CSLM, epiluminescence, and histopathologic aspects was observed. Individual cells, observed in high-resolution confocal images, were similar in shape and dimension to the histopathologic ones. Lesion architecture was described on reconstructed CSLM images. Melanocytic nests corresponded to globular cellular aggregates at confocal microscopy and to globules at epiluminescence observation. Melanophages were clearly identified in the papillary dermis both by confocal microscopy and histopathology. Conclusion: In vivo CSLM enabled the identification of characteristic cytologic and architectural aspects of Spitz nevi, correlated with histopathology and epiluminescence microscopy observation.


2004 - Probabilistic People Tracking for Occlusion Handling [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Tardini, Giovanni; Vezzani, Roberto
abstract

This work presents a novel people tracking approach, able to cope with frequent shape changes and large occlusions. In particular, the tracks are described by means of probabilistic masks and appearance models. Occlusions due to other tracks or due to background objects and false occlusions are discriminated. The tracking system is general enough to be applied with any motion segmentation module, it can track people interacting each other and it maintains the pixel assignment to track even with large occlusions. At the same time, the update model is very reactive, so as to cope with sudden body motion and silhouette's shape changes. Due to its robustness, it has been used in many experiments of people behavior control in indoor situations.


2004 - Semantic Transcoding of Videos by using Adaptive Quantization [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea
abstract

This paper proposes the use of an approach of video transcoding driven by the video content and providedwith the adaptive quantization of MPEG standards.Computer vision techniques can extract semanticsfrom videos according with user's interests: the videosemantics is exploited to adapt the video in order tomeet the device's capabilities and the user'srequirements and preserve the best quality possible. Well assessed video analysis techniques are used to segment the video into objects grouped in classes ofrelevance to which the user can assign a weight proportional to their relevance. This weight is used todecide the quantization values to be applied in theMPEG-2 encoding to each macroblock. A modified version of the PSNR (Peak Signal-to-Noise Ratio) is used as performance metric and comparativeevaluation is reported with respect to other codingstandards such as JPEG, JPEG 2000, (basic) MPEG-2, and MPEG-4. Experimental results are provided on different situations, one indoor and oneoutdoor. Keywords:Videotranscoding, adaptive quantization, motion detection


2004 - The A and B Parameters Of The Abcd Rule Of Dermoscopy: The Computer Point Of View [Abstract in Rivista]
Pellacani, Giovanni; Grana, Costantino; A., Martella; Seidenari, Stefania
abstract

-


2004 - Track-based and object-based occlusion for people tracking refinement in indoor surveillance [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Tardini, Giovanni
abstract

People tracking deals with problems of shape changes, self-occlusions and track occlusions due to other interfering tracks and fixed objects that hide parts of the people shape. These problems are more critical in indoor surveillance and in particular in home automation settings, in which the need to merge information obtained form different cameras distributed around the house calls for the integration of reliable data obtained during time. Therefore, tracking algorithms should be carefully tuned to cope with occlusions and shape changes, working not only at pixel level but also at region level. In this work we provide a novel technique for object tracking, based on probabilistic masks and appearance models. Occlusions due to other tracks or due to background objects and false occlusions are discriminated. The classification of occluded regions of the track is exploited in a selective model update. The tracking system is general enough to be applied with any motion segmentation module, it can track people interacting each other and it maintains the pixel to track assignment even with large occlusions. At the same time, the model update is very reactive, so as to cope with sudden body motion and silhouette's shape changes. Due to its robustness, it has been used in different experiments of people behavior control in indoor situations.


2004 - Using computer vision techniques for dangerous situation detection in domotic applications [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Tardini, Giovanni; Vezzani, Roberto
abstract

We describe an integrated solution devised for inhouse video surveillance, to control the safety of people living in a domestic environment. The system is composed of robust moving object detection module, able to disregard shadows, a tracking module designed for large occlusion solution and of a posture detector. Shadows, large occlusions and deformable model of people are key features of inhouse surveillance. Moreover, the requirements of high speed reaction to dangerous situations and the need to implement a reliable and low cost televiewing system, led to the introduction of a new multimedia model of semantic transcoding, capable of supporting different user's requests and constraints of their devices (PDA, smart phones, ...). Our application context is the emerging area of domotics (from the Latin word domus that means "home" and informatics) and, in particular, indoor video surveillance of the house where people with some difficulties (elders and disabled people) can now live in a sufficient degree of autonomy, thanks to the strong interaction with the new technologies that can be distributed in the house with affordable costs and high reliability.


2003 - A Hough transform-based method for radial lens distortion correction [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati; Vezzani, Roberto
abstract

The paper presents an approach for a robust (semi-)automatic correction of radial lens distortion in images and videos. This method, based on the Hough transform, has the characteristics to be applicable also on videos from unknown cameras that, consequently, can not be a priori calibrated. We approximated the lens distortion by considering only the lower-order term of the radial distortion. Thus, the method relies on the assumption that pure radial distortion transforms straight lines into curves. The computation of the best value of the distortion parameter is performed in a multi-resolution way. The method precision depends on the scale of the multi-resolution and on the Hough space's resolution. Experiments are provided for both outdoor, uncalibrated camera and an indoor, calibrated one. The stability of the value found in different frames of the same video demonstrates the reliability of the proposed method.


2003 - A new algorithm for border description of polarized light surface microscopic images of pigmented skin lesions [Articolo su rivista]
Grana, Costantino; Pellacani, Giovanni; Cucchiara, Rita; Seidenari, Stefania
abstract

The aim of this study was to provide mathematical descriptors for the border of pigmented skin lesion images and to assess their efficacy for distinction among different lesion groups. New descriptors such as lesion slope and lesion slope regularity are introduced and mathematically defined. A new algorithm based on the Catmull-Rom spline method and the computation of the gray-level gradient of points extracted by interpolation of normal direction on spline points was employed. The efficacy of these new descriptors was tested on a data set of 510 pigmented skin lesions, composed by 85 melanomas and 425 nevi, by employing statistical methods for discrimination between the two populations.


2003 - Border cut-off in dermoscopic images of melanocytic lesions: computer evaluation and comparison with clinical assessment [Abstract in Rivista]
Pellacani, Giovanni; Grana, Costantino; A., Martella; Seidenari, Stefania
abstract

The description of the border appears to be an important feature for clinical judgement in dermatoscopy, but it is subjective and can lead to different results depending on the examiners’ experience. In order to provide mathematical descriptors for border regularity and to increase the reproducibility of clinical judgement, a method to quantify border characteristics and to automatically reproduce the B (Border) parameters of the ABCD rule was developed. 331 images of pigmented skin lesions, 113 referring to melanomas and 218 to melanocytic naevi, acquired by a digital videomicroscope with a 20× magnification were studied. Clinical evaluation: for the evaluation of border cut-off, a score ranging to 0 from 8 was attributed to each lesion on the basis of the number of segments with an abrupt edge interruption of the pigmentation. Computer elaboration: after automatic border detection, the skin lesion gradient, defined as the change in lightness values along a 30 pixel long segment centred on the lesion border, expressed as the slope of the curve, was calculated along a 30 pixel segment. Minimum and maximum values and the standard deviation were calculated for the description of border regularity. In order to compare clinical and computer evaluation, the lesion border was divided into 8 segments and a threshold for abrupt border cut-off was set on a visual basis. Melanomas presented more abrupt and inhomogeneous margins in respect of melanocytic naevi. A good correlation between clinical evaluation and computer elaboration was found for the number of borders with an abrupt cut-off (rho = 0.834; P < 0.001). Computerized image analysis appears to be able to numerically describe pigmented skin lesions and to reproduce some aspects of the clinical evaluation. Enabling an objective and reproducible description, it could represent a useful support to clinical diagnosis.


2003 - Camera-car Video Analysis for Steering Wheel's Tracking [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati; F., Vigetti; M., Piccardi
abstract

Monitoring and controlling the driver’s guidance by analyzing the rotation impressed to the steering-wheel can be a very important task in order to improve safety. This paper proposes a general-purpose method to track the steering wheel’s absolute angle by using a single camera vision system mounted inside the car. The absolute angle is computed by means of the accumulation of inter-frame relative rotations and the error propagation is prevented with an alignment process. The approach is based on the modeling of the motion of the steering wheel, as it appears perspectivelydistorted by the point of view of the un-calibrated camera. We modified the Lucas-Kanade method for an approximatively rotational motion model in order to provide the detection and tracking of significant features on the wheel. The experimental results are compared with ground-truthed data obtained with different types of sensors.


2003 - Comparison between two methods for automated extraction and description of dark areas in dermoscopic images [Abstract in Rivista]
Pellacani, Giovanni; Grana, Costantino; A., Martella; Seidenari, Stefania
abstract

In contrast with common naevi, which generally show a homogeneous and regularly distributed pigmentation, brown to black pigment areas with irregular shape or asymmetric distribution are frequently observable in melanomas. Identification of dark areas inside a melanocytic lesion is of great importance for melanoma diagnosis, both during clinical examination and employing programs for automated image analysis. The aim of our study was to compare two different methods for the automated identification and description of dark areas in epiluminescence microscopy images of melanocytic lesions and to evaluate their diagnostic capability. 339 images of melanocytic lesions, referring to 113 melanomas and 226 melanocytic naevi, acquired by means of a polarizedlight videomicroscope (Videocap 200, DS-medica, Italy) with a 20 fold magnification were studied. Two different methods were employed for the identification of dark areas: the first permits the identification of ‘absolute’ dark areas, defined as areas which are darker than the skin. The second identifies the lesion area, the darkest with respect to the overall brightness of the lesion (‘relative’ dark areas). A set of parameters is extracted both for ‘absolute’ and ‘relative’ dark areas, in order to numerically describe the region properties, such as extension, balance, regularity and symmetry of its distribution. Significant differences in dark area distribution between melanomas and naevi were observed employing both methods, permitting a good discrimination of melanocytic lesions (diagnostic accuracy = 74.6% and 71.2% for absolute and relative dark areas, respectively). In conclusion, both methods for automated identification of dark areas are useful for melanoma diagnosis and can be implemented in programs for image analysis.


2003 - Computer Vision Techniques for PDA Accessibility of In-House Video Surveillance [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati; Vezzani, Roberto
abstract

In this paper we propose an approach to indoor environment surveillance and, in particular, to people behaviour control in home automation context. The reference application is a silent and automatic control of the behaviour of people living alone in the house and specially conceived for people with limited autonomy (e.g., elders or disabled people). The aim is to detect dangerous events (such as a person falling down) and to react to these events by establishing a remote connection with low-performance clients, such as PDA (Personal Digital Assistant). To this aim, we propose an integrated server architecture, typically connected in intranet with network cameras, able to segment and track objects of interest; in the case of objects classified as people, the system must also evaluate the people posture and infer possible dangerous situations. Finally, the system is equipped with a specifically designed transcoding server to adapt the video content to PDA requirements (display area and bandwidth) and to the user's requests. The main issues of the proposal are a reliable real-time object detector and tracking module, a simple but effective posture classifier improved by a supervised learning phase, and an high performance transcoding inspired on MPEG-4 object-level standard, tailored to PDA. Results on different video sequences and performance analysis are discussed.


2003 - Computer description of colours in dermoscopic melanocytic lesion images reproducing clinical assessment [Articolo su rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

Background The assessment of colours is essential for the diagnosis of malignant melanoma ( MM), both for pattern analysis on dermoscopic images, and when employing semiquantitative methods. Objectives To develop a computer program for colour assessment in MM images mimicking the human perception of lesion colours, and to compare the automatic colour evaluation with one performed by human observers. Methods A colour palette comprising six colour groups ( black, dark brown, light brown, blue grey, red and white) was created by selecting single colour components inside melanocytic lesion images acquired by means of a digital videomicroscope, and was implemented in the image analysis program. Subsequently, colours were assessed by the computer program on 331 melanocytic lesion images composing our image database, and the results were compared with the evaluation of lesion colours performed by the clinician. Results The black, white and blue - grey colours were more frequently found in MMs than in naevi, both by the clinicians and by the computer. In MM images we observed 4.27 +/- 1.14 colours (mean +/- SD) per lesion, as opposed to 3.22 +/- 0.68 in naevi. The correlation between clinical and computer evaluation of the colours was very good, with a value of 0.781 for overall assessment. Conclusions This innovative method for automatic colour evaluation, reproducing clinical assessment of melanocytic lesion colours, may provide numerical parameters to be employed for computer-aided diagnosis of MM.


2003 - Detecting moving objects, ghosts, and shadows in video streams [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Piccardi, Massimo; Prati, Andrea
abstract

Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture, and video surveillance. How to correctly and efficiently model and update the background model and how to deal with shadows are two of the most distinguishing and challenging aspects of such approaches. This work proposes a general-purpose method that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects (ghosts), and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects, ghosts, and shadows are processed differently in order to supply an object-based selective update. The proposed approach exploits color information for both background subtraction and shadow detection to improve object segmentation and background update. The approach proves fast, flexible, and precise in terms of both pixel accuracy and reactivity to background changes.


2003 - Image Representation and Retrieval with Topological Trees [Relazione in Atti di Convegno]
Grana, Costantino; Pellacani, Giovanni; Seidenari, Stefania; Cucchiara, Rita
abstract

Typical processes of image representation comprehend initial region segmentation followed by a description of single regions’ feature and their relationships. Then a graph model can be exploited in order to integrate the knowledge of the specific regions (that are the attributed relational graph’s (ARG) nodes) and the regions’ relations (that are the ARG’s edges). In this work we use color features to guide region segmentation, geometric features to characterize regions one by one and topological features (and in particular inclusion) to describe regions’ relationships. Guided by the inclusion property we define the Topological Tree (TT) as an image representation model that exploiting the transitive property of inclusion, uses the adjacency and inclusion topological features. We propose an approach based on a recursive version of fuzzy c-means to construct the topological tree directly from the initial image, performing both segmentation and TT construction. The TT can be exploited in many applications of image analysis and image retrieval by similarity in those contexts where inclusion is a key feature: we propose an applicative case of analysis of dermatological images to support the melanoma diagnosis.In this paper describe details of the TT algorithm, including the management of not ideality and an approximate measure of tree similarity in order to retrieve skin lesion with a similar TT-based description.


2003 - Semantic video transcoding using classes of relevance [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea
abstract

In this work we present a framework for on-the-fly video transcoding that exploits computer vision-based techniques to adapt the Web access to the user requirements. Theproposed transcoding approach aims at coping with both user bandwidth and resources capabilities, and with user interests in the video's content. We propose an object-basedsemantic transcoding that, according to the user-dened classes of relevance, applies different transcoding techniques to the objects segmented in a scene. Object extraction is provided by on-the-fly video processing, without manual annotation. Multiple transcoding policies are reviewed and a performance evaluation metric based on the Weighted Mean Square Error (and corresponding PSNR), that takes into account the perceptual user requirements by means of classes of relevance, is dened. Results are analyzed by varying transcoding techniques, bandwidth requirements and video types (with indoor and outdoor scenes), showing that the use of semantics can dramatically improve the bandwidth to distortion ratio.


2002 - A Framework for Semantic Video Transcoding [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati
abstract

In this work we present a transcoding framework and an object-based technique to adapt live and stored videos to the user bandwidth and resources capabilities.Multiple transcoding policies are reviewed and a performance evaluation metric based on the Weighted Mean Square Error that allows different classes of relevance is presented.We present results for different transcoding policies and for different bandwidth requirements, showing that the use of semantic can improve the bandwidth to distortion ratio.


2002 - Building the Topological Tree by Recursive FCM Color Clustering [Abstract in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Seidenari, Stefania; Pellacani, Giovanni
abstract

In this paper we define a Topological Tree (TT) as a knowledge representation method that aims to describe important visual and spatial features of image regions, namely the color similarity, the inclusion and the spatial adjacency. The topological tree exhibits some interesting properties that can be exploited to extract knowledge from images for information retrieval, image understanding and diagnosis purposes. Examples of applications in dermatology are described. The TT can be constructed after segmentation, by computing the spatial relationships of regions or can be generated directly during the segmentation: to this aim we present a novel recursive fuzzy c-means (FCM) clustering algorithm based on the Principal Component Analysis of the color space. The recursive FCM proves to be effective for underlining the adjacency and inclusion property of regions.


2002 - Comparison between computer elaboration and clinical assessment of asymmetry and border cut-off in melanoma images [Abstract in Rivista]
Pellacani, Giovanni; Grana, Costantino; Seidenari, Stefania
abstract

Clinical evaluation of pigmented skin lesion images is subjective and can lead to different results depending on the examiner’s experience, also applying semiquantitative methods such as the ABCD rule for dermatoscopy. In order to increase the reproducibility of clinical judgement, a method to automatically reproduce the A (Asymmetry) and the B (Border) parameters of the ABCD rule was developed. One hundred and fourteen images of melanomas acquired by a digital videomicroscope with a 20x magnification were studied.Clinical evaluation: a clinical judgement of asymmetry of the shape and pigment distribution along 2 axes were performed by 0–2 scoring system. For the evaluation of the border cut-off, a score ranging to 0 from 8 was attributed to each lesion on the basis of the number of segments with an abrupt edge interruption of the pigmentation. Computer elaboration: after automatic border detection, major and minor axes were obtained and ‘shape asymmetry’ on each axis was calculated considering the proportion of overlapping pixels. A correspondence lower than 90% was selected as the threshold for asymmetry. The ‘pigment distribution asymmetry’ on each axis was calculated comparing the portion of the dark area, obtained by the median cut algorithm, in the two halves of the lesion. A correspondence lower than 80% was considered as the threshold for asymmetry. In order to numerically describe the gradient at the border, the lesion border was divided into 8 segments and the change in lightness values along a 30 pixel long segment centered on the lesion border, expressed as the slope of the curve, was considered. Threshold for abrupt border cut-off was set by a slope greater than 3.609Results: a good correlation between clinical evaluation and computer elaboration was found for shape asymmetry (rho=0.698;p<0.001), pigment distribution asymmetry (rho=0.428;p<0.001) and number of borders with an abrupt cut-off (rho=0.834;p<0.001).


2002 - Detecting Moving Objects and their Shadows: An Evaluation with the PETS2002 Dataset [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati
abstract

This work presents a general-purpose method for moving visual object segmentation in videos and discusses results attained on sequences of PETS2002 datasets. The proposed approach, called Sakbot, exploits color and motion information to detect objects, shadows and ghosts, i.e. foreground objects with apparent motion. The method is based on background suppression in the color space. The main peculiarity of the approach is the exploitation of motion and shadow information to selectively update the background, improving the statistical background model with the knowledge of detected objects. The approach is able to detect Moving Visual Objects (MVOs), and stopped objects too, since the motion status is maintained at the level of tracking module. HSV color space is exploited for shadow detection in order to enhance both segmentation and background update. Time measures and precision performance analysis in tracking and counting people is provided for surveillance and monitoring purposes.


2002 - Development of a new program for image analysis of digital videomicroscopic images of pigmented skin lesions [Abstract in Rivista]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino; Cucchiara, Rita
abstract

Although an improvement of the diagnostic accuracy of pigmented skin lesions (PSL) has been achieved by the epiluminescence technique (ELM), the interpretation of ELM criteria is often confusing, especially for inexperienced observers. To enhance the reproducibility and accuracy of clinical judgement and the training of inexperienced operators, programs for PSL image analysis and algorithms for automatic diagnosis have been developed. The aim of our study was to develop a new program for PSL image analysis, able to describe different aspects of PSLs and to test its descriptive capability on PSL acquired by means of a digital videomicroscope (VMS 110A, Scalar Mitsubishi, Japan) using 20-fold magnification. After automatic border identification and baricentre determination, some geometric parameters, describing shape characteristics of the lesion, were calculated. A mathematical description of the border cut-off was obtained. The texture of the lesion was calculated applying the co-occurrence matrix at different image resolutions. Dark areas and colour areas, referring to selected colour groups, were obtained and their aspect and distribution were mathematically defined and calculated. 281 common nevi and 117 melanomas were numerically described by our program and the capability of the mathematical parameters to distinguish between benign and malignant lesion was tested by means of discriminant analysis. Significant differences were observed for most parameters between different PSL populations. The automatic classification enabled the distinction between melanomas and nevi with a 100% sensitivity and a 82.9% specificity.


2002 - Exploiting color and topological features for region segmentation with recursive fuzzy c-means [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Seidenari, Stefania; Pellacani, Giovanni
abstract

In this paper we define a novel approach for image segmentation into regions which focuses on both visual and topological cues, namely color similarity, inclusion and spatial adjacency. Many color clustering algorithms have been proposed in the past for skin lesion images but none exploits explicitly the inclusion properties between regions. Our algorithm is based on a recursive version of fuzzy c-means (FCM) clustering algorithm in the 2D color histogram constructed by Principal Component Analysis (PCA) of the color space. The distinctive feature of the proposal is that recursion is guided by the evaluation of adjacency and mutual inclusion properties of extracted regions; then, the recursive analysis addresses only included regions or regions with a not-negligible size. This approach allows a coarse-to-fine segmentation which focuses the attention on the inner parts of the images, in order to highlight the internal structure of the object depicted in the image. This could be particularly useful in many applications, especially in the biomedical image analysis. In this work we apply the technique to the segmentation of skin lesions in dermatoscopic images. It could be a suitable support for the diagnosis of skin melanoma, since dermatologists are interested in the analysis of the spatial relations, the symmetrical positions and the inclusion of regions.


2002 - Iterative fuzzy clustering for detecting regions of interest in skin lesions [Articolo su rivista]
Cucchiara, Rita; Grana, Costantino; Piccardi, Massimo
abstract

Image analysis tools are spreading in dermatology since the introduction of dermoscopy (epiluminescence microscopy), in the effort of algorithmically reproducing clinical evaluations. Color-based region segmentation of skin lesions is one of the key steps for correctly collecting statistics that can help clinicians in their diagnosis. Nevertheless, an efficient and accurate region segmentation algorithm has not been proposed in the literatureyet. This work proposes an iterative fuzzy c-means clustering algorithm based on PCA with the Karhunen-Loève transform of the color space. A topological tree is provided to store the mutual inclusions of the regions and then used to summarize the structural properties of the skin lesion. Preliminary experimental results are presented and discussed.


2002 - Semantic Transcoding for Live Video Server [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; A., Prati
abstract

In this paper we present transcoding techniques for a video server architecture that enables the user to access live video streams by using different devices with different capabilities. For live videos, annotation methods cannot be exploited. Instead we propose methods of on-the-fly transcoding that adapt the video content with respect to the user resources and the video semantic. Thus we propose an object-based transcoding with "classes of relevance" (for instance People, Face and Background). To compare the different strategies we propose a metric based on the Weighted Mean Square Error that allows the analysis of different application scenarios by means of a class-wise distortion measure. The obtained results show that the use of semantic can improve the bandwidth to distortion ratio significantly.


2002 - Semantic transcoding for live video server [Relazione in Atti di Convegno]
Cucchiara, R.; Grana, C.; Prati, A.
abstract

In this paper we present transcoding techniques for a video server architecture that enables the user to access live video streams by using different devices with different capabilities. For live videos, annotation methods cannot be exploited. Instead we propose methods of on-the-fly transcoding that adapt the video content with respect to the user resources and the video semantic. Thus we propose an object-based transcoding with "classes of relevance"(for instance People, Face and Background). To compare the different strategies we propose a metric based on the Weighted Mean Square Error that allows the analysis of different application scenarios by means of a class-wise distortion measure. The obtained results show that the use of semantic can improve the bandwidth to distortion ratio significantly.


2002 - Using the Topological Tree for skin lesion structure description [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino
abstract

In this work we describe the Topological Tree (TT) as a knowledge representation method that relates some important visual and spatial features of image regions, namely the color similarity, the inclusion and the spatial adjacency. Starting from color-based region segmentation of an image into disjoint regions, their spatial relationships can be devised and described with graph-based methods. We are interested in the region’s propriety “to be included into” (in the sense of “surrounded by”) another region. This property could be very useful in biomedical imaging and in particular in the diagnosis of skin melanoma. The TT can be constructed after segmentation, by computing the spatial relationships of regions or can be generated directly during the segmentation: to this aim we present a novel recursive fuzzy c-means (FCM) clustering algorithm based on the PCA of the color space. In the paper, in addition to the TT definition and the construction algorithm description, some results are presented and discussed.


2001 - Automatic digital image analysis of pigmented skin lesion: development of a new program for geometric feature description [Abstract in Rivista]
Seidenari, Stefania; Pellacani, Giovanni; A., Martella; Grana, Costantino
abstract

-


2001 - Detecting objects, shadows and ghosts in video streams by exploiting color and motion information [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; M., Piccardi; A., Prati
abstract

Many approaches to moving object detection for traffic monitoring and video surveillance proposed in the literature are based on background suppression methods. How to correctly and efficiently update the background model and how to deal with shadows are two of the more distinguishing and challenging features of such approaches. This work presents a general-purpose method for segmentation of moving visual objects (MVOs) based on an object-level classification in MVOs, ghosts and shadows. Background suppression needs a background model to be estimated and updated: we use motion and shadow information to selectively exclude from the background model MVOs and their shadows, while retaining ghosts. The color information (in the HSV color space) is exploited to shadow suppression and, consequently, to enhance both MVOs segmentation and background update.


2001 - Improving shadow suppression in moving object detection with HSV color information [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; M., Piccardi; A., Prati; S., Sirotti
abstract

Video-surveillance and traffic analysis systems can be heavily improved using vision-based techniques able to extract, manage and track objects in the scene. However, problems arise due to shadows. In particular, moving shadows can affect the correct localization, measurements and detection of moving objects. This work aims to present a technique for shadow detection and suppression used in a system for moving visual object detection and tracking. The major novelty of the shadow detection technique is the analysis carried out in the HSV color space to improve the accuracy in detecting shadows. Signal processing and optic motivations of the approach proposed are described. The integration and exploitation of the shadow detection module into the system are outlined and experimental results are shown and evaluated


2001 - Iterative fuzzy clustering for detecting regions of interest in skin lesions [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; M., Piccardi
abstract

Image analysis tools are spreading in dermatology since the introduction of dermoscopy (epiluminescence microscopy), in the effort of algorithmically reproducing clinical evaluations. Color-based region segmentation of skin lesions is one of the key steps for correctly collecting statistics that can help clinicians in their diagnosis. Nevertheless, an efficient and accurate region segmentation algorithm has not been proposed in the literature yet. This work proposes an iterative fuzzy c-means clustering algorithm based on PCA with the Karhunen-Loève transform of the color space. A topological tree is provided to store the mutual inclusions of the regions and then used to summarize the structural properties of the skin lesion. Preliminary experimental results are presented and discussed.


2001 - L’analisi d’immagine: geometrie, colori e tessiture. L’esperienza di Modena [Abstract in Atti di Convegno]
Seidenari, Stefania; Pellacani, Giovanni; Grana, Costantino
abstract

-


2001 - Shadow detection algorithms for traffic flow analysis: a comparative study [Relazione in Atti di Convegno]
A., Prati; I., Mikic; Grana, Costantino; M. M., Trivedi
abstract

Shadow detection is critical for robust and reliable vision-based systems for traffic flow analysis. In this paper we discuss various shadow detection approaches and compare two critically. The goal of these algorithms is toprevent moving shadows being misclassified as moving objects (or parts of them), thus avoiding the merging of twoor more objects into one and improving the accuracy of object localization. The environment considered is an outdoorhighway scene with multiple lanes observed by a single fixedcamera. The important features of shadow detection algorithms and the parameter set-up are analyzed and discussed. A critical evaluation of the results both in terms of accuracy and in terms of computational complexity are outlined. Finally, possible integration of the two approaches into a robust shadow detector is presented as future direction of our research.


2001 - The Sakbot system for moving object detection and tracking [Capitolo/Saggio]
Cucchiara, Rita; Grana, Costantino; Neri, Gianni; Piccardi, Massimo; Prati, Andrea
abstract

This paper presents Sakbot, a system for moving object detection in traffic monitoring and video surveillance applications. The system is endowed with robust and efficient detection techniques, which main features are the statistical and knowledge-based background update and the use of HSV color information for shadow suppression. Tracking is provided by a symbolic reasoning module allowing flexible object tracking over a variety of different applications. This system proves effective on many different situations, both from the point of view of the scene appearance and the purpose of the application.


2001 - The Sakbot system for moving object detection and tracking [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; G., Neri; M., Piccardi; Prati, Andrea
abstract

This paper presents Sakbot, a system for moving object detection and tracking in traffic monitoring and video surveillance applications. The system is endowed with robust and efficient detection techniques, which main features are the statistical and knowledge-based background update and the use of HSV color information for shadow suppression. Tracking is performed by means of a flexible tracking module based on symbolic reasoning, which can be tuned to several different applications.


2000 - Analisi computerizzata di immagini digitali di lesioni pigmentate cutanee: sviluppo di un nuovo software e descrizione dei parametri geometrici [Abstract in Atti di Convegno]
Seidenari, Stefania; A., Martella; Grana, Costantino; Pellacani, Giovanni
abstract

-


2000 - Analisi di sequenze di immagini per sorveglianza e controllo del traffico [Abstract in Atti di Convegno]
Grana, Costantino
abstract

.


2000 - Statistic and knowledge-based moving object detection in traffic scenes [Relazione in Atti di Convegno]
Cucchiara, Rita; Grana, Costantino; M., Piccardi; A., Prati
abstract

The most common approach used for vision-based traffic surveillance consists of a fast segmentation of moving visual objects (MVOs) in the scene together with an intelligent reasoning module capable of identifying, tracking and classifying the MVOs in dependency of the system goal. In this paper we describe our approach for MVOs segmentation in an unstructured traffic environment. We consider complex situations with moving people, vehicles and infrastructures that have different aspect model and motion model. In this case we define a specific approach based on background subtraction with statistic and knowledge-based background update. We show many results of real-time tracking of traffic MVOs in outdoor traffic scene such as roads, parking area intersections, and entrance with barriers


2000 - Sviluppo di un nuovo programma per la descrizione numerica delle lesioni pigmentate: il modulo delle geometrie [Abstract in Atti di Convegno]
Seidenari, Stefania; A., Martella; Grana, Costantino; Pellacani, Giovanni
abstract

-