Foto personale

Pagina personale di Rita CUCCHIARA

Dipartimento di Ingegneria "Enzo Ferrari"

Palazzi, Andrea; Solera, Francesco; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita ( 2017 ) - Learning Where to Attend Like a Human Driver ( IEEE Intelligent Vehicles Symposium - - 11-14 June 2017) ( - Proceedings of IEEE Intelligent Vehicles Symposium ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Despite the advent of autonomous cars, it's likely - at least in the near future - that human attention will still maintain a central role as a guarantee in terms of legal responsibility during the driving task. In this paper we study the dynamics of the driver's gaze and use it as a proxy to understand related attentional mechanisms. First, we build our analysis upon two questions: where and what the driver is looking at? Second, we model the driver's gaze by training a coarse-to-fine convolutional network on short sequences extracted from the DR(eye)VE dataset. Experimental comparison against different baselines reveal that the driver's gaze can indeed be learnt to some extent, despite i) being highly subjective and ii) having only one driver's gaze available for each sequence due to the irreproducibility of the scene. Eventually, we advocate for a new assisted driving paradigm which suggests to the driver, with no intervention, where she should focus her attention.

Baraldi, Lorenzo; Grana, Costantino; Cucchiara, Rita ( 2017 ) - NeuralStory: an Interactive Multimedia System for Video Indexing and Re-use ( 15th International Workshop on Content-Based Multimedia Indexing - - 19-21 June 2017) ( - Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing ) (ACM ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In the last years video has been swamping the Internet: websites, social networks, and business multimedia systems are adopting video as the most important form of communication and information. Video are normally accessed as a whole and are not indexed in the visual content. Thus, they are often uploaded as short, manually cut clips with user-provided annotations, keywords and tags for retrieval. In this paper, we propose a prototype multimedia system which addresses these two limitations: it overcomes the need of human intervention in the video setting, thanks to fully deep learning-based solutions, and decomposes the storytelling structure of the video into coherent parts. These parts can be shots, key-frames, scenes and semantically related stories, and are exploited to provide an automatic annotation of the visual content, so that parts of video can be easily retrieved. This also allows a principled re-use of the video itself: users of the platform can indeed produce new storytelling by means of multi-modal presentations, add text and other media, and propose a different visual organization of the content. We present the overall solution, and some experiments on the re-use capability of our platform in edutainment by conducting an extensive user valuation %with students from primary schools.

Pini, Stefano; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita ( 2017 ) - Towards Video Captioning with Naming: a Novel Dataset and a Multi-Modal Approach ( 19th International Conference on Image Analysis and Processing - - 11-15 September 2017) ( - Proceedings of the 19th International Conference on Image Analysis and Processing ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Current approaches for movie description lack the ability to name characters with their proper names, and can only indicate people with a generic "someone" tag. In this paper we present two contributions towards the development of video description architectures with naming capabilities: firstly, we collect and release an extension of the popular Montreal Video Annotation Dataset in which the visual appearance of each character is linked both through time and to textual mentions in captions. We annotate, in a semi-automatic manner, a total of 53k face tracks and 29k textual mentions on 92 movies. Moreover, to underline and quantify the challenges of the task of generating captions with names, we present different multi-modal approaches to solve the problem on already generated captions.

Solera, Francesco; Calderara, Simone; Cucchiara, Rita ( 2015 ) - Towards the evaluation of reproducible robustness in tracking-by-detection ( 12th IEEE International Conference on Advanced Video and Signal-Based Surveillance - - 25-28 August) ( - AVSS 2015 : 12th IEEE International Conference on Advanced Video and Signal-Based Surveillance : August 25-28, 2015, Karlsruhe Institute of Technology & Fraunhofer IOSB, Karlsruhe, Germany ) (IEEE Danvers (MA) USA ) - pp. da 1 a 6 ISBN: 9781467376327 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Conventional experiments on MTT are built upon the belief that fixing the detections to different trackers is sufficient to obtain a fair comparison. In this work we argue how the true behavior of a tracker is exposed when evaluated by varying the input detections rather than by fixing them. We propose a systematic and reproducible protocol and a MATLAB toolbox for generating synthetic data starting from ground truth detections, a proper set of metrics to understand and compare trackers peculiarities and respective visualization solutions.

Balducci, Fabrizio; Grana, Costantino; Cucchiara, Rita ( 2015 ) - Classification of Affective Data to Evaluate the Level Design in a Role-Playing Videogame ( 7th International Conference on Games and Virtual Worlds for Serious Applications, VS-Games 2015 - - 16-18 September 2015) ( - VS-Games 2015 - 7th International Conference on Games and Virtual Worlds for Serious Applications ) (IEEE Piscataway USA ) - pp. da 1 a 8 ISBN: 9781479981021 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents a novel approach to evaluate game level design strategies, applied to role playing games. Following a set of well defined guidelines, two game levels were designed for Neverwinter Nights 2 to manipulate particular emotions like boredom or flow, and tested by 13 subjects wearing a brain computer interface helmet. A set of features was extracted from the affective data logs and used to classify different parts of the gaming sessions, to verify the correspondence of the original level aims and the effective results on people emotions. The very interesting correlations observed, suggest that the technique is extensible to other similar evaluation tasks.

Coppi, Dalia; Calderara, Simone; Cucchiara, Rita ( 2015 ) - Active query process for digital video surveillance forensic applications - SIGNAL, IMAGE AND VIDEO PROCESSING - n. volume 9 - pp. da 749 a 759 ISSN: 1863-1703 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Multimedia forensics is a new emerging discipline regarding the analysis and exploitation of digital data as support for investigation to extract probative elements. Among them, visual data about people and people activities, extracted from videos in an efficient way, are becoming day by day more appealing for forensics, due to the availability of large video-surveillance footage. Thus, many research studies and prototypes investigate the analysis of soft biometrics data, such as people appearance and people trajectories. In this work, we propose new solutions for querying and retrieving visual data in an interactive and active fashion for soft biometrics in forensics. The innovative proposal joins the capability of transductive learning for semi-supervised search by similarity and a typical multimedia methodology based on user-guided relevance feedback to allow an active interaction with the visual data of people, appearance and trajectory in large surveillance areas. Approaches proposed are very general and can be exploited independently by the surveillance setting and the type of video analytic tools.

Solera, Francesco; Calderara, Simone; Cucchiara, Rita ( 2015 ) - Learning to Divide and Conquer for Online Multi-Target Tracking ( 2015 IEEE International Conference on Computer Vision - - 11-18 December 2015) ( - 2015 IEEE International Conference on Computer Vision ) (Institute of Electrical and Electronics Engineers Danvers (MA) USA ) - n. volume 11-18-December-2015 - pp. da 4373 a 4381 ISBN: 978-1-4673-8390-5; 978-1-4673-8391-2 | 978-1-4673-8391-2 ISSN: 1550-5499 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Online Multiple Target Tracking (MTT) is often addressed within the tracking-by-detection paradigm. Detections are previously extracted independently in each frame and then objects trajectories are built by maximizing specifically designed coherence functions. Nevertheless, ambiguities arise in presence of occlusions or detection errors. In this paper we claim that the ambiguities in tracking could be solved by a selective use of the features, by working with more reliable features if possible and exploiting a deeper representation of the target only if necessary. To this end, we propose an online divide and conquer tracker for static camera scenes, which partitions the assignment problem in local subproblems and solves them by selectively choosing and combining the best features. The complete framework is cast as a structural learning task that unifies these phases and learns tracker parameters from examples. Experiments on two different datasets highlights a significant improvement of tracking performances (MOTA +10%) over the state of the art.

Alletto, Stefano; Serra, Giuseppe; Cucchiara, Rita ( 2015 ) - Egocentric Object Tracking: An Odometry-Based Solution ( International Conference on Image Analysis and Processing - - 5-11 September 2015) ( - International Conference on Image Analysis and Processing - ICIAP 2015 ) (Springer Cham CHE ) - n. volume 9280 - pp. da 687 a 696 ISBN: 9783319232331 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Tracking objects moving around a person is one of the key steps in human visual augmentation: we could estimate their locations when they are out of our field of view, know their position, distance or velocity just to name a few possibilities. This is no easy task: in this paper, we show how current state-of-the-art visual tracking algorithms fail if challenged with a first-person sequence recorded from a wearable camera attached to a moving user. We propose an evaluation that highlights these algorithms' limitations and, accordingly, develop a novel approach based on visual odometry and 3D localization that overcomes many issues typical of egocentric vision. We implement our algorithm on a wearable board and evaluate its robustness, showing in our preliminary experiments an increase in tracking performance of nearly 20\% if compared to currently state-of-the-art techniques.

Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita ( 2014 ) - Mapping Appearance Descriptors on 3D Body Models for People Re-identification ( - International Journal of Computer Vision ) - INTERNATIONAL JOURNAL OF COMPUTER VISION - n. volume 111 - pp. da 345 a 364 ISSN: 0920-5691 [Articolo in rivista (262) - Articolo su rivista]
Abstract

People Re-identification aims at associating multiple instances of a person’s appearance acquired from different points of view, different cameras, or after a spatial or a limited temporal gap to the same identifier. The basic hypothesis is that the person’s appearance is mostly constant. Many appearance descriptors have been adopted in the past, but they are often subject to severe perspective and view-point issues. In this paper, we propose a complete re-identification framework which exploits non-articulated 3D body models to spatially map appearance descriptors (color and gradient histograms) into the vertices of a regularly sampled 3D body surface. The matching and the shot integration steps are directly handled in the 3D body model, reducing the effects of occlusions, partial views or pose changes, which normally afflict 2D descriptors. A fast and effective model to image alignment is also proposed. It allows operation on common surveillance cameras or image collections. A comprehensive experimental evaluation is presented using the benchmark suite 3DPeS

C. Grana; D. Borghesani; M. Manfredi; R. Cucchiara ( 2013 ) - A Fast Approach for Integrating ORB Descriptors in the Bag of Words Model ( IS&T/SPIE Electronic Imaging - - Feb 4-6) ( - Multimedia Content and Mobile Devices ) (SPIE - Society of Photo-Optical Instrumentation Bellingham, Washington USA ) - n. volume 8667 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose to integrate the recently introduces ORB descriptors in the currently favored approach for image classification, that is the Bag of Words model. In particular the problem to be solved is to provide a clustering method able to deal with the binary string nature of the ORB descriptors. We suggest to use a k-means like approach, called k-majority, substituting Euclidean distance with Hamming distance and majority selected vector as the new cluster center. Results combining this new approach with other features are provided over the ImageCLEF 2011 dataset.

R. Cucchiara; C. Grana ( 2012 ) - Special Issue: Recent Achievements in Multimedia for Cultural Heritage - Guest Editorial - JOURNAL OF MULTIMEDIA - n. volume 7 (2) - pp. da 107 a 108 ISSN: 1796-2048 [Articolo in rivista (262) - Articolo su rivista]
Abstract

For quite some time, libraries, document and historical centers from opposite corners of the world have been the caretakers of our rich and assorted social legacy. They have protected and furnished access to the testimonies of knowledge, beauty and inspiration, such as sculptures, paintings, music and literature. The new information technologies have created unbelievable opportunities to make this common heritage more accessible for all. Culture is following the digital path and “memory institutions” are adapting the way in which they communicate with their public. Multimedia technologies have recently created the conditions for a true revolution in the cultural heritage area, with reference to the study, valorization, and fruition of artistic works. New multimedia technologies shall be able to be utilized to plan unique approaches to the perception and fulfillment of the masterful legacy, for instance, through smart cultural objects and new interfaces with the backing of items such as story-telling, gaming and learning.All the plurality of masterpieces (paintings, books, manuscripts, even photos of sculptures and architecture) can be effectively embedded into a unique ``paradigm'' through digitization. This allows a significant reduction in costs, an enormous expansion of public accessibility (and therefore income), and at the same time a tremendous freedom for data elaboration. In brief, digitization enhances pleasure for the public and usefulness to experts on cultural heritage assets.

D. Borghesani; C. Grana; R. Cucchiara ( 2012 ) - 2D Images Map Warping for Improved User Interaction ( 21st International Conference on Pattern Recognition (ICPR 2012) - - Nov 11-15) ( - Proceedings of the 21st International Conference on Pattern Recognition (ICPR 2012) ) (IEEE Computer Society Press Los Alamitos, CA USA ) - pp. da 1096 a 1099 ISBN: 9784990644116 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we suggest an interaction model designed to fit users' expectations in front of an image retrieval system. A lightweight relevance feedback strategy, working directly on the 2D projection of image features, allows the user to spatially navigate the media collection maintaining the real-time constraint. A preliminary evaluation of this relevance feedback strategy shows good performance compared with other known approaches.

R. Cucchiara; A. Prati; R. Vezzani ( 2012 ) - Intelligent Video Surveillance ( - Critical Infrastructure Security: Assessment, Prevention, Detection, Response ) (WIT Press Southampton GBR ) - pp. da 177 a 189 ISBN: 9781845645625 ISSN: - [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Safety and security reasons are pushing the growth of surveillance systems, for both prevention and forensic tasks. Unfortunately, most of the installed systems have recording capability only, with quality so poor that makes them completely unhelpful. This chapter will introduce the concepts of modern systems for Intelligent Video Surveillance (IVS), with the claim of providing neither a complete treatment nor a technical description of this topic but of representing a simple and concise panorama of the motivations, components, and trends of these systems. Different from CCTV systems, IVS should be able, for instance, to monitor people in public areas and smart homes, to control urban traffi c, and to identity assessment for security and safety of critical infrastructure.

G. Gualdi; A. Prati; R. Cucchiara ( 2011 ) - Contextual Information and Covariance Descriptors for People Surveillance: An Application for Safety of Construction Workers - EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING - n. volume 2011 - pp. da 1 a 16 ISSN: 1687-5176 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In computer science, contextual information can be used both to reduce computations and to increase accuracy. This paper discusses how it can be exploited for people surveillance in very cluttered environments in terms of perspective (i.e., weak scenecalibration) and appearance of the objects of interest (i.e., relevance feedback on the training of a classifier). These techniques are applied to a pedestrian detector that uses a LogitBoost classifier, appropriately modified to work with covariance descriptors which lie on Riemannian manifolds. On each detected pedestrian, a similar classifier is employed to obtain a precise localization of the head. Two novelties on the algorithms are proposed in this case: polar image transformations to better exploit the circular feature of the head appearance and multispectral image derivatives that catch not only luminance but also chrominance variations. The complete approach has been tested on the surveillance of a construction site to detect workers that do not wear the hard hat: in such scenarios, the complexity and dynamics are very high, making pedestrian detection a real challenge.

S. Calderara; A. Prati; R. Cucchiara ( 2011 ) - Markerless Body Part Tracking for Action Recognition - INTERNATIONAL JOURNAL OF MULTIMEDIA INTELLIGENCE AND SECURITY - n. volume 1(1) - pp. da 76 a 89 ISSN: 2042-3462 [Articolo in rivista (262) - Articolo su rivista]
Abstract

This paper presents a method for recognising human actions bytracking body parts without using artificial markers. A sophisticated appearance-based tracking able to cope with occlusions is exploited to extract a probability map for each moving object. A segmentation technique based on mixture of Gaussians (MoG) is then employed to extract and track significantpoints on this map, corresponding to significant regions on the human silhouette. The evolution of the mixture in time is analysed by transforming it in a sequence of symbols (corresponding to a MoG). The similarity between actions is computed by applying global alignment and dynamic programming techniques to the corresponding sequences and using a variational approximation of the Kullback-Leibler divergence to measure the dissimilarity between two MoGs. Experiments on publicly available datasets and comparison with existing methods are provided.

Calderara, Simone; Prati, Andrea; Cucchiara, Rita ( 2010 ) - Moving pixels in static cameras: detecting dangerous situations due to environment or people ( - Intelligent Multimedia Analysis for Security Applications ) (Springer Eds. New York City, USA USA ) - pp. da 1 a 28 ISBN: 9783642117541 ISSN: - [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Dangerous situations arise in everyday life and many efforts have been lavished to exploit technology to increase the level of safety in urban areas. Video analysis is absolutely one of the most important and emerging technology for security purposes. Automatic video surveillance systems commonly analyze the scene searching for moving objects. Well known techniques exist to cope with this problem that is commonly referred as \change detection". Every time a dierence against a reference model is sensed, it should be analyzed to allow the system to discriminateamong a usual situation or a possible threat. When the sensor is a camera, motion is the key element to detect changes and moving objects must be correctly classied according to their nature. In this context we can distinguish among two dierent kinds of threat that can lead to dangerous situations in a video-surveilled environment. The first one is due to environmental changes such as rain, fog or smoke present in the scene. This kind of phenomena are sensed by the camera as moving pixelsand, subsequently as moving objects in the scene. This kind of threats shares some common characteristics such as texture, shape and color information and can be detected observing the features' evolution in time. The second situation arises whenpeople are directly responsible of the dangerous situation. In this case a subject is acting in an unusual way leading to an abnormal situation. From the sensor's point of view, moving pixels are still observed, but specic features and time-dependent statistical models should be adopted to learn and then correctly detect unusual and dangerous behaviors. With these premises, this chapter will present two different case studies. The rst one describes the detection of environmental changes in theobserved scene and details the problem of reliably detecting smoke in outdoor environments using both motion information and global image features, such as color information and texture energy computed by the means of the Wavelet transform.The second refers to the problem of detecting suspicious or abnormal people behaviors by means of people trajectory analysis in a multiple cameras video-surveillance scenario. Specically, a technique to infer and learn the concept of normality is proposed jointly with a suitable statistical tool to model and robustly compare people trajectories.

C. Grana; D. Borghesani; G. Gualdi; R. Cucchiara ( 2010 ) - Bag-Of-Words Classification of Miniature Illustrations ( 11th International Workshop on Image Analysis for Multimedia Interactive Services - - Apr 12-14) ( - Proceedings of the 11th International Workshop on Image Analysis for Multimedia Interactive Services ) (IEEE Computer Society Press Los Alamitos, CA USA ) - pp. da 61 a 64 ISBN: 9781424478484 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper a system for illuminated manuscripts images analysis is presented. In particular the bag-of-keypoints strategy, commonly adopted for object recognition, image classification and scene recognition, is applied to the classification of automatically extracted miniatures. Pictures are characterized by SURF descriptors, and a classification procedure is performed, comparing the results of Naive Bayes and histogram intersection distance measures.

C. Grana; D. Borghesani; P. Santinelli; R. Cucchiara ( 2010 ) - High Performance Connected Components Labeling on FPGA ( First International Workshop Interactive Multimodal Pattern Recognition in Embedded Systems - - Sep 1) ( - 2010 Workshops on Database and Expert Systems Applications ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 221 a 225 ISBN: 9780769541747 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes a comparison of the two most advanced algorithms for connected components labeling, highlighting how they perform on a soft core SoC architecture based on FPGA. In particular we test our block based connected components labeling algorithm, optimized with decision tables and decision trees. The embedded system is composed of the CMOS image sensor, FPGA, DDR SDRAM, USB controller and SPI Flash. Results highlight the importance of caching and instructions and data cache sizes for high performance image processing tasks.

Gualdi, Giovanni; Prati, Andrea; Cucchiara, Rita ( 2010 ) - Multi-stage Sampling with Boosting Cascades for Pedestrian Detection in Images and Videos ( 11th European Conference on Computer Vision (ECCV) - - 5-11 September 2010) ( - Lectures Notes in Computer Science ) (Springer-Verlag Berlin DEU ) - n. volume 6316 - pp. da 196 a 209 ISBN: 9783642155666 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Many works address the problem of object detection by means of machine learning with boosted classifiers. They exploit sliding window search, spanning the whole image: the patches, at all possible positions and sizes, are sent to the classifier. Several methods have been proposed to speed up the search (adding complementary features or using specialized hardware). In this paper we propose a statisticalbased search approach for object detection which uses a Monte Carlo sampling approach for estimating the likelihood density function with Gaussian kernels. The estimation relies on a multi-stage strategy where the proposal distribution is progressively refined by taking into account the feedback of the classifier (i.e. its response). For videos, this approach is plugged in a Bayesian-recursive framework which exploits the temporal coherency of the pedestrians. Several tests on both still images and videos on common datasets are provided in order to demonstrate therelevant speedup and the increased localization accuracy with respect to sliding window strategy using a pedestrian classifier based on covariance descriptors and a cascade of Logitboost classifiers.

Cucchiara, Rita; Fornaciari, Michele; Prati, Andrea; Santinelli, Paolo ( 2010 ) - Mutual Calibration of Camera Motes and RFIDs for People Localization and Identification ( 4th ACM/IEEE International Conference on Distributed Smart Cameras (ICDSC) - - 31 August-3 September 2010) ( - Proceedings of 4th ACM/IEEE International Conference on Distributed Smart Cameras ) (ACM New York, New York USA ) - pp. da 1 a 8 ISBN: 9781450303170 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Achieving both localization and identication of people ina wide open area using only cameras can be a challengingtask, which requires cross-cutting requirements : high reso-lution for identication, whereas low resolution for having awide coverage of the localization. Consequently, this paperproposes the joint use of cameras (only devoted to local-ization) and RFID sensors (devoted to identication) withthe nal objective of detecting and localizing intruders. Toground the observations on a common coordinate system,a calibration procedure is dened. This procedure only de-mands a training phase with a single person moving in thescene holding a RFID tag. Although preliminary, the resultsdemonstrate that this calibration is sufficiently accurate tobe applied whenever dierent scenarios, where area of over-lap between the eld of view (FoV) of a camera and theField of sense" (FoS) of a (blind) sensor must be efficientlydetermined.

D. Borghesani; C. Grana; R. Cucchiara ( 2010 ) - Rerum Novarum: Interactive Exploration of Illuminated Manuscripts ( 18th International Conference on Multimedia (ACM Multimedia 2010) - - Oct 25-29) ( - Proceedings of the 18th International Conference on Multimedia (ACM Multimedia 2010) ) (ACM New York USA ) - pp. da 1621 a 1623 ISBN: 9781605589336 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper describes an interactive application for the exploration and annotation of illuminated manuscripts, which typically contain thousands of pictures, used to comment or embellish the manuscript Gothic text. The system is composed by a modern user interface for browsing, surfing and querying, an automatic segmentation module, to ease the initial picture extraction task, and a similarity based retrieval engine, used to provide visually assisted tagging capabilities. A relevance feedback procedure is included to further refine the results.

R. Vezzani; R. Cucchiara ( 2010 ) - Event Driven Software Architecture for Multi-camera and Distributed Surveillance Research Systems ( First IEEE Workshop on Camera Networks - CVPRW - - 13-18 June 2010) ( - Proceedings of Computer Vision and Pattern Recognition Workshops ) (IEEE Computer Society Press Washington DC USA ) - pp. da 1 a 8 ISBN: 9781424470297 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Surveillance of wide areas with several connected cameras integrated in the same automatic system is no more a chimera, but modular, scalable and flexible architectures are mandatory to manage them. This paper points out the main issues on the development of distributed surveillance systems and proposes an integrated framework particularly suitable for research purposes. As first, exploiting a computer architecture analogy, a three layer tracking system is proposed, which copes with the integration of both overlapping and non overlapping cameras. Then, a static service oriented architecture is adopted to collect and manage the plethora of high level modules, such as face detection and recognition, posture and action classification, and so on. Finally, the overall architecture is controlled by an event driven communication infrastructure, which assures the scalability and the flexibility of the system.

Vezzani, Roberto; Cucchiara, Rita ( 2010 ) - Video Surveillance Online Repository (ViSOR): an integrated framework - MULTIMEDIA TOOLS AND APPLICATIONS - n. volume 50 - pp. da 359 a 380 ISSN: 1380-7501 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The availability of new techniques and tools for Video Surveillance and the capability of storing huge amounts of visual data acquired by hundreds of cameras every day call for a convergence between pattern recognition, computer vision and multimedia paradigms. A clear need for this convergence is shown by new research projects which attempt to exploit both ontology-based retrieval and video analysis techniques also in the field of surveillance.This paper presents the ViSOR (Video Surveillance Online Repository) framework, designed with the aim of establishing an open platform for collecting, annotating, retrieving, and sharing surveillance videos, as well as evaluating the performance of automatic surveillance systems. Annotations are based on a reference ontology which has been defined integrating hundreds of concepts, some of them coming from the LSCOM and MediaMill ontologies. A new annotation classification schema is also provided, which is aimed at identifying the spatial, temporal and domain detail level used.The ViSOR web interface allows video browsing, querying by annotated concepts or by keywords, compressed video previewing, media downloading and uploading.Finally, ViSOR includes a performance evaluation desk which can be used to compare different annotations.

Baltieri, Davide; Vezzani, Roberto; Cucchiara, Rita ( 2010 ) - Fast Background Initialization with Recursive Hadamard Transform ( IEEE International Conference on Advanced Video and Signal Based Surveillance AVSS 2010 - - 29 August-1 September 2010) ( - Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance AVSS 2010 ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 165 a 171 ISBN: 9780769542645 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present a new and fast techniquefor background estimation from cluttered image sequences.Most of the background initialization approaches developedso far collect a number of initial frames and then requirea slow estimation step which introduces a delay wheneverit is applied. Conversely, the proposed technique redistributesthe computational load among all the frames bymeans of a patch by patch preprocessing, which makesthe overall algorithm more suitable for real-time applications.For each patch location a prototype set is created andmaintained. The background is then iteratively estimatedby choosing from each set the most appropriate candidatepatch, which should verify a sort of frequency coherencewith its neighbors. To this aim, the Hadamard transformhas been adopted which requires less computation time thanthe commonly used DCT. Finally, a refinement step exploitsspatial continuity constraints along the patch borders toprevent erroneous patch selections. The approach has beencompared with the state of the art on videos from availabledatasets (ViSOR and CAVIAR), showing a speed up of about10 times and an improved accuracy

P. Piccinini; A. Prati; R. Cucchiara ( 2009 ) - A Fast Multi-model Approach for Object Duplicate Extraction ( Ninth IEEE Computer Society Workshop on Application of Computer Vision (WACV 2009) - - 7-8 December 2009) ( - Proceedings of Ninth IEEE Computer Society Workshop on Application of Computer Vision (WACV 2009) ) (IEEE Washington, DC (USA) USA ) - pp. da 106 a 111 ISBN: 9781424454969 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents an innovative approach for localizingand segmenting duplicate objects for industrial applications.The working conditions are challenging, withcomplex heavily-occluded objects, arranged at random inthe scene. To account for high flexibility and processingspeed, this approach exploits SIFT keypoint extraction andmean shift clustering to efficiently partition the correspondencesbetween the object model and the duplicates ontothe different object instances. The re-projection (by meansof an Euclidean transform) of some delimiting points ontothe current image is used to segment the object shapes. Thisprocedure is compared in terms of accuracy with existinghomography-based solutions which make use of RANSACto eliminate outliers in the homography estimation. Moreover,in order to improve the extraction in the case of reflectiveor transparent objects, multiple object models are usedand fused together. Experimental results on different andchallenging kinds of objects are reported.

S. Calderara; C. Alaimo; A. Prati; R. Cucchiara ( 2009 ) - A Real-Time System for Abnormal Path Detection ( 3rd International Conference on Imaging for Crime Detection and Prevention (ICDP-09) - - 3 December 2009) ( - Proceedings of 3rd International Conference on Imaging for Crime Detection and Prevention (ICDP-09) ) (IET Stevenage Herts GBR ) - pp. da 1 a 6 ISBN: 9781849192071 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes a real-time system capable to extract andmodel object trajectories from a multi-camera setup with theaim of identifying abnormal paths. The trajectories are modeledas a sequence of positional distributions (2D Gaussians)and clustered in the training phase by exploiting an innovativedistance measure based on a global alignment techniqueand Bhattacharyya distance between Gaussians. An on-lineclassification procedure is proposed in order to on-the-fly classifynew trajectories into either “normal” or “abnormal” (in thesense of rarely seen before, thus unusual and potentially interesting).Experiments on a real scenario will be presented.

Calderara, Simone; Cucchiara, Rita; Prati, Andrea; Vezzani, Roberto ( 2009 ) - Statistical Pattern Recognition for Multi-Camera Detection, Tracking and Trajectory Analysis ( - Multi-Camera Networks: Concepts and Applications ) (Academic Press Burlington, MA (USA) USA ) - pp. da 389 a 414 ISBN: 978 0 12 374633 7 ISSN: - [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This chapter will address most of the aspects of modern video surveillance with the reference to the research activity conducted at University of Modena and Reggio Emilia, Italy, within the scopes of the national FREE SURF (FREE SUrveillance in a pRivacy-respectFul way) and NATO-funded BE SAFE (Behavioral lEarning in Surveilled Areas with Feature Extraction) projects. Moving object detection and tracking from a single camera, multi-camera consistent labeling and trajectory shape analysis for path classification will be the main topics of this chapter.

Calderara, Simone; Prati, Andrea; Cucchiara, Rita ( 2008 ) - HECOL: Homography and Epipolar-based Consistent Labeling for Outdoor Park Surveillance (Academic Press Incorporated:6277 Sea Harbor Drive:Orlando, FL 32887:(800)543-9534, (407)345-4100, EMAIL: ap@acad.com, INTERNET: http://www.idealibrary.com, Fax: (407)352-3445 ) - COMPUTER VISION AND IMAGE UNDERSTANDING - n. volume 111(1) - pp. da 21 a 42 ISSN: 1077-3142 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Outdoor surveillance is one of the most attractive application of video processing and analysis. Robust algorithms must be defined and tuned to cope with the non-idealities of outdoor scenes. For instance, in a public park, an automatic video surveillance system must discriminate between shadows, reflections, waving trees, people standing still or moving, and other objects. Visual knowledge coming from multiple cameras can disambiguate cluttered and occluded targets by providing a continuous consistent labeling of tracked objects among the different views. This work proposes a new approach for coping with this problem in multi-camera systems with overlapped Fields of View (FoVs). The presence of overlapped zones allows the definition of a geometry-based approach to reconstruct correspondences between FoVs, using only homography and epipolar lines (hereinafter HECOL: Homography and Epipolar-based COnsistent Labeling) computed automatically with a training phase. We also propose a complete system that provides segmentation and tracking of people in each camera module. Segmentation is performed by means of the SAKBOT (Statistical and Knowledge Based Object Tracker) approach, suitably modified to cope with multi-modal backgrounds, reflections and other artefacts, typical of outdoor scenes. The extracted objects are tracked using a statistical appearance model robust against occlusions and segmentation errors. The main novelty of this paper is the approach to consistent labeling. A specific Camera Transition Graph is adopted to efficiently select the possible correspondence hypotheses between labels. A Bayesian MAP optimization assigns consistent labels to objects detected by several points of views: the object axis is computed from the shape tracked in each camera module and homography and epipolar lines allow a correct axis warping in other image planes. Both forward and backward probability contributions from the two different warping directions make the approach robust against segmentation errors, and capable of disambiguating groups of people. The system has been tested in a real setup of a urban public park, within the Italian LAICA (Laboratory of Ambient Intelligence for a friendly city) project. The experiments show how the system can correctly track and label objects in a distributed system with real-time performance. Comparisons with simpler consistent labeling methods and extensive outdoor experiments with ground truth demonstrate the accuracy and robustness of the proposed approach.

Calderara, S.; Cucchiara, R.; Prati, A. ( 2008 ) - Action Signature: a Novel Holistic Representation for Action Recognition ( IEEE International Conference on Advanced Video and Signal Based Surveillance - - 1-3 September 2008) ( - AVSS 2008 : IEEE Fifth International Conference on Advanced Video and Signal Based Surveillance ) (IEEE Danvers (MA) USA ) - pp. da 121 a 128 ISBN: 9780769533414 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Recognizing different actions with a unique approach can be a difficult task. This paper proposes a novel holistic representation of actions that we called "action signature". This 1D trajectory is obtained by parsing the 2D image containing the orientations of the gradient calculated on the motion feature map called motion-history image. In this way, the trajectory is a sketch representation of how the object motion varies in time. A robust statistical framework based on mixtures of von Mises distributions and dynamic programming for sequence alignment are used to compare and classify actions/trajectories. The experimental results show a rather high accuracy in distinguishing quite complicated actions, such as drinking, jumping, or abandoning an object.

S. Calderara; A. Prati; R. Cucchiara ( 2008 ) - A Markerless Approach for Consistent Action Recognition in a Multi-camera System ( ACM/IEEE International Conference on Distributed Smart Cameras - - 7-10 September 2008) ( - Proceedings of ICDSC 2008 ) (IEEE Computer Society Washington, DC, USA USA ) - pp. da 1 a 8 ISBN: 9781424426645 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents a method for recognizing human actions in a multi-camera setup. The proposed method automatically extracts significant points on the human body, without the need of artificial markers. A sophisticated appearance-based tracking able to cope with occlusions is exploited to extract a probability map for each moving object. A segmentation technique based on mixture of Gaussians is then employed to extract and track significant points on this map, corresponding to significant regions on the human silhouette. The point tracking produces a set of 3D trajectories that are compared with other trajectories by means of global alignment and dynamic programming techniques. Preliminary experiments showed the potentiality of the proposed approach.

G. Gualdi; A. Albarelli; A. Prati; A. Torsello; M. Pelillo; R. Cucchiara ( 2008 ) - Using Dominant Sets for Object Tracking with Freely Moving Camera ( Workshop on Visual Surveillance - - 17 October 2008) ( - Proceedings of VS 2008 ) (- Marseille FRA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Object tracking with freely moving cameras is an openissue, since background information cannot be exploited forforeground segmentation, and plain feature tracking is notrobust enough for target tracking, due to occlusions, distractors and object deformations. In order to deal withsuch challenging conditions a traditional approach, basedon Camshift-like color-based features, is augmented by introducing a structural model of the object to be tracked incorporating previous knowledge about the spatial relationsbetween the parts. Hence, an attributed graph is built ontop of the features extracted from each frame and a graphmatching technique is used to extract the optimal matchwith the model. Pixel-wise and object-wise comparisonwith other tracking techniques with respect to manually obtained ground truth are presented.

G. Gualdi; A. Prati; R. Cucchiara; E. Ardizzone; M. La Cascia; L. Lo Presti; M. Morana ( 2008 ) - Enabling Technologies on Hybrid Camera Networks for Behavioral Analysis of Unattended Indoor Environments and Their Surroundings ( 1st ACM International Workshop on Vision Network for Behaviour Analysis - - 31 October 2008) ( - Proceedings of VNBA 2008 ) (ACM New York, NY USA ) - pp. da 101 a 108 ISBN: 9781605583136 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents a layered network architecture and the enabling technologies for accomplishing vision-based behavioral analysis of unattended environments. Specifically the vision network covers both the attended environment and its surroundings by means of hybrid cameras. The layer overlooking at the surroundings is laid outdoor and tracks people, monitoring entrance/exit points. It recovers the geometry of the site under surveillance and communicates people positions to a higher level layer. The layer monitoring the unattended environment undertakes similar goals, with the addition of maintaining a global mosaic of the observed scene for further understanding. Moreover, it merges information coming from sensors beyond the vision to deepen the understanding or increase the reliability of the system. The behavioral analysis is demanded to a third layer that merges the information received from the two other layers and infers knowledge about what happened, happens and will be likely happening in the environment. The paper also describes a case study that was implemented in the Engineering Campus of the University of Modena and Reggio Emilia, where our surveillance system has been deployed in a computer laboratory which was often unaccessible due to lack of attendance.

R. Cucchiara; A. Prati; R. Vezzani ( 2007 ) - A Multi-Camera Vision System for Fall Detection and Alarm Generation - EXPERT SYSTEMS - n. volume 24 (4) - pp. da 334 a 345 ISSN: 0266-4720 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In-house video surveillance can represent an excellent support for people with some difficulties (e.g. elderly or disabled people) living alone and with a limited autonomy. New hardware technologies and in particular digital cameras are now affordable and they have recently gained credit as tools for (semi-)automatically assuring people's safety. In this paper a multi-camera vision system for detecting and tracking people and recognizing dangerous behaviours and events such as a fall is presented. In such a situation a suitable alarm can be sent, e.g. by means of an SMS. A novel technique of warping people's silhouette is proposed to exchange visual information between partially overlapped cameras whenever a camera handover occurs. Finally, a multi-client and multi-threaded transcoding video server delivers live video streams to operators/remote users in order to check the validity of a received alarm. Semantic and event-based transcoding algorithms are used to optimize the bandwidth usage. A two-room setup has been created in our laboratory to test the performance of the overall system and some of the results obtained are reported.

C. Grana; R. Cucchiara ( 2007 ) - Linear Transition Detection as a Unified Shot Detection Approach - IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY - n. volume 17 (4) - pp. da 483 a 489 ISSN: 1051-8215 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this paper, we propose an automatic system forvideo shot segmentation, called Linear Transition Detector (LTD),unique for both cuts and linear transitions detection. Comparisonwith publicly available shot detection systems is reported ondifferent sports (Formula 1, basket, soccer and cycling) andTRECVID 2005 results are also reported.

C. Grana; D. Vanini; S. Seidenari; G. Pellacani; R. Cucchiara ( 2007 ) - Network patterns recognition for automatic dermatoscopic images classification ( Medical Imaging 2007 - - Feb 17-22) ( - Proceedings of SPIE Medical Imaging ) (SPIE - The International Society for Optical Engineering Bellingham, WS USA ) - n. volume 6512 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we focus on the problem of automatic classification of melanocytic lesions, aiming at identifying the presence of reticular patterns. The recognition of reticular lesions is an important step in the description of the pigmented network, in order to obtain meaningful diagnostic information. Parameters like color, size or symmetry could benefit from the knowledge of having a reticular or non-reticular lesion. The detection of network patterns is performed with a three-steps procedure. The first step is the localization of line points, by means of the line points detection algorithm, firstly described by Steger. The second step is the linking of such points into a line considering the direction of the line at its endpoints and the number of line points connected to these. Finally a third step discards the meshes which couldn’t be closed at the end of the linking procedure and the ones characterized by anomalous values of area or circularity. The number of the valid meshes left and their area with respect to the whole area of the lesion are the inputs of a discriminant function which classifies the lesions into reticular and non-reticular. This approach was tested on two balanced (both sets are formed by 50 reticular and 50 non-reticular images) training and testing sets. We obtained above 86% correct classification of the reticular and non-reticular lesions on real skin images, with a specificity value never lower than 92%.

L. BERTELLI; R. CUCCHIARA; G. PATERNOSTRO; A. PRATI ( 2006 ) - A semi-automatic system for segmentation of cardiac M-mode images - PATTERN ANALYSIS AND APPLICATIONS - n. volume 9 (4) - pp. da 293 a 306 ISSN: 1433-7541 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Pixel classifiers are often adopted in pattern recognition as a suitable method for image segmentation. A common approach to the performance evaluation of classifier systems is based on the measurement of the classification errors and, at the same time, on the computational time. In general, multiclassifiers have proven to be more precise in the classification in many applications, but at the cost of a higher computational load. This paper analyzes different classifiers and proposes an evaluation of the classifiers in the case of semi-automatic processes with human interaction. Medical imaging is a typical application, where automatic or semi-automatic segmentation can be a valuable support to the diagnosis. The paper focuses on the segmentation of cardiac images of fruit flies (genetic model for analyzing human heart's diseases). Analysis is based on M-modes, that are gray-level images derived from mono-dimensional projections of the video frames on a line. Segmentation of the M-mode images is provided by classifiers and integrated in a multiclassifier. A neural network classifier, a Bayesian classifier, and a classifier based on hidden Markov chains are joined by means of a Behavior Knowledge Space fusion rule. The comparative evaluation is discussed in terms of both accuracy and required time, in which the time to correct the classifier errors by means of human intervention is also taken into account.

R. Cucchiara; A. Prati; R. Vezzani ( 2006 ) - A system for automatic face obscuration for privacy purposes - PATTERN RECOGNITION LETTERS - n. volume 27 (15) - pp. da 1809 a 1815 ISSN: 0167-8655 [Articolo in rivista (262) - Articolo su rivista]
Abstract

This work proposes a method for automatic face obscuration capable of protecting people's identity. Since face detection heavily benefits from the possibility to exploit tracking, multi-camera people tracking has been integrated with a face detector based on colour clustering and Hough transform. Moreover, the multiple viewpoints provided by multiple cameras are exploited in order to always obtain a good-quality image of the face. The identity of people in different views is kept consistent by means of a geometrical, uncalibrated approach based on homographies. Experimental results show the accuracy of the proposed approach. (c) 2006 Elsevier B.V. All rights reserved.

M. BERTINI; R. CUCCHIARA; A. DEL BIMBO; A. PRATI ( 2006 ) - Semantic adaptation of sport videos with user-centred performance analysis - IEEE TRANSACTIONS ON MULTIMEDIA - n. volume 8 (3) - pp. da 433 a 443 ISSN: 1520-9210 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In semantic video adaptation measures of performance must consider the impact of the errors in the automatic annotation over the adaptation in relationship with the preferences and expectations of the user. In this paper, we define two new performance measures Viewing Quality Loss and Bit-rate Cost Increase, that are obtained from classical peak signal-to-noise ration (PSNR) and bit rate, and relate the results of semantic adaptation to the errors in the annotation of events and objects and the user's preferences and expectations. We present and discuss results obtained with a system that performs automatic annotation of soccer sport video highlights and applies different coding strategies to different parts of the video according to their relative importance for the end user. With reference to this framework, we analyze how highlights' statistics and the errors of the annotation engine influence the performance of semantic adaptation and reflect into the quality of the video displayed at the user's client and the increase of transmission costs.

R. Cucchiara; C. Grana; A. Prati; R. Vezzani ( 2006 ) - A Distributed Domotic Surveillance System ( - Intelligent Distributed Video Surveillance Systems ) (IEE Press LONDON GBR ) - pp. da 91 a 117 ISBN: 9780863415043 ISSN: - [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Distributed video surveillance has a direct application in intelligent home automation or domotics (from the Latin word domus, that means “home”, and informatics); in particular, in-house videosurveillance can provide good support for people with some difficulties (e.g., elderly or disabled people) living alone and with a limited autonomy. New hardware technologies for surveillance are now affordable and provide high reliability. Problems related to reliable software solutions are not completely solved, especially concerning the application of general-purpose computer vision techniques in indoor environments. Indeed, assuming the objective is to detect the presence of people, track them, and recognize dangerous behaviours by means of abrupt changes in their posture, robust techniques must cope with non-trivial difficulties. In particular, luminance changes and shadows must be taken into account, frequent posture changes must be faced, and large and long-lasting occlusions are common due to the vicinity of the cameras and the presence of furnitureand doors that can often hide parts of the person’s body. These problems are analyzed and solutions based on background suppression, appearance-based probabilistic tracking, and probabilistic reasoning for posture recognition are described.

C. Grana; R. Vezzani; D. Bulgarelli; R. Cucchiara ( 2006 ) - MPEG-7 Pictorially Enriched Ontologies for Video Annotation ( Seconda Conferenza Italiana sui Sistemi Intelligenti - - Sep 27-29) ( - Atti della Seconda Conferenza Italiana sui Sistemi Intelligenti ) (- Ancona ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A system for the automatic creation of Pictorially Enriched Ontologies is presented, that is ontologies for context-based video digital libraries, enriched by pictorial concepts for video annotation, summarization and similarity-based retrieval. Extraction of pictorial concepts with video clips clustering, ontology storing with MPEG-7, and the use of the ontology for stored video annotation are described. Re-sults on sport videos and TRECVID2005 video material are reported.

A. HAKEEM; R. VEZZANI; S. SHAH; R. CUCCHIARA ( 2006 ) - Estimating Geospatial Trajectory of a Moving Camera ( ICPR 2006 - - 20-24 Aug) ( - Proc. of International Conference on Pattern Recognition (ICPR 2006) ) (IEEE Computer Society Los Alamitos, California USA ) - n. volume 2 - pp. da 82 a 87 ISBN: 9780769525211 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes a novel method for estimating thegeospatial trajectory of a moving camera. The proposedmethod uses a set of reference images with known GPS(global positioning system) locations to recover the trajectoryof a moving camera using geometric constraints. Theproposed method has three main steps. First, scale invariantfeatures transform (SIFT) are detected and matched betweenthe reference images and the video frames to calculatea weighted adjacency matrix (WAM) based on the numberof SIFT matches. Second, using the estimated WAM, themaximum matching reference image is selected for the currentvideo frame, which is then used to estimate the relativeposition (rotation and translation) of the video frame usingthe fundamental matrix constraint. The relative position isrecovered upto a scale factor and a triangulation amongthe video frame and two reference images is performed toresolve the scale ambiguity. Third, an outlier rejection andtrajectory smoothing (using b-spline) post processing stepis employed. This is because the estimated camera locationsmay be noisy due to bad point correspondence or degenerateestimates of fundamental matrices. Results of recoveringcamera trajectory are reported for real sequences.

S. CALDERARA; R. CUCCHIARA; A. PRATI ( 2006 ) - The LAICA project: Experiments on Multicamera People Tracking and Logging ( Conferenza Italiana Sistemi Intelligenti - - 27-29 September 2006) ( - Atti di CISI 2006 ) (- - ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Logging information on moving objects is crucial in video surveillance systems. Distributed multi-camera systems can provide the appearance of objects/people from differentviewpoints and at different resolutions, allowing a more complete and precise logging of the information. This is achieved through consistent labeling to correlate collected information of the same person. This paper proposes a novel approach to consistent labeling also capable tofully characterize groups of people and to manage miss segmentations. The ground-plane homography and the epipolar geometry are automatically learned and exploited to warp objects’ principal axes between overlapped cameras. A MAP estimator that exploits two contributions (forward and backward) is used to choose the most probable label con£guration to be assigned at the handoff of a new object. Extensive experiments demonstrate the accuracy of the proposed method in detecting single and simultaneous handoffs, miss segmentations, and groups.

Calderara, S.; Melli, R.; Prati, A.; Cucchiara, R. ( 2006 ) - Reliable background suppression for complex scenes ( 4th ACM international workshop on Video surveillance and sensor networks - - 27 October 2006) ( - Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks ) (ACM New York (NY) USA ) - pp. da 211 a 214 ISBN: 9781595934963; 9781604232486 | 9781604232486 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper describes a system for motion detection based on background suppression,specifically conceived for working in complex scenes with vacillating background,camouflage, illumination changing, etc.. The system contains proper techniques for background bootstrapping, shadow removal, ghost suppression and selective updating of the background model. The results on the challenging videos provided in VSSN '06 Open Source Algorithm Competition dataset demonstrate that the proposed system outperforms the widely-used mixture-of-Gaussians approach.

A. PRATI; F. SEGHEDONI; R. CUCCHIARA ( 2006 ) - Fast Dynamic Mosaicing and Person Following ( Proc. of International Conference on Pattern Recognition - - 20-24 August 2006) ( - Proceedings of ICPR 2006 ) (IEEE Computer Society Los Alamitos, California USA ) - n. volume 4 - pp. da 920 a 923 ISBN: 9780769525211 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A system for video surveillance purposes in wide areas based on active cameras, also capable to follow a person in the scene by keeping him framed, is presented. The proposed approach is based on the so-called direction histograms to compute the ego-motion and on frame differencing for detecting moving objects. It exploits post-processing and active contours to extract precise shape of moving objects to be fed to a probabilistic algorithm to track moving people in the scene. Person following, instead, is based on simple heuristic rules that move the camera as soon as the selected person is close to the border of the field of view. Experimental results on a live active camera demonstrate the feasibility of real-time person following.

C. Grana; R. Cucchiara; G. Pellacani; S. Seidenari ( 2006 ) - Line Detection and Texture Characterization of Network Patterns ( International Conference on Pattern Recognition - - Aug 20-24) ( - Proceedings of International Conference on Pattern Recognition ) (IEEE Computer Society Los Alamitos, CA USA ) - n. volume 2 - pp. da 275 a 278 ISBN: 9780769525211 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper describes a complete approach to detect, localize and describe network patterns. Such texture is automatically detected with Gaussian derivative kernels and Fisher linear discriminant analysis; line closure and thinning is provided by morphological masking and line luminance profile fitting provides width estimation. Detection results on dermatological images are reported and discussed.

R. VEZZANI; R. CUCCHIARA; A. MALIZIA; L. CINQUE ( 2006 ) - 3-D Virtual Environments on Mobile Devices for Remote Surveillance ( IEEE International Conference on Advanced Video and Signal-Based Surveillance 2006 - - 22-24 November 2006) ( - Proceedings of AVSS 2006 ) (IEEE Computer Society Washington, DC USA ) - n. volume 1 - pp. da 100 a 104 ISBN: 9780769526881 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present a distributed videosurveillanceframework. Our end is the remote monitoringof the behavior of people moving in a scene exploitinga virtual reconstruction on low capabilitiesdevices, like PDAs and cell phones. The main noveltyof this system is the effective integration of the computervision and computer graphics modules. The first,using a probabilistic frameworks, can detect the position,the trajectory and the posture of peoples movingin the scene. The second exploits the new possibility ofboth standard 3D graphics libraries on mobile (namelyJSR184 and M3G graphic format) and new PDAsprocessing capability in order to reconstruct the remotesurveillance data in real-time.

C. Grana; R. Vezzani; D. Bulgarelli; G. Gualdi; R. Cucchiara; M. Bertini; C. Torniai; A. Del Bimbo ( 2006 ) - PEANO: Pictorial Enriched Annotation of Video ( 14th ACM International Conference on Multimedia (ACM Multimedia 2006) - - Oct 23-27) ( - Proceedings of the 14th ACM International Conference on Multimedia (ACM Multimedia 2006) ) (ACM New York USA ) - pp. da 793 a 794 ISBN: 1595934472 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this DEMO, we present a tool set for video digital library management that allows i) structural annotation of edited videos in MPEG-7 by automatically extracting shots and clips; ii) automatic semantic annotation based on perceptual similarity against a taxonomy enriched with pictorial concepts iii) video clip access and hierarchical summarization with stand-alone and web interface iv) access to clips from mobile platform in GPRS-UMTS videostreaming. The tools can be applied in different domain-specific Video Digital Libraries. The main novelty is the possibility to enrich the annotation with pictorial concepts that are added to a textual taxonomy in order to make the automatic annotation process more fast and often effective. The resulting multimedia ontology is described in the MPEG-7 framework. The PEANO (Perceptual Annotation of Video) tool has been tested over video art, sport (Soccer, Olimpic Games 2006, Formula 1) and news clips.

R. Cucchiara; C. Grana; D. Bulgarelli; R. Vezzani ( 2006 ) - A semi-automatic video annotation tool with MPEG-7 content collections ( Eighth IEEE International Symposium on Multimedia - - Dec 11-13) ( - Eight IEEE International Symposium on Multimedia ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 742 a 745 ISBN: 9780769527468 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this work, we present a general purpose system for hierarchical structural segmentation and automatic annotation of video clips, by means of standardized low level features. We propose to automatically extract some prototypes for each class with a context based intra-class clustering. Clips are annotated following the MPEG-7 standard directives to provide easier portability. Results of automatic annotation and semiautomatic metadata creation are provided

M. Bertini; A. Del Bimbo; C. Torniai; C. Grana; R. Cucchiara ( 2006 ) - MOM: multimedia ontology manager. A framework for automatic annotation and semantic retrieval of video sequences ( 14th ACM International Conference on Multimedia (ACM Multimedia 2006) - - Oct 23-27) ( - Proceedings of the 14th ACM International Conference on Multimedia (ACM Multimedia 2006) ) (ACM New York USA ) - pp. da 787 a 788 ISBN: 1595934472 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Effective usage of multimedia digital libraries has to deal with the problem of building efficient content annotation and retrieval tools. MOM (Multimedia Ontology Manager) is a complete system that allows the creation of multimedia ontologies, supports automatic annotation and creation of extended text (and audio) commentaries of video sequences, and permits complex queries by reasoning on the ontology.

R. CUCCHIARA; A. PRATI; R. VEZZANI ( 2006 ) - Advanced video surveillance with pan tilt zoom cameras ( Workshop on Visual Surveillance (VS) - - 13 May 2006) ( - Proceeding of VS2006 ) (Faculty of Computing, Information Systems and Mathematics, Kingston University Kingston upon Thames, Surrey GBR ) - n. volume 1 - pp. da 49 a 56 ISBN: 00955300304 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper an advanced video surveillance system is proposed.Our goal is the detection of the people’s heads toallow their obscuration for privacy issues or to performrecognition tasks. We propose a system based on active PTZ(Pan-Tilt-Zoom) cameras that produce head images havinga large enough size, and can cover an area larger than stillcameras. Since conventional approaches are not suitable toPTZ cameras, the proposed approach is based on the socalleddirection histograms to compute the ego-motion andon frame differencing for detecting moving objects. It exploitspost-processing and active contours to extract preciseshape of moving objects to be fed to a probabilistic algorithmto track moving people in the scene. Person following,instead, is based on simple heuristic rules that movethe camera as soon as the selected person is close to theborder of the field of view. Finally, a color and shape basedhead detection that takes advantage of the people trackingis presented. Experimental results on a live active camerademonstrate the feasibility of real-time person followingand of the consecutive head detection phase.

C. Grana; D. Bulgarelli; R. Cucchiara ( 2006 ) - Video Clip Clustering for Assisted Creation of MPEG-7 Pictorially Enriched Ontologies ( Second International Symposium on Communications, Control and Signal Processing - - Mar 13-15) ( - Proceedings of Second International Symposium on Communications, Control and Signal Processing ) (SuviSoft Oy Ltd. Tampere FIN ) - pp. da 904 a 907 ISBN: 9782908849172 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present a system for the assisted creation of Pictorially Enriched Ontologies, that is ontologies for context-based digital libraries enriched by pictorial concepts for video annotation, summarization and similarity based retrieval. Here we detail the approach for video clips clustering and pictorial concepts extraction together with the approach for storing the ontology within the MPEG-7 framework. The clustering is performed by Complete Link hierarchical clustering on color histograms and motion features. Results on Formula 1 TV material are reported.

E. Perini; S. Soria; A. Prati; R. Cucchiara ( 2006 ) - FaceMouse: a Human-Computer Interface for Tetraplegic People ( Intern. Workshop on Human-Computer Interaction (HCI) - - May 7) ( - Proc. of Intern. Workshop on Human-Computer Interaction (HCI) ) (Springer Washington, DC USA ) - pp. da 99 a 108 ISBN: 03029743 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes a new human-machine interface particularly conceived for people with severe disabilities (specifically tetraplegic people), that allows them to interact with the computer for their everyday life by means of mouse pointer. In this system, called FaceMouse, instead of classical pointer paradigm that requires the user to look at the point where to move, we propose to use a paradigm called derivative paradigm, where the user does not indicate the precise position, but the direction along which the mouse pointer must be moved. The proposed system is composed of a common, lowcost webcam, and by a set of computer vision techniques developed to identify the parts of the user's face (the only body part that a tetraplegic person can move) and exploit them for moving the pointer. Specifically, the implemented algorithm is based on template matching to track the nose of the user and on cross-correlation to calculate the best match. Finally, several real applications of the system are described and experimental results carried out by disabled people are reported.

C. Grana; G. Pellacani; S. Seidenari; R. Cucchiara ( 2006 ) - Distance transform for automatic dermatologic images composition ( Medical Imaging 2006: Image Processing - - Feb 13-16) ( - Medical Imaging 2006: Image Processing ) (SPIE Bellingham, WA USA ) - n. volume 6144 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we focus on the problem of automatically registering dermatological images, because even if different products are available, most of them share the problem of a limited field of view on the skin. A possible solution is then the composition of multiple takes of the same lesion with digital software, such as that for panorama images creation.In this work, to perform an automatic selection of matching points the Harris Corner Detector is used, and to cope with outlier couples we employed the RANSAC method. Projective mapping is then used to match the two images. Given a set of correspondence points, Singular Value Decomposition was used to compute the transform parameters.At this point the two images need to be blended together. One initial assumption is often implicitly made: the aim is to merge two rectangular images. But when merging occurs between more than two images iteratively, this assumption will fail. To cope with differently shaped images, we employed the Distance Transform and provided a weighted merging of images. Different tests were conducted with dermatological images, both with standard rectangular frame and with not typical shapes, as for example a ring due to the objective and lens selection. The successive composition of different circular images with other blending functions, such as the Hat function, doesn’t correctly get rid of the border and residuals of the circular mask are still visible. By applying Distance Transform blending, the result produced is insensitive of the outer shape of the image.

R. Cucchiara; C. Grana; A. Prati; R. Vezzani ( 2005 ) - A computer vision system for in-house video surveillance - IEE PROCEEDINGS. VISION, IMAGE AND SIGNAL PROCESSING - n. volume 152 (2) - pp. da 242 a 249 ISSN: 1350-245X [Articolo in rivista (262) - Articolo su rivista]
Abstract

In-house video surveillance to control the safety of people living in domestic environments is considered. In this context, common problems and general purpose computer vision techniques are discussed and implemented in an integrated solution comprising a robust moving object detection module which is able to disregard shadows, a tracking module designed to handle large occlusions, and a posture detector. These factors, shadows, large occlusions and people's posture, are the key problems that are encountered with in-house surveillance systems, A distributed system with cameras installed in each room of a house can be used to provide full coverage of people's movements. Tracking is based on a probabilistic approach in which the appearance and probability of occlusions are computed for the current camera and warped in the next camera's view by positioning the cameras to disambiguate the occlusions. The application context is the emerging area of domotics (from the Latin word domus, meaning 'home', and informatics). In particular, indoor video surveillance, which makes it possible for elderly and disabled people to live with a sufficient degree of autonomy, via interaction with this new technology, which can be distributed in a house at affordable costs and with high reliability.

C. Grana; G. Tardini; R. Cucchiara ( 2005 ) - Adaptation and Annotation of Formula 1 Sport Videos ( First Italian Research Conference on Digital Library Management Systems - - Jan 28) ( - Post-proceedings of the First Italian Research Conference on Digital Library Management Systems ) (ISTI-CNR Pisa ITA ) - pp. da 85 a 90 ISBN: 0000000000 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we approach the problem of detecting editing features suitable for video annotation, by paying attention to artifacts and effects introduced in video editing. In particular, a linear transition detection algorithm is presented, which can characterize the transition center and length with high precision. The technique works with sub-frame granularity and is able to include both abrupt cuts and longer dissolves in a single approach. Theoretical justification for the algorithm is provided with an optimization technique for real cases. We present results obtained exploiting the editing features on a Formula 1 video digital library, detecting replays and providing pre classification hints for automatic shot annotation.

R. CUCCHIARA; A. PRATI; C. OSTI; S. PAVANI ( 2005 ) - Ambient Intelligence in Urban Environments ( Nono Congresso della Associazione Italiana per l’Intelligenza Artificiale - - 20 September 2005) ( - Atti del Nono Congresso della Associazione Italiana per l’Intelligenza Artificiale ) (Associazione Italiana per l'Intelligenza Artificiale - ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper reports advances achieved within a project called LAICA (Laboratorio di Ambient Intelligence per una Città Amica) on Ambient Intelligence in urban environments. The overall LAICA architecture is described and the unified operative centre developed by Regulus SpA (partner of the project) to collect and correlate data from different sensors and prototypes is depicted. Moreover, the paper describes the results obtained in developing a system for video surveillance in public parks, devoted to create a mosaic image of the scene and to extract and track moving people. Moreover, the system takes the privacy issues into account, proposing a method for face detection and tracking able to obscure faces in order to protect people’s identity.

R. Cucchiara; C. Grana; G. Tardini ( 2005 ) - Shot Detection for Formula 1 Video Digital Libraries ( 7th International Workshop of the EU Network of Excellence DELOS on Audio-Visual Content and Information Visualization in Digital Libraries - - May 4-6) ( - AVIVDiLib'05 Proceedings ) (Centromedia Capannori (Lucca) ITA ) - pp. da 131 a 140 ISBN: 0000000000 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Metadata extraction is one of the first tasks to be performed for automatic Digital Library annotation, and in particular shot detection has been widely explored in literature. While a lot of methods have been proposed for the detection of abrupt cuts, only a small number of them has explicitly addressed the problem of gradual transitions. In this paper we propose an algorithm that exploits a precise model of linear transition. Experimental results on Formula 1 car races videos show the robustness of this method. These test videos are characterized by extreme situations such as fast camera and objects motion and very different kinds of shots. The algorithm is able to estimate the exact length of the transition and an error score is also given as a fitness measure to the linear model, to discriminate true transitions from false detections. The final shot segmentation is delivered as an MPEG7 compliant output.

R. CUCCHIARA; R. VEZZANI ( 2005 ) - Assessing Temporal Coherence for Posture Classification with Large Occlusions ( IEEE Computer Society Workshop on Motion and Video Computing - - 5-7 January 2005) ( - Proceedings of Motion 2005 ) (IEEE Computer Society Washington, DC USA ) - n. volume 2 - pp. da 269 a 274 ISBN: 07695227182 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present a people posture classificationapproach especially devoted to cope with occlusions. Inparticular, the approach aims at assessing temporal coherenceof visual data over probabilistic models. A mixed predictiveand probabilistic tracking is proposed: a probabilistictracking maintains along time the actual appearance ofdetected people and evaluates the occlusion probability; anadditional tracking with Kalman prediction improves the estimationof the people position inside the room. ProbabilisticProjection Maps (PPMs) created with a learning phaseare matched against the appearance mask of the track. Finally,an Hidden Markov Model formulation of the posturecorrects the frame-by-frame classification uncertainties andmakes the system reliable even in presence of occlusions.Results obtained over real indoor sequences are discussed.

C. Grana; G. Tardini; R. Cucchiara ( 2005 ) - MPEG-7 Compliant Shot Detection in Sport Videos ( Seventh IEEE International Symposium on Multimedia - - Dec 12-14) ( - Seventh IEEE International Symposium on Multimedia ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 395 a 402 ISBN: 9780769524894 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose a system for automatic detection of shots in sport videos. Our work covers two main aspects: the first is robust shot detection in presence of fast object motion and camera operations. To this aim we propose a new algorithm, unique for both cuts and linear transitions detection, which only needs the tuning of two parameters. An extended comparison with four transition detection algorithms, representing the state of the art in literature, is reported. Examples with formula 1, basket, soccer and cycling videos are analyzed. The second aspect is an in depth discussion on the annotation of shots and transitions with the MPEG-7 standard.

R. Cucchiara; A. Prati; R.Vezzani ( 2005 ) - Posture Classification in a Multi-camera Indoor Environment ( IEEE International Conference on Image Processing - - 11-14 Sept.) ( - Proceedings of IEEE International Conference on Image Processing (ICIP 2005) ) (IEEE Computer Society - ) - n. volume 1 - pp. da 725 a 728 ISBN: 9780780391345 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Posture classification is a key process for analyzing thepeople’s behaviour. Computer vision techniques can behelpful in automating this process, but clutteredenvironments and consequent occlusions make this taskoften difficult. Different views provided by multiplecameras can be exploited to solve occlusions by warpingknown object appearance into the occluded view. To thisaim, this paper describes an approach to postureclassification based on projection histograms, reinforcedby HMM for assuring temporal coherence of the posture.The single camera posture classification is then exploitedin the multi-camera system to solve the cases in which theocclusions make the classification impossible.Experimental results of the classification from both thesingle camera and the multi-camera system are provided.

Cucchiara, Rita; Grana, Costantino; Prati, Andrea; Vezzani, Roberto ( 2005 ) - Probabilistic posture classification for human-behavior analysis - IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS - n. volume 35 (1) - pp. da 42 a 54 ISSN: 1083-4427 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Computer vision and ubiquitous multimedia access nowadays make feasible the development of a mostly automated system for human-behavior analysis. In this context, our proposal is to analyze human behaviors by classifying the posture of the monitored person and, consequently, detecting corresponding events and alarm situations, like a fall. To this aim, our approach can be divided in two phases: for each frame, the projection histograms (Haritaoglu et al., 1998) of each person are computed and compared with the probabilistic projection maps stored for each posture during the training phase; then, the obtained posture is further validated exploiting the information extracted by a tracking module in order to take into account the reliability of the classification of the first phase. Moreover, the tracking algorithm is used to handle occlusions, making the system particularly robust even in indoors environments. Extensive experimental results demonstrate a promising average accuracy of more than 95% in correctly classifying human postures, even in the case of challenging conditions.

M. Bertini; R. Cucchiara; A. Del Bimbo; A. Prati ( 2005 ) - An integrated framework for semantic annotation and adaptation - MULTIMEDIA TOOLS AND APPLICATIONS - n. volume 26 (3) - pp. da 345 a 363 ISSN: 1380-7501 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Tools for the interpretation of significant events from video and video clip adaptation can effectively support automatic extraction and distribution of relevant content from video streams. In fact, adaptation can adjust meaningful content, previously detected and extracted, to the user/client capabilities and requirements. The integration of these two functions is increasingly important, due to the growing demand of multimedia data from remote clients with limited resources (PDAs, HCCs, Smart phones). In this paper we propose an unified framework for event-based and object-based semantic extraction from video and semantic on-line adaptation. Two cases of application, highlight detection and recognition from soccer videos and people behavior detection in domotic* applications, are analyzed and discussed.

S. Calderara; A. Prati; R. Vezzani; R. Cucchiara ( 2005 ) - Consistent labeling for multi-camera object tracking ( 13th International Conference on Image Analysis and Processing - - Sept. 6-8) ( - Image Analysis and Processing – ICIAP 2005 ) (Springer Heidelberg DEU ) - n. volume LNCS 3617 - pp. da 1206 a 1214 ISBN: 9783540288695 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present a new approach to multi-camera object tracking based on the consistent labeling. An automatic and reliable procedure allows to obtain the homographic transformation between two overlapped views, without any manual calibration of the cameras. Object's positions are matched by using the homography when the object is firstly detected in one of the two views. The approach has been tested also in the case of simultaneous transitions and in the case in which people are detected as a group during the transition. Promising results are reported over a real setup of overlapped cameras.

M. BERTINI; R. CUCCHIARA; A. DEL BIMBO; A. PRATI ( 2005 ) - Real Time Semantic Adaptation of Sports Video with User-centred Performance Analysis ( International Workshop on Image Analysis for Multimedia Interactive Services - - 13-15 April 2005) ( - Proceedings of WIAMIS 2005 ) (IEE London GBR ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Semantic video adaptation improves traditional adaptation by taking into account the degree of relevance of the different portions of the content. It employs solutions to detect the significant parts of the video and applies different compression ratios to elements that have different importance. Performance of semantic adaptation heavily depends on the quality and precision of the automatic annotation, whether it operates in strict or nonstrict real time, and the codec which is used to perform adaptation at the event or object level. It should consider the effects of the errors in the automatic extraction of objects and events over the operation of the adaptation subsystem, and relate these effects to the preferences for the objects and events of the video program, that have been decided by the user. In this paper, we present strict real time annotation and adaptation of sports video and introduce two new performance measures: Viewing Quality Loss and Bit-rate Cost Increase, that are obtained from classical PSNR and Bit Ratio, but relate the results of semantic adaptation with the user’s preferences and expectations.

G. Tardini; C. Grana; R. Marchi; R. Cucchiara ( 2005 ) - Shot detection and motion analysis for automatic MPEG-7 annotation of sports videos ( 13th International Conference on Image Analysis and Processing - - Sep 6-8) ( - Image Analysis and Processing – ICIAP 2005 ) (Springer Heidelberg DEU ) - n. volume LNCS 3617 - pp. da 653 a 660 ISBN: 9783540288695 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we describe general algorithms that are devised for MPEG-7 automatic annotation of Formula 1 videos, and in particular for camera-car shots detection. We employed a shot detection algorithm suitable for cuts and linear transitions detection, which is able to precisely detect both the transition's center and length. Statistical features based on MPEG motion compensation vectors arc then employed to provide motion characterization, using a subset of the motion types defined in MPEG-7, and shot type classification. Results on shot detection and classification are provided.

S. Calderara; R. Vezzani; A. Prati; R. Cucchiara ( 2005 ) - Entry Edge of Field of View for multi-camera tracking in distributed video surveillance ( IEEE International Conference on Advanced Video and Signal-Based Surveillance - - 15-16 September 2005) ( - Proceedings of IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS2005) ) (IEEE Computer Society - ) - n. volume 1 - pp. da 93 a 98 ISBN: 9780780393851 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Efficient solution to people tracking in distributed videosurveillance is requested to monitor crowded and large environments.This paper proposes a novel use of the EntryEdges of Field of View (E2oFoV) to solve the consistentlabeling problem between partially overlapped views. Anautomatic and reliable procedure allows to obtain the homographictransformation between two overlapped views,without any manual calibration of the cameras. Throughthe homography, the consistent labeling is established eachtime a new track is detected in one of the cameras. A CameraTransition Graph (CTG) is defined to speed up the establishmentprocess by reducing the search space. Experimentalresults prove the effectiveness of the proposed solutionalso in challenging conditions.

R. CUCCHIARA; A. PRATI; L. BENINI; E. FARELLA ( 2005 ) - T_PARK: Ambient Intelligence for Security in Public Parks ( IEE International Workshop on Intelligent Environments, Special session on "Ambient Intelligence" - - 28-29 June 2005) ( - Proceedings of IE 2005 ) (IEE London GBR ) - pp. da 243 a 251 ISBN: 0 86341 519 9 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present joint research activities in computer vision and sensor networks for a distributedsurveillance of urban parks. Distributed visual surveillance of urban environments is one of the most interesting scenarios in Ambient Intelligence; in addition, the automated monitoring of public parks, often crowded by children and aduits, is still a very difficult task due to the number of objects of interests. In this context, integrating the power of low cost sensors with the information provided by cameras can lead to a more reliable solution to people tracking in wide areas. Specifically, the deficiencies of one approach can be (at least partially) covered by the advantages of the other. The goal is to perform people tracking in parks (toachieve trackable parks - T-Parks), both in zones covered by overlapped cameras and afso, thanks to sensors, in areas not covered by any camera. In this paper, we propose a new technique for multi-camera people tracking based on a learning phase to automatically calibrate pairs of cameras and to build Areas of Field Views (AoFoVs) in order to establish consistent labelling of people. In addition, sensornetworks distributed at the borders of the AoFoV give an estimation of the probability of people overlapping, triggering specific algorithms of face detection or headcounting to identify the single person. The research ofT-Parks is part of a two-year Italian project called LAICA, intended to provide advanced services for citizens and public officers based on ambient intelligence technologies.

R. CUCCHIARA; A. PRATI; R. VEZZANI ( 2005 ) - Ambient Intelligence for Security in Public Parks: the LAICA Project ( IEE International Symposium on Imaging for Crime Detection and Prevention 2005 - - 7-8 June 2005) ( - Proceedings of ICDP 2005 ) (Institution of Electrical Engineers London GBR ) - n. volume 1 - pp. da 139 a 144 ISBN: 9780863415357 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we address the exploitation of computervision techniques to develop multimedia services andautomatic monitoring systems related to the securityand the privacy in public areas. The research is part ofa two-year ltalian project called LAICA, intended toprovide advanced services for citizens and publicofficers. Citizens want fast and friendly web access topublic places, to see the environment in real-timewithout violating the privacy laws. Public officers andpolicy centres want a fast and reactive monitoringsystem, capable to automatically detect dangeroussituations, given the huge amount of cameras that cannot be monitored simultaneously by human operators.In this work, we describe the project and the definedmethodologies in multi-camera video mosaicing,people tracking and consistent labelling, and access toprocessed data with face obscuration.

R. Cucchiara; C. Grana; G. Tardini ( 2004 ) - Track-based and object-based occlusion for people tracking refinement in indoor surveillance ( 2nd International Workshop on Video Surveillance & Sensor Networks - - Oct 15-16) ( - Proceedings of the ACM 2nd International Workshop on Video Surveillance & Sensor Networks ) (ACM New York USA ) - pp. da 81 a 87 ISBN: 9781581139341 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

People tracking deals with problems of shape changes, self-occlusions and track occlusions due to other interfering tracks and fixed objects that hide parts of the people shape. These problems are more critical in indoor surveillance and in particular in home automation settings, in which the need to merge information obtained form different cameras distributed around the house calls for the integration of reliable data obtained during time. Therefore, tracking algorithms should be carefully tuned to cope with occlusions and shape changes, working not only at pixel level but also at region level. In this work we provide a novel technique for object tracking, based on probabilistic masks and appearance models. Occlusions due to other tracks or due to background objects and false occlusions are discriminated. The classification of occluded regions of the track is exploited in a selective model update. The tracking system is general enough to be applied with any motion segmentation module, it can track people interacting each other and it maintains the pixel to track assignment even with large occlusions. At the same time, the model update is very reactive, so as to cope with sudden body motion and silhouette's shape changes. Due to its robustness, it has been used in different experiments of people behavior control in indoor situations.

R. Cucchiara; D. Lovell; A. Prati; M.M. Trivedi ( 2004 ) - Introduction to the special section on in vehicle computer vision systems - IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY - n. volume 53 (6) - pp. da 1633 a 1635 ISSN: 0018-9545 [Articolo in rivista (262) - Articolo su rivista]
Abstract

-

R. CUCCHIARA; M. PICCARDI; A. PRATI ( 2004 ) - Neighbor cache prefetching for multimedia image and video processing - IEEE TRANSACTIONS ON MULTIMEDIA - n. volume 6 (4) - pp. da 539 a 552 ISSN: 1520-9210 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Cache performance is strongly influenced by the type of locality embodied in programs. In particular, multimedia programs handling images and videos are characterized by a bidimensional spatial locality, which is not adequately exploited by standard caches. In this paper we propose novel cache prefetching techniques for image data, called neighbor prefetching, able to improve exploitation of bidimensional spatial locality. A performance comparison is provided against other assessed prefetching techniques on a multimedia workload (with MPEG-2 and MPEG-4 decoding, image processing, and visual object segmentation), including a detailed evaluation of both the miss rate and the memory access time. Results prove that neighbor prefetching achieves a significant reduction in the time due to delayed memory cycles (more than 97% on MPEG-4 with respect to 75% of the second performing technique). This reduction leads to a substantial speedup on the overall memory access time (up to 140% for MPEG-4). Performance has been measured with the PRIMA trace-driven simulator, specifically devised to support cache prefetching.

R. Cucchiara; A. Prati; R. Vezzani ( 2004 ) - Real-time motion segmentation from moving cameras - REAL-TIME IMAGING - n. volume 10 - pp. da 127 a 143 ISSN: 1077-2014 [Articolo in rivista (262) - Articolo su rivista]
Abstract

This paper describes our approach to real-time detection of camera motion and moving object segmentation in videos acquired from moving cameras. As far as we know, none of the proposals reported in the literature are able to meet real-time requirements. In this work, we present an approach based on a color segmentation followed by a region-merging on motion through Markov Random Fields (MRFs). The technique we propose is inspired to a work of Gelgon and Bouthemy (Pattern Recognition 33 (2000) 725-40), that has been modified to reduce computational cost in order to achieve a fast segmentation (about 10 frame per second). To this aim a modified region matching algorithm (namely Partitioned Region Matching) and an innovative arc-based MRF optimization algorithm with a suitable definition of the motion reliability are proposed. Results on both synthetic and real sequences are reported to confirm validity of our solution.

C. Grana; G. Pellacani; S. Seidenari; R. Cucchiara ( 2004 ) - Color Calibration for a Dermatological Video Camera System ( 17th International Conference on Pattern Recognition - - Aug 23-26) ( - Proceedings of the 17th International Conference on Pattern Recognition ) (IEEE Computer Society Los Alamitos, CA USA ) - n. volume 3 - pp. da 798 a 801 ISBN: 9780769521282 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this work, we describe a technique to calibrate images for skin analysis in dermatology. Using a common reference we correct non-uniform illumination effects, give an estimation of the gamma correction and produce a XYZ conversion matrix. The final result is then reverted to a non standard RGB color space, built from the instrument images. In this way different instruments behave uniformly allowing colorimetric characterization, while improving the results of common algorithms. The proposed techniques should be the initial support for a distributed framework where dermatological images can be consistently compared.

R. Cucchiara; C. Grana; G. Tardini; R. Vezzani ( 2004 ) - Probabilistic People Tracking for Occlusion Handling ( 17th International Conference on Pattern Recognition - - Aug 23-26) ( - Proceedings of the 17th International Conference on Pattern Recognition ) (IEEE Computer Society Los Alamitos, CA USA ) - n. volume 1 - pp. da 132 a 135 ISBN: 9780769521282 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This work presents a novel people tracking approach, able to cope with frequent shape changes and large occlusions. In particular, the tracks are described by means of probabilistic masks and appearance models. Occlusions due to other tracks or due to background objects and false occlusions are discriminated. The tracking system is general enough to be applied with any motion segmentation module, it can track people interacting each other and it maintains the pixel assignment to track even with large occlusions. At the same time, the update model is very reactive, so as to cope with sudden body motion and silhouette's shape changes. Due to its robustness, it has been used in many experiments of people behavior control in indoor situations.

Cucchiara, Rita; Grana, Costantino; Prati, Andrea ( 2004 ) - Semantic Transcoding of Videos by using Adaptive Quantization - WANGJÌ WANGLÙ JÌSHÙ XUÉKAN - n. volume 5 - pp. da 31 a 39 ISSN: 1607-9264 [Articolo in rivista (262) - Articolo su rivista]
Abstract

This paper proposes the use of an approach of video transcoding driven by the video content and providedwith the adaptive quantization of MPEG standards.Computer vision techniques can extract semanticsfrom videos according with user's interests: the videosemantics is exploited to adapt the video in order tomeet the device's capabilities and the user'srequirements and preserve the best quality possible. Well assessed video analysis techniques are used to segment the video into objects grouped in classes ofrelevance to which the user can assign a weight proportional to their relevance. This weight is used todecide the quantization values to be applied in theMPEG-2 encoding to each macroblock. A modified version of the PSNR (Peak Signal-to-Noise Ratio) is used as performance metric and comparativeevaluation is reported with respect to other codingstandards such as JPEG, JPEG 2000, (basic) MPEG-2, and MPEG-4. Experimental results are provided on different situations, one indoor and oneoutdoor. Keywords:Videotranscoding, adaptive quantization, motion detection

R. Cucchiara; A. Prati; R. Vezzani ( 2004 ) - An Intelligent Surveillance System for Dangerous Situation Detection in Home Environments - INTELLIGENZA ARTIFICIALE - n. volume 1 (1) - pp. da 11 a 15 ISSN: 1724-8035 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this paper we address the problem of human posture classification, in particular focusing to an indoor surveillance application. The approach was initially inspired to a previous works of Haritaoglou et al. [5] that uses histogram projections to classify people’s posture. Projection histograms are here exploited as the main feature for the posture classification, but, differently from [5], we propose a supervised statistical learning phase to create probability maps adopted as posture templates. Moreover, camera calibration and homography are included to solve perspective problems and to improve the precision of the classification. Furthermore, we make use of a finite state machine to detect dangerous situations as falls and to activate a suitable alarm generator. The system works on-line on standard workstations with network cameras.

M. BERTINI; A. DEL BIMBO; A. PRATI; R. CUCCHIARA ( 2004 ) - Objects and Events Recognition for Sport Videos Transcoding ( 2nd International Symposium on Image/Video Communications over fixed and mobile - - 7-9 July 2004) ( - Proceedings of ISIVC 2004 ) (École Nationale Supérieure des Télécommunications de Bretagne Brest FRA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

-

M. BERTINI; A. DEL BIMBO; R. CUCCHIARA; A. PRATI ( 2004 ) - Semantic Annotation and Transcoding of Soccer Videos ( Asian Conference on Computer Vision - - 27-30 January 2004) ( - Proceedings of ACCV 2004 ) (- Jeju KOR ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

-

M. BERTINI; A. DEL BIMBO; A. PRATI; R. CUCCHIARA ( 2004 ) - Semantic Annotation and Transcoding for Sport Videos ( International Workshop on Image Analysis for Multimedia Interactive Services - - 21-23 April 2004) ( - Proceedings of WIAMIS 2004 ) (- Lisboa PRT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Telecommunication companies are demonstrating interestin providing mobile video services. The availability of largerbandwidth, and the improvements in terms of resolution ofthe displays of third generation mobile phones, let telecomand content provider companies to provide new services totheir customers. Among these services users can watch acertain number of sport videos, usually a selection of thebest actions occurred during a play. In order to provide atimely and satisfying service to customers there is need oftools and systems that help to detect and recognize the interesting events, and optimize the use of bandwidth, coding these events and the most interesting objects within them at the best visual quality/bandwidth ratio.

R. Cucchiara; C. Grana; A. Prati; R. Vezzani ( 2003 ) - A Hough transform-based method for radial lens distortion correction ( 12th International Conference on Image Analysis and Processing - - Sep 17-19) ( - Proceedings of the 12th International Conference on Image Analysis and Processing ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 182 a 187 ISBN: 9780769519487 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The paper presents an approach for a robust (semi-)automatic correction of radial lens distortion in images and videos. This method, based on the Hough transform, has the characteristics to be applicable also on videos from unknown cameras that, consequently, can not be a priori calibrated. We approximated the lens distortion by considering only the lower-order term of the radial distortion. Thus, the method relies on the assumption that pure radial distortion transforms straight lines into curves. The computation of the best value of the distortion parameter is performed in a multi-resolution way. The method precision depends on the scale of the multi-resolution and on the Hough space's resolution. Experiments are provided for both outdoor, uncalibrated camera and an indoor, calibrated one. The stability of the value found in different frames of the same video demonstrates the reliability of the proposed method.

R. Cucchiara; A. Prati; M. Piccardi ( 2003 ) - Improving data prefetching efficacy in multimedia applications - MULTIMEDIA TOOLS AND APPLICATIONS - n. volume 20 (3) - pp. da 159 a 178 ISSN: 1380-7501 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The workload of multimedia applications has a strong impact on cache memory performance, since the locality of memory references embedded in multimedia programs differs from that of traditional programs. In many cases, standard cache memory organization achieves poorer performance when used for multimedia. A widely-explored approach to improve cache performance is hardware prefetching, which allows the pre-loading of data in the cache before they are referenced. However, existing hardware prefetching approaches are unable to exploit the potential improvement in performance, since they are not tailored to multimedia locality. In this paper we propose novel effective approaches to hardware prefetching to be used in image processing programs for multimedia. Experimental results are reported for a suite of multimedia image processing programs including MPEG-2 decoding and encoding, convolution, thresholding, and edge chain coding.

A. PRATI; I. MIKIC; MM TRIVEDI; R. CUCCHIARA ( 2003 ) - Detecting moving shadows: Algorithms and evaluation - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE - n. volume 25 (7) - pp. da 918 a 923 ISSN: 0162-8828 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Moving shadows need careful consideration in the development of robust dynamic scene analysis systems. Moving shadow detection is critical for accurate object detection in video streams since shadow points are often misclassified as object points, causing errors in segmentation and tracking. Many algorithms have been proposed in the literature that deal with shadows. However, a comparative evaluation of the existing approaches is still lacking. In this paper, we present a comprehensive survey of moving shadow detection approaches. We organize contributions reported in the literature in four classes two of them are statistical and two are deterministic. We also present a comparative empirical evaluation of representative algorithms selected from these four classes. Novel quantitative (detection and discrimination rate) and qualitative metrics (scene and object independence, flexibility to shadow situations, and robustness to noise) are proposed to evaluate these classes of algorithms on a benchmark suite of indoor and outdoor video sequences. These video sequences and associated ground-truth data are made available at http://cvrr.ucsd.edu/aton/shadow to allow for others in the community to experiment with new algorithms and metrics.

R. Cucchiara; C. Grana; M. Piccardi; A. Prati ( 2003 ) - Detecting moving objects, ghosts, and shadows in video streams - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE - n. volume 25 (10) - pp. da 1337 a 1342 ISSN: 0162-8828 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Background subtraction methods are widely exploited for moving object detection in videos in many applications, such as traffic monitoring, human motion capture, and video surveillance. How to correctly and efficiently model and update the background model and how to deal with shadows are two of the most distinguishing and challenging aspects of such approaches. This work proposes a general-purpose method that combines statistical assumptions with the object-level knowledge of moving objects, apparent objects (ghosts), and shadows acquired in the processing of the previous frames. Pixels belonging to moving objects, ghosts, and shadows are processed differently in order to supply an object-based selective update. The proposed approach exploits color information for both background subtraction and shadow detection to improve object segmentation and background update. The approach proves fast, flexible, and precise in terms of both pixel accuracy and reactivity to background changes.

R. Cucchiara; C. Grana; A. Prati; F. Vigetti; M. Piccardi ( 2003 ) - Camera-car Video Analysis for Steering Wheel's Tracking ( 1st International Workshop on In-Vehicle Cognitive Computer Vision Systems - - Apr 3) ( - Proceedings of 1st International Workshop on In-Vehicle Cognitive Computer Vision Systems ) (- - ITA ) - pp. da 36 a 43 ISBN: 0000000000 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Monitoring and controlling the driver’s guidance by analyzing the rotation impressed to the steering-wheel can be a very important task in order to improve safety. This paper proposes a general-purpose method to track the steering wheel’s absolute angle by using a single camera vision system mounted inside the car. The absolute angle is computed by means of the accumulation of inter-frame relative rotations and the error propagation is prevented with an alignment process. The approach is based on the modeling of the motion of the steering wheel, as it appears perspectivelydistorted by the point of view of the un-calibrated camera. We modified the Lucas-Kanade method for an approximatively rotational motion model in order to provide the detection and tracking of significant features on the wheel. The experimental results are compared with ground-truthed data obtained with different types of sensors.

R. Cucchiara; C. Grana; A. Prati ( 2003 ) - Semantic video transcoding using classes of relevance - INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS - n. volume 3 (1) - pp. da 145 a 169 ISSN: 0219-4678 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this work we present a framework for on-the-fly video transcoding that exploits computer vision-based techniques to adapt the Web access to the user requirements. Theproposed transcoding approach aims at coping with both user bandwidth and resources capabilities, and with user interests in the video's content. We propose an object-basedsemantic transcoding that, according to the user-dened classes of relevance, applies different transcoding techniques to the objects segmented in a scene. Object extraction is provided by on-the-fly video processing, without manual annotation. Multiple transcoding policies are reviewed and a performance evaluation metric based on the Weighted Mean Square Error (and corresponding PSNR), that takes into account the perceptual user requirements by means of classes of relevance, is dened. Results are analyzed by varying transcoding techniques, bandwidth requirements and video types (with indoor and outdoor scenes), showing that the use of semantics can dramatically improve the bandwidth to distortion ratio.

R. Cucchiara; A. Prati; R. Vezzani ( 2003 ) - Domotics for disability: smart surveillance and smart video server ( 8th Conference of the Italian Association of Artificial Intelligence - - 23-26 September) ( - Proceedings of the Workshop on Ambient Intelligence ) (- - ) - n. volume 1 - pp. da 46 a 57 ISBN: 9783540201199 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we address the problem of human posture classification, in particular focusing to an indoor surveillance application. The approach was initially inspired to a previous works of Haritaoglou et al. [6] that uses histogram projections to classify people’s posture. Projection histograms are here exploited as the main feature for the posture classification, but, differently from [6], we propose a supervised statistical learning phase to create probability maps adopted as posture templates. Moreover, camera calibration and homography is included to resolve prospective problems and improve the precision of classification. Furthermore, we make use of a finite state machineto detect dangerous situations as falls and to activate a suitable alarm generator. The system works on line on standard workstation with network cameras.

R. CUCCHIARA; A. PRATI; F. VIGETTI ( 2003 ) - Steering wheel's angle tracking from camera-car ( IEEE Intelligent Vehicle Symposium - - 9-11 June 2003) ( - Proceedings of IV 2003 ) (IEEE Piscataway, NJ, USA USA ) - pp. da 406 a 409 ISBN: 0 7803 7848 2 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes a general-purpose method to trackthe steering wheel’s absolute angle by using a single camera vision system mounted inside the car. The approachis based on the modeling of the motion of thesteering wheel, as it appears perspectively distorted bythe point of view of the un-calibrated camera. We modifiedthe Lucas-Kanade method for an approzimativelyrotational motion model in order to provide the detectionand tracking of significant features on the wheel.The experimental results are compared with ground-trutheddata obtained with different types of sensors.

R. Cucchiara; A. Prati; R. Vezzani ( 2003 ) - Object Segmentation in Videos from Moving Camera with MRFs on Color and Motion Features ( IEEE Conference on Computer Vision and Pattern Recognition - - 16-22 June) ( - Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR2003) ) (IEEE Computer Society Los Alamitos, CA USA ) - n. volume 1 - pp. da 405 a 410 ISBN: 9780769519005 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we address the problem of fast segmenting moving objects in video acquired by moving camera or more generally with a moving background. We present an approach based on a color segmentation followed by a region-merging on motion through Markov Random Fields (MRFs). The technique we propose is inspired to a work of Gelgon and Bouthemy [6], that has been modified to reduce computational cost in order to achieve a fast segmentation (about ten frame per second). To this aim a modified region matching algorithm (namely Partitioned Region Matching) and an innovative arc-based MRF optimization algorithmwith a suitable definition of the motion reliability are proposed. Results on both synthetic and real sequences are reported to confirm validity of our solution.

C. Grana; G. Pellacani; S. Seidenari; R. Cucchiara ( 2003 ) - Image Representation and Retrieval with Topological Trees ( Image: E-Learning, Understanding, Information Retrieval and Medical - - Jun 9-10) ( - Image: E-Learning, Understanding, Information Retrieval and Medical Proceedings of the First International Workshop ) (World Scientific Publishing Co. Pte. Ltd. Singapore SGP ) - pp. da 112 a 122 ISBN: 9789812385871 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Typical processes of image representation comprehend initial region segmentation followed by a description of single regions’ feature and their relationships. Then a graph model can be exploited in order to integrate the knowledge of the specific regions (that are the attributed relational graph’s (ARG) nodes) and the regions’ relations (that are the ARG’s edges). In this work we use color features to guide region segmentation, geometric features to characterize regions one by one and topological features (and in particular inclusion) to describe regions’ relationships. Guided by the inclusion property we define the Topological Tree (TT) as an image representation model that exploiting the transitive property of inclusion, uses the adjacency and inclusion topological features. We propose an approach based on a recursive version of fuzzy c-means to construct the topological tree directly from the initial image, performing both segmentation and TT construction. The TT can be exploited in many applications of image analysis and image retrieval by similarity in those contexts where inclusion is a key feature: we propose an applicative case of analysis of dermatological images to support the melanoma diagnosis.In this paper describe details of the TT algorithm, including the management of not ideality and an approximate measure of tree similarity in order to retrieve skin lesion with a similar TT-based description.

R. Cucchiara; C. Grana; S. Seidenari; G. Pellacani ( 2002 ) - Exploiting color and topological features for region segmentation with recursive fuzzy c-means - MACHINE GRAPHICS & VISION - n. volume 11 (2/3) - pp. da 169 a 182 ISSN: 1230-0535 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this paper we define a novel approach for image segmentation into regions which focuses on both visual and topological cues, namely color similarity, inclusion and spatial adjacency. Many color clustering algorithms have been proposed in the past for skin lesion images but none exploits explicitly the inclusion properties between regions. Our algorithm is based on a recursive version of fuzzy c-means (FCM) clustering algorithm in the 2D color histogram constructed by Principal Component Analysis (PCA) of the color space. The distinctive feature of the proposal is that recursion is guided by the evaluation of adjacency and mutual inclusion properties of extracted regions; then, the recursive analysis addresses only included regions or regions with a not-negligible size. This approach allows a coarse-to-fine segmentation which focuses the attention on the inner parts of the images, in order to highlight the internal structure of the object depicted in the image. This could be particularly useful in many applications, especially in the biomedical image analysis. In this work we apply the technique to the segmentation of skin lesions in dermatoscopic images. It could be a suitable support for the diagnosis of skin melanoma, since dermatologists are interested in the analysis of the spatial relations, the symmetrical positions and the inclusion of regions.

R. Cucchiara; C. Grana ( 2002 ) - Using the Topological Tree for skin lesion structure description ( Sixth International Conference on Knowledge-Based Intelligent Information & Engineering Systems - - Sep 16-18) ( - Knowledge-based Intelligent Information Engineering Systems & Allied Technologies ) (IOS Press/Ohmsha Amsterdam NLD ) - pp. da 166 a 170 ISBN: 9781586032807 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this work we describe the Topological Tree (TT) as a knowledge representation method that relates some important visual and spatial features of image regions, namely the color similarity, the inclusion and the spatial adjacency. Starting from color-based region segmentation of an image into disjoint regions, their spatial relationships can be devised and described with graph-based methods. We are interested in the region’s propriety “to be included into” (in the sense of “surrounded by”) another region. This property could be very useful in biomedical imaging and in particular in the diagnosis of skin melanoma. The TT can be constructed after segmentation, by computing the spatial relationships of regions or can be generated directly during the segmentation: to this aim we present a novel recursive fuzzy c-means (FCM) clustering algorithm based on the PCA of the color space. In the paper, in addition to the TT definition and the construction algorithm description, some results are presented and discussed.

L. CINQUE; R. CUCCHIARA; S. LEVIALDI; G. PIGNALBERI ( 2002 ) - A Decision Support System for Range Image Segmentation ( 3rd International Conference on Digital Information Processing and Control in Ex - - 28-30 May 2002) ( - Proceedings of 3rd International Conference on Digital Information Processing and Control in Ex ) (- - USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

-

R. Cucchiara; C. Grana; M. Piccardi ( 2002 ) - Iterative fuzzy clustering for detecting regions of interest in skin lesions - AIIA NOTIZIE - n. volume 15 - pp. da 36 a 39 ISSN: - [Articolo in rivista (262) - Articolo su rivista]
Abstract

Image analysis tools are spreading in dermatology since the introduction of dermoscopy (epiluminescence microscopy), in the effort of algorithmically reproducing clinical evaluations. Color-based region segmentation of skin lesions is one of the key steps for correctly collecting statistics that can help clinicians in their diagnosis. Nevertheless, an efficient and accurate region segmentation algorithm has not been proposed in the literatureyet. This work proposes an iterative fuzzy c-means clustering algorithm based on PCA with the Karhunen-Loève transform of the color space. A topological tree is provided to store the mutual inclusions of the regions and then used to summarize the structural properties of the skin lesion. Preliminary experimental results are presented and discussed.

R. Cucchiara; C. Grana; A. Prati; S. Seidenari; G. Pellacani ( 2002 ) - Building the Topological Tree by Recursive FCM Color Clustering ( 16th International Conference on Pattern Recognition - - Aug 11-15) ( - Proceedings of the 16th International Conference on Pattern Recognition ) (IEEE Computer Society Los Alamitos, CA USA ) - n. volume 1 - pp. da 759 a 762 ISBN: 9780769516967 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we define a Topological Tree (TT) as a knowledge representation method that aims to describe important visual and spatial features of image regions, namely the color similarity, the inclusion and the spatial adjacency. The topological tree exhibits some interesting properties that can be exploited to extract knowledge from images for information retrieval, image understanding and diagnosis purposes. Examples of applications in dermatology are described. The TT can be constructed after segmentation, by computing the spatial relationships of regions or can be generated directly during the segmentation: to this aim we present a novel recursive fuzzy c-means (FCM) clustering algorithm based on the Principal Component Analysis of the color space. The recursive FCM proves to be effective for underlining the adjacency and inclusion property of regions.

R. Cucchiara; C. Grana; A. Prati ( 2002 ) - Semantic Transcoding for Live Video Server ( Tenth ACM international conference on Multimedia - - Dec 1-6) ( - Proceedings of the tenth ACM international conference on Multimedia ) (ACM New York USA ) - pp. da 223 a 226 ISBN: 9781581136203 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present transcoding techniques for a video server architecture that enables the user to access live video streams by using different devices with different capabilities. For live videos, annotation methods cannot be exploited. Instead we propose methods of on-the-fly transcoding that adapt the video content with respect to the user resources and the video semantic. Thus we propose an object-based transcoding with "classes of relevance" (for instance People, Face and Background). To compare the different strategies we propose a metric based on the Weighted Mean Square Error that allows the analysis of different application scenarios by means of a class-wise distortion measure. The obtained results show that the use of semantic can improve the bandwidth to distortion ratio significantly.

R. Cucchiara; C. Grana; A. Prati ( 2002 ) - Detecting Moving Objects and their Shadows: An Evaluation with the PETS2002 Dataset ( Third IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS’2002) - - Jun 1) ( - Proceedings of the Third IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS’2002) ) (James M. Ferryman Reading, UK GBR ) - pp. da 18 a 25 ISBN: 076951698X ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This work presents a general-purpose method for moving visual object segmentation in videos and discusses results attained on sequences of PETS2002 datasets. The proposed approach, called Sakbot, exploits color and motion information to detect objects, shadows and ghosts, i.e. foreground objects with apparent motion. The method is based on background suppression in the color space. The main peculiarity of the approach is the exploitation of motion and shadow information to selectively update the background, improving the statistical background model with the knowledge of detected objects. The approach is able to detect Moving Visual Objects (MVOs), and stopped objects too, since the motion status is maintained at the level of tracking module. HSV color space is exploited for shadow detection in order to enhance both segmentation and background update. Time measures and precision performance analysis in tracking and counting people is provided for surveillance and monitoring purposes.

F. CAVALLI; R. CUCCHIARA; M. PICCARDI; A. PRATI ( 2002 ) - Performance analysis of MPEG-4 decoder and encoder ( IEEE Region 8 International Symposium on Video/Image Processing and Multimedia Communications - - 16-19 June 2002) ( - Proceedings VIPromCom-2002 ) (Croatian Society Electronics in Marine - Elmar Zadar HRV ) - pp. da 227 a 231 ISBN: 9789537044015 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, a performance analysis of MPEG-4 encoder and decoder programs on standard personal computer is presented. The paper first describes the MPEG-4 computational load and discusses related works, then outlines the performance analysis. Experimental results show that while the decoder program can be easily executed in real time, the encoder requires execution times in the order of seconds per frame which call for substantial optimisation to satisfy the real-time constraints.

R. Cucchiara; C. Grana; M. Piccardi ( 2001 ) - Iterative fuzzy clustering for detecting regions of interest in skin lesions ( Workshop su "Intelligenza Artificiale, Visione e Pattern Recognition" - - Sep 24) ( - Atti del Workshop su "Intelligenza Artificiale, Visione e Pattern Recognition" ) (- - ITA ) - pp. da 31 a 38 ISBN: 0000000000 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Image analysis tools are spreading in dermatology since the introduction of dermoscopy (epiluminescence microscopy), in the effort of algorithmically reproducing clinical evaluations. Color-based region segmentation of skin lesions is one of the key steps for correctly collecting statistics that can help clinicians in their diagnosis. Nevertheless, an efficient and accurate region segmentation algorithm has not been proposed in the literature yet. This work proposes an iterative fuzzy c-means clustering algorithm based on PCA with the Karhunen-Loève transform of the color space. A topological tree is provided to store the mutual inclusions of the regions and then used to summarize the structural properties of the skin lesion. Preliminary experimental results are presented and discussed.

R. Cucchiara; C. Grana; G. Neri; M. Piccardi; A. Prati ( 2001 ) - The Sakbot system for moving object detection and tracking ( - Video-Based Surveillance Systems: Computer Vision and Distributed Processing ) (Springer Heidelberg DEU ) - pp. da 145 a 158 ISBN: 9780792376323 ISSN: - [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This paper presents Sakbot, a system for moving object detection in traffic monitoring and video surveillance applications. The system is endowed with robust and efficient detection techniques, which main features are the statistical and knowledge-based background update and the use of HSV color information for shadow suppression. Tracking is provided by a symbolic reasoning module allowing flexible object tracking over a variety of different applications. This system proves effective on many different situations, both from the point of view of the scene appearance and the purpose of the application.

A. PRATI; R. CUCCHIARA; I. MIKIC; MM TRIVEDI ( 2001 ) - Analysis and detection of shadows in video streams: a comparative evaluation ( IEEE-CS Computer Vision and Pattern Recognition conference - - 8-14 December 2001) ( - Proceedings of IEEE-CS Computer Vision and Pattern Recognition conference ) (IEEE Computer Society Los Alamitos, California USA ) - n. volume 2 - pp. da 571 a 576 ISBN: 9780769512723 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Robustness to changes in illumination conditions as well as viewing perspectives is an important requirement for many computer vision applications. One of the key factors in enhancing the robustness of dynamic scene analysis is that of accurate and reliable means for shadow detection. Shadow detection is critical for correct object detection in image sequences. Many algorithms have been proposed in the literature that deal with shadows. However, a comparative evaluation of the existing approaches isstill lacking. In this paper, the full range of problems underlyingthe shadow detection are identified and discussed. We classify the proposed solutions to this problem using a taxonomy of four main classes, called deterministic model and non-model based and statistical parametric and nonparametric. Novel quantitative (detection and discrimination accuracy) and qualitative metrics (scene and object independence, flexibility to shadow situations and robustness to noise) are proposed to evaluate these classes of algorithms on a benchmark suite of indoor and outdoor videosequences.

R. Cucchiara; C. Grana; M. Piccardi; A. Prati ( 2001 ) - Detecting objects, shadows and ghosts in video streams by exploiting color and motion information ( 11th International Conference on Image Analysis and Processing (ICIAP 2001) - - Sep 26-28) ( - Proceedings of the 11th International Conference on Image Analysis and Processing (ICIAP 2001) ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 360 a 365 ISBN: 9780769511832 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Many approaches to moving object detection for traffic monitoring and video surveillance proposed in the literature are based on background suppression methods. How to correctly and efficiently update the background model and how to deal with shadows are two of the more distinguishing and challenging features of such approaches. This work presents a general-purpose method for segmentation of moving visual objects (MVOs) based on an object-level classification in MVOs, ghosts and shadows. Background suppression needs a background model to be estimated and updated: we use motion and shadow information to selectively exclude from the background model MVOs and their shadows, while retaining ghosts. The color information (in the HSV color space) is exploited to shadow suppression and, consequently, to enhance both MVOs segmentation and background update.

R. Cucchiara; C. Grana; G. Neri; M. Piccardi; A. Prati ( 2001 ) - The Sakbot system for moving object detection and tracking ( 2nd European Workshop on Advanced Video-Based Surveillance Systems - - Sep 4) ( - Proceedings of 2nd European Workshop on Advanced Video-Based Surveillance Systems ) (- - GBR ) - pp. da 159 a 171 ISBN: 0000000000 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents Sakbot, a system for moving object detection and tracking in traffic monitoring and video surveillance applications. The system is endowed with robust and efficient detection techniques, which main features are the statistical and knowledge-based background update and the use of HSV color information for shadow suppression. Tracking is performed by means of a flexible tracking module based on symbolic reasoning, which can be tuned to several different applications.

R. CUCCHIARA; M. PICCARDI; P. MELLO ( 2000 ) - Image Analysis and Rule-Based Reasoning for a Traffic Monitoring (IEEE / Institute of Electrical and Electronics Engineers Incorporated:445 Hoes Lane:Piscataway, NJ 08854:(800)701-4333, (732)981-0060, EMAIL: subscription-service@ieee.org, INTERNET: http://www.ieee.org, Fax: (732)981-9667 ) - IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS - n. volume 1(2) - pp. da 119 a 130 ISSN: 1524-9050 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The paper presents an approach for detecting vehicles in urban traffic scenes by means of rule-based reasoning on visual data. The strength of the approach is its formal separation between the low-level image processing modules (used for extracting visual data under various illumination conditions) and the high-level module, which provides a general-purpose knowledge-based framework for tracking vehicles in the scene. The image-processing modules extract visual data from the scene by spatio-temporal analysis during daytime, and by morphological analysis of headlights at night, The high-level module is designed as a forward chaining production rule system, working on symbolic data, i.e., vehicles and their attributes (area, pattern, direction, and others) and exploiting a set of heuristic rules tuned to urban traffic conditions, The synergy between the artificial intelligence techniques of the high-level and the low-level image analysis techniques provides the system with flexibility and robustness.

R. CUCCHIARA; M. PICCARDI; A. PRATI ( 2000 ) - Hardware prefetching techniques for cache memories in multimedia applications ( International Workshop on Computer Architectures for Machine Perception - - 11-13 September 2000) ( - Proceedings of International Workshop on Computer Architectures for Machine Perception ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 311 a 319 ISBN: 0 7695 0740 9 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The workload of niultimedia applications has a strong impact on cache memory performance, since the locality of memory references embedded in multimedia programs differs from that of traditional programs. In many cases, standard cache memory organization achieves poorer performance when used for multimedia. A widely explored approach to improve cache performance is hardware prefetching that allows the pre-loading of data in the cache before they are referenced. However, existing hardware prefetching approaches partially miss thepotential performance improvement, since they are not tailored to multimedia locality. In this paper we propose novel effective approaches to hardware prefetching to be used in image processing programs for multimedia. Experimental results are reported for a suite of multimedia image processing programs including convolutions with kernels, MPEG-2 decoding, and edgechain coding.

R. CUCCHIARA; M. PICCARDI; A. PRATI ( 2000 ) - Focus based Feature Extraction for Pallets Recognition ( British Machine Vision Conference - - 11-14 September 2000) ( - Proceedings of British Machine Vision Conference ) (IEE London GBR ) - n. volume 2 - pp. da 695 a 704 ISBN: 1 901725 13 8 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Visual recognition for object grasping is a well-known challenge for robot automation in industrial applications. A typical example is pallet recognition in industrial environment for pick-and-place automated process. The aim of vision and reasoning algorithms is to help robots in choosing the best pallets holes location. This work proposes an application-based approach, which fulfil all requirements, dealing with every kind of occlusions and light situations possible. Even some ”meaning noise” (or ”meaning misunderstanding”) is considered. A pallet model, with limited degrees of freedom, is described and, starting from it, a complete approach to pallet recognition is outlined. In the model we define both virtual and real corners, that are geometricalobject proprieties computed by different image analysis operators. Real corners are perceived by processing brightness information directly from the image, while virtual corners are inferred at a higher level of abstraction. A final reasoning stage selects the best solution fitting the model. Experimental results and performance are reported in order to demonstrate the suitability of the proposed approach.

R. Cucchiara; C. Grana; M. Piccardi; A. Prati ( 2000 ) - Statistic and knowledge-based moving object detection in traffic scenes ( 3rd IEEE Conference on Intelligent Transportation Systems - - Oct 1-3) ( - Proceedings of the 3rd IEEE Conference on Intelligent Transportation Systems ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 27 a 32 ISBN: 9780780359710 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The most common approach used for vision-based traffic surveillance consists of a fast segmentation of moving visual objects (MVOs) in the scene together with an intelligent reasoning module capable of identifying, tracking and classifying the MVOs in dependency of the system goal. In this paper we describe our approach for MVOs segmentation in an unstructured traffic environment. We consider complex situations with moving people, vehicles and infrastructures that have different aspect model and motion model. In this case we define a specific approach based on background subtraction with statistic and knowledge-based background update. We show many results of real-time tracking of traffic MVOs in outdoor traffic scene such as roads, parking area intersections, and entrance with barriers

R. Cucchiara; M. Gavanelli; E. Lamma; P. Mello; M. Milano; M. Piccardi ( 1999 ) - Extending CLP(FD) with Interactive Data Acquisition for 3D Visual Object Recognition ( First International Conference on the Practical Application of Constraint Technologies and Logic Programming - - Apr 19-21) ( - Proceedings of the First International Conference on the Practical Application of Constraint Technologies and Logic Programming ) (The Practical Application Company Blackpool, UK GBR ) - pp. da 137 a 155 ISBN: 978 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

-

R. CUCCHIARA; M. PICCARDI ( 1999 ) - Vehicle Detection under Day and Night Illumination ( International Symposia on Intelligent Industrial Automation - - June 1-4) ( - Proceedings of the International Symposia on Intelligent Industrial Automation ) (Academic Press Rochester, NY, USA USA ) - pp. da 789 a 784 ISBN: 9783906454160 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Effective detection of vehicles in urban traffic scenes can be achieved by exploiting image analysis techniques. Nevertheless, vehicle detection in daytime and at night can’t be approached with the same image analysis algorithms, due to the strongly different illumination conditions. This paper describes the two different sets of image analysis algorithms that have been used in the VTTS system (Vehicular Traffic Tracking System) for extracting vehicles from image sequences acquired in daytime and at night. In the system, a supervising level selects the set of algorithms to apply and performs vehicle tracking under control of a rule-based decision module. The paper describes the tracking module, and reports experimental results for both vehicle detection andtracking.

R. CUCCHIARA; M. PICCARDI; P. MELLO ( 1999 ) - Image Analysis and Rule-Based Reasoning for a Traffic Monitoring ( IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems - - Oct. 5-8) ( - Proceedings of the IEEE/IEEJ/JSAI International Conference on Intelligent Transportation Systems ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 758 a 763 ISBN: 9780780349759 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The paper describes a system for detecting vehicles in urban traffic scenes in daytime and at night by means of image analysis and rule-based reasoning. The strength of the proposed approach is its formal separation between the low-level image processing modules (detecting moving vehicles under day and night light) and the high-level module, which provides a single framework for tracking vehicles in the scene. The image processing modules perform spatio-temporal analysis on moving templates in daytime images, and morphological analysis of headlight pairs in night images. The high-level module is designed as a forward chained production rule system, working on symbolic data, i.e. vehicles and their attributes, and exploiting a set of heuristic roles tuned to urban traffic conditions. The synergy between the artificial intelligence techniques of the high level and low-level image analysis techniques provides the system with flexibility and robustness.

E. LAMMA; P. MELLO; M. MILANO; R. CUCCHIARA; G. GAVANELLI; M. PICCARDI ( 1999 ) - Constraint Propagation and Value Acquisition: why we should do it Interactively ( Sixteenth International Joined Conference on Artificial Intelligence (IJCAI99) - - July 31 - Aug. 6) ( - Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence ) (Morgan Kaufmann Publishers Inc. San Francisco, CA USA ) - pp. da 468 a 477 ISBN: 9781558606135 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In Constraint Satisfaction Problems (CSPs) values belonging to variable domains should be completely known before the constraint propagationprocess starts. In many applications, however, the acquisition of domain values is a computational expensive process or some domainvalues could not be available at the beginningof the computation. For this purpose, we introduce an Interactive Constraint SatisfactionProblem (ICSP) model as extension of the widely used CSP model. The variable domainvalues can be acquired when needed duringthe resolution process by means of InteractiveConstraints, which retrieve (possibly consistent)information. Experimental results on randomly generated CSPs and for 3D object recognition show the effectiveness of the proposedapproach.

R. CUCCHIARA; M. PICCARDI; A. PRATI ( 1999 ) - Exploiting Cache in Multimedia ( IEEE International Conference on Multimedia Computing and Systems (ICMCS) - - 7-11 June 1999) ( - International Conference on Multimedia Computing and Systems ) (IEEE Computer Society Los Alarnitos, California USA ) - n. volume 1 - pp. da 345 a 350 ISBN: 9780769502533 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The paper explores cache strategies for multimedia. Although many architectural improvements have been designed for multimedia, the cache structure and the standard caching policies of general-purpose processors exhibit poor performance in exploiting the 2D spatial locality typical of programs handling and processing images. In this paper we propose a novel caching approach suitably tailored to the requirement of multimedia programs. Our proposal exploits hardware pre-fetching for allocating in cache blocks of data that satisfy the 2D spatial locality requirements. Results refer to a benchmark suite of multimedia program including MPEG decoding and image processing programs with different data dependency and access scheme to image data.

R. CUCCHIARA; M. PICCARDI ( 1999 ) - Eliciting Visual Primitives for Detecting Elongated Shapes - IMAGE AND VISION COMPUTING - n. volume 17(5) - pp. da 347 a 355 ISSN: 0262-8856 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Elsevier eds

R. CUCCHIARA; P. ONFIANI; A. PRATI; N. SCARABOTTOLO ( 1999 ) - Segmentation of Moving Objects at Frame Rate: A Dedicated Hardware Solution ( IEE Conf. on Image Processing and its Applications (IPA) - - 13-15 July 1999) ( - Proceedings of IEE Conf. on Image Processing and its Applications ) (IEE London GBR ) - n. volume 1 - pp. da 138 a 142 ISBN: 0 85296 717 9 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Many works in image processing concern segmentation of moving objects in sequence of images. This problem is particularly critical, since it represents the first step of many complex processes of computer vision, for applications like object tracking, video-surveillance, monitoring, and autonomous navigation. In such applications, both real-time and low-cost requirements should be satisfied.To this aim we propose a dedicated hardware solution, based on reconfigurable logic, that provides motion detection and moving objects segmentation at framerate.

R. CUCCHIARA; M. PICCARDI; A. PRATI; N. SCARABOTTOLO ( 1999 ) - Real-time Detection of Moving Vehicles ( International Conference on Image Analysis and Processing (ICIAP) - - 27-29 September 1999) ( - Proceedings of International Conference on Image Analysis and Processing ) (IEEE Computer Society Los Alarnitos, California USA ) - pp. da 618 a 623 ISBN: 0 7695 0040 4 ISSN: - [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Computer vision-based traffic flow monitoring is of major importance for enforcing traffic management policies. Information such as the number of vehicles passing on a road per time unit, or vehicles' turning rates at intersections are exploited by traffic management policies to supervise traffic-light timings. Computer vision-based traffic flow monitoring requiresextraction of moving vehicles from traffic scenes in real time. To accomplish this task, efficient algorithms must be used and effective, low-cost hardware implementation must be pursued. This paper first describes the algorithms used in VTTS (Vehicular Traffic Tracking System) to achieve segmentation of moving vehicles. Then, hardware implementation on a re-programmable FPGA-based board is described in detail.

R. CUCCHIARA; G. NERI; M. PICCARDI ( 1998 ) - A real-time hardware implementation of the hough transform (Elsevier BV:PO Box 211, 1000 AE Amsterdam Netherlands:011 31 20 4853757, 011 31 20 4853642, 011 31 20 4853641, EMAIL: nlinfo-f@elsevier.nl, INTERNET: http://www.elsevier.nl, Fax: 011 31 20 4853598 ) - JOURNAL OF SYSTEMS ARCHITECTURE - n. volume 45 (1) - pp. da 31 a 45 ISSN: 1383-7621 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The paper presents a hardware implementation of algorithms based on the Hough transform (HT) for real-time straight line detection. In particular, the basic HT on the edge points (EHT) and the Gradient-Weighted Hough transform (GWHT) for gray-level images are analyzed in detail and implemented on a pipelined architecture using Field Programmable Gate Arrays (FPGA). Algorithms execution times are compared with other hardware and software based systems in order to assess the efficiency of the presented approach. The paper shows how the achievable performance can meet the real-time requirements of an industrial inspection application.

R. CUCCHIARA ( 1998 ) - Genetic algorithms for clustering in machine vision - MACHINE VISION AND APPLICATIONS - n. volume 11(1) - pp. da 1 a 6 ISSN: 0932-8092 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The paper presents a genetic algorithm for clustering objects in images based on their visual features. In particular, a novel solution code (named Boolean Matching Code) and a correspondent reproduction operator (the Single Gene Crossover) are defined specifically for clustering and are compared with other standard genetic approaches. The paper describes the clustering algorithm in detail, in order to show the suitability of the genetic paradigm and underline the importance of effective tuning of algorithm parameters to the application. The algorithm is evaluated on some test sets and an example of its application in automated visual inspection is presented.

R. CUCCHIARA; F. FILICORI ( 1998 ) - The Vector-Gradient Hough Transform (IEEE / Institute of Electrical and Electronics Engineers Incorporated:445 Hoes Lane:Piscataway, NJ 08854:(800)701-4333, (732)981-0060, EMAIL: subscription-service@ieee.org, INTERNET: http://www.ieee.org, Fax: (732)981-9667 ) - IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE - n. volume 20 (7) - pp. da 746 a 751 ISSN: 0162-8828 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The paper presents a new transform, called vector-gradient Hough transform, for identifying elongated shapes in gray-scale images. This goal is achieved not only by collecting information on the edges of the objects, but also by reconstructing their transversal profile of luminosity. The main features of the new approach are related to its vector space formulation and the associated capability of exploiting all the vector information of the luminosity gradient