Foto personale

Pagina personale di Sonia BERGAMASCHI

Dipartimento di Ingegneria "Enzo Ferrari"

Bergamaschi, Sonia; Gagliardelli, Luca; Simonini, Giovanni; Zhu, Song ( 2017 ) - BigBench workload executed by using Apache Flink ( 27th International Conference on Flexible Automation and Intelligent Manufacturing, FAIM2017 - Modena - June 27-30, 2017) ( - Procedia Manufacturing ) (Elsevier ) - n. volume 11 - pp. da 695 a 702 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Many of the challenges that have to be faced in Industry 4.0 involve the management and analysis of huge amount of data (e.g. sensor data management and machine-fault prediction in industrial manufacturing, web-logs analysis in e-commerce). To handle the so-called Big Data management and analysis, a plethora of frameworks has been proposed in the last decade. Many of them are focusing on the parallel processing paradigm, such as MapReduce, Apache Hive, Apache Flink. However, in this jungle of frameworks, the performance evaluation of these technologies is not a trivial task, and strictly depends on the application requirements. The scope of this paper is to compare two of the most employed and promising frameworks to manage big data: Apache Flink and Apache Hive, which are general purpose distributed platforms under the umbrella of the Apache Software Foundation. To evaluate these two frameworks we use the benchmark BigBench, developed for Apache Hive. We re-implemented the most significant queries of Apache Hive BigBench to make them work on Apache Flink, in order to be able to compare the results of the same queries executed on both frameworks. Our results show that Apache Flink, if it is configured well, is able to outperform Apache Hive.

Beneventano, Domenico ; Bergamaschi, Sonia; Gagliardelli, Luca; Po, Laura ( 2017 ) - Driving Innovation in Youth Policies With Open Data ( - Communications in Computer and Information Science ) (Springer Verlag ) - n. volume 631 - pp. da 324 a 344 ISBN: 9783319527574 ISSN: 1865-0929 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

In December 2007, thirty activists held a meeting in California to define the concept of open public data. For the first time eight Open Government Data (OPG) principles were settled; OPG should be Complete, Primary (reporting data at an high level of granularity), Timely, Accessible, Machine processable, Non-discriminatory, Non-proprietary, License-free. Since the inception of the Open Data philosophy there has been a constant increase in information released improving the communication channel between public administrations and their citizens. Open data offers government, companies and citizens information to make better decisions. We claim Public Administrations, that are the main producers and one of the consumers of Open Data, might effectively extract important information by integrating its own data with open data sources. This paper reports the activities carried on during a research project on Open Data for Youth Policies. The project was devoted to explore the youth situation in the municipalities and provinces of the Emilia Romagna region (Italy), in particular, to examine data on population, education and work. We identified interesting data sources both from the open data community and from the private repositories of local governments related to the Youth Policies. The selected sources have been integrated and, the result of the integration by means of a useful navigator tool have been shown up. In the end, we published new information on the web as Linked Open Data. Since the process applied and the tools used are generic, we trust this paper to be an example and a guide for new projects that aims to create new knowledge through Open Data.

Bergamaschi, Sonia; Beneventano, Domenico; Mandreoli, Federica; Martoglia, Riccardo; Guerra, Francesco; Orsini, Mirko; Po, Laura; Vincini, Maurizio; Simonini, Giovanni; Zhu, Song; Gagliardelli, Luca; Magnotta, Luca ( 2017 ) - From Data Integration to Big Data Integration ( - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years ) (Springer International Publishing ) - n. volume 31 - pp. da 43 a 59 ISBN: 9783319618920 ISSN: 2197-6511 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Abstract. The Database Group (DBGroup, www.dbgroup.unimore.it) and Information System Group (ISGroup, www.isgroup.unimore.it) re- search activities have been mainly devoted to the Data Integration Research Area. The DBGroup designed and developed the MOMIS data integration system, giving raise to a successful innovative enterprise DataRiver (www.datariver.it), distributing MOMIS as open source. MOMIS provides an integrated access to structured and semistructured data sources and allows a user to pose a single query and to receive a single unified answer. Description Logics, Automatic Annotation of schemata plus clustering techniques constitute the theoretical framework. In the context of data integration, the ISGroup addressed problems related to the management and querying of heterogeneous data sources in large-scale and dynamic scenarios. The reference architectures are the Peer Data Management Systems and its evolutions toward dataspaces. In these contexts, the ISGroup proposed and evaluated effective and efficient mechanisms for network creation with limited information loss and solutions for mapping management query reformulation and processing and query routing. The main issues of data integration have been faced: automatic annotation, mapping discovery, global query processing, provenance, multi- dimensional Information integration, keyword search, within European and national projects. With the incoming new requirements of integrating open linked data, textual and multimedia data in a big data scenario, the research has been devoted to the Big Data Integration Research Area. In particular, the most relevant achieved research results are: a scalable entity resolution method, a scalable join operator and a tool, LODEX, for automatically extracting metadata from Linked Open Data (LOD) resources and for visual querying formulation on LOD resources. Moreover, in collaboration with DATARIVER, Data Integration was successfully applied to smart e-health.

Bergamaschi, Sonia; Carlini, Emanuele; Ceci, Michelangelo; Furletti, Barbara; Giannotti, Fosca; Malerba, Donato; Mazzanzanica, Mario; Monreale, Anna; Pasi, Gabriella; Pedereschi, Dino; Perego, Raffaele; Ruggieri, Salvatore ( 2016 ) - Big Data Research in Italy: A Perspective - ENGINEERING - n. volume 2(2016) - pp. da 163 a 170 ISSN: 2095-8099 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The aim of this article is to sinthetically describe the research projects that a selecion of italian universities is undertaking in the context of big data. Far for being exahaustive, this atricle has the objective of offering a sample of distinct applications that address the issue of managing huge amounts od data in Italy, collected in relation to diverse domains.

Simonini, Giovanni; Bergamaschi, Sonia; Jagadish, H. V. ( 2016 ) - BLAST: a Loosely Schema-aware Meta-blocking Approach for Entity Resolution - PROCEEDINGS OF THE VLDB ENDOWMENT - n. volume 9 - pp. da 1173 a 1184 ISSN: 2150-8097 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Identifying records that refer to the same entity is a fundamental step for data integration. Since it is prohibitively expensive to compare every pair of records, blocking techniques are typically employed to reduce the complexity of this task. These techniques partition records into blocks and limit the comparison to records co-occurring in a block. Generally, to deal with highly heterogeneous and noisy data (e.g. semi-structured data of the Web), these techniques rely on redundancy to reduce the chance of missing matches. Meta-blocking is the task of restructuring blocks generated by redundancy-based blocking techniques, removing superfluous comparisons. Existing meta-blocking approaches rely exclusively on schema-agnostic features. In this paper, we demonstrate how “loose” schema information (i.e., statistics collected directly from the data) can be exploited to enhance the quality of the blocks in a holistic loosely schema-aware (meta-)blocking approach that can be used to speed up your favorite Entity Resolution algorithm. We call it Blast (Blocking with Loosely-Aware Schema Techniques). We show how Blast can automatically extract this loose information by adopting a LSH-based step for e ciently scaling to large datasets. We experimentally demonstrate, on real-world datasets, how Blast outperforms the state-of-the-art unsupervised meta-blocking approaches, and, in many cases, also the supervised one.

BERGAMASCHI, Sonia; INTERLANDI, Matteo; GUERRA, Francesco; TRILLO LADO, Raquel; VELEGRAKIS, Yannis ( 2016 ) - Combining User and Database Perspective for Solving Keyword Queries over Relational Databases - INFORMATION SYSTEMS - n. volume 55 - pp. da 1 a 19 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Over the last decade, keyword search over relational data has attracted considerable attention. A possible approach to face this issue is to transform keyword queries into one or more SQL queries to be executed by the relational DBMS. Finding these queries is a challenging task since the information they represent may be modeled across different tables and attributes. This means that it is needed to identify not only the schema elements where the data of interest is stored, but also to find out how these elements are interconnected. All the approaches that have been proposed so far provide a monolithic solution. In this work, we, instead, divide the problem into three steps: the first one, driven by the user׳s point of view, takes into account what the user has in mind when formulating keyword queries, the second one, driven by the database perspective, considers how the data is represented in the database schema. Finally, the third step combines these two processes. We present the theory behind our approach, and its implementation into a system called QUEST (QUEry generator for STructured sources), which has been deeply tested to show the efficiency and effectiveness of our approach. Furthermore, we report on the outcomes of a number of experimental results that we have conducted.

BERGAMASCHI, Sonia; BENEVENTANO, Domenico; BENEDETTI, FABIO ( 2016 ) - Context Semantic Analysis: A Knowledge-Based Technique for Computing Inter-document Similarity ( 9th International Conference on Similarity Search and Applications (SISAP) - Tokyo, Japan - October 24-26, 2016) ( - Similarity Search and Applications ) (Springer Verlag ) - n. volume 9939 - pp. da 164 a 178 ISBN: 9783319467597 ISSN: 1611-3349 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose a novel knowledge-based technique for inter-document similarity, called Context Semantic Analysis (CSA). Several specialized approaches built on top of specific knowledge base (e.g. Wikipedia) exist in literature but CSA differs from them because it is designed to be portable to any RDF knowledge base. Our technique relies on a generic RDF knowledge base (e.g. DBpedia and Wikidata) to extract from it a vector able to represent the context of a document. We show how such a Semantic Context Vector can be effectively exploited to compute inter-document similarity. Experimental results show that our general technique outperforms baselines built on top of traditional methods, and achieves a performance similar to the ones of specialized methods.

Beneventano, Domenico; Bergamaschi, Sonia; Martoglia, Riccardo ( 2016 ) - Exploiting Semantics for Searching Agricultural Bibliographic Data - JOURNAL OF INFORMATION SCIENCE - n. volume 42 - pp. da 748 a 762 ISSN: 0165-5515 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Filtering and search mechanisms which permit to identify key bibliographic references are fundamental for researchers. In this paper we propose a fully automatic and semantic method for filtering/searching bibliographic data, which allows users to look for information by specifying simple keyword queries or document queries, i.e. by simply submitting existing documents to the system. The limitations of standard techniques, based on either syntactical text search and on manually assigned descriptors, are overcome by considering the semantics intrinsically associated to the document/query terms; to this aim, we exploit different kinds of external knowledge sources (both general and specific domain dictionaries or thesauri). The proposed techniques have been developed and successfully tested for agricultural bibliographic data, which plays a central role to enable researchers and policy makers to retrieve related agricultural and scientific information by using the AGROVOC thesaurus.

Bergamaschi, Sonia; Ferro, Nicola; Guerra, Francesco; Silvello, Gianmaria ( 2016 ) - Keyword-Based Search Over Databases: A Roadmap for a Reference Architecture Paired with an Evaluation Framework - TRANSACTIONS ON COMPUTATIONAL COLLECTIVE INTELLIGENCE - n. volume 9630 - pp. da 1 a 20 ISSN: 2190-9288 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Structured data sources promise to be the next driver of a significant socio-economic impact for both people and companies. Nevertheless, accessing them through formal languages, such as SQL or SPARQL, can become cumbersome and frustrating for end-users. To overcome this issue, keyword search in databases is becoming the technology of choice, even if it suffers from efficiency and effectiveness problems that prevent it from being adopted at Web scale. In this paper, we motivate the need for a reference architecture for keyword search in databases to favor the development of scalable and effective components, also borrowing methods from neighbor fields, such as information retrieval and natural language processing. Moreover, we point out the need for a companion evaluation framework, able to assess the efficiency and the effectiveness of such new systems and in the light of real and compelling use cases.

Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni; Velegrakis, Yannis ( 2016 ) - Providing Insight into Data Source Topics - JOURNAL ON DATA SEMANTICS - n. volume 5 - pp. da 211 a 228 ISSN: 1861-2032 [Articolo in rivista (262) - Articolo su rivista]
Abstract

A fundamental service for the exploitation of the modern large data sources that are available online is the ability to identify the topics of the data that they contain. Unfortunately, the heterogeneity and lack of centralized control makes it difficult to identify the topics directly from the actual values used in the sources. We present an approach that generates signatures of sources that are matched against a reference vocabulary of concepts through the respective signature to generate a description of the topics of the source in terms of this reference vocabulary. The reference vocabulary may be provided ready, may be created manually, or may be created by applying our signature-generated algorithm over a well-curated data source with a clear identification of topics. In our particular case, we have used DBpedia for the creation of the vocabulary, since it is one of the largest known collections of entities and concepts. The signatures are generated by exploiting the entropy and the mutual information of the attributes of the sources to generate semantic identifiers of the various attributes, which combined together form a unique signature of the concepts (i.e. the topics) of the source. The generation of the identifiers is based on the entropy of the values of the attributes; thus, they are independent of naming heterogeneity of attributes or tables. Although the use of traditional information-theoretical quantities such as entropy and mutual information is not new, they may become untrustworthy due to their sensitivity to overfitting, and require an equal number of samples used to construct the reference vocabulary. To overcome these limitations, we normalize and use pseudo-additive entropy measures, which automatically downweight the role of vocabulary items and property values with very low frequencies, resulting in a more stable solution than the traditional counterparts. We have materialized our theory in a system called WHATSIT and we experimentally demonstrate its effectiveness.

Bergamaschi, Sonia; Po, Laura ( 2015 ) - Comparing LDA and LSA Topic Models for Content-Based Movie Recommendation Systems ( - Web Information Systems and Technologies - 10th International Conference, WEBIST 2014, Barcelona, Spain, April 3-5, 2014, Revised Selected Papers ) (Springer Berlino DEU ) - n. volume 226 - pp. da 247 a 263 ISBN: 9783319270296 ISSN: 1865-1348 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

We propose a plot-based recommendation system, which is based upon an evaluation of similarity between the plot of a video that was watched by a user and a large amount of plots stored in a movie database. Our system is independent from the number of user ratings, thus it is able to propose famous and beloved movies as well as old or unheard movies/programs that are still strongly related to the content of the video the user has watched. The system implements and compares the two Topic Models, Latent Semantic Allocation (LSA) and Latent Dirichlet Allocation (LDA), on a movie database of two hundred thousand plots that has been constructed by integrating different movie databases in a local NoSQL (MongoDB) DBMS. The topic models behaviour has been examined on the basis of standard metrics and user evaluations, performance ssessments with 30 users to compare our tool with a commercial system have been conducted.

Bergamaschi, Sonia; Martoglia, Riccardo; Sorrentino, Serena ( 2015 ) - Exploiting Semantics for Filtering and Searching Knowledge in a Software Development Context - KNOWLEDGE AND INFORMATION SYSTEMS - n. volume 45 - pp. da 295 a 318 ISSN: 0219-1377 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Software development is still considered a bottleneck for SMEs (Small and Medium Enterprises) in the advance of the Information Society. Usually, SMEs store and collect a large number of software textual documentation; these documents might be profitably used to facilitate them in using (and re-using) Software Engineering methods for systematically designing their applications, thus reducing software development cost. Specific and semantics textual filtering/search mechanisms, supporting the identification of adequate processes and practices for the enterprise needs, are fundamental in this context. To this aim, we present an automatic document retrieval method based on semantic similarity and Word Sense Disambiguation (WSD) techniques. The proposal leverages on the strengths of both classic information retrieval and knowledge-based techniques, exploiting syntactical and semantic information provided by general and specific domain knowledge sources. For any SME, it is as easily and generally applicable as are the search techniques offered by common enterprise Content Management Systems (CMSs). Our method was developed within the FACIT-SME European FP-7 project, whose aim is to facilitate the diffusion of Software Engineering methods and best practices among SMEs. As shown by a detailed experimental evaluation, the achieved effectiveness goes well beyond typical retrieval solutions.

Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura ( 2015 ) - Exposing the Underlying Schema of LOD Sources ( International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM - Singapore - 6-9 December 2015) ( - Proceedings of the 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT 2015) ) (IEEE PIscataway USA ) - pp. da 301 a 304 ISBN: 9781467396189 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Linked Data Principles defined by Tim-Berners Lee promise that a large portion of Web Data will be usable as one big interlinked RDF database. Today, with more than one thousand of Linked Open Data (LOD) sources available on the Web, we are assisting to an emerging trend in publication and consumption of LOD datasets. However, the pervasive use of external resources together with a deficiency in the definition of the internal structure of a dataset causes many LOD sources are extremely complex to understand. In this paper, we describe a formal method to unveil the implicit structure of a LOD dataset by building a (Clustered) Schema Summary. The Schema Summary contains all the main classes and properties used within the datasets, whether they are taken from external vocabularies or not, and is conceivable as an RDFS ontology. The Clustered Schema Summary, suitable for large LOD datasets, provides a more high level view of the classes and the properties used by gathering together classes that are object of multiple instantiations.

Benedetti, Fabio; Bergamaschi, Sonia; Orsini, Mirko; Magnotta, Luca ( 2015 ) - Integrazione di dati clinici con il sistema MOMIS ( - Strumenti, diritti, regole e nuova relazione di cura. Il paziente europeo protagonista nell' eHealth ) (Giappichelli Torino ITA ) - n. volume 2 - pp. da 69 a 82 ISBN: 9788892100671 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Nel corso dell’ultimo decennio è diventata sempre più rilevante la necessità di accedere ad informazioni distribuite e contestualmente anche il problema dell’integrazione di informazioni provenienti da sorgenti eterogenee. In campo medico, gli istituti di ricerca e le aziende ospedaliere hanno a disposizione un nu-mero sempre crescente di fonti d’informazione, che possono contenere dati cor-relati tra loro ma spesso ridondanti, eterogenei e non sempre consistenti. L’esigenza, soprattutto da parte delle organizzazioni di ricerca, è quella di poter accedere in modo semplice a tutte le informazioni distribuite sui diversi sistemi informativi, e poter costruire applicazioni che utilizzino in tempo reale tali infor-mazioni, per poter ottenere nel minor tempo possibile i risultati che saranno a be-neficio dei pazienti. In questo articolo viene presentato il progetto di integrazione dati degli studi clini-ci sperimentali condotti dalla FIL (Fondazione Italiana Linfomi) effettuato dal gruppo di ricerca DBGroup e dalla spin off universitaria DataRiver. Il progetto ha riguardato l’integrazione dei dati provenienti da 3 diversi sistemi informativi al fine di ottenere una visione unificata dell’andamento di tutti gli studi ed effettua-re analisi statistiche dinamiche in tempo reale. Lo strumento per il monitoraggio dei trial clinici “Trial Monitoring tool”, sviluppato sfruttando il sistema di data integration MOMIS ed il componente MOMIS Dashboard, consente di effettuare ricerche e monitoraggio dei dati aggregati e di visualizzare i risultati dell’andamento degli studi su mappe, grafici e tabelle dinamiche.

Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura ( 2015 ) - LODeX: A tool for Visual Querying Linked Open Data ( The 14th International Semantic Web Conference (ISWC-2015) - Bethlehem,USA - 11-15 October 2015) ( - Proceedings of the ISWC 2015 Posters & Demonstrations Track co-located with the 14th International Semantic Web Conference ) (ceur-ws.org DEU ) - n. volume 1486 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Formulating a query on a Linked Open Data (LOD) source is not an easy task; a technical knowledge of the query language, and, the awareness of the structure of the dataset are essential to create a query. We present a revised version of LODeX that provides the user an easy way for building queries in a fast and interactive manner. When a user decides to explore a LOD source, he/she can take advantage of the Schema Summary produced by LODeX (i.e. a synthetic view of the dataset’s structure) and he/she can pick graphical elements from it to create a visual query. The tool also supports the user in browsing the results and, eventually, in refining the query. The prototype has been evaluated on hundreds of public SPARQL endpoints (listed in Data Hub) and it is available online at http://dbgroup.unimo.it/lodex2. A survey conducted on 27 users has demonstrated that our tool can effectively support both unskilled and skilled users in exploring and querying LOD datasets.

Bartolini, Iaria ; Beneventano, Domenico; Bergamaschi, Sonia ; Ciaccia, Paolo; Corni, Alberto; Orsini, Mirko; Patella, Marco; Santese, Marco Maria ( 2015 ) - MOMIS Goes Multimedia: WINDSURF and the Case of Top-K Queries ( Italian Symposium on Advanced Database Systems (SEBD - Sistemi Evoluti per Basi di Dati) - Gaeta - 14-17 June 2015) ( - 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) ) (Curran Associates Red Hook USA ) - pp. da 200 a 207 ISBN: 9781510810877 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In a scenario with “traditional” and “multimedia” data sources, this position paper discusses the following question: “How can a multimedia local source (e.g., Windsurf) supporting ranking queries be integrated into a mediator system without such capabilities (e.g., MOMIS)?” More precisely, “How to support ranking queries coming from a multimedia local source within a mediator system with a “traditional” query processor based on an SQL-engine?” We first describe a na¨ıve approach for the execution of range and Top-K global queries where the MOMIS query processing method remains substantially unchanged, but, in the case of Top-K queries, it does not guarantee to obtain K results. We then discuss two alternative modalities for allowing MOMIS to return the Top-K best results of a global query.

Albano, Lorenzo; Domenico, Beneventano; Sonia, Bergamaschi ( 2015 ) - Multilingual Word Sense Induction to Improve Web Search Result Clustering ( 23rd Italian Symposium on Advanced Database Systems, SEBD 2015 - Gaeta, Italy - 14-17 June 2015) ( - 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) ) (Curran Associates New York USA ) - pp. da 272 a 279 ISBN: 9781510810877 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In [13] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea of, first, automatically in- ducing senses for the target query and, second, clustering the search results based on their semantic similarity to the word senses induced. In [1] we proposed an innovative Word Sense Induction method based on multilingual data; key to our approach was the idea that a multilingual context representation, where the context of the words is expanded by considering its translations in different languages, may im- prove the WSI results; the experiments showed a clear per- formance gain. In this paper we give some preliminary ideas to exploit our multilingual Word Sense Induction method to Web search result clustering.

Albano, Lorenzo; Beneventano, Domenico; Bergamaschi, Sonia ( 2015 ) - Multilingual Word Sense Induction to Improve Web Search Result Clustering ( 24th International Conference on World Wide Web - Firenze - 18-22 May 2015) ( - WWW '15 Companion Proceedings of the 24th International Conference on World Wide Web ) (International World Wide Web Conferences Steering Committee Ginevra CHE ) - pp. da 835 a 839 ISBN: 9781450334730 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In [12] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea of, first, automatically in- ducing senses for the target query and, second, clustering the search results based on their semantic similarity to the word senses induced. In [1] we proposed an innovative Word Sense Induction method based on multilingual data; key to our approach was the idea that a multilingual context representation, where the context of the words is expanded by considering its translations in different languages, may im- prove the WSI results; the experiments showed a clear per- formance gain. In this paper we give some preliminary ideas to exploit our multilingual Word Sense Induction method to Web search result clustering.

Domenico, Beneventano; Sonia, Bergamaschi; Luca, Gagliardelli; Laura, Po ( 2015 ) - Open Data for Improving Youth Policies ( International Conference on Knowledge Engineering and Ontology Development, part of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management(IC3K) - Lisbona - 12-14 Novembre 2015) ( - Proceedings of the International Conference on Knowledge Engineering and Ontology Development (KEOD 2015), part of the 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2015) ) (SciTePress Setubal PRT ) - n. volume 2 - pp. da 118 a 129 ISBN: 9789897581588 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Open Data \textit{philosophy} is based on the idea that certain data should be made ​​available to all citizens, in an open form, without any copyright restrictions, patents or other mechanisms of control. Various government have started to publish open data, first of all USA and UK in 2009, and in 2015, the Open Data Barometer project (www.opendatabarometer.org) states that on 77 diverse states across the world, over 55 percent have developed some form of Open Government Data initiative. We claim Public Administrations, that are the main producers and one of the consumers of Open Data, might effectively extract important information by integrating its own data with open data sources.This paper reports the activities carried on during a one-year research project on Open Data for Youth Policies. The project was mainly devoted to explore the youth situation in the municipalities and provinces of the Emilia Romagna region (Italy), in particular, to examine data on population, education and work.The project goals were: to identify interesting data sources both from the open data community and from the private repositories of local governments of Emilia Romagna region related to the Youth Policies; to integrate them and, to show up the result of the integration by means of a useful navigator tool; in the end, to publish new information on the web as Linked Open Data. This paper also reports the main issues encountered that may seriously affect the entire process of consumption, integration till the publication of open data.

Bergamaschi, Sonia; Ferro, Nicola; Guerra, Francesco; Silvello, Gianmaria ( 2015 ) - Perspective Look at Keyword-based Search Over Relation Data and its Evaluation (Extended Abstract) ( 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) - Gaeta - 14-17 June 2015) ( - 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) ) (Curren Associates New York USA ) - pp. da 168 a 175 ISBN: 9781510810877 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This position paper discusses the need for considering keyword search over relational databases in the light of broader systems, where keyword search is just one of the components and which are aimed at better supporting users in their search tasks. These more complex systems call for appropriate evaluation methodologies which go beyond what is typically done today, i.e. measuring performances of components mostly in isolation or not related to the actual user needs, and, instead, able to consider the system as a whole, its constituent components, and their inter-relations with the ultimate goal of supporting actual user search tasks.

Bulgarelli, Andrea; Fioretti, Valentina; Zoli, Andrea; Aboudan, Alessio; Rodríguez-Vázquez, Juan José; De Cesare, Giovanni; De Rosa, Adriano; Maier, Gernot; Lyard, Etienne; Bastieri, Denis; Lombardi, Saverio; Tosti, Gino; Bergamaschi, Sonia; Beneventano, Domenico; Lamanna, Giovanni; Jacquemier, Jean; Kosack, Karl; Angelo Antonelli, Lucio; Boisson, Catherine; Borkowski, Jerzy; Buson, Sara; Carosi, Alessandro; Conforti, Vito; Colomé, Pep; De Los Reyes, Raquel; Dumm, Jon; Evans, Phil; Fortson, Lucy; Fuessling, Matthias; Gotz, Diego; Graciani, Ricardo; Gianotti, Fulvio; Grandi, Paola; Hinton, Jim; Humensky, Brian; Inoue, Susumu; Knödlseder, Jürgen; Le Flour, Thierry; Lindemann, Rico; Malaguti, Giuseppe; Markoff, Sera; Marisaldi, Martino; Neyroud, Nadine; Nicastro, Luciano; Ohm, Stefan; Osborne, Julian; Oya, Igor; Rodriguez, Jerome; Rosen, Simon; Ribo, Marc; Tacchini, Alessandro; Schüssle, Fabian; Stolarczyk, Thierry; Torresi, Eleonora; Testa, Vincenzo; Wegner, Peter; Weinstein, Amanda ( 2015 ) - The on-site analysis of the cherenkov telescope array ( 34th International Cosmic Ray Conference, ICRC 2015 - nld - 2015) ( - Proceedings of Science ) (Proceedings of Science (PoS) ) - POS PROCEEDINGS OF SCIENCE - n. volume 30- [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Cherenkov Telescope Array (CTA) observatory will be one of the largest ground-based veryhigh- energy gamma-ray observatories. The On-Site Analysis will be the first CTA scientific analysis of data acquired from the array of telescopes, in both northern and southern sites. The On-Site Analysis will have two pipelines: the Level-A pipeline (also known as Real-Time Analysis, RTA) and the level-B one. The RTA performs data quality monitoring and must be able to issue automated alerts on variable and transient astrophysical sources within 30 seconds from the last acquired Cherenkov event that contributes to the alert, with a sensitivity not worse than the one achieved by the final pipeline by more than a factor of 3. The Level-B Analysis has a better sensitivity (not be worse than the final one by a factor of 2) and the results should be available within 10 hours from the acquisition of the data: for this reason this analysis could be performed at the end of an observation or next morning. The latency (in particular for the RTA) and the sensitivity requirements are challenging because of the large data rate, a few GByte/s. The remote connection to the CTA candidate site with a rather limited network bandwidth makes the issue of the exported data size extremely critical and prevents any kind of processing in real-time of the data outside the site of the telescopes. For these reasons the analysis will be performed on-site with infrastructures co-located with the telescopes, with limited electrical power availability and with a reduced possibility of human intervention. This means, for example, that the on-site hardware infrastructure should have low-power consumption. A substantial effort towards the optimization of high-throughput computing service is envisioned to provide hardware and software solutions with high-throughput, low-power consumption at a low-cost. This contribution provides a summary of the design of the on-site analysis and reports some prototyping activities.

Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura ( 2015 ) - Visual Querying LOD sources with LODeX ( 8th International Conference on Knowledge Capture (K-CAP 2015) - Palisades, NY, USA - 7-10 October 2015) ( - Proceedings of the 8th International Conference on Knowledge Capture ) (ACM New York USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Linked Open Data (LOD) Cloud has more than tripled its sources in just three years (from 295 sources in 2011 to 1014 in 2014). While the LOD data are being produced at a increasing rate, LOD tools lack in producing an high level representation of datasets and in supporting users in the exploration and querying of a source. To overcome the above problems and significantly increase the number of consumers of LOD data, we devised a new method and a tool, called LODeX, that promotes the understanding, navigation and querying of LOD sources both for experts and for beginners. It also provides a standardized and homogeneous summary of LOD sources and supports user in the creation of visual queries on previously unknown datasets. We have extensively evaluated the portability and usability of the tool. LODeX have been tested on the entire set of datasets available at Data Hub, i.e. 302 sources. In this paper, we showcase the usability evaluation of the different features of the tool (the Schema Summary representation and the visual query building) obtained on 27 users (comprising both Semantic Web experts and beginners).

Andrea Bulgarelli ; Valentina Fioretti ; Andrea Zoli ; Alessio Aboudan ; Juan José Rodríguez-Vázquez ; Gernot Maier ; Etienne Lyard ; Denis Bastieri ; Saverio Lombardi ; Gino Tosti ; Adriano De Rosa ; Sonia Bergamaschi ; Matteo Interlandi ; Domenico Beneventano ; Giovanni Lamanna ; Jean Jacquemier ; Karl Kosack ; Lucio Angelo Antonelli ; Catherine Boisson ; Jerzy Burkowski ; Sara Buson ; Alessandro Carosi ; Vito Conforti ; Jose Luis Contreras ; Giovanni De Cesare ; Raquel de los Reyes ; Jon Dumm ; Phil Evans ; Lucy Fortson ; Matthias Fuessling ; Ricardo Graciani ; Fulvio Gianotti ; Paola Grandi ; Jim Hinton ; Brian Humensky ; Jürgen Knödlseder ; Giuseppe Malaguti ; Martino Marisaldi ; Nadine Neyroud ; Luciano Nicastro ; Stefan Ohm ; Julian Osborne ; Simon Rosen ; Alessandro Tacchini ; Eleonora Torresi ; Vincenzo Testa ; Massimo Trifoglio ; Amanda Weinstein ( 2014 ) - A prototype for the real-time analysis of the Cherenkov Telescope Array ( Ground-based and Airborne Telescopes V - Montréal, Quebec, Canada - June 22, 2014) ( - Proc. SPIE 9145, Ground-based and Airborne Telescopes V ) (Helen J. Hall Montréal, Quebec, CAN ) - pp. da 108 a 108 ISBN: 91452X [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Cherenkov Telescope Array (CTA) observatory will be one of the biggest ground-based very-high-energy (VHE) γ- ray observatory. CTA will achieve a factor of 10 improvement in sensitivity from some tens of GeV to beyond 100 TeV with respect to existing telescopes. The CTA observatory will be capable of issuing alerts on variable and transient sources to maximize the scientific return. To capture these phenomena during their evolution and for effective communication to the astrophysical community, speed is crucial. This requires a system with a reliable automated trigger that can issue alerts immediately upon detection of γ-ray flares. This will be accomplished by means of a Real-Time Analysis (RTA) pipeline, a key system of the CTA observatory. The latency and sensitivity requirements of the alarm system impose a challenge because of the anticipated large data rate, between 0.5 and 8 GB/s. As a consequence, substantial efforts toward the optimization of highthroughput computing service are envisioned. For these reasons our working group has started the development of a prototype of the Real-Time Analysis pipeline. The main goals of this prototype are to test: (i) a set of frameworks and design patterns useful for the inter-process communication between software processes running on memory; (ii) the sustainability of the foreseen CTA data rate in terms of data throughput with different hardware (e.g. accelerators) and software configurations, (iii) the reuse of nonreal- time algorithms or how much we need to simplify algorithms to be compliant with CTA requirements, (iv) interface issues between the different CTA systems. In this work we focus on goals (i) and (ii). © (2014) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.

Fabio Benedetti; Sonia Bergamaschi; Laura Po ( 2014 ) - A Visual Summary for Linked Open Data sources ( ISWC 2014 Posters & Demonstrations Track - Riva del Garda, Italy - October 21, 2014) ( - Proceedings of the ISWC 2014 Posters & Demonstrations Track a track within the 13th International Semantic Web Conference, ISWC 2014, Riva del Garda, Italy, October 21, 2014 ) (CEUR-WS ) - n. volume 1272 - pp. da 173 a 176 ISSN: 1613-0073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose LODeX, a tool that produces a representative summary of a Linked open Data (LOD) source starting from scratch, thus supporting users in exploring and understanding the contents of a dataset. The tool takes in input the URL of a SPARQL endpoint and launches a set of predefined SPARQL queries, from the results of the queries it generates a visual summary of the source. The summary reports statistical and structural information of the LOD dataset and it can be browsed to focus on particular classes or to explore their properties and their use. LODeX was tested on the 137 public SPARQL endpoints contained in Data Hub (formerly CKAN), one of the main Open Data catalogues. The statistical and structural information of the 107 well performed extractions are collected and available in the online version of LODeX (http://dbgroup.unimo.it/lodex).

Sonia Bergamaschi; Laura Po; Serena Sorrentino ( 2014 ) - Comparing Topic Models for a Movie Recommendation System ( The 10th International Conference on Web Information Systems and Technologies - Barcellona, Spagna - 3-5 Aprile 2014) ( - Proceedings of the 10th International Conference on Web Information Systems and Technologies ) (SciTePress – Science and Technology Publications Barcellona ESP ) - n. volume 2 - pp. da 172 a 183 ISBN: 978-989758024-6 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Recommendation systems have become successful at suggesting content that are likely to be of interest to the user, however their performance greatly suffers when little information about the users preferences are given. In this paper we propose an automated movie recommendation system based on the similarity of movie: given a target movie selected by the user, the goal of the system is to provide a list of those movies that are most similar to the target one, without knowing any user preferences. The Topic Models of Latent Semantic Allocation (LSA) and Latent Dirichlet Allocation (LDA) have been applied and extensively compared on a movie database of two hundred thousand plots. Experiments are an important part of the paper; we examined the topic models behaviour based on standard metrics and on user evaluations, we have conducted performance assessments with 30 users to compare our approach with a commercial system. The outcome was that the performance of LSA was superior to that of LDA in supporting the selection of similar plots. Even if our system does not outperform commercial systems, it does not rely on human effort, thus it can be ported to any domain where natural language descriptions exist. Since it is independent from the number of user ratings, it is able to suggest famous movies as well as old or unheard movies that are still strongly related to the content of the video the user has watched.

Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni ( 2014 ) - Discovering the topics of a data source: A statistical approach? ( Workshop on Surfacing the Deep and the Social Web, SDSW 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 - ita - 2014) ( - CEUR Workshop Proceedings ) (CEUR-WS ) - CEUR WORKSHOP PROCEEDINGS - n. volume 1310 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present a preliminary approach for automatically discovering the topics of a structured data source with respect to a reference ontology. Our technique relies on a signature, i.e., a weighted graph that summarizes the content of a source. Graph-based approaches have been already used in the literature for similar purposes. In these proposals, the weights are typically assigned using traditional information-theoretical quantities such as entropy and mutual information. Here, we propose a novel data-driven technique based on composite likelihood to estimate the weights and other main features of the graphs, making the resulting approach less sensitive to overfitting. By means of a comparison of signatures, we can easily discover the topic of a target data source with respect to a reference ontology. This task is provided by a matching algorithm that retrieves the elements common to both the graphs. To illustrate our approach, we discuss a preliminary evaluation in the form of running example.

Sonia Bergamaschi; Francesco Guerra; Giovanni Simonini ( 2014 ) - Keyword Search over Relational Databases: Issues, Approaches and Open Challenges ( - Bridging Between Information Retrieval and Databases ) (Springer-Verlag Berlin Heidelberg Berlin DEU ) - n. volume LNCS 8173 - pp. da 54 a 73 ISBN: 9783642547973 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

In this paper, we overview the main research approaches developed in the area of Keyword Search over Relational Databases. In particular, we model the process for solving keyword queries in three phases: the management of the user’s input, the search algorithms, the results returned to the user. For each phase we analyze the main problems, the solutions adopted by the most important system developed by researchers and the open challenges. Finally, we introduce two open issues related to multi-source scenarios and database sources handling instance not fully accessible.

Fabio Benedetti; Sonia Bergamaschi; Laura Po ( 2014 ) - LODeX: A visualization tool for Linked Open Data navigation and querying. [Software (296) - Software]
Abstract

We present LODeX, a tool for visualizing, browsing and querying a LOD source starting from the URL of its SPARQL endpoint. LODeX creates a visual summary for a LOD dataset and allows users to perfor queries on it. Users can select the classes of interest for discovering which instances are stored in the LOD source without any knowledge of the underlying vocabulary used for describing data. The tool couples the overall view of the LOD source with the preview of the instances so that the user can easily build and refine his/her query. The tool has been evaluated on hundreds of public SPARQL endpoints (listed in Data Hub). The schema summaries of 40 LOD sources are stored and available for online querying at http://dbgroup.unimo.it/lodex2.

Fabio Benedetti; Sonia Bergamaschi;Laura Po ( 2014 ) - Online Index Extraction from Linked Open Data Sources ( Second International Workshop on Linked Data for Information Extraction (LD4IE} 2014) - Riva del Garda, Italy - October 20, 2014) ( - Proceedings of the Second International Workshop on Linked Data for Information Extraction {(LD4IE} 2014) co-located with the 13th International Semantic Web Conference {(ISWC} 2014), Riva del Garda, Italy, October 20, 2014 ) - n. volume 1267 - pp. da 9 a 20 ISSN: 1613-0073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The production of machine-readable data in the form of RDF datasets belonging to the Linked Open Data (LOD) Cloud is growing very fast. However, selecting relevant knowledge sources from the Cloud, assessing the quality and extracting synthetical information from a LOD source are all tasks that require a strong human effort. This paper proposes an approach for the automatic extraction of the more representative information from a LOD source and the creation of a set of indexes that enhance the description of the dataset. These indexes collect statistical information regarding the size and the complexity of the dataset (e.g. the number of instances), but also depict all the instantiated classes and the properties among them, supplying user with a synthetical view of the LOD source. The technique is fully implemented in LODeX, a tool able to deal with the performance issues of systems that expose SPARQL endpoints and to cope with the heterogeneity on the knowledge representation of RDF data. An evaluation on LODeX on a large number of endpoints (244) belonging to the LOD Cloud has been performed and the effectiveness of the index extraction process has been presented.

Domenico Beneventano; Sonia Bergamaschi ( 2014 ) - PROVENANCE-AWARE SEMANTIC SEARCH ENGINES BASED ON DATA INTEGRATION SYSTEMS - INTERNATIONAL JOURNAL OF ORGANIZATIONAL AND COLLECTIVE INTELLIGENCE - n. volume 4(2) - pp. da 1 a 30 ISSN: 1947-9344 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Search engines are common tools for virtually every user of the Internet and companies, such as Google and Yahoo!, have become household names. Semantic Search Engines try to augment and improve traditional Web Search Engines by using not just words, but concepts and logical relationships. Given the openness of the Web and the different sources involved, a Web Search Engine must evaluate quality and trustworthiness of the data; a common approach for such assessments is the analysis of the provenance of information. In this paper a relevant class of Provenance-aware Semantic Search Engines, based on a peer-to-peer, data integration mediator-based architecture is described. The architectural and functional features are an enhancement with provenance of the SEWASIE semantic search engine developed within the IST EU SEWASIE project, coordinated by the authors. The methodology to create a two level ontology and the query processing engine developed within the SEWASIE project, together with provenance extension are fully described.

Domenico Beneventano; Sonia Bergamaschi ; Serena Sorrentino; Maurizio Vincini; Fabio Benedetti ( 2014 ) - Semantic Annotation of the CEREALAB database by the AGROVOC Linked Dataset - ECOLOGICAL INFORMATICS - n. volume 26 - pp. da 119 a 126 ISSN: 1574-9541 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Nowadays, there has been an increment of open data government initiatives promoting the idea that particular data should be freely published. However, the great majority of these resources is published in an unstructured format and is typically accessed only by closed communities. Starting from these considerations, in a previous work related to a youth precariousness dataset, we proposed an experimental and preliminary methodology or facilitating resource providers in publishing public data into the Linked Open Data (LOD) cloud, and for helping consumers (companies and citizens) in efficiently accessing and querying them. Linked Open Data play a central role for accessing and analyzing the rapidly growing pool of life science data and, as discussed in recent meetings, it is important for data source providers themselves making their resources available as Linked Open Data. In this paper we extend and apply our methodology to the agricultural domain, i.e. to the CEREALAB database, created to store both genotypic and phenotypic data and specifically designed for plant breeding, in order to provide its publication into the LOD cloud.

Lorenzo Albano; Domenico Beneventano; Sonia Bergamaschi ( 2014 ) - Word Sense Induction with Multilingual Features Representation ( International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM - Warsaw, Poland - 11–14 August 2014) ( - Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences ) (IEEE VARSAVIA POL ) - n. volume 2 - pp. da 343 a 349 ISBN: 978-147994143-8 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The use of word senses in place of surface word forms has been shown to improve performance on many computational tasks, including intelligent web search. In this paper we propose a novel approach to automatic discovery of word senses from raw text, a task referred to as Word Sense Induction (WSI). Almost all the WSI approaches proposed in the literature dealt with monolingual data and only very few proposals incorporate bilingual data. The WSI method we propose is innovative as use multi-lingual data to perform WSI of words in a given language. The experiments show a clear overall improvement of the performance: the single-language setting is outperformed by the multi-language settings on almost all the considered target words. The performance gain, in terms of F-Measure, has an average value of 5% and in some cases it reaches 40%.

I. Baroni; S. Bergamaschi; L. Po ( 2013 ) - An iPad Order Management System for Fashion Trade ( WEBIST 2013 - Proceedings of the 9th International Conference on Web Information Systems and Technologies - Aachen, Germany - 8-10 May, 2013) ( - WEBIST ) (Karl-Heinz Krempels, Alexander Stocker Aachen DEU ) - pp. da 519 a 526 ISBN: 9789898565549 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The fashion industry loves the new tablets. In 2011 we noted a 38% growth of e-commerce in the italian fashion industry. A large number of brands have understood the value of mobile devices as the key channel for consumer communication. The interest of brands in applications of mobile marketing and services have made a big step forward, with an increase of 129% in 2011 (osservatori.net, 2012). This paper presents a mobile version of the Fashion OMS (Order Management System) web application. Fashion Touch is a mobile application that allows clients and company’s sales networks to process commercial orders, consult the product catalog and manage customers as the OMS web version does with the added functionality of the off-line order entering mode. To develop an effective mobile App, we started by analyzing the new web technologies for mobile applications (HTML5, CSS3, Ajax) and their relative development frameworks making a comparison with the Apple’s native programming language. We selected Titanium, a multi-platform framework for native mobile and desktop devices application development via web technologies as the best framework for our purpose. We faced issues concerning the network synchronization and studied different database solutions depending on the device hardware characteristics and performances. This paper reports every aspect of the App development until the publication on the Apple Store.

S. Bergamaschi; N.Ferro; F.Guerra ; G. Silvello ( 2013 ) - Keyword Search and Evaluation over Relational Databases: an Outlook to the Future ( DBRank 2013 - Riva del Garda (TN) - 30/08/2013) ( - 7th International Workshop on Ranking in Databases ) (ACM New York, NY, USA New York USA ) - pp. da 1 a 3 ISBN: 9781450324977 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This position paper discusses the need for considering keyword search over relational databases in the light of broader systems, where keyword search is just one of the components and which are aimed at better supporting users in their search tasks. These more complex systems call for appropriate evaluation methodologies which go beyond what is typically done today, i.e. measuring performances of components mostly in isolation or not related to the actual user needs, and, instead, able to consider the system as a whole, its constituent components, and their inter-relations with the ultimate goal of supporting actual user search tasks.

Sonia Bergamaschi; Maciej Gawinecki; Serena Sorrentino ( 2013 ) - NORMS: an automatic tool to perform schema label normalization [Software (296) - Software]
Abstract

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources(heterogeneous in format and structure). Schema matching systems usually exploit lexical and semantic information provided by lexical databases/thesauri to discover intra/inter semanticrelationships among schema elements. However, most of them obtain poor performance on real scenarios due to the significant presence of “non-dictionary words” in real world schemata.Non-dictionary words include compound nouns, abbreviations and acronyms. In this paper, we present NORMS (NORmalizer of Schemata), a tool performing schema label normalization to increase the number of comparable labels extracted fromschemata.

Bergamaschi, S.; Guerra, F.; Interlandi, M.; Trillo Lado, R.; Velegrakis, Y. ( 2013 ) - QUEST: A Keyword Search System for Relational Data based on Semantic and Machine Learning Techniques - PROCEEDINGS OF THE VLDB ENDOWMENT - n. volume 6(12) - pp. da 1222 a 1225 ISSN: 2150-8097 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We showcase QUEST (QUEry generator for STructured sources), a search engine for relational databases that combines semantic and machine learning techniques for transforming keyword queries into meaningful SQL queries. The search engine relies on two approaches: the forward, providing mappings of keywords into database terms (names of tables and attributes, and domains of attributes), and the backward, computing the paths joining the data structures identified in the forward step. The results provided by the two approaches are combined within a probabilistic framework based on the Dempster-Shafer Theory. We demonstrate QUEST capabilities, and we show how, thanks to the flexibility obtained by the probabilistic combination of different techniques, QUEST is able to compute high quality results even with few training data and/or with hidden data sources such as those found in the Deep Web.

Serena Sorrentino; Sonia Bergamaschi; Elisa Fusari; Domenico Beneventano ( 2013 ) - Semantic Annotation and Publication of Linked Open Data ( Computational Science and Its Applications - ICCSA 2013 - Ho Chi Minh City, Vietnam - June 24-27, 2013) ( - Computational Science and Its Applications - ICCSA 2013 - 13th International Conference, Ho Chi Minh City, Vietnam, June 24-27, 2013, Proceedings, Part V ) (Springer - Lecture Notes in Computer Science Heidenberg DEU ) - n. volume 7975 - pp. da 462 a 474 ISBN: 9783642396397 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Nowadays, there has been an increment of open data government initiatives promoting the idea that particular data produced by public administrations (such as public spending, health care, education etc.) should be freely published. However, the great majority of these resources is published in an unstructured format (such as spreadsheets or CSV) and is typically accessed only by closed communities. Starting from these considerations, we propose a semi-automatic experimental methodology for facilitating resource providers in publishing public data into the Linked Open Data (LOD) cloud, and for helping consumers (companies and citizens) in efficiently accessing and querying them. We present a preliminary method for publishing, linking and semantically enriching open data by performing automatic semantic annotation of schema elements. The methodology has been applied on a set of data provided by the Research Project on Youth Precariousness, of the Modena municipality, Italy.

Domenico Beneventano; Sonia Bergamaschi; Serena Sorrentino ( 2013 ) - Semantic Annotation of the CEREALAB Database by the AGROVOC Linked Dataset ( Computational Science and Its Applications - ICCSA 2013 - Ho Chi Minh City, Vietnam - June 24-27, 2013) ( - Computational Science and Its Applications - ICCSA 2013 - 13th International Conference, Ho Chi Minh City, Vietnam, June 24-27, 2013, Proceedings, Part I ) (Springer- Lecture Notes in Computer Science Heidenberg DEU ) - n. volume 7971 - pp. da 194 a 203 ISBN: 9783642396366 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The objective of the CEREALAB database is to help the breeders in choosing molecular markers associated to the most important traits. Phenotypic and genotypic data obtained from the integration of open source databases with the data obtained by the CEREALAB project are made available to the users. The CEREALAB database has been and is currently extensively used within the frame of the CEREALAB project. This paper presents the main achievements and the ongoing research to annotate the CEREALAB database and to publish it in the Linking Open Data network, in order to facilitate breeders and geneticists in searching and exploiting linked agricultural resources. One of the main focus of this paper is to discuss the use of the AGROVOC Linked Dataset both to annotate the CEREALAB schema and to discover schema-level mappings among the CEREALAB Dataset and other resources of the Linking Open Data network, such as NALT, the National Agricultural Library Thesaurus, and DBpedia.

Vincini M.; Bergamaschi S.; Beneventano D. ( 2013 ) - Semantic Integration of heterogeneous data sources in the MOMIS Data Transformation System - JOURNAL OF UNIVERSAL COMPUTER SCIENCE - n. volume 19 - pp. da 1986 a 2012 ISSN: 0948-695X [Articolo in rivista (262) - Articolo su rivista]
Abstract

In the last twenty years, many data integration systems following a classical wrapper/mediator architecture and providing a Global Virtual Schema (a.k.a. Global Virtual View - GVV) have been proposed by the research community. The main issues faced by these approaches range from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. Despite the research effort, all the approaches proposed require a lot of user intervention for customizing and managing the data integration and reconciliation tasks. In some cases, the effort and the complexity of the task is huge, since it requires the development of specific programming codes. Unfortunately, due to the specificity to be addressed, application codes and solutions are not frequently reusable in other domains. For this reason, the Lowell Report 2005 has provided the guideline for the definition of a public benchmark for information integration problem. The proposal, called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches), focuses on how the data integration systems manage syntactic and semantic heterogeneities, which definitely are the greatest technical challenges in the field. We developed a Data Transformation System (DTS) that supports data transformation functions and produces query translation in order to push down to the sources the execution. Our DTS is based on MOMIS, a mediator-based data integration system that our research group is developing and supporting since 1999. In this paper, we show how the DTS is able to solve all the twelve queries of the THALIA benchmark by using a simple combination of declarative translation functions already available in the standard SQL language. We think that this is a remarkable result, mainly for two reasons: firstly to the best of our knowledge there is no system that has provided a complete answer to the benchmark, secondly, our queries does not require any overhead of new code.

S. Bergamaschi; F. Guerra; M. Interlandi; S. Rota; R.Trillo; Y. Velegrakis ( 2013 ) - Using a HMM based approach for mapping keyword queries into database terms ( Symposium on Advanced Database Systems SEBD - Roccella Jonica - June 30th -July 04th, 2013) ( - Proceedings of 21st Italian Symposium on Advanced Database Systems ) (proceedings informali Roccella Jonica ITA ) - pp. da 239 a 246 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Systems translating keyword queries into SQL queries over relational databases are usually referred to in the literature as schema-based approaches. These techniques exploit the information contained in the database schema to build SQL queries that express the intended meaning of the user query. Besides, typically, they perform a preliminary step that associates keywords in the user query with database elements (names of tables, attributes and domain attributes). In this paper, we present a probabilistic approach based on a Hidden Markov Model to provide such mappings. In contrast to most existing techniques, our proposal does not require any a-priori knowledge of the database extension.

Sonia Bergamaschi; Matteo Interlandi; Mario Longo; Laura Po; Maurizio Vincini ( 2012 ) - A meta-language for MDX queries in eLog Business Solution ( IEEE 28th International Conference on Data Engineering (ICDE 2012) - Washington, USA (Arlington, Virginia) - 1-5 April, 2012) ( - IEEE 28th International Conference on Data Engineering (ICDE 2012) ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 1417 a 1428 ISBN: 9781467300421 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The adoption of business intelligence technologyin industries is growing rapidly. Business managers are notsatisfied with ad hoc and static reports and they ask for moreflexible and easy to use data analysis tools. Recently, applicationinterfaces that expand the range of operations available to theuser, hiding the underlying complexity, have been developed. Thepaper presents eLog, a business intelligence solution designedand developed in collaboration between the database group ofthe University of Modena and Reggio Emilia and eBilling, anItalian SME supplier of solutions for the design, production andautomation of documentary processes for top Italian companies.eLog enables business managers to define OLAP reports bymeans of a web interface and to customize analysis indicatorsadopting a simple meta-language. The framework translates theuser’s reports into MDX queries and is able to automaticallyselect the data cube suitable for each query.Over 140 medium and large companies have exploited thetechnological services of eBilling S.p.A. to manage their documentsflows. In particular, eLog services have been used by themajor media and telecommunications Italian companies and theirforeign annex, such as Sky, Mediaset, H3G, Tim Brazil etc. Thelargest customer can provide up to 30 millions mail pieces within6 months (about 200 GB of data in the relational DBMS). In aperiod of 18 months, eLog could reach 150 millions mail pieces(1 TB of data) to handle.

Farinella, Tania; Bergamaschi, Sonia; Po, Laura ( 2012 ) - A non-intrusive movie recommendation system ( Confederated International Conferences on On the Move to Meaningful Internet Systems, OTM 2012: CoopIS, DOA-SVI, and ODBASE 2012 - Rome, ita - 2012) ( - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ) - n. volume 7566 - pp. da 735 a 751 ISBN: 9783642336140; 9783642336140 | 9783642336140 ISSN: 1611-3349 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Several recommendation systems have been developed to support the user in choosing an interesting movie from multimedia repositories. The widely utilized collaborative-filtering systems focus on the analysis of user profiles or user ratings of the items. However, these systems decrease their performance at the start-up phase and due to privacy issues, when a user hides most of his personal data. On the other hand, content-based recommendation systems compare movie features to suggest similar multimedia contents; these systems are based on less invasive observations, however they find some difficulties to supply tailored suggestions. In this paper, we propose a plot-based recommendation system, which is based upon an evaluation of similarity among the plot of a video that was watched by the user and a large amount of plots that is stored in a movie database. Since it is independent from the number of user ratings, it is able to propose famous and beloved movies as well as old or unheard movies/programs that are still strongly related to the content of the video the user has watched. We experimented different methodologies to compare natural language descriptions of movies (plots) and evaluated the Latent Semantic Analysis (LSA) to be the superior one in supporting the selection of similar plots. In order to increase the efficiency of LSA, different models have been experimented and in the end, a recommendation system that is able to compare about two hundred thousands movie plots in less than a minute has been developed.

S. Bergamaschi; R. Martoglia; S. Sorrentino ( 2012 ) - A Semantic Method for Searching Knowledge in a Software Development Context ( 20th Italian Symposium on Advanced Database Systems - Venice, Italy - June 2012) ( - Proceedings of the 20th Italian Symposium on Advanced Database Systems ) (Edizioni Libreria Progetto Padova ITA ) - pp. da 115 a 122 ISBN: 9788896477236 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The FACIT-SME European FP-7 project targets to facilitate the use and sharing of Software Engineering (SE) methods and best practices among software developing SMEs. In this context, we present an automatic semantic document searching method based on Word Sense Disambiguation which exploits both syntactic and semantic information provided by external dictionaries and is easily applicable for any SME.

Serena Sorrentino; Sonia Bergamaschi; Elena Parmiggiani ( 2012 ) - A Supervised Method for Lexical Annotation of Schema Labels based on Wikipedia ( ER International Conference on Conceptual Modeling (ER 2012) - Firenze - 15-18/10/2012) ( - Conceptual Modelling, 31th International Conference, ER2012, Florence, Italy. October. ) (Springer Berlin DEU ) - n. volume LNCS 7532 - pp. da 359 a 368 ISBN: 9783642340017 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Lexical annotation is the process of explicit assignment of one or more meanings to a term w.r.t. a sense inventory (e.g., a thesaurus or an ontology). We propose an automatic supervised lexical annotation method, called ALATK (Automatic Lexical Annotation -Topic Kernel), based on the Topic Kernel function for the annotation of schema labels extracted from structured and semi-structured data sources. It exploits Wikipedia as sense inventory and as resource of training data.

Sonia Bergamaschi; Marius Octavian Olaru; Serena Sorrentino; Maurizio Vincini ( 2012 ) - Dimension matching in Peer-to-Peer Data Warehousing ( Fusing Decision Support Systems into the Fabric of the Context - Anávissos, Greece - 28-30 june, 2012) ( - Frontiers in Artificial Intelligence and Applications ) (IOS Press Amsterdam NLD ) - n. volume 238 - pp. da 149 a 160 ISBN: 9781614990727 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

During the last decades, the Data Warehouse has been one of the main components of a Decision Support System (DSS) inside a company. Given the great diffusion of Data Warehouses nowadays, managers have realized that there is a great potential in combining information coming from multiple information sources, like heterogeneous Data Warehouses from companies operating in the same sector. Existing solutions rely mostly on the Extract-Transform-Load (ETL) approach, a costly and complex process. The process of Data Warehous integration can be greatly simplified by developing a method that is able to semi-automatically discover semantic relationships among attributes of two or more different, heterogeneous Data Warehouse schemas. In this paper, we propose a method for the semi-automatic discovery of mappings between dimension hierarchies of heterogeneous Data Warehouses. Our approach exploits techniques from the Data Integration research area by combining topological properties of dimensions and semantic techniques.

Sonia Bergamaschi; Domenico Beneventano; Riccardo MArtoglia ( 2012 ) - FACIT-SME - Facilitate IT-providing SMEs by Operation-related Models and Methods [Software (296) - Software]
Abstract

The FACIT SME project addresses SMEs operating in the ICT domain. The goals are (a) to facilitate the use of Software Engineering (SE) methods and to systematize their application integrated with the business processes, (b) to provide efficient and affordable certification of these processes according to internationally accepted standards, and (c) to securely share best practices, tools and experiences with development partners and customers. The project targets (1) to develop a novel Open Reference Model (ORM) for ICT SME, serving as knowledge backbone in terms of procedures, documents, tools and deployment methods; (2) to develop a customisable Open Source Enactment System (OSES) that provides IT support for the project-specific application of the ORM; and (3) to evaluate these developments with 5 ICT SMEs by establishing the ORM, the OSES and preparing the certifications. The approach combines and amends achievements from Model Generated Workplaces, Certification of SE for SMEs, and model-based document management. The consortium is shaped by 4 significant SME associations as well as a European association exclusively focused on the SME community in the ICT sector. Five R&D partners provide the required competences. Five SMEs operating in the ICT domain will evaluate the results in daily-life application. The major impact is expected for ICT SMEs by (a) optimising their processes based on best practise; (b) achieving internationally accepted certification; and (c) provision of structured reference knowledge. They will improve implementation projects and make their solutions more appealing to SMEs. ICT SME communities (organized by associations) will experience significant benefit through exchange of recent knowledge and best practises. By providing clear assets (ORM and OSES), the associations shape the service offering to their members and strengthen their community. The use of Open Source will further facilitate the spread of the results across European SMEs.

Domenico Beneventano; Sonia Bergamaschi; Abdul Rahman Dannaoui; Nicola Pecchioni ( 2012 ) - Integration and Provenance of Cereals Genotypic and Phenotypic Data ( - Eighth International Conference on Data Integration in the Life Sciences (DILS 2012) ) - pp. da 3 a 3 [Poster (275) - Poster]
Abstract

This paper presents the ongoing research on the design and development of a Provenance Management component, PM_MOMIS, for the MOMIS Data Integration System. MOMIS has been developed by the DBGROUP of the University of Modena and Reggio Emilia (www.dbgroup.unimore.it). An open source version of the MOMIS system is delivered and maintained by the academic spin-off DataRiver (www.datariver.it).PM_MOMIS aims to provide the provenance management techniques supported by two of the most relevant data provenance systems, the "Perm" and "Trio" systems, and extends them by including the data fusion and conflict resolution techniques provided by MOMIS. PM_MOMIS functionalities have been studied and partially developed in the domain of genotypic and phenotypic cereal-data management within the CEREALAB project. The CEREALAB Data Integration Application integrates data coming from different databases with MOMIS, with the aim of creating a powerful tool for plant breeders and geneticists. Users of CEREALAB played a major role in the emergence of real needs of provenance management in their domain.We defined the provenance for the "full outerjoin-merge" operator, used in MOMIS to solve conflicts among values; this definition is based on the concept of "PI-CS-provenance" of the "Perm" system; we are using the "Perm" system as the SQL engine of MOMIS, so that to obtain the provenance in our CEREALAB Application. The main drawback of this solution is that often conflicting values represent alternatives; then our proposal is to consider the output of the "full outerjoin-merge" operator as an uncertain relation and manage it with a system that supports uncertain data and data lineage, the "Trio" system.

Domenico Beneventano; Sonia Bergamaschi; Abdul Rahman Dannaoui ( 2012 ) - Integration and Provenance of Cereals Genotypic and Phenotypic Data ( 20th Italian Symposium on Advanced Database Systems (SEBD2012) - Venice, Italy - June 24-27, 2012) ( - Proceedings of the 20th Italian Symposium on Advanced Database Systems ) (EDIZIONI LIBRERIA PROGETTO Padova ITA ) - pp. da 91 a 98 ISBN: 9788896477236 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents the ongoing research on the design and development of a Provenance Managementcomponent, PM_MOMIS, for the MOMIS Data Integration System. PM_MOMIS aims to provide the provenancemanagement techniques supported by two of the most relevant data provenance systems, the Perm andTrio systems, and extends them by including the data fusion and conflict resolution techniquesprovided by MOMIS.PM_MOMIS functionalities have been studied and partially developed in the domain of genotypic andphenotypic cereal-data management within the CEREALAB project. The CEREALAB Data IntegrationApplication integrates data coming from different databases with MOMIS, with the aim of creating apowerful tool for plant breeders and geneticists. Users of CEREALAB played a major role in theemergence of real needs of provenance management in their domain.

S. Bergamaschi ( 2012 ) - On the Move to Meaningful Internet Systems: OTM 20121 (Springer Heildelberg DEU ) - pp. da 1 a 485 ISBN: 9783642336058 [Curatela (284) - Curatela]
Abstract

The two-volume set LNCS 7565 and 7566 constitutes the refereed proceedings of three confederated international conferences: Cooperative Information Systems (CoopIS 2012), Distributed Objects and Applications - Secure Virtual Infrastructures (DOA-SVI 2012), and Ontologies, DataBases and Applications of SEmantics (ODBASE 2012) held as part of OTM 2012 in September 2012 in Rome, Italy. The 53 revised full papers presented were carefully reviewed and selected from a total of 169 submissions. The 22 full papers included in the first volume constitute the proceedings of CoopIS 2012 and are organized in topical sections on business process design; process verification and analysis; service-oriented architectures and cloud; security, risk, and prediction; discovery and detection; collaboration; and 5 short papers.

Beneventano D.; Bergamaschi S.; Dannaoui A.R.; Milc J.; Pecchioni N.; Sorrentino S. ( 2012 ) - The CEREALAB Database: Ongoing Research and Future Challenges ( 6th Research Conference on Metadata and Semantics Research, MTSR 2012 - Cadiz; Spain - NOV 28-30, 2012) ( - METADATA AND SEMANTICS RESEARCH ) - COMMUNICATIONS IN COMPUTER AND INFORMATION SCIENCE - n. volume 343 - pp. da 336 a 341 ISSN: 1865-0929 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The objective of the CEREALAB database is to help the breeders in choosing molecular markers associated to the most important traits. Phenotypic and genotypic data obtained from the integration of open source databases with the data obtained by the CEREALAB project are made available to the users. The first version of the CEREALAB database has been extensively used within the frame of the CEREALAB project. This paper presents the main achievements and the ongoing research related to the CEREALAB database. First, as a result of the extensive use of the CEREALAB database, several extensions and improvements to the web application user interface were introduced. Second, always derived from end-user needs, the notion of provenance was introduced and partially implemented in the context of the CEREALAB database. Third, we describe some preliminary ideas to annotate the CEREALAB database and to publish it in the Linking Open Data network.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Silvia Rota; Raquel Trillo Lado; Yannis Velegrakis ( 2012 ) - Understanding the Semantics of Keyword Queries on Relational Data Without Accessing the Instance ( - Semantic Search over the Web ) (Springer Heidelberg DEU ) - pp. da 131 a 158 ISBN: 9783642250071 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This chapter deals with the problem of answering a keyword query over a relational database. To do so, one needs to understand the meaning of the keywords in the query, “guess” its possible semantics, and materialize them as SQL queries that can be executed directly on the relational database. The focus of the chapter is on techniques that do not require any prior access to the instance data, making them suitable for sources behind wrappers or Web interfaces or, in general, for sources that disallow prior access to their data in order to construct an index. The chapter describes two techniques that use semantic information and metadata from the sources, alongside the query itself, in order to achieve that. Apart from understanding the semantics of the keywords themselves, the techniques are also exploiting the order and the proximity of the keywords in the query to make a more educated guess. The first approach is based on an extension of the Hungarian algorithm for identifying the data structures having the maximum likelihood to contain the user keywords. In the second approach, the problem of associating keywords into data structures of the relational source is modeled by means of a hidden Markov model, and the Viterbi algorithm is exploited for computing the mappings. Both techniques have been implemented in two systems called KEYMANTIC and KEYRY, respectively.

Milc J; Sala A; Bergamaschi S; Pecchioni N. ( 2011 ) - A genotypic and phenotypic informationsource for marker-assisted selection of cereals:the CEREALAB database - DATABASE - n. volume 2011 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The CEREALAB database aims to store genotypic and phenotypic data obtained by the CEREALAB project and to integratethem with already existing data sources in order to create a tool for plant breeders and geneticists. The database can helpthem in unravelling the genetics of economically important phenotypic traits; in identifying and choosing molecularmarkers associated to key traits; and in choosing the desired parentals for breeding programs. The database is dividedinto three sub-schemas corresponding to the species of interest: wheat, barley and rice; each sub-schema is then dividedinto two sub-ontologies, regarding genotypic and phenotypic data, respectively.

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - A Hidden Markov Model Approach to Keyword-Based Search over Relational Databases ( Conceptual Modeling (ER2011) - Brussels - 30/10/2011 - 03/11/2011) ( - Conceptual Modeling - ER 2011, 30th International Conference, ER 2011, Brussels, Belgium, October 31 - November 3, 2011. Proceedings ) (Springer Heidelberg DEU ) - pp. da 411 a 420 ISBN: 9783642246050 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We present a novel method for translating keyword queries over relationaldatabases into SQL queries with the same intended semantic meaning. Incontrast to the majority of the existing keyword-based techniques, our approachdoes not require any a-priori knowledge of the data instance. It follows a probabilisticapproach based on a Hidden Markov Model for computing the top-K bestmappings of the query keywords into the database terms, i.e., tables, attributesand values. The mappings are then used to generate the SQL queries that areexecuted to produce the answer to the keyword query. The method has been implementedinto a system called KEYRY (from KEYword to queRY).

D. Beneventano; S. Bergamaschi; C. Gennaro; F. Rabitti ( 2011 ) - A mediator-based approach for integrating heterogeneous multimedia sources - MULTIMEDIA TOOLS AND APPLICATIONS - n. volume 55 - pp. da 1 a 24 ISSN: 1380-7501 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In many applications, the information required by the user cannot be found in just one source, but has to be retrieved from many varying sources. This is true not only of formatted data in database management systems, but also of textual documents and multimedia data, such as images and videos. We propose a mediator system that provides the end-user with a single query interface to an integrated view of multiple heterogeneous data sources. We exploit the capabilities of the MOMIS integration system and the MILOS multimedia data management system. Each multimedia source is managed by an instance of MILOS, in which a collection of multimedia records is made accessible by means of similarity searches employing the query-by-example paradigm. MOMIS provides an integrated virtual view of the underlying multimedia sources, thus offering unified multimedia access services. Two features are that MILOS is flexible—it is not tied to any particular similarity function—and the MOMIS’s mediator query processor only exploits the ranks of the local answers.

Sonia Bergamaschi; Francesco Guerra; Mirko Orsini; Claudio Sartori; Maurizio Vincini ( 2011 ) - A Semantic Approach to ETL Technologies - DATA & KNOWLEDGE ENGINEERING - n. volume 70(8) - pp. da 717 a 731 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Data warehouse architectures rely on extraction, transformation and loading (ETL) processes for the creation of anupdated, consistent and materialized view of a set of data sources. In this paper, we aim to support these processes byproposing a tool for the semi-automatic definition of inter-attribute semantic mappings and transformation functions.The tool is based on semantic analysis of the schemas for the mapping definitions amongst the data sources and thedata warehouse, and on a set of clustering techniques for defining transformation functions homogenizing data comingfrom multiple sources. Our proposal couples and extends the functionalities of two previously developed systems: theMOMIS integration system and the RELEVANT data analysis system.

Sonia Bergamaschi; Matteo Interlandi; Maurizio Vincini ( 2011 ) - A Web Platform for Collaborative Multimedia Content Authoring Exploiting Keyword Search Engine and Data Cloud - JOURNAL OF COMPUTER SCIENCE AND ENGINEERING - n. volume 8, issue 2 - pp. da 1 a 8 ISSN: 2043-9091 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The composition of multimedia presentations is a time- and resource-consuming task if not afforded in a well-defined manner. This is particularly true when people having different roles and following different high-level directives, collaborate in the authoring and assembling of a final product. For this reason we adopt the Select, Assemble, Transform and Present (SATP) approach to coordinate the presentation authoring and a tag cloud-based search engine in order to help users in efficiently retrieving useful assets. In this paper we present MediaPresenter, the framework we developed to support companies in the creation of multimedia communication means, providing an instrument that users can exploit every time new communication channels have to be created.

S. Bergamaschi; F. Ferrari; M. Interlandi; M. Vincini ( 2011 ) - A web-based platform for multimedia content authoring exploiting keyword search engine and data cloud ( International Conference on Information Society. i-Society 2011 - London - June, 2011) ( - International Conference on Information Society. i-Society 2011 ) (IEEE UK/RI Computer Chapter London GBR ) - pp. da 1 a 5 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The composition of multimedia presentations is atime and resource consuming task if not afforded in a well definedmanner. This is particularly true when people having differentroles and following different high-level directives, collaborate inthe authoring and assembling of a final product. For this reasonwe adopt the Select, Assemble, Transform and Present (SATP)approach to coordinate the presentation authoring and a tagcloud-based search engine in order to help users in efficientlyretrieving useful assets. In this paper we present MediaPresenter,the framework we developed to support companies in the creationof multimedia communication means, providing an instrumentthat users can exploit every time new communication channelshave to be created.

Sonia Bergamaschi; Domenico Beneventano; Laura Po; Serena Sorrentino ( 2011 ) - Automatic Normalization and Annotation for Discovering Semantic Mappings ( SeCO Workshop on Search Computing - Como, Italy - May 25-31, 2010) ( - Search Computing: Trends and Developments ) (Springer Heidelberg DEU ) - n. volume 6585 - pp. da 85 a 100 ISBN: 9783642196676 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the “hidden meaning” associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning” to schema labels. However, accuracy of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and word abbreviations. In this work, we address this problem by proposing a method to perform schema labels normalization which increases the number of comparable labels. Unlike other solutions, the method semi-automatically expands abbreviations and annotates compound terms, without a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy.

Sonia Bergamaschi; Domenico Beneventano; Francesco Guerra; Mirko Orsini ( 2011 ) - Data Integration ( - Handbook of Conceptual Modeling ) (Springer Berlin DEU ) - pp. da 441 a 476 ISBN: 9783642158643 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Given the many data integration approaches, a complete and exhaustivecomparison of all the research activities is not possible. In this chapter, we willpresent an overview of the most relevant research activities andideas in the field investigated in the last 20 years. We will also introduce the MOMISsystem, a framework to perform information extraction and integration from bothstructured and semistructured data sources, that is one of the most interesting resultsof our research activity. An open source version of the MOMIS system was deliveredby the academic startup DataRiver (www.datariver.it).

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - KEYRY: A Keyword-Based Search Engine over Relational Databases Based on a Hidden Markov Model ( Conceptual Modeling (ER2011) - Demo - Brussels - 30/10/2011 - 03/11/2011) ( - Advances in Conceptual Modeling. Recent Developments and New Directions - ER 2011 Workshops FP-UML, MoRE-BI, Onto-CoM, SeCoGIS, Variability@ER, WISM, Brussels, Belgium, October 31 - November 3, 2011. Proceedings ) (Springer Heidelberg DEU ) - pp. da 328 a 331 ISBN: 9783642245732 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose the demonstration of KEYRY, a tool for translating keywordqueries over structured data sources into queries in the native language ofthe data source. KEYRY does not assume any prior knowledge of the source contents.This allows it to be used in situations where traditional keyword searchtechniques over structured data that require such a knowledge cannot be applied,i.e., sources on the hidden web or those behind wrappers in integration systems.In KEYRY the search process is modeled as a Hidden Markov Model and the ListViterbi algorithm is applied to computing the top-k queries that better representthe intended meaning of a user keyword query. We demonstrate the tool’s capabilities,and we show how the tool is able to improve its behavior over time byexploiting implicit user feedback provided through the selection among the top-ksolutions generated.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Raquel Trillo Lado; Yannis Velegrakis ( 2011 ) - Keyword search over relational databases: a metadata approach ( ACM SIGMOD International Conference on Management of Data - Athens - June 12-16, 2011) ( - Proceedings of the ACM SIGMOD International Conference on Management of Data ) (ACM New York USA ) - pp. da 565 a 576 ISBN: 9781450306614 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Keyword queries offer a convenient alternative to traditionalSQL in querying relational databases with large, often unknown,schemas and instances. The challenge in answering such queriesis to discover their intended semantics, construct the SQL queriesthat describe them and used them to retrieve the respective tuples.Existing approaches typically rely on indices built a-priori on thedatabase content. This seriously limits their applicability if a-prioriaccess to the database content is not possible. Examples include theon-line databases accessed through web interface, or the sources ininformation integration systems that operate behind wrappers withspecific query capabilities. Furthermore, existing literature has notstudied to its full extend the inter-dependencies across the ways thedifferent keywords are mapped into the database values and schemaelements. In this work, we describe a novel technique for translatingkeyword queries into SQL based on the Munkres (a.k.a. Hungarian)algorithm. Our approach not only tackles the above twolimitations, but it offers significant improvements in the identificationof the semantically meaningful SQL queries that describe theintended keyword query semantics. We provide details of the techniqueimplementation and an extensive experimental evaluation.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Raquel Trillo Lado; Yannis Velegrakis ( 2011 ) - Keyword-based Search in Data Integration Systems ( Italian Symposium on Advanced Database Systems - Maratea - 26-29/06/2011) ( - Proceedings of the 19th Italian Symposium on Advanced Database Systems ) (Università della Basilicata Potenza ITA ) - pp. da 103 a 110 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we describe Keymantic, a framework for translating keywordqueries into SQL queries by assuming that the only available information isthe source metadata, i.e., schema and some external auxiliary information. Sucha framework finds application when only intensional knowledge about the datasource is available like in Data Integration Systems.

Sonia Bergamaschi; Matteo Interlandi; Maurizio Vincini ( 2011 ) - MediaBank: Keyword Search and Tag Cloud Functionalities for aMultimedia Content Authoring Web Platform - INTERNATIONAL JOURNAL OF MULTIMEDIA AND IMAGE PROCESSING - n. volume N. 1, Issue 3/4 - pp. da 79 a 96 ISSN: 2042-4647 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The composition of multimedia presentations is atime- and resource-consuming task if not afforded ina well-defined manner. This is particularly true whenpeople having different roles and following differenthigh-level directives, collaborate in the authoringand assembling of a final product. For this reasonwe adopt the Select, Assemble, Transform andPresent (SATP) approach to coordinate thepresentation authoring and a tag cloud-based searchengine in order to help users in efficiently retrievinguseful assets. In the first of this paper we presentMediaPresenter, the framework we developed tosupport companies in the creation of multimediacommunication means, providing an instrument thatusers can exploit every time new communicationchannels have to be created. In the second part wedescribe how we adopt keyword search techniquescoupled with Tag Cloud in order to summarize theresults over the stored data.

S. Bergamaschi; F. Ferrari; M. Interlandi; M. Vincini ( 2011 ) - MediaPresenter, a web platform for multimedia content management ( Nineteenth Italian Symposium on Advanced Database Systems - Maratea - June 26-29, 2011) ( - Proceedings of the Nineteenth Italian Symposium on Advanced Database Systems, SEBD2011 ) (Università della Basilicata POTENZA ITA ) - n. volume 1 - pp. da 435 a 442 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The composition of multimedia presentations is a time and resource consuming task if not afforded in a well defined manner. This is particularly true for medium/big companies, where people having different roles and following different high-level directives, collaborate in the authoring and assembling of a final product. In this paper we present MediaPresenter, the framework we developed to support companies in the creation of multimedia communication means, providing an instrument that users can exploit every time new communication channels have tobe created.

Carlo Batini; Domenico Beneventano; Sonia Bergamaschi; Tiziana Catarci ( 2011 ) - Semantic Integration of Data, Multimedia, Services - Editorial - INFORMATION SYSTEMS - n. volume 36(2) - pp. da 115 a 116 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Research efforts on structured data, multimedia, and services have involved non-overlapping communities. However, from a user perspective, the three kinds of information should behave and be accessed similarly. Instead, a user has to deal with different tools in order to gain a complete knowledge about a domain. There is no integrated view comprising data, multimedia and services retrieved by the specific tools that is automatically computed. A unified approach for dealing with different kinds of information may allow searches across different domains and different starting points / results in the searching processes.Multiple and challenging research issues have to be addressed to achieve this goal, including: mediating among different models for representing information, developing new techniques for extracting and mapping relevant information from heterogeneous kinds of data, devising innovative paradigms for formulating and processing queries ranging over both (multimedia) data and services, investigating new models for visualizing the results and allowing the user to easily manipulate them.This special issue "Semantic Integration of Data, Multimedia, and Services" presents advances in data, multimedia, and services interoperability.

Sonia Bergamaschi; Marius Octavian Olaru; Serena Sorrentino; Maurizio Vincini ( 2011 ) - Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dimensions ( CIT 2011 - Amsterdam - December 2011) ( - International Conference on Advances in Communication and Information Technology - CIT 2011 ) (IDES Conference Publishing System Amsterdam NLD ) - pp. da 1 a 10 ISBN: 9789081906708 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Data Warehousing is the main Business Intelligence instrument for the analysis of large amounts of data. It permits the extraction of relevant information for decision making processes inside organizations. Given the great diffusion of Data Warehouses, there is an increasing need to integrate information coming from independent Data Warehouses or from independently developed data marts in the same Data Warehouse. In this paper, we provide a method for the semi-automatic discovery of common topological properties of dimensions that can be used to automatically map elements of different dimensions in heterogeneous Data Warehouses. The method uses techniques from the Data Integration research area and combines topological properties of dimensions in a multidimensional model.

Sonia Bergamaschi;Marius Octavian Olaru;Serena Sorrentino;Maurizio Vincini ( 2011 ) - Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dimensions - ACEEE INTERNATIONAL JOURNAL ON INFORMATION TECHNOLOGY - n. volume Vol. 1 Issue 3 - pp. da 38 a 46 ISSN: 2158-012X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Data Warehousing is the main Business Intelligence instrument for the analysis of large amounts of data. It permits the extraction of relevant information for decision making processes inside organizations. Given the great diffusion of Data Warehouses, there is an increasing need to integrate information coming from independent Data Warehouses or from independently developed data marts in the same Data Warehouse. In this paper, we provide a method for the semi-automatic discovery of common topological properties of dimensions that can be used to automatically map elements of different dimensions in heterogeneous Data Warehouses. The method uses techniques from the Data Integration research area and combines topological properties of dimensions in a multidimensional model.

Silvia Rota; Sonia Bergamaschi; Francesco Guerra ( 2011 ) - The List Viterbi Training Algorithm and Its Application to Keyword Search over Databases ( CIKM’11 - Glasgow - October 24–28, 2011) ( - Proceedings of the 20th ACM Conference on Information and Knowledge Management ) (ACM New York USA ) - pp. da 1601 a 1606 ISBN: 9781450307178 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Hidden Markov Models (HMMs) are today employed in a varietyof applications, ranging from speech recognition to bioinformatics.In this paper, we present the List Viterbi training algorithm, aversion of the Expectation-Maximization (EM) algorithm based onthe List Viterbi algorithm instead of the commonly used forwardbackwardalgorithm. We developed the batch and online versionsof the algorithm, and we also describe an interesting application inthe context of keyword search over databases, where we exploit aHMM for matching keywords into database terms. In our experimentswe tested the online version of the training algorithm in asemi-supervised setting that allows us to take into account the feedbacksprovided by the users.

Sonia Bergamaschi; Domenico Beneventano; Alberto Corni; Entela Kazazi; Mirko Orsini; Laura Po; Serena Sorrentino ( 2011 ) - The Open Source release of the MOMIS Data Integration System ( the 19th Italian Symposium on Advanced Database Systems (SEBD2011) - Maratea, Italy - June 26-29, 2011) ( - Proceedings of the Nineteenth Italian Symposium on Advance Database Systems ) (Università della Basilicata Potenza ITA ) - pp. da 175 a 186 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

MOMIS (Mediator EnvirOnment for Multiple InformationSources) is an Open Source Data Integration system able to aggregate data coming from heterogeneous data sources (structured and semistructured) in a semi-automatic way. DataRiver3 is a Spin-Off of the University of Modena and Reggio Emilia that has re-engineered the MOMIS system, and released its Open Source version both for commercial and academic use. The MOMIS system has been extended with a set of features to minimize the integration process costs, exploiting the semantics of the data sources and optimizing each integration phase.The Open Source MOMIS system have been successfully applied in several industrial sectors: Medical, Agro-food, Tourism, Textile, Mechanical, Logistics. This paper describes the features of the Open Source MOMIS system and how it is able to address real data integration challenges.

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - Understanding linked open data through keyword searching: the KEYRY approach ( Linked Web Data Management - Uppsala, Sweden - 25 March 2011) ( - Proceedings of the 1st International Workshop on Linked Web Data Management ) (ACM New York USA ) - pp. da 34 a 35 ISBN: 9781450306089 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We introduce KEYRY, a tool for translating keyword queries overstructured data sources into queries formulated in their native querylanguage. Since it is not based on analysis of the data sourcecontents, KEYRY finds application in scenarios where sourceshold complex and huge schemas, apt to frequent changes, such assources belonging to the linked open data cloud. KEYRY is basedon a probabilistic approach that provides the top-k results that betterapproximate the intended meaning of the user query.

R. Brunetti; S. Bergamaschi; P. Bandieri ( 2011 ) - Uomini e Computer: La nuova alleanza [Altro (298) - Altro]
Abstract

La maggior parte di noi ha una discreta familiarità con i computer, con Internet e con il Web, con gli elaboratori di immagini e suoni. L’Informatica invece, cioè l’insieme dei processi e delle tecnologie che rendono possibile la creazione, la raccolta, l’elaborazione, l’immagazzinamento e la trasmissione dell’informazione con metodi automatici, la più giovane delle Scienze esatte, rimane ancora abbastanza sconosciuta ai più. Il computer, protagonista per eccellenza della rivoluzione culturale introdotta dallo sviluppo delle tecnologie informatiche, ha cambiato nel giro di circa cinquanta anni dimensioni e potenzialità diventando uno strumento indispensabile per poter ampliare i nostri orizzonti tecnologici e culturali e per partecipare alla “cultura globale” creata dalle reti informatiche.Si discute molto se l’enorme mole di informazione manipolabile ed accessibile attraverso il computer porti effettivamente ad un “progresso” culturale dell’uomo oppure, invece, porti ad una perdita di controllo della propria identità e dei propri veri connotati in una confusione ineliminabile tra l’immagine truccata e distorta del mondo “virtuale” e quella del mondo reale che la genera.Una esplorazione sulle applicazioni tecnologiche e culturali più avanzate dell’informatica offre la possibilità di recuperare una “nuova alleanza” tra l’uomo e le macchine, tra la irriproducibile individualità della mente umana e la moltiplicazione dell’io creata dal Web, tra l’atto creativo dell’artista e le leggi esatte che presiedono il comportamento dei computer.Il progetto ha come scopo quello di avvicinare gli studenti delle Scuole Elementari e Superiori e la cittadinanza modenese agli sviluppi più avanzati della scienza e tecnologia informatica. Si discuteranno principalmente i progressi tecnologici legati alla robotica, i modelli informatici per l’intelligenza artificiale, la rivoluzione nella gestione dell’informazione introdotta dal Web, l’impatto dell’informatica nel mondo dello spettacolo e dell’arte.Sono stati realizzati:3 Laboratori per le Scuole Elementari3 Incontri per le Scuole Superiori3 conferenze seraliProduzione di un DVD a conclusione dell’esperienza

Raquel Trillo; Laura Po; Sergio Ilarri; Sonia Bergamaschi; Eduardo Mena ( 2011 ) - Using semantic techniques to access web data - INFORMATION SYSTEMS - n. volume 36(2) - pp. da 117 a 133 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Nowadays, people frequently use different keyword-based web search engines to find the information they need on the web. However, many words are polysemous and, when these words are used to query a search engine, its output usually includes links to web pages referring to their different meanings. Besides, results with different meanings are mixed up, which makes the task of finding the relevant information difficult for the users, especially if the user-intended meanings behind the input keywords are not among the most popular on the web. In this paper, we propose a set of semantics techniques to group the results provided by a traditional search engine into categories defined by the different meanings of the input keywords. Differently from other proposals, our method considers the knowledge provided by ontologies available on the web in order to dynamically define the possible categories. Thus, it is independent of the sources providing the results that must be grouped. Our experimental results show the interest of the proposal.

Francesco Guerra; Sonia Bergamaschi ( 2011 ) - 2nd International Workshop on Data Engineering meets the Semantic [Esposizione (290) - Esposizione]
Abstract

The goal of DESWeb is to bring together researchers and practitioners from both fields of Data Management and Semantic Web. It aims at investigating the new challenges that Semantic Web technologies have introduced and new ways through which these technologies can improve existing data management solutions. Furthermore, it intends to study what data management systems and technologies can offer in order to improve the scalability and performance of Semantic Web applications.

S. Bergamaschi; L. Sgaravato; M. Vincini ( 2010 ) - A COMPLETE LCA DATA INTEGRATION SOLUTION BUILT UPON MOMIS SYSTEM ( 18th Italian Symposium on Advanced Database Systems (SEBD 2010) - Rimini - June, 20-23, 2010) ( - Proceedings of the 18th Italian Symposium on Advanced Database Systems (SEBD 2010) ) (Soc. Editrice ESCULAPIO Bologna ITA ) - pp. da 342 a 349 ISBN: 9788874883691 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Life Cycle Thinking is day by day spreading outside scientific circles to assume a key role in the modern production system. Rewarded by consumers or ruled by governments an increasing number of firms is focusing on the assessment of their industrial processes. ENEA supports the adoption of such practice in small companies supplying them with simplified LCA tools; extending their database with value and up-to-date data published by the European Commission is of primary importance in order to provide effective assistance. This paper presents and demonstrates how the MOMIS (and RELEVANT) systems may be coupled and extended to actually provide a time and effort effective support in developing and deploying such an integration solution. The paper describes all the stages involved in the Extract Transform and Load process, with strong emphasis on the benefits the integration designer can achieve by the means of the semi-automatic definition of inter-attribute mappings and transformation functions [1,2] on a large number of records.

S.R.H. Joseph; Z. Despotovic; G. Moro; S. Bergamaschi ( 2010 ) - Agents and Peer-to-Peer Computing6th InternationalWorkshop, AP2PC 2007 Honululu, Hawaii, USA, May 14-18, 2007 Revised and Selected Papers (Springer Heidelberg DEU ) - pp. da 1 a 123 ISBN: 9783642113673 [Curatela (284) - Curatela]
Abstract

Peer-to-peer (P2P) computing has attracted significant media attention, initiallyspurred by the popularity of file-sharing systems such as Napster, Gnutella, andMorpheus.More recently systems like BitTorrent and eDonkey have continued tosustain that attention. New techniques such as distributed hash-tables (DHTs),semantic routing, and Plaxton Meshes are being combined with traditional conceptssuch as Hypercubes, Trust Metrics, and caching techniques to pool togetherthe untapped computing power at the “edges” of the Internet. These newtechniques and possibilities have generated a lot of interest in many industrialorganizations, and resulted in the creation of a P2P working group on standardizationin this area (http://www.irtf.org/charter?gtype=rg&group=p2prg).In P2P computing, peers and services forego central coordination and dynamicallyorganize themselves to support knowledge sharing and collaboration,in both cooperative and non-cooperative environments. The success of P2P systemsstrongly depends on a number of factors. First, the ability to ensure equitabledistribution of content and services. Economic and business models whichrely on incentive mechanisms to supply contributions to the system are beingdeveloped, along with methods for controlling the “free riding” issue. Second,the ability to enforce provision of trusted services. Reputation-based P2P trustmanagement models are becoming a focus of the research community as a viablesolution. The trust models must balance both constraints imposed by theenvironment (e.g., scalability) and the unique properties of trust as a social andpsychological phenomenon. Recently, we are also witnessing a move of the P2Pparadigm to embrace mobile computing in an attempt to achieve even higherubiquitousness. The possibility of services related to physical location and therelation with agents in physical proximity could introduce new opportunities andalso new technical challenges.Although researchers working on distributed computing, multi-agent systems,databases, and networks have been using similar concepts for a long time, it isonly fairly recently that papers motivated by the current P2P paradigm havestarted appearing in high-quality conferences and workshops. Research in agentsystems in particular appears to be most relevant because, since their inception,multi-agent systems have always been thought of as collections of peers.The multi-agent paradigmcan thus be superimposed on the P2P architecture,where agents embody the description of the task environments, the decisionsupportcapabilities, the collective behavior, and the interaction protocols ofeach peer. The emphasis in this context on decentralization, user autonomy, dynamicgrowth, and other advantages of P2P also leads to significant potentialproblems. Most prominent among these problems are coordination: the abilityof an agent to make decisions on its own actions in the context of activitiesof other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers,robustness, traffic redistribution, and so forth. It is important to scale up coordinationstrategies along multiple dimensions to enhance their tractability andviability, and thereby to widen potential application domains. These two problemsare common to many large-scale applications.Without coordination, agentsmay be wasting their efforts, squandering, resources, and failing to achieve theirobjectives in situations requiring collective effort.This workshop series brings together researchers working on agent systemsand P2P computing with the intention of strengthening this connection. Researchersfrom other related areas such as distributed systems, networks, anddatabase systems are also welcome (and, in our opinion, have a lot to contribute).We sought high-quality and original contributions on the general themeof “Agents and P2P Computing.”

Laura Po; Sonia Bergamaschi ( 2010 ) - Automatic Lexical Annotation Applied to the SCARLET Ontology Matcher ( The Second Asian Conference on Intelligent Information and Database Systems - Hue City, Vietnam - 24-26 Marzo 2010) ( - Intelligent Information and Database Systems ) (Springer Berlin Heidelberg DEU ) - n. volume 5991 - pp. da 144 a 153 ISBN: 978-364212100-5 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper proposes lexical annotation as an effective method to solve the ambiguity problems that affect ontology matchers. Lexical annotation associates to each ontology element a set of meanings belonging to a semantic resource. Performing lexical annotation on theontologies involved in the matching process allows to detect false positive mappings and to enrich matching results by adding new mappings (i.e. lexical relationships between elements on the basis of the semanticrelationships holding among meanings).The paper will go through the explanation of how to apply lexical annotation on the results obtained by a matcher. In particular, the paper shows an application on the SCARLET matcher.We adopt an experimental approach on two test cases, where SCARLET was previously tested, to investigate the potential of lexical annotation. Experiments yielded promising results, showing that lexical annotationimproves the precision of the matcher.

Sonia Bergamaschi;Domenico Beneventano;Riccardo MArtoglia ( 2010 ) - Facilitate IT-providing SMEs by Operation-related Models and Methods [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

The starting point of this project has been constituted by a specific need of SMEs AG involved in this project of providing new innovative services claimed by their members.With the globalization SMEs operating in ICT started in having customers abroad, having requests of more complex and distributed applications. Sometimes they changed the development technology, but almost none of them changed the organizational approach to the whole process; too busy in tracking customers. But, now in this sector a deep organizational change is mandatory also for SMEs, because of the risk is to loose market share due to high developing cost.This request of support raised by SMEs operating in ICT perfectly matched the need of innovation of SME AGs like the ones involved in the FACIT-SME project. The natural way of evolution of SME AGs was and is to mature the capability of providing to members services that really affect competitiveness, and from this consideration the FACIT-SME project has been started.

Sonia Bergamaschi; Francesco Guerra; Barry Leiba ( 2010 ) - Guest editors' introduction: Information overload - IEEE INTERNET COMPUTING - n. volume 14(6) - pp. da 10 a 13 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Search the Internet for the phrase “information overload definition,” andGoogle will return some 7,310,000results (at the time of this writing).Bing gets 9,760,000 results for thesame query. How is it possible for usto process that much data, to select themost interesting information sources,to summarize and combine differentfacets highlighted in the results, andto answer the questions we set out toask? Information overload is present ineverything we do on the Internet.Despite the number of occurrences ofthe term on the Internet, peer-reviewedliterature offers only a few accuratedefinitions of information overload.Among them, we prefer the one thatdefines it as the situation that “occursfor an individual when the informationprocessing demands on time (InformationLoad, IL) to perform interactionsand internal calculations exceed thesupply or capacity of time available (Information Processing Capacity, IPC) for such processing.”1 In other words, when the information available exceeds the user’s ability to process it. This formaldefinition provides a measure that we can express algebraically as IL > IPC, offering a way for classifying and comparing the different situations in which the phenomenon occurs. But measuring IL and IPC is a complex task because they strictly depend on a set of factors involving both the individual and the information (such as the individual’s skill), as well as the motivations and goals behind the information request.Clay Shirky, who teaches at New York University,takes a different view, focusing on how we sift through the information that’s available to us. We’ve long had access to “more reading material than you could finish in a lifetime,” he says, and “there is no such thing as information overload, there’s only filter failure.”2 But howeverwe look at it, whether it’s too much productionor failure in filtering, it’s a general and common problem, and information overload management requires the study and adoption of special, user- and context-dependent solutions.Due to the amount of information available that comes with no guarantee of importance, trust, or accuracy, the Internet’s growth has inevitably amplified preexisting information overload issues. Newspapers, TV networks, and press agencies form an interesting example of overload producers: they collectively make available hundreds of thousands of partially overlapping news articles each day. This large quantity gives rise to information overload in a “spatial” dimension — news articles about the same subject are published in different newspapers— and in a “temporal” dimension — news articles about the same topic are published and updated many times in a short time period.The effects of information overload include difficulty in making decisions due to time spent searching and processing information,3 inabilityto select among multiple information sources providing information about the same topic,4 and psychological issues concerning excessive interruptions generated by too many informationsources.5 To put it colloquially, this excess of information stresses Internet users out.

Sonia Bergamaschi; Francesco Guerra; Barry Leiba ( 2010 ) - IEEE Internet Computing Special Issue on Information Overload (IEEE Computer Society Los Alamitos USA ) - IEEE INTERNET COMPUTING - pp. da 10 a 13 ISSN: 1089-7801 [Curatela (284) - Curatela]
Abstract

Search the Internet for the phrase “information overload definition,” and Google will return some 7,310,000 results (at the time of this writing). Bing gets 9,760,000 results for the same query. How is it possible for us to process that much data, to select the most interesting information sources, to summarize and combine different facets highlighted in the results, and to answer the questions we set out to ask? Information overload is present in everything we do on the Internet.Despite the number of occurrences of the term on the Internet, peer-reviewed literature offers only a few accurate definitions of information overload.Among them, we prefer the one that defines it as the situation that “occurs for an individual when the information processing demands on time (Information Load, IL) to perform interactionsand internal calculations exceed the supply or capacity of time available (Information Processing Capacity, IPC) for such processing.” In other words, when the information available exceeds the user’s ability to process it. This formal definition provides a measure that we can express algebraically as IL > IPC, offering a way for classifying and comparing the different situations in which the phenomenon occurs. But measuring IL and IPC is a complex task because they strictly depend on a set of factors involving both the individual and the information (such as the individual’s skill), as well as the motivations and goals behind the information request.Clay Shirky, who teaches at New York University, takes a different view, focusing on how we sift through the information that’s available to us. We’ve long had access to “more reading material than you could finish in a lifetime,” he says, and “there is no such thing as information overload, there’s only filter failure.” But however we look at it, whether it’s too much production or failure in filtering, it’s a general and common problem, and information overload management requires the study and adoption of special, user- and context-dependent solutions.Due to the amount of information available that comes with no guarantee of importance, trust, or accuracy, the Internet’s growth has inevitably amplified preexisting information overload issues. Newspapers, TV networks, and press agencies form an interesting example of overload producers: they collectively make available hundreds of thousands of partially overlapping news articles each day. This large quantity gives rise to information overload in a “spatial” dimension — news articles about the same subject are published in different newspapers— and in a “temporal” dimension — news articles about the same topic are published and updated many times in a short time period.The effects of information overload include difficulty in making decisions due to time spent searching and processing information, inability to select among multiple information sources providing information about the same topic, and psychological issues concerning excessive interruptions generated by too many information sources. To put it colloquially, this excess of information stresses Internet users out.

S. Bergamaschi; E. Domnori; F. Guerra; M. Orsini; R. Trillo Lado; Y. Velegrakis ( 2010 ) - Keymantic: Semantic Keyword-based Searching in Data Integration Systems - PROCEEDINGS OF THE VLDB ENDOWMENT - n. volume 3(2) - pp. da 1637 a 1640 ISSN: 2150-8097 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We propose the demonstration of Keymantic, a system for keyword-based searching in relational databases that does not require a-priori knowledge of instances held in a database. It nds numerous applications in situations where traditional keyword-based searching techniques are inapplicable due to the unavailability of the database contents for the construction of the required indexes.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Mirko Orsini; Raquel Trillo Lado; Yannis Velegrakis ( 2010 ) - Keymantic: Semantic Keyword-based Searching in Data Integration Systems [Software (296) - Software]
Abstract

Keymantic is a systemfor keyword-based searching in relational databases thatdoes not require a-priori knowledge of instances held in adatabase. It finds numerous applications in situations wheretraditional keyword-based searching techniques are inappli-cable due to the unavailability of the database contents forthe construction of the required indexes.

D. Beneventano; S. Bergamaschi; M. Orsini; M. Vincini ( 2010 ) - MOMIS: Getting through the THALIA benchmark ( 18th Italian Symposiun on Advanced Database System (SEBD 2010) - Rimini - June, 20-23, 2010) ( - Proccedings of the 18th Italian Symposiun on Advanced Database System (SEBD 2010) ) (So. Editrice Esculapio Bologna ITA ) - pp. da 354 a 357 ISBN: 9788874883691 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

During the last decade many data integration systems characterized by a classical wrapper/mediator architecture based on a Global Virtual Schema (Global Virtual View - GVV) have been proposed. The data sources store data, while the GVV provides a reconciled, integrated, and virtual view of the underlying sources. Each proposed system contribute to the state of the art advancement by focusing on different aspects to provide an answer to one or more challenges of the data integration problem, ranging from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. The approaches are still in part manual, requiring a great amount of customization for data reconciliation and for writing specific non reusable programming code. The specialization of mediator systems make a comparisons among the various systems difficult. Therefore, the last Lowell Report [1] has provided the guideline for the definition of a public benchmark for data integration problems. The proposal is called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) [2], and it provides researchers with a collection of downloadable data sources representing University course catalogues, a set of twelve benchmark queries, as well as a scoring function for ranking the performance of an integration system. In this paper we show how the MOMIS mediator system we developed [3,4] can deal with all the twelve queries of the THALIA benchmark by simply extending and combining the declarative translation functions available in MOMIS and without any overhead of new code. This is a remarkable result, in fact, as far as we know, no system has provided a complete answer to the benchmark.

Serena Sorrentino; Sonia Bergamaschi; Maciej Gawinecki; Laura Po ( 2010 ) - Schema Label Normalization for Improving Schema Matching - DATA & KNOWLEDGE ENGINEERING - n. volume 69(12) - pp. da 1254 a 1273 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources that are heterogeneous in format and in structure. Starting from the “hidden meaning” associated with schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning” to schema labels.However, the performance of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns, abbreviations, and acronyms. We address this problem by proposing a method to perform schema label normalization which increases the number of comparable labels. The method semi-automatically expands abbreviations/acronyms and annotates compound nouns, with minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching results.

S. Bergamaschi; S. Lodi; R. Martoglia; C. Sartori ( 2010 ) - SEBD 2010 Proceedings of the 18th Italian Symposium on Advanced Database Systems (ESCULAPIO - Bologna Bologna ITA ) - pp. da 1 a 476 ISBN: 9788874883691 [Curatela (284) - Curatela]
Abstract

PrefaceThis volume collects the papers selected for presentation at the Eighteenth ItalianSymposium on Advanced Database Systems (SEBD 2010), held in Rimini,Italy, from the 20th to the 23rd of June 2010.SEBD is the major annual event of the Italian database research community.The symposium is conceived as a gathering forum for the discussionand exchange of ideas and experiences among researchers and experts fromthe academy and industry, about all aspects of database systems and their applications.SEBD is back in Rimini after sixteen years, and it is interesting to observehow the landscape of the Italian database research community has changed. In1994 twenty-one papers were accepted, now the number has more than doubled,meaning that the community has been steadily growing. Most of the topicsconsidered in 1994 are still around, even if the language, the formalisms andthe reference applications have changed. The Web was e-mail, FTP, Usenet,a small amount of HTML pages here and there, and little more, now it isthe pervasive engine of information dissemination and search. The Web is sopowerful that a series of brand new ideas and applications have arisen fromit, due to a mix of possibility and necessity. Social systems across the Web,mobility, and heterogeneity were not conceivable in the early 1990s. Semanticweb, data mining and warehousing, streaming techniques, large scale integrationare necessary to deal with the growing amount of data and information.The SEBD 2010 program reects the current interests of the Italian databaseresearchers and covers most of the topics considered by the international researchcommunity. Sixty papers were submitted to SEBD 2010, of which twenty-twowere research papers, two were software demonstrations, and thirty-four wereextended abstracts, i.e., papers containing descriptions of on-going projects orpresenting results already published. Fifty-one papers were accepted for presentation,of which seventeen were research papers, two were software demonstrations,and thirty-two were extended abstracts.Besides paper presentations, the program includes a tutorial by Divesh Srivastava(AT&T Labs-Research) and two invited talks, the rst by Hector Garcia-Molina (Stanford University, CA) and the second by Amr El Abbadi (Universityof California, CA).We would like to thank all the authors who submitted papers and all symposiumparticipants. We are grateful to the members of the Program Committeeand the external referees for their thorough work in reviewing submissions withexpertise and patience, and to the members of the SEBD Steering Committeefor their support in the organization of SEBD 2010. Special thanks are due tothe members of the Organizing Committee and to the University of Bologna,Polo di Rimini, which made this event possible. Finally, we gratefully thank allcooperating institutions.Rimini, June 2010 Sonia Bergamaschi Stefano LodiRiccardo Martoglia Claudio Sartori

Sonia Bergamaschi; Laura Po; Serena Sorrentino; Alberto Corni ( 2010 ) - Uncertainty in Data Integration Systems: Automatic Generation of Probabilistic Relationships ( - Management of the Interconnected World ) (Springer Berlin Heidelberg DEU ) - pp. da 221 a 228 ISBN: 9783790824032 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This paper proposes a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is based on probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. to perform annotation w.r.t. a thesaurus/lexical resource) the schemata of a given set of data sources to be integrated. From the annotated schemata and the relathionships defined in the thesaurus, we derived the probabilistic lexical relationships among schema elements. Lexical relationships are collected in the Probabilistic Common Thesaurus (PCT), as well as structural relationships.

Francesco Guerra; Yannis Velegrakis; Sonia Bergamaschi ( 2010 ) - 1st International Workshop on Data Engineering meets the Semantic Web (DESWeb 2010) [Esposizione (290) - Esposizione]
Abstract

Modern web applications like Wiki’s, social networking sites and mashups, are radically changing thenature of modern Web from a publishing-only environment into a vivant place for information exchange.The successful exploitation of this information largely depends on the ability to successfully communicatethe data semantics, which is exactly the vision of the Semantic Web. In this context, new challengesemerge for semantic-aware data management systems.The contribution of the data management community in the Semantic Web effort is fundamental. RDFhas already been adopted as the representation model and exchange format for the semantics of thedata on the Web. Although, until recently, RDF had not received considerable attention, the recentpublication in RDF format of large ontologies with millions of entities from sites like Yahoo! andWikipedia, the huge amounts of microformats in RDF from life science organizations, and the giganticRDF bibliographic annotations from publishers, have made clear the need for advanced managementtechniques for RDF data.On the other hand, traditional data management techniques have a lot to gain by incorporating semanticinformation into their frameworks. Existing data integration, exchange and query solutions are typicallybased on the actual data values stored in the repositories, and not on the semantics of these values.Incorporation of semantics in the data management process improves query accuracy, and permit moreefficient and effective sharing and distribution services. Integration of new content, on-the-fly generationof mappings, queries on loosely structured data, keyword searching on structured data repositories, andentity identification, are some of the areas that can benefit from the presence of semantic knowledgealongside the data.The goal of DESWeb is to bring together researchers and practitioners from both fields of DataManagement and Semantic Web. It aims at investigating the new challenges that Semantic Webtechnologies have introduced and new ways through which these technologies can improve existing datamanagement solutions. Furthermore, it intends to study what data management systems andtechnologies can offer in order to improve the scalability and performance of Semantic Web applications.

SALA ANTONIO; S. BERGAMASCHI ( 2009 ) - A Mediator Based Approach to Ontology Generation and Querying of Molecular and Phenotypic Cereals Data ([Olney] : Inderscience Enterprises, 2006-. ) - INTERNATIONAL JOURNAL OF METADATA, SEMANTICS AND ONTOLOGIES - n. volume 4 - pp. da 85 a 92 ISSN: 1744-2621 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We describe the development of the CEREALAB ontology, an ontology of molecular and phenotypic cereals data, that allows identifying the correlation between the phenotype of a plant with its molecular data. It is realised by integrating public web databases with the database developed by the research group of the CEREALAB laboratory. Integration is obtained semi-automatically by using the Mediator envirOnment for Multiple Information Sources (MOMIS) system, a data integration system developed by the Database Group of the University of Modena and Reggio Emilia, and allows querying the integrated data sources regardless of the specific languages of the source databases.

Sonia Bergamaschi; Laura Po; Serena Sorrentino; Alberto Corni ( 2009 ) - ALA: Dealing with Uncertainty in Lexical Annotation [Software (296) - Software]
Abstract

We present ALA, a tool for the automatic lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) of structured and semi-structured data sources and the discovery of probabilistic lexical relationships in a data integration environment. ALA performs automatic lexical annotation through the use of probabilistic annotations, i.e. an annotation is associated to a probability value. By performing probabilistic lexical annotation, we discover probabilistic inter-sources lexical relationships among schema elements. ALA extends the lexical annotation module of the MOMIS data integration system. However, it may be applied in general in the context of schema mapping discovery, ontology merging and data integration system and it is particularly suitable for performing “on-the-fly” data integration or probabilistic ontology matching.

S. Bergamaschi; F. Guerra; M. Orsini; C. Sartori; M. Vincini ( 2009 ) - An ETL tool based on semantic analysis of schemata and instances ( International Conference on Knowledge-based and Intelligent Information & Engineering Systems (KES 2009) - Santiago, Chile - September 28-30, 2009) ( - Knowledge-Based and Intelligent Information and Engineering Systems ) (Springer Heidelberg DEU ) - n. volume 5712 - pp. da 58 a 65 ISBN: 9783642045912 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as an ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse,and instance level operations, exploited for defining transformation functions that integrate data coming from multiple sources in a common representation.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.

S. Bergamaschi ( 2009 ) - Cercare un ago in un grande quantitativo di dati! Un motore di ricerca basato su keyword per sorgenti dati strutturate. [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

L’integrazione di dati è una delle sfide che le comunità di ricerca che lavorano sui database e sull’intelligenza artificiale stanno affrontando. Un approccio normalmente utilizzato per integrare sorgenti dati ha come obiettivo quello di costruire uno schema “mediato” delle sorgenti (Schema Globale) che permette all’utente di interrogare molteplici sorgenti di struttura anche eterogenea (database relazionali, documenti in formato html, testuale, xml) con un’unica interrogazione rivolta allo schema globale. Spesso tale schema, pur risolvendo il problema dell’eterogeneità delle sorgenti integrate, si compone di un gran numero di tabelle correlate e risulta quindi di difficile lettura per un utente. D’altra parte, i linguaggi di interrogazione tradizionali si basano sulla conoscenza degli schemi delle sorgenti. In questo progetto si vuole sviluppare un linguaggio di interrogazione, e il relativo motore esecutore, che permetta all’utente di formulare in maniera semplice interrogazioni complesse basate su keyword su uno Schema Globale. Tale motore ha caratteristiche avanzate: 1) uso della conoscenza intensionale per migliorare le ricerche; 2) sintesi della conoscenza estensionale per aiutare l’utente nella formulazione di una interrogazione 3) sviluppo di un linguaggio per esprimere interrogazioni complesse (condizioni sulle keyword) con la facilità dei linguaggi keyword-based dei motori di ricerca tradizionali.

Sonia Bergamaschi; Mirko Orsini; Domenico Beneventano; Antonio Sala; Alberto Corni; Laura Po; Serena Sorrentino; QUIX Srl ( 2009 ) - DataRiver [Altro (298) - Spin Off]
Abstract

Sonia Bergamaschi; Laura Po; Serena Sorrentino; Alberto Corni ( 2009 ) - Dealing with Uncertainty in Lexical Annotation - REVISTA DE INFORMÁTICA TEÓRICA E APLICADA - n. volume 16(2) - pp. da 93 a 96 ISSN: 2175-2745 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We present ALA, a tool for the automatic lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) of structured and semi-structured data sources and the discovery of probabilistic lexical relationships in a data integration environment. ALA performs automatic lexical annotation through the use of probabilistic annotations, i.e. an annotation is associated to a probability value. By performing probabilistic lexical annotation, we discover probabilistic inter-sources lexical relationships among schema elements. ALA extends the lexical annotation module of the MOMIS data integration system. However, it may be applied in general in the context of schema mapping discovery, ontology merging and data integration system and it is particularly suitable for performing “on-the-fly” data integration or probabilistic ontology matching.

Domenico Beneventano; Sonia Bergamaschi; Serena Sorrentino ( 2009 ) - Extending WordNet with compound nouns for semi-automatic annotation ( IEEE International Conference on Natural Language Processing and Knowledge Engineering (IEEE NLP-KE'09) - Dalian , China - 24-27 September 2009) ( - Proceedings of IEEE NLP-KE 2009 ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 1 a 8 ISBN: 9781424445387 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The focus of data integration systems is on producing a comprehensive global schema successfully integrating data from heterogeneous data sources (heterogeneous in format and in structure). Starting from the “meanings” associated to schema elements (i.e. class/attribute labels) and exploiting the structural knowledge of sources, it is possible to discover relationships among the elements of different schemata. Lexical annotation is the explicit inclusion of the “meaning” of a data source element according to a lexical resource. Accuracy of semi-automatic lexical annotatortools is poor on real-world schemata due to the abundance of non-dictionary compound nouns. It follows that a large set of relationships among different schemata is discovered, including a great amount of false positive relationships. In this paper we propose a new method for the annotation ofnon-dictionary compound nouns, which draws its inspiration from works in the natural languagedisambiguation area. The method extends the lexical annotation module of the MOMIS data integration system.

E. Corradini; A. Zanasi; S. Bergamaschi; F. Guerra ( 2009 ) - FRONTEX - Open Source Intelligence, Data and Text Mining for Analysts in the Border Guard Community [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

provide elements on the new threats and on the required capabilities to react to them:- Detection and identification- Information management- Risk assessment, modelling and simulation- Situation awareness & assessmentTopics to address:- Security: a concept in evolution- What are the threats?- Critical Infrastructure Protection- Border Security- Protection against Organized Criminality and Terrorism- Restoring security in case of crisis- Functions, capabilities, technologies- From National Security to Competitive Intelligence- Intelligence and security systems planning- Security in Europe: European Commission funding- Business cases with examples of utilization

Francesco Guerra; Sonia Bergamaschi; Mirko Orsini; Claudio Sartori; Maurizio Vincini ( 2009 ) - Improving Extraction and Transformation in ETL by Semantic Analysis ( European Conference on Knowledge Management - Vicenza, Italy - 3-4 September 2009) ( - Proceedings of the 10th European Conference on Knowledge Management ) (Academic Publishing Limited non disponibile GBR ) - pp. da 347 a 355 ISBN: 9781906638399 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Extraction, Transformation and Loading processes (ETL) are crucial for the data warehouseconsistency and are typically based on constraints and requirements expressed in natural language in the form ofcomments and documentations. This task is poorly supported by automatic software applications, thus makingthese activities a huge works for data warehouse. In a traditional business scenario, this fact does not representa real big issue, since the sources populating a data warehouse are fixed and directly known by the dataadministrator. Nowadays, the actual business needs require enterprise information systems to have a greatflexibility concerning the allowed business analysis and the treated data. Temporary alliances of enterprises,market analysis processes, the data availability on Internet push enterprises to quickly integrate unexpected datasources for their activities. Therefore, the reference scenario for data warehouse systems extremely changes,since data sources populating the data warehouse may not directly be known and managed by the designers,thus creating new requirements for ETL tools related to the improvement of the automation of the extraction andtransformation process, the need of managing heterogeneous attribute values and the ability to manage differentkinds of data sources, ranging from DBMS, to flat file, XML documents and spreadsheets. In this paper wepropose a semantic-driven tool that couples and extends the functionalities of two systems: the MOMISintegration system and the RELEVANT data analysis system. The tool aims at supporting the semi-automaticdefinition of ETL inter-attribute mappings and transformations in a data warehouse project. By means of asemantic analysis, two tasks are performed: 1) identification of the parts of the schemata of the data sourceswhich are related to the data warehouse; 2) supporting the definition of transformation rules for populating thedata warehouse. We experimented the approach in a real scenario: preliminary qualitative results show that ourtool may really support the data warehouse administrator’s work, by considerably reducing the data warehousedesign time.

F. GUERRA; BERGAMASCHI S; ORSINI M; SALA A; SARTORI C ( 2009 ) - Keymantic: A keyword Based Search Engine using Structural Knwoledge ( International Conference on Enterprise Information Systems - Milano, Italia - 6-10 Maggio 2009) ( - International Conference on Enterprise Information Systems ) (INSTICC Setubal PRT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Traditional techniques for query formulation need the knowledge of the database contents, i.e. which data are stored in the data source and how they are represented.In this paper, we discuss the development of a keyword-based search engine for structured data sources. The idea is to couple the ease of use and flexibility of keyword-based search with metadata extracted from data schemata and extensional knowledge which constitute a semantic network of knowledge. Translating keywords into SQL statements, we will develop a search engine that is effective, semantic-based, and applicablealso when instance are not continuously available, such as in integrated data sources or in data sources extracted from the deep web.

Laura Po; Serena Sorrentino; Sonia Bergamaschi; Domenico Beneventano ( 2009 ) - Lexical Knowledge Extraction: an Effective Approach to Schema and Ontology Matching ( 10th European Conference on Knowledge Management - Università Degli Studi Di Padova, Vicenza, Italy - 3-4 Settembre 2009) ( - Proceedings of the 10th European Conference on Knowledge Management ) (Academic Publishing Limited non disponibile GBR ) - n. volume 1 - pp. da 617 a 626 ISBN: 9781906638399 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper’s aim is to examine what role Lexical Knowledge Extraction plays in data integration as well as ontology engineering.Data integration is the problem of combining data residing at distributed heterogeneous sources, and providing the user with a unified view of these data; a common and important scenario in data integration are structured or semi-structure data sources described by a schema.Ontology engineering is a subfield of knowledge engineering that studies the methodologies for building and maintaining ontologies. Ontology engineering offers a direction towards solving the interoperability problems brought about by semantic obstacles, such as the obstacles related to the definitions of business terms and software classes. In these contexts where users are confronted with heterogeneous information it is crucial the support of matching techniques. Matching techniques aim at finding correspondences between semantically related entities of different schemata/ontologies.Several matching techniques have been proposed in the literature based on different approaches, often derived from other fields, such as text similarity, graph comparison and machine learning.This paper proposes a matching technique based on Lexical Knowledge Extraction: first, an Automatic Lexical Annotation of schemata/ontologies is performed, then lexical relationships are extracted based on such annotations.Lexical Annotation is a piece of information added in a document (book, online record, video, or other data), that refers to a semantic resource such as WordNet. Each annotation has the property to own one or more lexical descriptions. Lexical annotation is performed by the Probabilistic Word Sense Disambiguation (PWSD) method that combines several disambiguation algorithms.Our hypothesis is that performing lexical annotation of elements (e.g. classes and properties/attributes) of schemata/ontologies makes the system able to automatically extract the lexical knowledge that is implicit in a schema/ontology and then to derive lexical relationships between the elements of a schema/ontology or among elements of different schemata/ontologies.The effectiveness of the method presented in this paper has been proven within the data integration system MOMIS.

Serena Sorrentino; Sonia Bergamaschi; Maciej Gawinecki; Laura Po ( 2009 ) - Schema Normalization for Improving Schema Matching ( International Conference on Conceptual Modeling (ER 2009) - Gramado, Brasile - 9-12 Novembre 2009) ( - Conceptual Modeling - ER2009 ) (Springer Heidelberg DEU ) - n. volume 5829 - pp. da 280 a 293 ISBN: 978-3-642-04839-5 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources (heterogeneous in format and in structure). Starting from the \hidden meaning" associated to schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning" to schema labels. However, accuracy of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and word abbreviations.In this work, we address this problem by proposing a method to perform schema labels normalization which increases the number of comparable labels. Unlike other solutions, the method semi-automatically expands abbreviations and annotates compound terms, without a minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching accuracy.

Raquel Trillo; Laura Po; Sergio Ilarri; Sonia Bergamaschi; Eduardo Mena ( 2009 ) - Semantic Access to Data from the Web ( 1st International Workshop on Interoperability through Semantic Data and Service Integration - Camogli, Genova, Italy - 25 Giugno 2009) ( - 1st International Workshop on Interoperability through Semantic Data and Service Integration ) (Domenico Beneventano, proceedings informali del workshop Camogli ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

There is a great amount of information available on the web. So, users typically use different keyword-based web search engines to find the information they need. However, many words are polysemous and therefore the output of the search engine will include links to web pages referring to different meanings of the keywords. Besides, results with different meanings are mixed up, which makes the task of finding the relevant information difficult for the user, specially if the meanings behind the input keywords are not among the most popular in the web. In this paper, we propose a semantics-based approach to group the results returned to the user in clusters defined by the different meanings of the input keywords. Differently from other proposals, our method considers the knowledge provided by a pool of ontologies available on the Web in order to dynamically define the different categories (or clusters). Thus, it is independent of the sources providing the results that must be grouped.

S. Bergamaschi; F. Guerra; M. Orsini; C. Sartori; M. Vincini ( 2009 ) - Semantic Analysis for an Advanced ETL framework ( International Workshop on Interoperability through Semantic Data and Service Integration - Camogli (Genova) - June 25th, 2009) ( - Proceedings of the 1st International Workshop on Interoperability through Semantic Data and Service Integration ) (atti informali Camogli (Genova) ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse, and instance level operations, exploited for defining transformationfunctions that integrate in a common representation data coming from multiple sources.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.

Sonia Bergamaschi; Serena Sorrentino ( 2009 ) - Semi-automatic compound nouns annotation for data integration systems ( Sistemi evoluti per Basi di dati (SEBD 2009) - Camogli, Genova, Italy - 21-24 Giungno 2009) ( - The 17th Italian Symposium on Advanced Database Systems ) (SENECA EDIZIONI Torino ITA ) - pp. da 221 a 228 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Lexical annotation is the explicit inclusion of the “meaning" of a data source element according to a lexical resource. Accuracy of semi-automatic lexical annotator tools is poor on real-world schemata due to the abundance of non-dictionary compound nouns. It follows that a large set of relationships among different schemata is discovered, including a great amount of false positive relationships. In this paper we propose a new method for the annotation of non- dictionary compound nouns, which draws its inspiration from works in the natural languagedisambiguation area. The method extends the lexical annotation module of the MOMIS data integration system.

Sonia Bergamaschi; Andrea Maurino ( 2009 ) - Toward a Unified View of Data and Services ( Web Information Systems Engineering - WISE 2009 - Poznan, Poland - October 5-7, 2009) ( - Web Information Systems Engineering - WISE 2009 ) (Springer Heidelberg DEU ) - n. volume Lecture Notes in Computer Science 5802 - pp. da 11 a 12 ISBN: 9783642044083 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The research on data integration and service discovering has involved from the beginning different (not always overlapping) communities. Therefore, data and services are described with different models and different techniques to retrieve data and services have been developed. Nevertheless, from a user perspective, the border between data and services is often not so definite, since data and services provide a complementary vision about the available resources.In NeP4B (Networked Peers for Business), a project funded by the Italian Ministry of University and Research, we developed a semantic approach for providing a uniform representation of data and services, thus allowing users to obtain sets of data and lists of web-services as query results. The NeP4B idea relies on the creation of a Peer Virtual View (PVV) representing sets of data sources and web services, i.e. an ontological representation of data sources which is mapped to an ontological representation of web services. The PVV is exploited for solving user queries: 1) data results are selected by adopting a GAV approach; 2) services are retrieved by an information retrieval approach applied on service descriptions and by exploiting the mappings on the PVV.In the tutorial, we introduce: 1) the state of the art of semantic-based data integration and web service discovering systems; 2) the NeP4B architecture.

Sonia Bergamaschi; Francesco Guerra; Federica Mandreoli; Maurizio Vincini ( 2009 ) - Working in a dynamic environment: the NeP4B approach as a MAS ( Agents and Peer-to-Peer Computing (AP2PC 2009) - Budapest, Hungary - May 11, 2009) ( - Proceedings of the eighth International Workshop on Agents and Peer-to-Peer Computing ) (Springer, Verlag Berlin DEU ) - pp. da 117 a 130 ISBN: 9783642318085 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Integration of heterogeneous information in the context of Internet is becoming a key activity to enable a more organized and semantically meaningful access to several kinds of information in the form of data sources, multimediadocuments and web services. In NeP4B (Networked Peers for Business), a project funded by the Italian Ministry of University and Research, we developed an approach for providing a uniform representation of data, multimedia and services,thus allowing users to obtain sets of data, multimedia documents and lists of webservices as query results. NeP4B is based on a P2P network of semantic peers, connected one with each other by means of automatically generated mappings.In this paper we present a new architecture for NeP4B, based on a Multi-Agent System.We claim that such a solution may be more efficient and effective, thanks to the agents’ autonomy and intelligence, in a dynamic environment, where sources are frequently added (or deleted) to (from) the network.

D. BENEVENTANO; BERGAMASCHI S; CLAUDIO GENNARO; FRANCESCO GUERRA; MATTEO MORDACCHINI; ANTONIO SALA ( 2008 ) - A Mediator System for Data and Multimedia Sources ( Data Integration through Semantic Technology - Bangkok - 08 december 2008) ( - DATA INTEGRATION THROUGH SEMANTIC TECHNOLOGY ) (Data Integration through Semantic Technology - Workshop at the 3rd Asian Semantic Web Conference BANGKOK THA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Managing data and multimedia sources with a unique tool is a challenging issue. In this paper, the capabilities of the MOMIS integration system and the MILOS multimedia content management system are coupled, thus providing a methodology and a tool for building and querying an integrated virtual view of data and multimedia sources.

S. Joseph; Z. Despotovic; G. Moro; S. Bergamaschi ( 2008 ) - Agents and Peer-to-Peer Computing5th International Workshop, AP2PC 2006, Hakodate, Japan, May 9, 2006, Revised and Invited Papers (Springer Heidelberg DEU ) - pp. da 1 a 187 ISBN: 9783540797043 [Curatela (284) - Curatela]
Abstract

Peer-to-peer (P2P) computing has attracted significant media attention, initiallyspurred by the popularity of file-sharing systems such as Napster, Gnutella, andMorpheus.More recently systems like BitTorrent and eDonkey have continued tosustain that attention. New techniques such as distributed hash-tables (DHTs),semantic routing, and Plaxton Meshes are being combined with traditional conceptssuch as Hypercubes, Trust Metrics, and caching techniques to pool togetherthe untapped computing power at the “edges” of the Internet. These newtechniques and possibilities have generated a lot of interest in many industrialorganizations, and have resulted in the creation of a P2P working group on standardizationin this area (http://www.irtf.org/charter?gtype=rg&group=p2prg).In P2P computing, peers and services forego central coordination and dynamicallyorganize themselves to support knowledge sharing and collaboration,in both cooperative and non-cooperative environments. The success of P2P systemsstrongly depends on a number of factors. First, the ability to ensure equitabledistribution of content and services. Economic and business models whichrely on incentive mechanisms to supply contributions to the system are beingdeveloped, along with methods for controlling the “free riding” issue. Second,the ability to enforce provision of trusted services. Reputation-based P2P trustmanagement models are becoming a focus of the research community as a viablesolution. The trust models must balance both constraints imposed by theenvironment (e.g., scalability) and the unique properties of trust as a social andpsychological phenomenon. Recently, we are also witnessing a move of the P2Pparadigm to embrace mobile computing in an attempt to achieve even higherubiquitousness. The possibility of services related to physical location and therelation with agents in physical proximity could introduce new opportunities andalso new technical challenges.Although researchers working on distributed computing, multi-agent systems,databases, and networks have been using similar concepts for a long time, it isonly fairly recently that papers motivated by the current P2P paradigm havestarted appearing in high-quality conferences and workshops. Research in agentsystems in particular appears to be most relevant because, since their inception,multiagent systems have always been thought of as collections of peers.The multiagent paradigm can thus be superimposed on the P2P architecture,where agents embody the description of the task environments, the decisionsupportcapabilities, the collective behavior, and the interaction protocols ofeach peer. The emphasis in this context on decentralization, user autonomy, dynamicgrowth, and other advantages of P2P also leads to significant potentialproblems. Most prominent among these problems are coordination: the abilityof an agent to make decisions on its own actions in the context of activitiesof other agents; and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers,robustness, traffic redistribution, and so forth. It is important to scale up coordinationstrategies along multiple dimensions to enhance their tractability andviability, and thereby to widen potential application domains. These two problemsare common to many large-scale applications.Without coordination, agentsmay be wasting their efforts, squandering resources, and failing to achieve theirobjectives in situations requiring collective effort.This workshop series brings together researchers working on agent systems andP2P computing with the intention of strengthening this connection. Researchersfrom other related areas such as distributed systems, networks, and databasesystems are also welcome (and, in our opinion, have a lot to contribute). Weseek high-quality and original contributions on the general theme of “Agentsand P2P Computing.”

S. BERGAMASCHI; PO L; SORRENTINO S ( 2008 ) - Automatic annotation for mapping discovery in data integration systems ( Sistemi Evoluti per Basi di Dati (SEBD 2008) - Mondello, Palermo, Italy - 22-25 Giugno 2008) ( - Proceedings of the 16th Italian Symposium on Advanced Database Systems ) (Salvatore Gaglio, Ignazio Infantino, Domenico Sacca' Mondello, Palermo, Italy ITA ) - pp. da 334 a 341 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this article we present CWSD (Combined Word Sense Disambiguation) a method and a software tool for enabling automatic lexical annotation of local (structured and semi-structured) data sources in a data integration system. CWSD is based on the exploitation of WordNet Domains and the lexical and structural knowledge of the data sources. The method extends the semi-automatic lexical annotation module of the MOMIS data integration system. The distinguishing feature of the method is its independence or low dependence of a human intervention. CWSD is a valid method to satisfy two important tasks: (1) the source lexical annotation process, i.e. the operation of associating an element of a lexical reference database (WordNet) to all source elements, (2) the discover of mappings among concepts of distributed data sources/ontologies.

Sonia Bergamaschi; Francesco Nigro; Laura Po; Maurizio Vincini ( 2008 ) - Open Source come modello di business per le PMI: analisi critica e casi di studio ( - Open source e proprietà intellettuale. Fondamenti filosofici, tecnologie informatiche e gestione dei diritti ) (Gedit Edizioni Bologna ITA ) - pp. da 41 a 59 ISBN: 9788860270771 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Il software Open Source sta attirando l'attenzione a tutti i livelli, sia all'interno del mondo economico che produttivo, perché propone un nuovo modello di sviluppo tecnologico ed economico fortemente innovativo e di rottura con il passato.In questo elaborato verranno analizzate le ragioni che stanno determinando il successo di tale modello e verranno inoltre presentate alcune casistiche in cui l'Open Source risulta vantaggioso, evidenziando gli aspetti più interessanti sia per gli utilizzatori che per i produttori del software.

Roberto Rizzo; Rosangela Marchelli; Ilaria De Munari; Sonia Bergamaschi; Mariella Careri; Maria Elisabetta Guerzoni; Roberto Tuberosa; Giampiero Valè; Andrea Mozzarelli; Giuliano Ezio Sansebastiano; Nelson Marmiroli; Paola Vecchia; Erasmo Neviani; Nicola Pecchioni; Carlo Chezzi; Pierluigi Reschiglian; Corrado Fogher; Predieri Stefano; Sebastiano Porretta; Franca Castagnoli; Stefano Ravaglia; Andrea Demontis ( 2008 ) - SITEIA - Sicurezza Tecnologie Innovazione Agroalimentare [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

Analisi chimiche, genetico-molecolari, microbiologiche, sensoriali e chimico-fisiche, per:•identificazione di contaminanti, allergeni, composti di neoformazione indesiderati dovuti ai trattamenti industriali e ai metodi di cottura, e quantificazione e monitoraggio dei composti di interesse nutrizionale (valutazione del loro rapporto con la salute dei consumatori)•caratterizzazione della qualità e della autenticità dei prodotti alimentari e identificazione delle frodi•collezione e valorizzazione di microrganismi di interesse industriale•caratterizzazione genotipica e fenotipica delle materie prime e in particolare dei cereali e prodotti derivati, ai fini di ottimizzare la selezione varietale e di aumentare la resistenza ad agenti patogeni in agricoltura, senza ricorrere a manipolazione genetica•identificazione e tracciabilità molecolare di specie e varietà in colture in atto, in materie prime, granaglie e prodotti alimentari finiti- Sviluppo di tecnologie di sterilizzazione degli alimenti, trattamenti di disinfezione e sterilizzazione degli imballaggi e metodologie di progettazione igienica delle macchine e degli impianti alimentari- Simulazione numerica per l’ottimizzazione della progettazione di macchine e impianti- Utilizzo di sistemi elettronici avanzati (RFID) per l’automazione della tracciabilità degli alimenti- Sviluppo di imballaggi attivi ed intelligenti- Sviluppo di soluzioni innovative per il recupero e la valorizzazione dei sottoprodotti dell’industria agro-alimentare- Sviluppo di database e di software per la gestione delle informazioni per le imprese agro-alimentari

Sonia Bergamaschi; Boon-Chong Seet; Noria Foukia; Nigel Stanger; Gianluca Moro ( 2008 ) - Sixth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2008) [Esposizione (290) - Esposizione]
Abstract

The aim of this sixth workshop is to explore the promise of P2P to offer exciting new possibilities in distributed information processing and database technologies. Nowadays network technologies allow the deployment of systems composed of a big number of devices and calculators whose complexity may lead to high management costs. Besides, the investments required to guarantee robustness and reliability may become unsustainable. Examples of this kind of systems are grid networks and enterprises' or governmental server clusters spread all over the world and used for business, social or scientific purposes. Other application scenarios are emerging where the system can not be configured by specific adjustments on its single components, for instance in sensor networks with thousands or millions of micro-devices. Viable solutions require the system to be able to self-configure, self- manage, self-recovery, more generally speaking it must be able to selforganize without human interaction, adapting its working and optimization strategies to resource usage and to overall efficiency. The P2P paradigm lends itself to constructing large scale complex, adaptive, autonomous and heterogeneous database and information systems, endowed with clearly specified and differential capabilities to negotiate, bargain, coordinate and self-organize the information exchanges in large scale networks. Peer-to-peer systems have concretely shown how to aggregate huge computation and information resources from small autonomous and heterogeneous calculators. Although no centralized coordination exists, these systems are able to organize themselves and can offer basic services for information discovery. The literature on peer-to-peer systems, which has grown rapidly in the last years, has highlighted the potential of this new paradigm offering more efficient and reliable solutions for self-organization of big distributed systems. The realization of these promises lies fundamentally in the availability of enhanced services such as structured ways for representing, classifying, querying and registering shared information, verification and certification of information, content distributed schemes and quality of content, security features, information discovery and accessibility, interoperation and composition of active information services, and finally market-based mechanisms to allow cooperative and non cooperative information exchanges. The exploitation of the knowledge extracted from the peers' network is definitely a further potential of such systems. For example, the possibility of performing distributed data mining on the big amount of data that peer-to-peer systems are able to put together, also exploiting their huge parallel computing potential, may lead to extract knowledge potentially useful for scientific, social and commercial purposes, depending on the network domain. Moreover, in-network data mining algorithms would supply the capability of generating and transmitting highlevel models instead of raw data allowing to significantly reduce network traffic and the network could forecast events and return only relevant information to user requests. The use of semantics for the descriptions of peers and services could introduce new approaches for querying, sharing, distributing and organizing knowledge. Such approach generates several challenges related to the association of services/contents to ontologies, the interoperability/ integration of ontologies, the exploitation of emergent semantics required for understanding different contents and the automation of such processes. For example, in mobile computing the possibility of data and services related to physical location and the relation with peers and sensors in physical proximity could introduce new opportunities and also new technical challenges. Such dynamic environments, which are inherently characterized by mobility and heterogeneity of resources like devices, participants, services, information and

Sonia Bergamaschi; Francesco Guerra; Yannis Velegrakis ( 2008 ) - 2nd International Workshop on Semantic Web Architectures For Enterprises [Esposizione (290) - Esposizione]
Abstract

The Semantic Web vision aims at building a "web of data", where applications may share their data on the Internet and relate them to real world objects for interoperability and exchange. Similar ideas have been applied to web services, where different modeling architectures have been proposed for adding semantics to web service descriptions making services on the web widely available. The potential impact envisaged by these approaches on real business applications is also important in areas such as: Semantic-based business integration: business integration allows enterprises to share their data and services with other enterprises for business purposes. Making data and services available satisfies both "structural" requirements of enterprises (e.g. the possibility of sharing data about products or about available services), and "dynamic" requirement (e.g. business-to-business partnerships to execute an order). Information systems implementing semantic web architectures can enable and strongly support this process. Semantic interoperability: metadata and ontologies support the dynamic and flexible exchange of data and services across information systems of different organizations. Adding semantics to representations of data and services allows accurate data querying and service discovering. Semantic-based lifecycle management: metadata, ontologies and rules are becoming an effective way for modeling corporate processes and business domains, effectively supporting the maintenance and evolution of business processes, corporate data, and knowledge. Knowledge management: ontologies and automated reasoning tools seem to provide an innovative support to the elicitation, representation and sharing of corporate knowledge. SWAE (Semantic Web Architectures for Enterprises) aims at evaluating how and how much the Semantic Web vision has met its promises with respect to business and market needs. Papers and demonstrations of interest for the workshop will show and highlight the interactions between Semantic Web technologies and business applications. The workshop aims at collecting models, tools, use cases and practical experience in which Semantic Web techniques have been developed and applied to support any relevant business processes. It aims at assessing their degree of success, the challenges that have been addressed, the solutions that have been provided and the new tools that have been implemented. Special attention will be paid to proposals of “complete architecture”, i.e. applications that can effectively support the maintenance and evolution of business processes as a whole and applications that are able to combine representations of data and services in order to realize a common business knowledge management system.

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - A new type of metadata for querying data integration systems ( Convegno Nazionale Sistemi di Basi di Dati Evolute - Torre Canne (Fasano, BR) - 17-20 June 2007) ( - SEBD2007 ) (Michelangelo Ceci, Donato Malerba, Letizia Tanca Bari ITA ) - pp. da 266 a 273 ISBN: 9788890298103 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided languages and systems able to guarantee an integrated intensional representation of a given set of data sources.A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge. In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values.Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources.

S. BERGAMASCHI; L. PO; S. SORRENTINO ( 2007 ) - Automatic annotation in data integration systems ( The 6th International Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2007) - Vilamoura, Algarve, Portugal - November 27-29, 2007) ( - On the Move to Meaningful Internet Systems 2007: OTM 2007 Workshops, OTM Confederated International Workshops and Posters, AWeSOMe, CAMS, OTM Academy Doctoral Consortium, MONET, OnToContent, ORM, PerSys, PPN, RDDS, SSWS, and SWWS 2007, Vilamoura, Portugal, November 25-30, 2007, Proceedings, Part I ) (Springer Heidelberg DEU ) - n. volume 4805 - pp. da 27 a 28 ISBN: 978-3-540-76887-6 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose a CWSD (Combined Word Sense Disambiguation) algorithm for the automatic annotation of structured and semi-structured data sources. Rather than being targeted to textual data sources like most of the traditional WSD algorithms found in the literature, our algorithm can exploit information coming from the structure of the sources together with the lexical knowledge associated with the terms (elements of the schemata).

S. BERGAMASCHI; L. PO; A. SALA; S. SORRENTINO ( 2007 ) - Automatic annotation of local data sources for data integration systems ( Workshop on Databases, Information Systems and Peer-to-Peer Computing - University of Vienna, Austria - September 24, 2007) ( - Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P) ) (non disponibile non disponibile AUT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this article we present CWSD (Combined Word Sense Disambiguation) a method and a software tool for enabling automatic annotation of local structured and semi-structured data sources, with lexical information, in a data integration system. CWSD is based on the exploitation of WordNet Domains, structural knowledge and on the extension of the lexical annotation module of the MOMIS data integration system. The distinguishing feature of the algorithm is its low dependence of a human intervention. Our approach is a valid method to satisfy two important tasks: (1) the source annotation process, i.e. the operation of associating an element of a lexical reference database (WordNet) to all source elements, (2) the discover of mappings among concepts of distributed data sources/ontologies.

S. BERGAMASCHI; A. SALA ( 2007 ) - CEREALAB DATABASE: Data Integration with the MOMIS System ( The 2007 annual meeting of the Italian Bioinformatics Society BITS - Napoli, Italy - April 26 - 28, 2007) ( - The 2007 annual meeting of the Italian Bioinformatics Society (BITS) ) (non disponibile non disponibile ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Biological information is frequently widespread over the Web and retrieving knowledge in this domain often requires to navigate through several websites. Data sources are usually heterogeneous and present different structures and interfaces. Mediator systems can be used to perform integration of such databases in order to have an integrated view of multiple information sources and to query them. The MOMIS system (Mediator envirOnment for Multiple Information Sources) is a framework developed by the Database Group of the University of Modena and Reggio Emilia (www.dbgroup.unimo.it) to perform intelligent information integration from both structured and unstructured data sources. The result of the integration process is a Global Virtual View (GVV) of the underlying sources which is a conceptualization of the underlying domain and then may be thought of as an ontology describing the involved sources. Queries can be posed over the GVV regardless of the structure of the local sources in a transparent way for the user. The MOMIS system has been experimented for the realization of the CEREALAB database. CEREALAB is a research project of technology transfer for applying Marker Assisted Selection (MAS) techniques to cereal breeding in Italian seed companies.

S. BERGAMASCHI; A. SALA ( 2007 ) - Creating and Querying an Integrated Ontology for Molecular and Phenotypic Cereals Data ( Conference on Metadata and Semantics Research (MTSR 2007) - Corfù, Greece - October 11-12 ,2007) ( - Metadata and Semantics, Post-proceedings of the 2nd International Conference on Metadata and Semantics Research, MTSR 2007, Corfu Island in Greece, 1-2 October 2007 ) (Springer Berlino DEU ) - pp. da 445 a 454 ISBN: 9780387777443 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we describe the development of an ontology of molecular and phenotypic cereals data, realized by integrating existing public web databases with the database developed by the research group of the CEREALAB project (www.cerealab.org). This integration is obtained using the MOMIS system (Mediator envirOnment for Multiple Information Sources), a mediator based data integration system developed by the Database Group of the University of Modena and Reggio Emilia(www.dbgroup.unimo.it). MOMIS performs information extraction and integration from both structured and semi-structured data sources in a semi-automatic way. Information integration is performed in a semi-automatic way, by exploiting the knowledge in a Common Thesaurus (defined by the framework) and the descriptions of source schemas with a combination of clustering and Description Logics techniques. The result of the integration process is a Global Virtual Schema (GVV) of the underlying data sources for which mapping rules and integrity constraints are specified to handle heterogeneity. Each GVV element is annotated w.r.t. the WordNet lexical database(wordnet.princeton.edu). The GVV can be queried transparently with regards to integrated data sources using an easy to use graphical interface regardless of the specific languages of the source databases.

Gianluca Moro; Sonia Bergamaschi; Sam Joseph; Jean-Henry Morin; Aris M. Ouksel ( 2007 ) - Databases, Information Systems, and Peer-to-Peer Computing, International Workshops, DBISP2P 2005/2006, Trondheim, Norway, August 28-29, 2005, Seoul, Korea, September 11, 2006, Revised Selected Papers (Springer Heidelberg DEU ) - pp. da 1 a 416 ISBN: 9783540716600 [Curatela (284) - Curatela]
Abstract

Atti delle edizioni 2005 e 2006 degli workshop su Databases, Information Systems, and Peer-to-Peer Computing.

N. Pecchioni; J. Milc; A. Sala; S. Bergamaschi ( 2007 ) - dBase CEREALAB [Software (296) - Software]
Abstract

The CEREALAB database; an information system for breeders is a source of molecular and phenotypic data, realized by integrating two already existing web databases, Gramene and Graingenes together with the source storing the information achieved by research groups of the CEREALAB project. The new data derives from a systematic genotyping work using already known markers and some brandly new protocols developed by the discovery workpackage of the project.This integration is obtained using the MOMIS system (Mediator Environment for Multiple Information Sources). The result obtained is a queriable virtual view that integrates the three sources and allows performing selection of cultivars of barley, wheat and rice based on molecular data and phenotypic traits, regardless of the specific languages of the three source databases. The phenotypic characters to be included in the database have been chosen among those of major interest for the breeders and divided into six categories: Abiotic Stress, Biotic Stress, Growth and Development, Quality and Yield. As far as molecular data is concerned the major categories for the query are: Trait, Qtl, Gene and Marker.

Milc J.; Albertazzi G.; Caffagni A.; Sala A.; Francia E.; Barbieri M.; Bergamaschi S.; N. Pecchioni ( 2007 ) - Development of an On-Line Database of Molecular and Phenotypic Data for Marker Assisted Selection of Cereals. ( LI Congresso della Società Italiana di Genetica Agraria - Riva del Garda (TN) - 23-26 September 2007) ( - Proceedings of the 51st Italian Society of Agricultural Genetics Annual Congress Riva del Garda, Italy – 23/26 September, 2007 ) (Società Italiana di Genetica Agraria Napoli ITA ) - n. volume - [Abstract in Atti di convegno (274) - Abstract in Atti di Convegno]
Abstract

-

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - Extracting Relevant Attribute Values for Improved Search - IEEE INTERNET COMPUTING - n. volume 11 (5) - pp. da 26 a 35 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

A new kind of metadata offers a synthesized view of an attribute's values for a user to exploit when creating or refining a search query in data-integration systems. The extraction technique that obtains these values is automatic and independent of an attribute domain but parameterized with various metrics for similarity measures. The authors describe a fully implemented prototype and some experimental results to show the effectiveness of "relevant values" when searching a knowledge base.

Sonia Bergamaschi; Zoran Despotovic; Sam Joseph; Gianluca Moro ( 2007 ) - Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007) [Esposizione (290) - Esposizione]
Abstract

The aim of the workshop is to explore the promise of P2P to offer exciting new possibilities in distributed information processing and database technologies. The realization of these promises lies fundamentally in the availability of enhanced services such as structured ways for classifying and registering shared information, verification and certification of information, content distributed schemes and quality of content, security features, information discovery and accessibility, interoperation and composition of active information services, and finally market-based mechanisms to allow cooperative and non cooperative information exchanges. The P2P paradigm lends itself to constructing large scale complex, adaptive, autonomous and heterogeneous database and information systems, endowed with clearly specified and differential capabilities to negotiate, bargain, coordinate and self-organize the information exchanges in large scale networks. This vision will have a radical impact on the structure of complex organizations (business, scientific or otherwise) and on the emergence and the formation of social communities, and on how the information is organized and processed. Recently, the P2P paradigm is embracing mobile computing and ad-hoc networks in an attempt to achieve even higher ubiquitousness. The possibility of data and services related to physical location and the relation with peers and sensors in physical proximity could introduce new opportunities and also new technical challenges. Such dynamic environments, which are inherently characterized by high mobility and heterogeneity of resources like devices, participants, services, information and data representation, pose several issues on how to search and localize resources, how to efficiently route traffic, up to higher level problems related to semantic interoperability and information relevance. The use of ontologies for the descriptions of peers and services could introduce new approaches for querying, sharing, distributing and organizing knowledge. Nevertheless, several challenges related to the association of services/contents to ontologies, the interoperability/ integration of! ontologies required for understanding different contents and the automation of such processes rise. The workshop is build on the success of the four preceding editions since VLDB 2003, whose proceedings have been always published by Springer in Lecture Notes in Computer Science series. It concentrates on exploring the synergies between current database research and P2P computing, in fact it is our belief that database research has much to contribute to the P2P grand challenge through its wealth of techniques for sophisticated semantics-based data models, new indexing algorithms and efficient data placement, query processing techniques and transaction processing. Database technologies in the new information age will form the crucial components of the first generation of complex adaptive P2P information systems, which will be characterized by their ability to continuously self-organize, adapt to new circumstances, promote emergence as an inherent property, optimize locally but not necessarily globally, deal with approximation and incompleteness. This workshop also concentra! tes on the impact of complex adaptive information systems on current database technologies and their relation to emerging industrial technologies. The workshop co-location with VLDB (http://www.vldb2007.org/), the major international database and information systems conference, is important in order to actually bring together key researchers from all over the world working on databases and P2P computing with the intention of strengthening this connection. Researchers from other related areas such as distributed systems, networks, multi-agent systems and complex systems are also invited, in fact we believe that mostly in the P2P paradigm, as an interdisciplinary theme, different approaches and point of views may generate c

Sonia Bergamaschi; Paolo Bouquet; Daniel Giacomuzzi; Francesco Guerra; Laura Po; Maurizio Vincini ( 2007 ) - MELIS: a tool for the incremental annotation of domain ontologies [Software (296) - Software]
Abstract

Melis is a software tool for enablingan incremental process of automatic annotation of local schemas (e.g. re-lational database schemas, directory trees) with lexical information. Thedistinguishing and original feature of MELIS is its incrementality: thehigher the number of schemas which are processed, the more back-ground/domain knowledge is cumulated in the system (a portion of do-main ontology is learned at every step), the better the performance ofthe systems on annotating new schemas.

S. BERGAMASCHI; P. BOUQUET; D. GIACOMUZZI; F. GUERRA; L. PO; M. VINCINI ( 2007 ) - MELIS: An Incremental Method For The Lexical Annotation Of Domain Ontologies ( Web Information Systems and Technologies (WEBIST 2007) - Barcelona, Spain - March 3-6, 2007) ( - Proceedings of the Third International Conference on Web Information Systems and Technologies ) (for Systems and Technologies of Information, Control and Communication Setubal PRT ) - pp. da 240 a 247 ISBN: 978-972-8865-78-8 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELISis its incrementality: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of MELIS as a standalone tool and as a component integrated in MOMIS.

S. Bergamaschi; P. Bouquet; D. Giacomuzzi; F. Guerra; L. Po; M. Vincini ( 2007 ) - Melis: an incremental method for the lexical annotation of domain ontologies - INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS - n. volume 3 - pp. da 57 a 80 ISSN: 1552-6283 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELIS is the incremental process: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of ME LIS as a standalone tool and as a component integrated in MOMIS.

D. Beneventano; S. Bergamaschi; F. Guerra; M. Vincini ( 2007 ) - Progetto di Basi di Dati Relazionali (Pitagora Bologna ITA ) - pp. da 1 a 345 ISBN: 9788837116804 [Monografia o trattato scientifico (276) - Monografia/Trattato scientifico]
Abstract

L'obiettivo del volume è fornire al lettore le nozioni fondamentali di progettazione e di realizzazione di applicazioni di basi di dati relazionali. Relativamente alla progettazione, vengono trattate le fasi di progettazione concettuale e logica e vengono presentati i modelli dei dati Entity-Relationship e Relazionale che costituiscono gli strumenti di base, rispettivamente, per la progettazione concettuale e la progettazione logica. Viene inoltre introdotto lo studente alla teoria della normalizzazione di basi di dati relazionali. Relativamente alla realizzazione, vengono presentati elementi ed esempi del linguaggio standard per RDBMS (Relational Database Management Systems) SQL. Ampio spazio è dedicato ad esercizi svolti sui temi trattati. Il volume nasce dalla pluriennale esperienza didattica condotta dagli autori nei corsi di Basi di Dati e di Sistemi Informativi per studenti dei corsi di laurea e laurea specialistica della Facoltà di Ingegneria di Modena, della Facoltà di Ingegneria di Reggio Emilia e della Facoltà di Economia "Marco Biagi" dell'Università degli Studi di Modena e Reggio Emilia. Il volume attuale estende notevolmente le edizioni precedenti arricchendo la sezione di progettazione logica e di SQL.La sezione di esercizi è completamente nuova, inoltre, ulteriori esercizi sono reperibili su questa pagina web. Come le edizioni precedenti, costituisce più una collezione di appunti che un vero libro nel senso che tratta in modo rigoroso ma essenziale i concetti forniti. Inoltre, non esaurisce tutte le tematiche di un corso di Basi di Dati, la cui altra componente fondamentale è costituita dalla tecnologia delle basi di dati. Questa componente è, a parere degli autori, trattata in maniera eccellente da un altro testo di Basi di Dati, scritto dai nostri colleghi e amici Paolo Ciaccia e Dario Maio dell'Università di Bologna. Il volume, pure nella sua essenzialità, è ricco di esercizi svolti e quindi può costituire un ottimo strumento per gruppi di lavoro che, nell'ambito di software house, si occupino di progettazione di applicazioni di basi di dati relazionali.

BENEVENTANO D; VINCINI M; ORSINI M; S. BERGAMASCHI; NANA C ( 2007 ) - Query Translation on heterogeneous sources in MOMIS Data Transformation Systems ( International Workshop on Database Interoperability (InterDB) - Vienna, Austria - 24 September 2007) ( - International Workshop on Database Interoperability (InterDB) ) (Christoph Koch and Johannes Gehrke and Minos N. Garofalakis and Divesh Srivastava and Karl Aberer and Anand Deshpande and Daniela Florescu and Chee Yong Chan and Venkatesh Ganti and Carl-Christian Kanne and Wolfgang Klas and Erich J. Neuhold Vienna AUT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Abstract

S. BERGAMASCHI; GUERRA F; ORSINI M.; SARTORI C; VINCINI M ( 2007 ) - Relevant News: a semantic news feed aggregator ( Semantic Web Applications and Perspectives - Bari - 18 - 20 Dicembre 2007) ( - Semantic Web Applications and Perspectives - Proceedings of the 4th Italian Semantic Web Workshop ) (Giovanni Semeraro, Eugenio Di Sciascio, Christian Morbidoni, Heiko Stoemer BARI ITA ) - n. volume 314 - pp. da 150 a 159 ISBN: 16130073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present RELEVANTNews, a web feed reader that automatically groups news related to the same topic published in different newspapers in different days. The tool is based on RELEVANT, a previously developed tool, which computes the “relevant values”, i.e. a subset of the values of a string attribute.Clustering the titles of the news feeds selected by the user, it is possible identify sets of related news on the basis of syntactic and lexical similarity.RELEVANTNews may be used in its default configuration or in a personalized way: the user may tune some parameters in order to improve the grouping results. We tested the tool with more than 700 news published in 30 newspapers in four daysand some preliminary results are discussed.

Sonia Bergamaschi; Claudio Sartori; Francesco Guerra; Mirko Orsini ( 2007 ) - RELEvant VAlues geNeraTor [Software (296) - Software]
Abstract

A new kind of metadata offers a synthesized view of an attribute's values for a user to exploit when creating or refining a search query in data-integration systems. The extraction technique that obtains these values is automatic and independent of an attribute domain but parameterized with various metrics for similarity measures.

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - Relevant values: new metadata to provide insight on attribute values at schema level ( International Conference on Enterprise Information Systems - Funchal, Madeira - 12-16, June 2007) ( - Proceedings of the 9th International Conference on Enterprise Information Systems ) (INSTICC - Institute for Systems and Technologies of Information, Controll and Communication Lisbona PRT ) - pp. da 274 a 279 ISBN: 9789728865887 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided languages and systems able to guarantee an integrated intensionalrepresentation of a given set of data sources. A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge.In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values. Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is basedon data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources, as in the Semantic Web context.

D. BENEVENTANO; S. BERGAMASCHI ( 2007 ) - Semantic search engines based on data integration systems ( - In Semantic Web: Theory, Tools and Applications ) (Information Science Reference Hershey USA ) - pp. da 317 a 342 ISBN: 9781599040455 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

As the use of the World Wide Web has become increasingly widespread, the business of commercial search engines has become a vital and lucrative part of the Web. Search engines are common place tools for virtually every user of the Internet; and companies, such as Google and Yahoo!, have become household names. Semantic Search Engines try to augment and improve traditional Web Search Engines by using not just words, but concepts and logical relationships. In this chapter a relevant class of Semantic Search Engines, based on a peer-to-peer, data integration mediator-based architecture is described. The architectural and functional features are presented with respect to two projects, SEWASIE and WISDOM, involving the authors. The methodology to create a two level ontology and query processing in the SEWASIE project are fully described.

Sonia Bergamaschi; Zoran Despotovic; Sam Joseph; Gianluca Moro ( 2007 ) - Sixth International Workshop on Agents and Peer-to-Peer Computing (AP2PC 2007) [Esposizione (290) - Esposizione]
Abstract

Peer-to-peer (P2P) computing has attracted enormous media attention, initially spurred by the popularity of file sharing systems such as Napster, Gnutella, and Morpheus. More recently systems like BitTorrent and eDonkey have continued to sustain that attention. New techniques such as distributed hash-tables (DHTs), semantic routing, and Plaxton Meshes are being combined with traditional concepts such as Hypercubes, Trust Metrics and caching techniques to pool together the untapped computing resources at the "edges" of the internet. These new techniques and possibilities have generated a lot of interest in many industrial organizations, and has resulted in the creation of a P2P working group on standardization in this area. (http://www.irtf.org/charter?gtype=rg&group=p2prg).In P2P computing peers and services forego central coordination and dynamically organise themselves to support knowledge sharing and collaboration, in both cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. Firstly, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity could introduce new opportunities and also new technical challenges.Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only fairly recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as collections of peers.The MultiAgent paradigm can thus be superimposed on the P2P architecture, where agents embody the description of the task environments, the decision-support capabilities, the collective behavior, and the interaction protocols of each peer. The emphasis in this context on decentralization, user autonomy, dynamic growth and other advantages of P2P, also leads to significant potential problems. Most prominent among these problems are coordination: the ability of an agent to make decisions on its own actions in the context of activities of other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers, robustness, traffic redistribution, and so forth. It is important to scale up coordination strategies along multiple dimensions to enhance their tractability and viability, and thereby to widen potential application domains. These two problems are common to many large-scale applications. Without coordination, agents may be wasting their efforts, squander resources and fail to achieve their objectives in situations requiring collective effort.This workshop will bring together researchers working on agent systems and P2P computing with the intention of strengthening this connection. The increasing interest in this research area is evident in that the four previous editions of AP2PC has been among the most popular AAMAS workshops in terms of participation. Research in Agents and Peer to Peer is by its nature interdisciplinary and offers a challenge

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2007 ) - The SEWASIE MAS for Semantic Search ( First International Workshop on Agent supported Cooperative Work (ACW 2007) - Lyon -France - 29 October) ( - Proceedings of the Second IEEE International Conference on Digital Information Management ) (IEEE Engineering Management Society Los Alamitos, California USA ) - n. volume 2 - pp. da 793 a 798 ISBN: 9781424414765 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The capillary diffusion of the Internet has made available access to an overwhelming amount of data, allowing users having benefit of vast information. However, information is not really directly available: internet data are heterogeneous and spread over different places, with several duplications, and inconsistencies. The integration of such heterogeneous inconsistent data, with data reconciliation and data fusion techniques, may therefore represent a key activity enabling a more organized and semantically meaningful access to data sources. Some issues are to be solved concerning in particular the discovery and the explicit specification of the relationships between abstract data concepts and the need for data reliability in dynamic, constantly changing network. Ontologies provide a key mechanism for solving these challenges, but the web’s dynamic nature leaves open the question of how to manage them.Many solutions based on ontology creation by a mediator system have been proposed: a unified virtual view (the ontology) of the underlying data sources is obtained giving to the users a transparent access to the integrated data sources. The centralized architecture of a mediator system presents several limitations, emphasized in the hidden web: firstly, web data sources hold information according to their particular view of the matter, i.e. each of them uses a specific ontology to represent its data. Also, data sources are usually isolated, i.e. they do not share any topological information concerning the content or structure of other sources.Our proposal is to develop a network of ontology-based mediator systems, where mediators are not isolated from each other and include tools for sharing and mapping their ontologies. In this paper, we describe the use of a multi-agent architecture to achieve and manage the mediators network. The functional architecture is composed of single peers (implemented as mediator agents) independently carrying out their own integration activities. Such agents may then exchange data and knowledge with other peers by means of specialized agents (called brokering agents) which provide a coherent access plan to the peer network. In this way, two layers are defined in the architecture: at the local level, peers maintain an integrated view of local sources; at the network level, agents maintain mappings among the different peers. The result is the definition of a new type of mediator system network intended to operate in web economies, which we realized within SEWASIE (SEmantic Webs and AgentS in Integrated Economies), an RDT project supported by the 5th Framework IST program of the European Community, successfully ended on September 2005.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2007 ) - The SEWASIE Network of Mediator Agents for Semantic Search - JOURNAL OF UNIVERSAL COMPUTER SCIENCE - n. volume 13 (12) - pp. da 1936 a 1969 ISSN: 0948-695X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Integration of heterogeneous information in the context of Internet becomes a key activity to enable a more organized and semantically meaningful access to data sources. As Internet can be viewed as a data-sharing network where sites are data sources, the challenge is twofold. Firstly, sources present information according to their particular view of the matter, i.e. each of them assumes a specific ontology. Then, data sources are usually isolated, i.e. they do not share any topological information concerning the content or the structure of other sources. The classical approach to solve these issues is provided by mediator systems which aim at creating a unified virtual view of the underlying data sources in order to hide the heterogeneity of data and give users a transparent access to the integrated information.In this paper we propose to use a multi-agent architecture to build and manage a mediators network. While a single peer (i.e. a mediator agent) independently carries out data integration activities, it exchanges knowledge with other peers by means of specialized agents (i.e. brokers) which provide a coherent access plan to access information in the peer network. This defines two layers in the system: at local level, peers maintain an integrated view of local sources, while at network level agents maintain mappings among the different peers. The result is the definition of a new networked mediator system intended to operate in web economies, which we realized in the SEWASIE (SEmantic Webs and AgentS in Integrated Economies) project. SEWASIE is a RDT project supported by the 5th Framework IST program of the European Community successfully ended on September 2005.

Sonia Bergamaschi; Paolo Bouquet; Francesco Guerra ( 2007 ) - 1st International Workshop on Semantic Web Architectures For Enterprises [Esposizione (290) - Esposizione]
Abstract

SWAE aims at evaluating how and how much the Semantic Web vision has met its promises with respect to business and market needs. Even though the Semantic Web is a relatively new branch of scientific and technological research, its relevance has already been envisaged for some crucial business processes: Semantic-based business data integration: data integration satisfies both "structural" requirements of enterprises (e.g. the possibility of consulting its data in a unified manner), and "dynamic" requirement (e.g. business-to-business partnerships to execute an order). Information systems implementing semantic web architectures can strongly support this process, or simply enable it. Semantic interoperability: metadata and ontologies support the dynamic and flexible exchange of data and services across information systems of different organizations. The development of applications for the automatic classification of services and goods on the basis of standard hierarchies, and the translation of such classifications into the different standards used by companies is a clear example of the potential for semantic interoperability methods and tools. Knowledge management: ontologies and automated reasoning tools seem to provide an innovative support to the elicitation, representation and sharing of corporate knowledge. In particular, for the shift from document-centric KM to an entity-centric KM approach. Enterprise and process modeling: ontologies and rules are becoming an effective way for modeling corporate processes and business domains (for example, in cost reduction). The goal of the workshop is to evaluate and assess how deep the permeation of Semantic Web models, languages, technologies and applications has been in effective enterprise business applications. It would also identify how semantic web based systems, methods and theories sustain business applications such as decision processes, workflow management processes, accountability, and production chain management. A particular attention will be dedicated to metrics and criteria that evaluate cost-effectiveness of system designing processes, knowledge encoding and management, system maintenance, etc.

S. BERGAMASCHI; P. BOUQUET; D. GIACOMUZZI; F. GUERRA; L. PO; M. VINCINI ( 2006 ) - An incremental method for meaning elicitation of a domain ontology ( Semantic Web Applications and Perspectives (SWAP 2006) - Scuola Normale Superiore, PISA - 18-20 December, 2006) ( - Proceedings of the 3rd Italian Semantic Web Workshop ) (CEUR-WS.org ) - n. volume 201 - pp. da 1 a 8 ISBN: 16130073 ISSN: 1613-0073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Internet has opened the access to an overwhelming amount of data, requiring the development of new applications to automatically recognize, process and manage informationavailable in web sites or web-based applications. The standardSemantic Web architecture exploits ontologies to give a shared(and known) meaning to each web source elements.In this context, we developed MELIS (Meaning Elicitation and Lexical Integration System). MELIS couples the lexical annotation module of the MOMIS system with some components from CTXMATCH2.0, a tool for eliciting meaning from severaltypes of schemas and match them. MELIS uses the MOMIS WNEditor and CTXMATCH2.0 to support two main tasks in theMOMIS ontology generation methodology: the source annotationprocess, i.e. the operation of associating an element of a lexicaldatabase to each source element, and the extraction of lexicalrelationships among elements of different data sources.

S. Bergamaschi; G. Gelati; F. Guerra; M. Vincini ( 2006 ) - An intelligent data integration approach for collaborative project management in virtual enterprises - WORLD WIDE WEB - n. volume 9(1) - pp. da 35 a 61 ISSN: 1386-145X [Articolo in rivista (262) - Articolo su rivista]
Abstract

The increasing globalization and flexibility required by companies has generated new issues in the last decade related to the managing of large scale projects and to the cooperation of enterprises within geographically distributed networks. ICT support systems are required to help enterprises share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with a main goal: to facilitate the activity of assembling, integration and the verification of a multi-enterprise project. The main achievement of the system from a data management perspective is to avoid inconsistencies generated by updates at the sources' level and minimizes data replications. The developed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration system, which supports information sharing and querying, and web-services that ensure the interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia. Gruppo Formula S.p.A. and tested by Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge-IST-2000-28221).

Sonia Bergamaschi; Zoran Despotovic; Sam Joseph; Gianluca Moro ( 2006 ) - Fifth International Workshop on Agents and Peer-to-Peer Computing(AP2PC 2006) [Esposizione (290) - Esposizione]
Abstract

Peer-to-peer (P2P) computing has attracted enormous media attention, initially spurred by the popularity of file sharing systems such as Napster, Gnutella, and Morpheus. More recently systems like BitTorrent and eDonkey have continued to sustain that attention. New techniques such as distributed hash-tables (DHTs), semantic routing, and Plaxton Meshes are being combined with traditional concepts such as Hypercubes, Trust Metrics and caching techniques to pool together the untapped computing resources at the "edges" of the internet. These new techniques and possibilities have generated a lot of interest in many industrial organizations, and has resulted in the creation of a P2P working group on standardization in this area. (http://www.irtf.org/charter?gtype=rg&group=p2prg).In P2P computing peers and services forego central coordination and dynamically organise themselves to support knowledge sharing and collaboration, in both cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. Firstly, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity could introduce new opportunities and also new technical challenges.Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only fairly recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as collections of peers.The MultiAgent paradigm can thus be superimposed on the P2P architecture, where agents embody the description of the task environments, the decision-support capabilities, the collective behavior, and the interaction protocols of each peer. The emphasis in this context on decentralization, user autonomy, dynamic growth and other advantages of P2P, also leads to significant potential problems. Most prominent among these problems are coordination: the ability of an agent to make decisions on its own actions in the context of activities of other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers, robustness, traffic redistribution, and so forth. It is important to scale up coordination strategies along multiple dimensions to enhance their tractability and viability, and thereby to widen potential application domains. These two problems are common to many large-scale applications. Without coordination, agents may be wasting their efforts, squander resources and fail to achieve their objectives in situations requiring collective effort.This workshop will bring together researchers working on agent systems and P2P computing with the intention of strengthening this connection. The increasing interest in this research area is evident in that the four previous editions of AP2PC has been among the most popular AAMAS workshops in terms of participation. Research in Agents and Peer to Peer is by its nature interdisciplinary and offers a challenge

Sonia Bergamaschi; Sam Joseph; Jean-Henry Morin; Gianluca Moro ( 2006 ) - Fourth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2006) [Esposizione (290) - Esposizione]
Abstract

The aim of this fourth workshop is to explore the promise of P2P to offer exciting new possibilities in distributed information processing and database technologies. The realization of these promises lies fundamentally in the availability of enhanced services such as structured ways for classifying and registering shared information, verification and certification of information, content distributed schemes and quality of content, security features, information discovery and accessibility, interoperation and composition of active information services, and finally market-based mechanisms to allow cooperative and non cooperative information exchanges. The P2P paradigm lends itself to constructing large scale complex, adaptive, autonomous and heterogeneous database and information systems, endowed with clearly specified and differential capabilities to negotiate, bargain, coordinate and self-organize the information exchanges in large scale networks. This vision will have a radical impact on the structure of complex organizations (business, scientific or otherwise) and on the emergence and the formation of social communities, and on how the information is organized and processed. Recently, the P2P paradigm is embracing mobile computing and ad-hoc networks in an attempt to achieve even higher ubiquitousness. The possibility of data and services related to physical location and the relation with peers and sensors in physical proximity could introduce new opportunities and also new technical challenges. Such dynamic environments, which are inherently characterized by high mobility and heterogeneity of resources like devices, participants, services, information and data representation, pose several issues on how to search and localize resources, how to efficiently route traffic, up to higher level problems related to semantic interoperability and information relevance. The use of ontologies for the descriptions of peers and services could introduce new approaches for querying, sharing, distributing and organizing knowledge. Nevertheless, several challenges related to the association of services/contents to ontologies, the interoperability/ integration of ontologies required for understanding different contents and the automation of such processes rise. A sample applicative scenario may be the offer of new services for business trades on the basis of the client requirements both established by means of (different) ontologies. On the basis of the physical location, the client ontology contacts other ontologies, executing automatic integration/ interoperation/ reconciliation processes whereas information are expressed according with different ontologies. Analogous issues and similar scenarios may be depicted for static and wireless connectivity, and static and mobile architectures. The proposed workshop will build on the success of the three preceding editions at VLDB 2003, 2004 and 2005. It will concentrate on exploring the synergies between current database research and P2P computing. It is our belief that database research has much to contribute to the P2P grand challenge through its wealth of techniques for sophisticated semantics-based data models, new indexing algorithms and efficient data placement, query processing techniques and transaction processing. Database technologies in the new information age will form the crucial components of the first generation of complex adaptive P2P information systems, which will be characterized by their ability to continuously self-organize, adapt to new circumstances, promote emergence as an inherent property, optimize locally but not necessarily globally, deal with approximation and incompleteness. This workshop will also concentrate on the impact of complex adaptive information systems on current database technologies and their relation to emerging industrial technologies such as IBM's autonomic computing initiative. The workshop will be co-located with VLDB, the major international database

Domenico Beneventano; Sonia Bergamaschi; Stefania Bruschi; Francesco Guerra;Mirko Orsini;Maurizio Vincini ( 2006 ) - Instances Navigation for Querying Integrated Data from Web-Sites ( - Web Information Systems and Technologies, International Conferences, WEBIST 2005 and WEBIST 2006. Revised Selected Papers ) (Springer Heidelberg DEU ) - pp. da 125 a 137 ISBN: 9783540323013; 9783540740629 | 9783540740629 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.

D. BENEVENTANO; S. BERGAMASCHI; S. BRUSCHI; F. GUERRA; M. ORSINI; M. VINCINI ( 2006 ) - Instances navigation for querying integrated data from web-sites ( International Conference on Web Information Systems - Setubal, Portugal, - April 11-13, 2006) ( - International Conference on Web Information Systems and Technologies ) (INSTICC Setubal PRT ) - pp. da 46 a 53 ISBN: 9789728865467 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.

Sonia Bergamaschi; Fausto Rabitti; Maurizio Brioschi; Carlo Batini ( 2006 ) - Networked Peers for Business [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

Gli specifici obiettivi economici ed organizzativi sono:- Modellare le relazioni economiche ed organizzative tra ICT, i processi di business e la performance, con attenzione alle PMI in rete nei settori manifatturieri tradizionali- Analizzare i problemi di collaborazione che emergono dall'utizilizzo dell'ICT nei processi di business incentrati sulle persone all'interno di sistemi di gestione delle operazioni multi-party, con lo scopo di trasformare le reti di fornitura. Esistono strumenti e servizi per il supporto alla collaborazione, ma il pieno sfruttamento degli ambienti di lavoro cooperativi può essere raggiunto solo con una profonda conoscenza degli attributi economico-organizzativo dei processi supportati dalle ICT. I risultati daranno una visione più chiara di come i processi di business potrebbero migliorare la produttività fornendo un accesso continuo sempre e ovunque ad una varietà di risorse (persone, conoscenza, servizi, dispositivi), in un ambiente di fiducia dove le organizzazioni virtuali dinamiche possono essere create oltre i confini aziendali, integrandosi facilmente con i processi amministrativi, di business e di progettazione delle imprese- Sviluppare un servizio per i clienti di NeP4B, chiamato "modello della catena del valore", che aiuti le aziende nella selezione dei servizi nei sistemi P2P che massimizzano il valore fornito ai processi aziendali, basato sulla correlazione tra modelli economico organizzativi , ICT e catena del valore, e sulle opinioni dei partenr fidati. - Sviluppare un modello di trust coerente con lo scenario in cui diversi attori possono incontrarsi in un ambiente virtuale e devono decidere se un partecipante è affidabile. La principale caratteristica del modello di trust è di derivare le relazioni di trust in un ambiente dove non esistono partner "super partes". Le relazioni di trust possono essere costruite usando le tecniche di reputazione, per esempio usando le opinioni di altri peer basate su precedenti esperienze. L'obiettivo è di progettare e sviluppare un'architettura che consente ai peer di esprimere le loro opinioni su altri peer (ma anche su risorse/servizi che essi offrono), e di distribuite e raccogliere queste informazioni. Il meccanismo di distribuzione delle opinioni dovrà essere progettato prendendo in considerazione aspetti di sicurezza, come l'autenticità/integrità delle opinioni, e di privacy (anonimità/pseudoanonimità)Gli specifici obiettivi tecnologici e di rappresentazione della conoscenza aziendale sono:- Costruire peer semantici con funzioni avanzate ed eventualmente automatizzate per l'effettiva acquisizione, classificazione, organizzazione, condivisione e uso della conoscenza aziendale (espressa in linguaggio naturale) inseriti in documenti testuali e multimediali. - Rappresentare semantic web service resi disponibili da altri peer o provider esterni (esempio la Pubblica Amministrazione), insieme con informazioni sulla loro efficacia nella catena del valore.- Stabilire le condizioni per relazioni dinamiche sicure e di fiducia, negoziazioni e scambio di informazioni tra aziende basate su protocolli di collaborazione condivisi.- Offrire applicazioni web per gestire processi distribuiti in specifici settori, di carattere generale e le relative necessità nascondendo la complessità dell'infrastruttura tecnologica.- Fornire servizi avanzati di ricerca, basati su semantica, per scoprire candidati partner sulla base della conoscenza che individualmente decidono di rendere accessibile agli altri utenti.- Risolvere il problema dell'estrazione di conoscenza da documenti testuali, che descrivono la logica di business dell'azienda, costruendo generatori automatici che estraggono le descrizioni, in linguaggio naturale, delle categorie aziendali direttamente dai documenti che appartengono alla categoria- Estendere la descrizione dei semantic web service, arricchendoli con caratteristiche non funzionali per:a) Caratterizzare

Domenico Beneventano; Sonia Bergamaschi ( 2006 ) - Semantic search engines based on data integration systems ( Distributed Agent-based Retrieval Tools - Pula - Cagliari, Italy - June 26, 2006) ( - Distributed agent-based retrieval tools. Proceedings of the 1st International workshop ) (POLIMERICA S.a.s MILANO ITA ) - pp. da 11 a 38 ISBN: 8876990437 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

As the use of the World Wide Web has become increasingly widespread, the business ofcommercial search engines has become a vital and lucrative part of the Web. Search engines arecommon place tools for virtually every user of the Internet; and companies, such as Google andYahoo!, have become household names. Semantic Search Engines try to augment and improvetraditional Web Search Engines by using not just words, but concepts and logical relationships.In this chapter a relevant class of Semantic Search Engines, based on a peer-to-peer, dataintegration mediator-based architecture is described.The architectural and functional features are presented with respect to two projects, SEWASIEand WISDOM, involving the authors. The methodology to create a two level ontology and queryprocessing in the SEWASIE project are fully described.

Domenico Beneventano; Sonia Bergamaschi ( 2006 ) - SofTware for Ambient Semantic Interoperable Services [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

L’obiettivo del progetto STASIS (SofTware for Ambient Semantic Interoperable Services) è la ricerca, lo sviluppo e la sperimentazione di metodi e tecniche che consentano l’interoperabilità semantica tra le varie aziende. La strada che si intende seguire è centrata sull’uso di ontologie che forniscono una rappresentazione esaustiva e non ambigua, comprensibile dal calcolatore e condivisa, della conoscenza associata ad una particolare area o attività.

Sonia Bergamaschi; Antonio Sala ( 2006 ) - Virtual Integration of Existing Web Databases for the Genotypic Selection of Cereal Cultivars ( Conference on Ontologies, DataBases, and Applications of Semantics (ODBASE 2006) - Montpellier, France - 29 ottobre – 3 novembre 2006) ( - On the Move to Meaningful Internet Systems 2006: CoopIS,DOA, GADA, and ODBASE, OTM Confederated International Conferences, CoopIS, DOA, GADA, and ODBASE 2006, Montpellier, France, October 29 - November 3, 2006. Proceedings, Part I ) (Springer Berlino DEU ) - n. volume Part I - pp. da 909 a 926 ISBN: 9783540482871 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The paper presents the development of a virtual database for the genotypic selection of cereal cultivars starting from phenotypic traits. The database is realized by integrating two existing web databases, Gramene and Graingenes, and a pre-existing data source developed by the Agrarian Faculty of the University of Modena and Reggio Emilia. The integration process gives rise to a virtual integrated view of the underlying sources. This integration is obtained using the MOMIS system (Mediator envirOnment for Multiple Information Sources), a framework developed by the Database Group of the University of Modena and Reggio Emilia. MOMIS performs information extraction and integration from both structured and semistructured data sources. Information integration is performed in a semi-automatic way, by exploiting the knowledge in a Common Thesaurus (defined by the framework) and the descriptions of source schemas with a combination of clustering and Description Logics techniques. Momis allows querying information in a transparent mode for the user regardless of the specific languages of the sources. The result obtained by applying MOMIS to Gramene and Graingenes web databases is a queriable virtual view that integrates the two sources and allow performing genotypic selection of cultivars of barley, wheat and rice based on phenotypic traits, regardless of the specific languages of the web databases. The project is conducted in collaboration with the Agrarian Faculty of the University of Modena and Reggio Emilia and funded by the Regional Government of Emilia Romagna.

Gianluca Moro; Sonia Bergamaschi; Karl Aberer ( 2005 ) - Agents and Peer-to-Peer Computing, Third International Workshop, AP2PC 2004, New York, NY, USA, July 19, 2004, Revised and Invited Papers (Springer Heidelberg DEU ) - pp. da 1 a 244 ISBN: 9783540297550 [Curatela (284) - Curatela]
Abstract

Atti del workshop Agents and Peer-to-Peer Computing tenuto nel 2004.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2005 ) - Building a tourism information provider with the MOMIS system - INFORMATION TECHNOLOGY & TOURISM - n. volume 7(3-4) - pp. da 221 a 238 ISSN: 1098-3058 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The tourism industry is a good candidate for taking up Semantic Web technology. In fact, there are many portals and websites belonging to the tourism domain that promote tourist products (places to visit, food to eat, museums, etc.) and tourist services (hotels, events, etc.), published by several operators (tourist promoter associations, public agencies, etc.). This article presents how the MOMIS system may be used for building a tourism information provider by exploiting the tourism information that is available in Internet websites. MOMIS (Mediator envirOnment for Multiple Information Sources) is a mediator framework that performs information extraction and integration from heterogeneous distributed data sources and includes query management facilities to transparently support queries posed to the integrated data sources.

Sonia Bergamaschi ( 2005 ) - Cross - Centro per l’innovazione ed il trasferimento tecnologico per l’interoperabilità e le reti di imprese [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

CROSS è un Centro per l’Innovazione nato per la diffusione e al trasferimento delle tecnologie informatiche e delle telecomunicazioni (ICT) con l’obiettivo specifico del rafforzamento delle reti di imprese. Le tematiche prioritarie riguardano, pertanto, la diffusione delle tecnologie e degli standard per l’interoperabilità e l'innovazione organizzativa.

DOMENICO BENEVENTANO; SONIA BERGAMASCHI ( 2005 ) - OQL Query Engine - Progettazione e realizzazione di un motore per la risoluzione di query espresse nel linguaggio Object Query Language (Standard ODMG) [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

Il progetto si propone di costruire un componente software per la risoluzione di interrogazioni nella Basi di Dati Orientate ad oggetti. Le interrogazioni sono espresse nel linguaggio Object Query Language (OQL) dello standard Object Data Management Group (ODMG).

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2005 ) - Querying a super-peer in a schema-based super-peer network ( Databases, Information Systems, and Peer-to-Peer Computing - Trondheim, Norway - August 28-29, 2005) ( - International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005) ) (Springer, Lecture Notes in Computer Science Berlino DEU ) - pp. da 13 a 25 ISBN: 9783540716600 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose a novel approach for defining and querying a super-peer within a schema-based super-peer network organized into a two-level architecture: the low level, called the peer level (which contains a mediator node), the second one, called super-peer level (which integrates mediators peers with similar content).We focus on a single super-peer and propose a method to define and solve a query, fully implemented in the SEWASIE project prototype. The problem we faced is relevant as a super-peer is a two-level data integrated system, then we are going beyond traditional setting in data integration. We have two different levels of Global as View mappings: the first mapping is at the super-peer level and maps several Global Virtual Views (GVVs) of peers into the GVV of the super-peer; the second mapping is within a peer and maps the data sources into the GVV of the peer. Moreover, we propose an approach where the integration designer, supported by a graphical interface, can implicitly define mappings by using Resolution Functions to solve data conflicts, and the Full Disjunction operator that has been recognized as providing a natural semantics for data merging queries.

Sonia Bergamaschi; Domenico Beneventano; Maurizio Vincini; Francesco Guerra ( 2005 ) - SEWASIE - SEmantic Webs and AgentS in Integrated Economies. [Software (296) - Software]
Abstract

SEWASIE (SEmantic Webs and AgentS in Integrated Economies) aims to design and implement an advanced search engine enabling intelligent access to heterogeneous data sources on the web via semantic enrichment to provide the basis of structured secure web-based communication. SEWASIE implemented an advanced search engine that provides intelligent access to heterogeneous data sources on the web via semantic enrichment to provide the basis of structured secure web-based communication. SEWASIE provides users with a search client that has an easy-to-use query interface, and which can extract the required information from the Internet and can show it in a useful and user-friendly format. From an architectural point of view, the prototype provides a search engine client and indexing servers and ontologies.

S. BERGAMASCHI; P. BOUQUET; P. CIACCIA; P. MERIALDO ( 2005 ) - Speaking Words of WISDOM: Web Intelligent Search based on DOMain ontologies ( Semantic Web Applications and Perspectives (SWAP 2005) - Trento - 14-15-16 December 2005) ( - Semantic Web Applications and Perspectives, 2nd Italian Semantic Web Workshop (SWAP 2005) ) (CEUR Workshop Proceedings Trento ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present the architecture of a system for searching and querying information sources available on the web which was developed as part of a project called WISDOM. key feature of our proposal is a distributed architecture based on (i) the peer-to-peer paradigm and (ii) the adoption of domainontologies. at the lower level, we support a strong, ontology-based integration of the information content of a bunch of source peers, which form a so-called semantic peer. at the upper level, we provide a loose, mapping-based integrationof a set of semantic peers. we then show how queries can be efficiently managed and distributed in such a two-layer scenario.

Bergamaschi S; Fillottrani PR; Gelati G ( 2005 ) - The SEWASIE multi-agent system ( Agents and Peer-to-Peer Computing - New York, NY, USA - July 19, 2004) ( - Agents and Peer-to-Peer Computing, Third International Workshop, AP2PC 2004, New York, NY, USA, July 19, 2004, Revised and Invited Papers ) (Springer Heidelberg DEU ) - n. volume 3601 - pp. da 120 a 131 ISBN: 9783540297550 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Data integration, in the context of the web, faces new problems, due in particular to the heterogeneity of sources, to the fragmentation of the information and to the absence of a unique way to structure, and view information. In such areas, the traditional paradigms on which database foundations are based (i.e. client/server architecture, few sources containing large information) have to be overcome by new architectures. In this paper we propose a layered P2P architecture for mediator systems. Peers are information nodes which are coordinated by a multi-agent system in order to allow distributed query processing.

Sonia Bergamaschi; Gianluca Moro; Aris M. Ouksel ( 2005 ) - Third International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005) [Esposizione (290) - Esposizione]
Abstract

The aim of this third workshop is to explore the promise of P2P to offer exciting new possibilities in distributed information processing and database technologies. The realization of this promise lies fundamentally in the availability of enhanced services such as structured ways for classifying and registering shared information, verification and certification of information, content distributed schemes and quality of content, security features, information discovery and accessibility, interoperation and composition of active information services, and finally market-based mechanisms to allow cooperative and non cooperative information exchanges. The P2P paradigm lends itself to constructing large scale complex, adaptive, autonomous and heterogeneous database and information systems, endowed with clearly specified and differential capabilities to negotiate, bargain, coordinate and self-organize the information exchanges in large scale networks. This vision will have a radical impact on the structure of complex organizations (business, scientific or otherwise) and on the emergence and the formation of social communities, and on how the information is organized and processed.The P2P information paradigm naturally encompasses static and wireless connectivity, and static and mobile architectures. Wireless connectivity combined with the increasingly small and powerful mobile devices and sensors pose new challenges as well as opportunities to the database community. Information becomes ubiquitous, highly distributed and accessible anywhere and at any time over highly dynamic, unstable networks with very severe constraints on the information management and processing capabilities. What techniques and data models may be appropriate for this environment, and yet guarantee or approach the performance, versatility and capability that users and developers come to enjoy in traditional static, centralized and distributed database environment? Is there a need to define new notions of consistency and durability, and completeness, for example?The proposed workshop will build on the success of the two preceding editions at VLDB 2003 and 2004. It will concentrate on exploring the synergies between current database research and P2P computing. It is our belief that database research has much to contribute to the P2P grand challenge through its wealth of techniques for sophisticated semantics-based data models, new indexing algorithms and efficient data placement, query processing techniques and transaction processing. Database technologies in the new information age will form the crucial components of the first generation of complex adaptive P2P information systems, which will be characterized by their ability to continuously self-organize, adapt to new circumstances, promote emergence as an inherent property, optimize locally but not necessarily globally, deal with approximation and incompleteness. This workshop will also concentrate on the impact of complex adaptive information systems on current database technologies and their relation to emerging industrial technologies such as IBM's autonomic computing initiative.The workshop will be co-located with VLDB, the major international database and information systems conference, and will bring together key researchers from all over the world working on databases and P2P computing with the intention of strengthening this connection. Researchers from other related areas such as distributed systems, networks, multi-agent systems and complex systems will also be invited.

Bergamaschi S; Guerra F; Vincini M ( 2004 ) - A peer-to-peer information system for the semantic web ( Agents and Peer-to-Peer Computing - Melbourne, Australia - July 14, 2003) ( - Agents and Peer-to-Peer Computing, Second International Workshop, AP2PC 2003 ) (Springer Heidelberg DEU ) - n. volume 2872 - pp. da 113 a 122 ISBN: 9783540240532 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Data integration, in the context of the web, faces new problems, due in particular to the heterogeneity of sources, to the fragmentation of the information and to the absence of a unique way to structure and view information. In such areas, the traditional paradigms, on which database foundations are based (i.e. client server architecture, few sources containing large information), have to be overcome by new architectures. The peer-to-peer (P2P) architecture seems to be the best way to fulfill these new kinds of data sources, offering an alternative to traditional client/server architecture. In this paper we present the SEWASIE system that aims at providing access to heterogeneous web information sources. An enhancement of the system architecture in the direction of P2P architecture, where connections among SEWASIE peers rely on exchange of XML metadata, is described.

R. BENASSI; S. BERGAMASCHI; ALAIN FERGNANI; DANIELE MISELLI ( 2004 ) - Extending a Lexicon Ontology for Intelligent Information Integration ( European Conference on Artificial Intelligence (ECAI2004) - Valencia, Spain - 22-27 August 2005) ( - European Conference on Artificial Intelligence (ECAI2004) ) (IOS PRESS Amsterdam NLD ) - pp. da 278 a 282 ISBN: 9781586034528 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

One of the current research on the Semantic Web areais semantic annotation of information sources. On-line lexical ontologie can be exploited as a-priori common knowledge to provide easily understandable, machine-readable metadata. Nevertheless, the absence of terms related to specific domains causes a loss of semantics.In this paper we present WNEditor, a tool that aims at guidingthe annotation designer during the creation of a domain lexiconontology, extending the pre-existing WordNet ontology. New terms, meanings and relations between terms are virtually added and managed by preserving the WordNet’s internal organization.

S. Bergamaschi; D. Beneventano; F. Guerra; M. Orsini; M. Vincini ( 2004 ) - MOMIS: an Ontology-based Information Integration System(software) [Software (296) - Software]
Abstract

The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent oftheir location or their data’s heterogeneity.An open source version of the MOMIS system was released on April 2010 by the spin-off DATARIVER (www.datariver.it)Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web ages with machine-readable definitions. It thus relieson the a priori existence of ontologies that represent the domains associated with the given information sources. This approachrelies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that theannotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. Byinvolving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover,the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content.

I. BENETTI; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2004 ) - SOAP-ENABLED WEB SERVICES FOR KNOWLEDGE MANAGEMENT - INTERNATIONAL JOURNAL OF WEB ENGINEERING AND TECHNOLOGY - n. volume 1(2) - pp. da 218 a 235 ISSN: 1476-1289 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The widespread diffusion of the World Wide Web among medium/small companies yields a huge amount of information to make business available online. Nevertheless the heterogeneity of that information, forces even trading partners involved in the same business process to face daily interoperability issues.The challenge is the integration of distributed business processes, which, in turn, means integration of heterogeneous data coming from distributed sources.This paper presents the new web services-based architecture of the MOMIS (Mediator envirOnment for Multiple Information Sources) framework that enhances the semantic integration features of MOMIS, leveraging new technologies such as XML web services and the SOAP protocol.The new architecture decouples the different MOMIS modules, publishing them as XML web services. Since the SOAP protocol used to access XML web services requires the same network security settings as a normal internet browser, companies are enabled to share knowledge without softening their protection strategies.

R. BENASSI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2004 ) - Synthesizing an Integrated Ontology with MOMIS ( International Conference on Knowledge Engineering and Decision Support (ICKEDS 2004) - Porto, Portugal - Portugal, 21-23 July) ( - International Conference on Knowledge Engineering and Decision Support (ICKEDS) ) (Proceedings su cd Porto PRT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Mediator EnvirOnment for Multiple Information Sources (MOMIS) aims at constructing synthesized, integrated descriptions of the information coming from multiple heterogeneous sources, in order to provide the user with a global virtual view of the sources independent from their location and the level of hetero-geneity of their data. Such a global virtual view is a con-ceptualization of the underlying domain and then may be thought of as an ontology describing the involved sources. In this article we explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underly-ing domain

D. BENEVENTANO; S. BERGAMASCHI ( 2004 ) - The MOMIS methodology for integrating heterogeneous data sources ( IFIP WORLD COMPUTER CONGRESS - TOULOUSE, FRANCE - 22-27 AUGUST 2004) ( - IFIP World Computer Congress ) (IFIP Tolosa FRA ) - pp. da 19 a 24 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Mediator EnvirOnment for Multiple Information Sources (MOMIS) aims at constructingsynthesized, integrateddescriptions of the information coming from multiplehe terogeneous sources, in order to provide the user with a global virtual viewof the sources independent from their location and the level of heterogeneity of their data. Such a global virtual view is a conceptualizationof the underlying domain andthen may be thought of as anontology describing the involved sources. In this article wee xplore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underlying domain.

Gianluca Moro; Sonia Bergamaschi; Karl Aberer; Munindar P. Singh ( 2004 ) - Third International Workshop on Agents and Peer-to-Peer Computing(AP2PC 2004) [Esposizione (290) - Esposizione]
Abstract

Peer-to-peer (P2P) computing is attracting enormous media attention, spurred by the popularity of file sharing systems such as Napster, Gnutella, and Morpheus. The peers are autonomous, or as some call them, first-class citizens. P2P networks are emerging as a new distributed computing paradigm for their potential to harness the computing power of the hosts composing the network and make their under-utilized resources available to others. This possibility has generated a lot of interest in many industrial organizations which have already launched important projects.In P2P systems, peer and web services in the role of resources become shared and combined to enable new capabilities greater than the sum of the parts. This means that services can be developed and treated as pools of methods that can be composed dynamically. The decentralized nature of P2P computing makes it also ideal for economic environments that foster knowledge sharing and collaboration as well as cooperative and non-cooperative behaviors in sharing resources. Business models are being developed, which rely on incentive mechanisms to supply contributions to the system and methods for controlling free riding. Clearly, the growth and the management of P2P networks must be regulated to ensure adequate compensation of content and/or service providers. At the same time, there is also a need to ensure equitable distribution of content and services.Although researchers working on distributed computing, MultiAgent Systems, databases and networks have been using similar concepts for a long time, it is only recently that papers motivated by the current P2P paradigm have started appearing in high quality conferences and workshops. Research in agent systems in particular appears to be most relevant because, since their inception, MultiAgent Systems have always been thought of as networks of peers.The MultiAgent paradigm can thus be superimposed on the P2P architecture, where agents embody the description of the task environments, the decision-support capabilities, the collective behavior, and the interaction protocols of each peer. The emphasis in this context on decentralization, user autonomy, ease and speed of growth that gives P2P its advantages, also leads to significant potential problems. Most prominent among these problems are coordination: the ability of an agent to make decisions on its own actions in the context of activities of other agents, and scalability: the value of the P2P systems lies in how well they scale along several dimensions, including complexity, heterogeneity of peers, robustness, traffic redistribution, and so on. It is important to scale up coordination strategies along multiple dimensions to enhance their tractability and viability, and thereby to widen the application domains. These two problems are common to many large-scale applications. Without coordination, agents may be wasting their efforts, squander resources and fail to achieve their objectives in situations requiring collective effort.This workshop will bring together researchers working on agent systems and P2P computing with the intention of strengthening this connection. Researchers from other related areas such as distributed systems, networks and database systems will also be welcome (and, in our opinion, have a lot to contribute).

R. BENASSI; S. BERGAMASCHI; M. VINCINI ( 2004 ) - TUCUXI: the Intelligent Hunter Agent for Concept Understanding and Lexical Chaining ( IEEE/WIC/ACM International Conference on Web Intelligence - Beijing, China - 20-24 September 2004) ( - Web Intelligence ) (IEEE Computer Society New York USA ) - pp. da 249 a 255 ISBN: 9780769521008 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present Tucuxi, an intelligent hunter agent that replaces traditional keyword-based queries on the Web with a user-provided domani ontology, where meanings to be searched are not ambiguous.

R. BENASSI; S. BERGAMASCHI; M. VINCINI ( 2004 ) - Web Semantic Search with TUCUXI ( Convegno Nazionale su Sistemi Evoluti per Basi di Dati - S. Margherita di Pula, Cagliari, Italy - June 21-23, 2004) ( - Convegno Nazionale su Sistemi Evoluti per Basi di Dati ) (M. Agosti, P. Dessi, F. Schreiber CAGLIARI ITA ) - pp. da 426 a 423 ISBN: 9788890140914 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

S.Margherita di Pula (CAGLIARI), Italia, 21-23 Giugno.

S. Bergamaschi; P. Bouquet; P. Ciaccia; M. Paolo ( 2004 ) - WISDOM: Ricerca Intelligente su Web basata su Ontologie di Dominio [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

L'enorme quantità di dati e la crescente disponibilità di servizi sul Web rendono sempre più importante lo sviluppo di infrastrutture e sistemi software che, fornendo strumenti per l'integrazione delle risorse informative, per la loro localizzazione e per la fruizione personalizzata delle stesse, permetta ai clienti (sia umani che artificiali) collegati alla rete di “ricaricarsi” delle informazioni di interesse, evitando i problemi di "information overloading" che si riscontrano usando i comuni motori di ricerca. Il progetto WISDOM ha come obiettivo principale lo sviluppo di tecniche, e strumenti, basati su ontologie di dominio, per la ricerca efficace ed efficiente di informazione su Web e si colloca quindi nell’ambito di ricerca del Semantic Web. Si articolerà in tre temi tra loro sinergici e complementari e definira' un'architettura metodologica e funzionale di riferimento al fine di garantire coerenza tra le soluzioni che verranno messe a punto nei tre temi. TEMA 1: Creazione ed Estensione di una Ontologia di DominioTEMA 2: Semantica Emergente: Scoperta di Mapping Semantici tra Ontologie di DominioTEMA 3: Elaborazione di Interrogazioni L'obiettivo del primo tema è lo studio e lo sviluppo di soluzioni per la rappresentazione semantica dei contenuti delle sorgenti informative in ambito Web, con particolare riferimento ai siti data-intensive e ai siti/pagine Web con contenuto scarsamente strutturato. La rappresentazione ed integrazione di tali sorgenti informative portera' alla creazione di ontologie di dominio e alla loro eventuale modifica per effetto della scoperta/integrazione di nuove sorgenti informative. L'obiettivo del secondo tema è lo sviluppo di soluzioni per realizzare il mapping semantico fra ontologie di dominio in ambito Web, con particolare riferimento allo sviluppo di tecniche e strumenti di supporto alla identificazione, scoperta, validazione e memorizzazione di relazioni semantiche. L'obiettivo del terzo tema è lo sviluppo di tecniche di ricerca di informazione su web in grado di utilizzare l'infrastruttura semantica sviluppata dai temi 1 e 2. Considerando l'eterogeneita' dei dati/siti trattati e i vincoli imposti dall'ambiente distribuito, verranno studiati e sviluppati di meccanismi efficaci ed efficienti di elaborazione delle interrogazioni che usano la caratterizzazione delle sorgenti per selezionare le sorgenti utili, che risolvono i problemi di riscrittura e di integrazione dei risultati sulle sorgenti. In figura e` rappresentato uno scenario di riferimento per il progetto: due diverse ontologie riferite ad uno stesso dominio, che rappresentano anche conoscenza estensionale, messe in relazione con semplici mapping semantici. Le due interrogazioni, Query1 e Query2, sono poste ciacuna con riferimento ad una ontologia: le tecniche sviluppate nel progetto permetteranno di rispondere a ciascuna query interrogando, se rilevanti, tutti i siti che sono riferiti alle ontologie presenti nella rete. Relativamene al TEMA 1, un primo obiettivo è la definizione di un linguaggio di ontologia per la descrizione strutturale e semantica dei contenuti delle sorgenti, in termini di metadati, compatibile con standard W3C (XML, RDF, RDFS, XML Schema, OWL). In particolare, per far fronte a query specifiche, tale linguaggio deve consentire una caratterizzazione sintetica del contenuto (istanze) delle sorgenti informative. Una ontologia di dominio è rappresentata come una vista globale virtuale (GVV - Global Virtual View) di un insieme di sorgenti informative relative allo stesso dominio. Per i siti data-intensive, il primo problema da affrontare è l'estrazione dello schema tramite opportuni wrapper generati automaticamente. Un secondo problema è quello di dare una semantica ai dati estratti da wrapper generati automaticamente. Per tale problema si valuteranno estensioni alle tecniche per la annotazione dei dati estratti da wrapper con approcci basati sulla semantica dell'onto

S. BERGAMASCHI; G. GELATI; F. GUERRA; M. VINCINI ( 2003 ) - A Experiencing AUML for the WINK Multi-Agent System ( WOA 2003: dagli Oggetti agli Agenti - Villasimius (Cagliari), Italy - 10 - 11 Settembre 2003) ( - WOA 2003: Dagli Oggetti agli Agenti. 4th AI*IA/TABOO Joint Workshop "From Objects to Agents": Intelligent Systems and Pervasive Computing ) (Pitagora Editrice Bologna Bologna ITA ) - pp. da 148 a 148 ISBN: 9788837114138 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In the last few years, efforts have been done towards bridging thegap between agent technology and de facto standard technologies,aiming at introducing multi-agent systems in industrialapplications. This paper presents an experience done by using oneof such proposals, Agent UML. Agent UML is a graphicalmodelling language based on UML. The practical usage of thisnotation has brought to suggest some refinements of the AgentUML features.

D. BENEVENTANO; S. BERGAMASCHI; A. FERGNANI; F. GUERRA; M. VINCINI; D. MONTANARI ( 2003 ) - A Peer-to-Peer Agent-Based Semantic Search Engine ( Sistemi Evoluti per Basi di Dati (SEBD 2003) - Cetraro (CS) - June 24-27, 2003) ( - Proceedings of the Eleventh Italian Symposium on Advanced Database Systems ) (Rubbettino Editore Cosenza ITA ) - pp. da 367 a 378 ISBN: 9788849806298 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Several architectures, protocols, languages, and candidate standards, have been proposed to let the "semantic web'' idea take off. In particular, searching for information requires cooperation of the information providers and seekers. Past experience and history show that a successful architecture must support ease of adoption and deployment by a wide and heterogeneous population, a flexible policy to establish an acceptable cost-benefit ratio for using the system, and the growth of a cooperative distributed infrastructure with no central control. In this paper an agent-based peer-to-peer architecture is defined to support search through a flexible integration of semantic information.Two levels of integration are foreseen: strong integration of sources related to the same domain into a single information node by means of a mediator-based system; weak integration of information nodes on the basis of semantic relationships existing among concepts of different nodes.The EU IST SEWASIE project is described as an instantiation of this architecture. SEWASIE aims at implementing an advanced search engine, which will provide SMEs with intelligent access to heterogeneous information on the Internet.

D. Beneventano; S. Bergamaschi; D. Miselli; A. Fergnani; M. Vincini ( 2003 ) - Building an Integrated Ontology within the SEWASIE Project: The Ontology Builder Tool (International Semantic Web Conference Sanibel Island, Florida USA ) - pp. da 1 a 1 ISBN: 9780000000002 [Monografia o trattato scientifico (276) - Monografia/Trattato scientifico]
Abstract

See http://www.sewasie.org/

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2003 ) - Building an integrated Ontology within the SEWASIE system ( Workshop on Semantic Web and Databases - Berlin, Germany - September 7-8, 2003) ( - First International Workshop on Semantic Web and Databases (SWDB) ) (Isabel F. Cruz, Vipul Kashyap, Stefan Decker, Rainer Eckstein Berlin DEU ) - pp. da 91 a 107 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

MOMIS (Mediator envirOnment for Multiple Information Sources) is a framework for information extraction and integration of heterogeneous structured and semi-structured information sources. The result of the integration process is a Global Virtual View (in short GVV) which is a set of (global) classesthat represent the information contained in the sources being used. In this paper, we present the application of our integration concerning a specific type of source (i.e. web documents), and show how the result of the integration approach can be exploited to create a conceptualization of the domain belonging the sources, i.e. an ontology. Two new achievements of the MOMIS system are presented: the semi-automatic annotation of the GVV and the extension of a built-up ontology by the addition of another source.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA ( 2003 ) - Building an Ontology with MOMIS ( Semantic Integration Workshop - Sanibel Island, Florida, USA - October 20, 2003) ( - Proceedings of the Semantic Integration Workshop Collocated with the Second International Semantic Web Conference (ISWC-03) ) - CEUR WORKSHOP PROCEEDINGS - n. volume 82 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Nowadays the Web is a huge collection of data and its expansion rate is very high. Web users need new ways to exploit all this available information and possibilities. A new vision of the Web, the Semantic Web , where resources are annotated with machine-processable metadata providing them with background knowledge and meaning, arises. A fundamental component of the Semantic Web is the ontology; this “explicit specification of a conceptualization” allows information providers to give a understandable meaning to their documents. MOMIS (Mediator envirOnment for Multiple Information Sources) is a framework for information extraction and integration of heterogeneous information sources. The system implements a semi-automatic methodology for data integration that follows the Global as View (GAV) approach. The result of the integration process is a global schema, which provides a reconciled, integrated and virtual view of the underlying sources, GVV (Global Virtual View). The GVV is composed of a set of (global) classes that represent the information contained in the sources. In this paper, we focus on the MOMIS application into a particular kind of source (i.e. web documents), and show how the result of the integration process can be exploited to create a conceptualization of the underlying domain, i.e. a domain ontology for the integrated sources. GVV is then semi-automatically annotated according to a lexical ontology. With reference to the Semantic Web area, where generally the annotation process consists of providing a web page with semantic markups according to an ontology, we firstly markup the local metadata descriptions and then the MOMIS system generates an annotated conceptualization of the sources. Moreover, our approach “builds” the domain ontology as the synthesis of the integration process, while the usual approach in the Semantic Web is based on “a priori” existence of ontology

D. Beneventano; S. Bergamaschi; C. Sartori ( 2003 ) - Description logics for semantic query optimization in object-oriented database systems - ACM TRANSACTIONS ON DATABASE SYSTEMS - n. volume 28 - pp. da 1 a 50 ISSN: 0362-5915 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Semantic query optimization uses semantic knowledge (i.e., integrity constraints) to transform a query into an equivalent one that may be answered more efficiently. This article proposes a general method for semantic query optimization in the framework of Object-Oriented Database Systems. The method is effective for a large class of queries, including conjunctive recursive queries expressed with regular path expressions and is based on three ingredients. The first is a Description Logic, ODLRE, providing a type system capable of expressing: class descriptions, queries, views, integrity constraint rules and inference techniques, such as incoherence detection and subsumption computation. The second is a semantic expansion function for queries, which incorporates restrictions logically implied by the query and the schema (classes + rules) in one query. The third is an optimal rewriting method of a query with respect to the schema classes that rewrites a query into an equivalent one, by determining more specialized classes to be accessed and by reducing the number of factors. We implemented the method in a tool providing an ODMG-compliant interface that allows a full interaction with OQL queries, wrapping underlying Description Logic representation and techniques to the user.

MATTHIAS KLUSH; S. BERGAMASCHI; PAOLO PETTA ( 2003 ) - European Research and Development of Intelligent Information Agents: The AgentLink Perspective ( - INTELLIGENT INFORMATION AGENTS: The AgentLink Perspective ) (Springer HEIDELBERG DEU ) - n. volume 2586 - pp. da 1 a 21 ISBN: 9783540007593 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

The vast amount of heterogeneous information sources available in the Internet demands advanced solutions for acquiring, mediating, and maintaining relevant information for the common user. The impacts of data, system, and semantic heterogeneity on the information overload of the user are manifold and especially due to potentially significant differences in data modeling, data structures, content representations using ontologies and vocabularies, query languages and operations to retrieve, extract, and analyse information in the appropriate context. The impacts of the increasing globalisation on the information overload encompass the tedious tasks of the user to determine and keep track of relevant information sources, to efficiently deal with different levels of abstractions of information modeling at sources, and to combine partially relevant information from potentially billions of sources. A special type of intelligent software agents , so called information agents, is supposed to cope with these difficulties associated with the information overload of the user. This implies its ability to semantically broker information by providing pro-active resource discovery, resolving the information impedance of information consumers and providers in the Internet, and offering value-added information services and products to the user or other agents. In subsequent sections we briefly introduce the reader to the notion of such type of agents as well as one of the prominent European forums for research on and development of these agents, the AgentLink special interest group on intelligent information agents. This book includes presentations of advanced systems of information agents and solution approaches to different problems in the domain that have been developed jointly by members of this special interest group in respective working groups.

Matthias Klusch; Sonia Bergamaschi; Peter Edwards; Paolo Petta ( 2003 ) - Intelligent Information Agents - The AgentLink Perspective (Springer Heidelberg DEU ) - pp. da 1 a 273 ISBN: 9783540007593 [Curatela (284) - Curatela]
Abstract

State of the art of the research on intelligent information agents

I. BENETTI; S. BERGAMASCHI; E.SCARSO ( 2003 ) - Managing knowledge through electronic commerce applications: a framework for integrating information coming from heterogeneous web sources (Inderscience Enterprises Limited:29 route de Pre-Bois, CP 896, CH-1215 Geneva Switzerland:011 44 1234 713365, EMAIL: subs@inderscience.com, INTERNET: http://www.inderscience.com, Fax: 011 41 22 7910885 ) - INTERNATIONAL JOURNAL OF ELECTRONIC BUSINESS - n. volume 1 No. 3 - pp. da 237 a 257 ISSN: 1470-6067 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The paper aims at investigating the interplay existing between Electronic Commerce (EC) technologies, knowledge and Knowledge Management (KM), an issue that has raised attention in the academic and professional literature in recent times. To this end, a careful examination of the logic and working mechanisms of MOMIS is conducted, a semi-automatic framework for integrating information coming from heterogeneous sources that is under development at the Department of Information Engineering of the University of Modena e Reggio Emilia. In particular, the use of MOMIS to create virtual catalogues (i.e. EC instruments that dynamically retrieve information from multiple heterogeneous sources) is discussed in depth from a knowledge-based point of view. The analysis seems to confirm that EC and KM are not unrelated managerial issues, but rather that they can (and must) be beneficially integrated. More specifically, it seems possible to state that a better and more accurate understanding of knowledge management processes are essential to design and realise more effective EC applications.

D. BENEVENTANO; S. BERGAMASCHI; J. GELATI; F. GUERRA; M. VINCINI ( 2003 ) - MIKS: an agent framework supporting information access and integration ( - Intelligent Information Agents Research and Development in Europe: An AgentLink Perspective ) (Springer Heidelberg DEU ) - n. volume 2586 - pp. da 22 a 49 ISBN: 9783540007593 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems for cooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, in this paper we propose the exploitation of mobile agents in the information integration area, and, in particular, their integration in the Momis infrastructure. MOMIS (Mediator EnvirOnment for Multiple Information Sources) is a system that has been conceived as a pool of tools to provide an integrated access to heterogeneous information stored in traditional databases (for example relational, object oriented databases) or in file systems, as well as in semi-structured data sources (XML-file).This proposal has been implemented within the MIKS (Mediator agent for Integration of Knowledge Sources) system and it is completely described in this paper.

D. BENEVENTANO; S. BERGAMASCHI; D. MONTANARI; L. OTTAVIANI ( 2003 ) - Semantic Web Search Engines: the SEWASIE approach ( - SECOND INTERNATIONAL SEMANTIC WEB CONFERENCE (ISWC 2003) ) - pp. da 1 a 1 [Poster (275) - Poster]
Abstract

SEWASIE is a research project funded by the European Commission that aims to design and implement an advanced search engine enabling intelligent access to heterogeneous data sources on the web via semantic enrichment to provide the basis of structured secure web-based communication.

A. ALBERIGI QUARANTA; I. BENETTI; S. BERGAMASCHI; E. SCARSO ( 2003 ) - Stato e prospettive di sviluppo delle tecnologie informatiche per l'economia digitale ( - Crisi e trasformazione dell'economia digitale ) (FrancoAngeli Milano ITA ) - pp. da 271 a 296 ISBN: 9788846450128 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Il capitolo descrive lo stato e le prospettive di sviluppo delle tecnologie informatiche per l'economia digitale.

D. Beneventano; S. Bergamaschi; F. Guerra; M. Vincini ( 2003 ) - Synthesizing, an integrated ontology - IEEE INTERNET COMPUTING - n. volume 7 - pp. da 42 a 51 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

To exploit the Internet’s expanding data collection, current Semantic Web approaches employ annotation techniques to link individual information resources with machine-comprehensible metadata. Before we can realize the potential this new vision presents, however, several issues must be solved. One of these is the need for data reliability in dynamic, constantly changing networks. Another issue is how to explicitly specify relationships between abstract data concepts. Ontologies provide a key mechanism for solving these challenges, but the Web’s dynamic nature leaves open the question of how to manage them. The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent of their location or their data’s heterogeneity. Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web pages with machine-readable definitions. It thus relies on the a priori existence of ontologies that represent the domains associated with the given information sources. This approach relies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that the annotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. By involving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover, the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content. In this article, we use Web documents as a representative information source to describe the Momis methodology’s general application. We explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underlying domain. In particular, our method provides a way to extend previously created conceptualizations, rather than starting from scratch, by inserting a new source.

S. BERGAMASCHI; G.GELATI; F. GUERRA; M. VINCINI ( 2003 ) - WINK: a Web-based Enterprise System for Collaborative Project Management in Virtual Enterprises ( Web Information Systems Engineering - Roma, Italy - 10-12 December 2003) ( - 4th International Conference on Web Information Systems Engineering ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 176 a 185 ISBN: 9780769519999 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The increasing of globalization and flexibility required to the companies has generated, in the last decade, new issues, related to the managing of large scale projects within geographically distributed networks and to the cooperation of enterprises. ICT support systems are required to allow enterprises to share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with two main goals: avoiding inconsistencies generated by updates at the sources’ level and minimizing data replications. The proposed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration component, which supports information sharing and querying, and SOAP enabled web-services which ensure the whole interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia, Gruppo Formula S.p.A. and Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge - IST-2000-28221).

Bergamaschi S; Guerra F; Vincini M ( 2002 ) - A data integration framework for e-commerce product classification ( International Semantic Web Conference (ISWC 2001) - Cagliari, Italy - 9-12 June 2002) ( - The Semantic Web - ISWC 2002, First International Semantic Web Conference ) (Springer Heidelberg DEU ) - n. volume 2342 - pp. da 379 a 393 ISBN: 9783540437604 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A marketplace is the place in which the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology, supported by a tool (SI-Designer), to define the mapping among different e-commerce product classification standards. This methodology was developed for the MOMIS system within the Intelligent Integration of Information research area. We describe our extension to the methodology that makes it applyable in general to product classification standard, by selecting a fragment of ECCMA/UNSPSC and ecl @ss standard.

S. Bergamaschi; M. Vincini ( 2002 ) - A semantic approach to access heterogeneous data sources: the SEWASIE Project ( International Conference on Teleworking for Business, Education, Research and e-Commerce - Vilnius - 19-23 October 2002) ( - Telebalt 2002 ) (IST Programme, European Commission Vilnius LTU ) - pp. da 102 a 112 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

SEWASIE is implementing an advanced search engine that provides intelligent access to heterogeneous data sources on the web via semantic enrichment. This can be thought of as the basis of structured secure web-based communication. SEWASIE provides users with a search client that has an easy-to-use query interface, and which can extract the required information from the Internet and to show it in a useful and user-friendly format. From an architectural point of view, the prototype will provide a search engine client and indexing servers and ontologies.There are many benefits to be had from such a system. There will be a reduction of transaction costs by efficient search and communication facilities. Within the business context, the system will support integrated searching and negotiating, which will promote the take-up of key technologies for SMEs and give them a competitive edge.

D. BENEVENTANO; S. BERGAMASCHI; M.FELICE; D. GAZZOTTI; G.GELATI; F. GUERRA; M. VINCINI ( 2002 ) - An Agent framework for Supporting the MIKS Integration Process ( WOA 2002: Dagli Oggetti agli Agenti - Milano, Italia - 18-19 November 2002) ( - WOA 2002: Dagli Oggetti agli Agenti. 3rd AI*IA/TABOO Joint Workshop "From Objects to Agents": From Information to Knowledge ) (Pitagora Editrice Bologna ITA ) - pp. da 35 a 41 ISBN: 9788837113636 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems forcooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, inthis paper we propose the exploitation of mobile agents inthe information integration area, and, in particular, the rolesthey play in enhancing the feature of the Momis infrastructure.Momis (Mediator agent for Integration of Knowledge Sources) is asystem that has been conceived as a pool of tools to provide anintegrated access to heterogeneous information stored intraditional databases (for example relational, object orienteddatabases) or in file systems, as well as in semi-structured datasources (XML-file).In this paper we describe the new agent-based framework concerning the integration process as implemented in Miks (Mediator agent for Integration of Knowledge Sources) system.

I. BENETTI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2002 ) - An information integration framework for E-commerce - IEEE INTELLIGENT SYSTEMS - n. volume 17 - pp. da 18 a 25 ISSN: 1541-1672 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The Web has transformed electronic information systems from single, isolated nodes into a worldwide network of information exchange and business transactions. In this context, companies have equipped themselves with high-capacity storage systems that contain data in several formats. The problems faced by these companies often emerge because the storage systems lack structural and application homogeneity in addition to a common ontology.The semantic differences generated by a lack of consistent ontology can lead to conflicts that range from simple name contradictions (when companies use different names to indicate the same data concept) to structural incompatibilities (when companies use different models to represent the same information types).One of the main challenges for e-commerce infrastructure designers is information sharing and retrieving data from different sources to obtain an integrated view that can overcome any contradictions or redundancies. Virtual catalogs can help overcome this challenge because they act as instruments to retrieve information dynamically from multiple catalogs and present unified product data to customers. Instead of having to interact with multiple heterogeneous catalogs, customers can instead interact with a virtual catalog in a straightforward, uniform manner.This article presents a virtual catalog project called Momis (mediator environment for multiple information sources). Momis is a mediator-based system for information extraction and integration that works with structured and semistructured data sources. Momis includes a component called the SI-Designer for semiautomatically integrating the schemas of heterogeneous data sources, such as relational, object, XML, or semistructured sources. Starting from local source descriptions, the Global Schema Builder generates an integrated view of all data sources and expresses those views using XML. Momis lets you use the infrastructure with other open integration information systems by simply interchanging XML data files.Momis creates XML global schema using different stages, first by creating a common thesaurus of intra and interschema relationships. Momis extracts the intraschema relationships by using inference techniques, then shares these relationships in the common thesaurus. After this initial phase, Momis enriches the common thesaurus with interschema relationships obtained using the lexical WordNet system (www.cogsci.princeton.edu/wn), which identifies the affinities between interschema concepts on the basis of their lexicon meaning. Momis also enriches the common thesaurus using the Artemis system, which evaluates structural affinities among interschema concepts.

G. Cabri; F. Guerra; M. Vincini; S. Bergamaschi; L. Leonardi; F. Zambonelli ( 2002 ) - MOMIS: Exploiting agents to support information integration - INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS - n. volume 11 - pp. da 293 a 313 ISSN: 0218-8430 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Information overloading introduced by the large amount of data that is spread over the Internet must be faced in an appropriate way. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for today's technologies related to information management. In the area of information integration, this paper proposes an approach based on mobile software agents integrated in the MOMIS (Mediator envirOnment for Multiple Information Sources) infrastructure, which enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The exploitation of mobile agents in MOMIS can significantly increase the flexibility of the system. In fact, their characteristics of autonomy and adaptability well suit the distributed and open environments, such as the Internet. The aim of this paper is to show the advantages of the introduction in the MOMIS infrastructure of intelligent and mobile software agents for the autonomous management and coordination of integration and query processing over heterogeneous data sources.

S. BERGAMASCHI; F. GUERRA ( 2002 ) - Peer to Peer Paradigm for a Semantic Search Engine ( Agents and Peer-to-Peer Computing (AP2PC 2002) - Bologna, Italy - 15 July 2002) ( - Agents and Peer-to-Peer Computing, First International Workshop ) (Springer Heidelberg DEU ) - n. volume 2530 - pp. da 81 a 86 ISBN: 9783540405382 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper provides, firstly, a general description of the research project SEWASIE and, secondly, a proposal of an architectural evolution of the SEWASIE system in the direction of peer-to-peer paradigm. The SEWASIE project has the aim to design and implement an advanced search engine enabling intelligent access to heterogeneous data sources on the web using community-specific multilingual ontologies. After a presentation of the main features of the system a preliminar proposal of architectural evolutions of the SEWASIE system in the direction of peer-to-peer paradigm is proposed.

S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2002 ) - Product Classification Integration for E-Commerce ( Second International Workshop on Electronicy Business Hubs - WEBH - Aix En Provence, France - 2-6 September 2002) ( - DEXA Workshops ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 861 a 867 ISBN: 9780769516684 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A marketplace is the place where the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology to define a mapping among different e-commerce product classification standards. This methodology is an extension of the MOMIS-system, a mediator system developed within the Intelligent Integration of Information research area.

BERGAMASCHI S; BENEVENTANO D; CASTANO S; DE ANTONELLIS V; FERRARA A; GUERRA F; F. MANDREOLI; ORNETTI G. C; VINCINI M ( 2002 ) - Semantic Integration and Query Optimization of Heterogeneous Data Sources ( 1st OOIS Workshop on Efficient Web-based Information Systems (EWIS 2002) - Montpellier, France - September 2, 2002) ( - Advances in Object-Oriented Information Systems ) (Springer Heidelberg DEU ) - n. volume 2426 - pp. da 154 a 165 ISBN: 9783540440888 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In modern Internet/Intranet-based architectures, an increasing number of applications requires an integrated and uniform accessto a multitude of heterogeneous and distributed data sources. Inthis paper, we describe the ARTEMIS/MOMIS system for the semantic integration and query optimization of heterogeneous structured and semistructured data sources.

S. Bergamaschi; A. Tavernari; M. Lenzerini; M. Jarke; E.Franconi; T. Burwick; G. Vetere; A. Becks ( 2002 ) - SEmantic Webs and AgentS in Integrated Economies [Altro (298) - Partecipazione a progetti di ricerca]
Abstract

- Develop an agent-based, secure, scalable and distributed system architecture for semantic search (ontology based) and for structured web-based communication (for electronic negotiation). -Develop a general framework responsible for the implementation of the semantic enrichment processes leading to semantically-enriched virtual data stores that constitute the information nodes accessible by the users. The created ontology must have a multilingual interface, based on a logical layer and coded using widespread W3C standards. -Develop a general framework for query management and information reconciliation taking into account the semantically enriched data stores. First, commonalities among queries have to be detected, then the relevant virtual data stores responsible for answering parts of the queries determined and the queries accordingly split. Finally, the sub-answers have to be combined in order to provide the user with an overall answer to the original query. -Develop an information-brokering component that includes methods for collecting, contextualising and visualising semantically-rich data. To obtain these result, intelligent information filtering and knowledge guidance services have to be developed on the basis of semantic web technologies. Structured data has to be linked to semi- or unstructured data via ontologies. The collected data has to be visualised to show related documents and search result contexts for the purpose of financial control. -Develop structured communication processes that enable the use of ontologies. The communication tool enables structured negotiation support for human negotiators engaging in business-to-business electronic commerce and employing intelligent software agents for some routine communication task. -Develop end-user interfaces for both the semantic design and the query management. The first is a tool supporting the design, the management, and the storage of the semantic information associated to virtual data stores together with a conceptual modelling methodology associated to the devised data model. The latter is a tool for end-user query management and intelligent navigation exploiting the semantic information associated to virtual data stores and to the global virtual view.

D. BENEVENTANO; S. BERGAMASCHI; D. BIANCO; F. GUERRA; M. VINCINI ( 2002 ) - SI-Web: a Web based interface for the MOMIS project ( Sistemi Evoluti per Basi di Dati (SEBD 2002) - Portoferraio, Italy - 19-21 June 2002) ( - Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2002) ) (Paolo Ciaccia, Fausto Rabitti, Giovanni Soda Portoferraio ITA ) - pp. da 407 a 411 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The MOMIS project (Mediator envirOnment for MultipleInformation Sources) developed in the past years allows the integration of data from structured and semi-structured data sources. SI-Designer (Source Integrator Designer) is a designer support tool implemented within the MOMIS project for semi-automatic integration of heterogeneous sources schemata. It is a java application where all modules involved are available as CORBA Object and interact using established IDL interfaces. The goal of this demonstration is to present a new tool: SI-Web (Source Integrator on Web), it offers the same features of SI-Designer but it has got the great advantage of being usable onInternet through a web browser.

D. BENEVENTANO; S. BERGAMASCHI; D. GAZZOTTI; G.GELATI; F. GUERRA; M. VINCINI ( 2002 ) - The WINK Project for Virtual Enterprise Networking and Integration ( Sistemi Evoluti per Basi di Dati (SEBD 2002) - Portoferraio, Italy - 19-21 June 2002) ( - Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD2002) ) (Paolo Ciaccia, FAusto Rabitti, Giovanni Soda Portoferraio ITA ) - pp. da 283 a 290 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

To stay competitive (or sometimes simply to stay) on the market companies and manufacturers more and more often have to join their forces to survive and possibly flourish. Among other solutions, the last decade has experienced the growth and spreading of an original business model called Virtual Enterprise. To manage a Virtual Enterprise modern information systems have to tackle technological issues as networking, integration and cooperation. The WINK project, born form the partnership between University of Modena and Reggio Emilia and Gruppo Formula, addresses these problems. The ultimate goal is to design, implement and finally test on a pilot case (provided by Alenia), the WINK system, as combination of two existing and promising software systems (the WHALES and MIKS systems), to provide the Virtual Enterprise requirement for data integration and cooperation amd management planning.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - Exploiting extensional knowledge for query reformulation and object fusion in a data integration system ( Sistemi Evoluti per Basi di Dati (SEBD 2001) - Venezia, Italy - 27-29 Giugno 2001) ( - Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2001) ) (Augusto Celentano, Letizia Tanca, Paolo Tiberio Venezia ITA ) - pp. da 257 a 272 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Query processing in global information systems integrating multiple heterogeneous sources is a challenging issue in relation to the effective extraction of information available on-line. In this paper we propose intelligent, tool-supported techniques for querying global information systems integrating both structured and semistructured data sources. The techniques have been developed in the environment of a data integration, wrapper/mediator based system, MOMIS, and try to achieve two main goals: optimized query reformulation w.r.t local sources and object fusion, i.e. grouping together information (from the same or different sources) about the same real-world entity. The developed techniques rely on the availability of integrationknowledge, i.e. local source schemata, a virtual mediated schema and its mapping descriptions, that is semantic mappings w.r.t. the underlying sources both at the intensional and extensional level. Mapping descriptions, obtained as a result of the semi-automatic integration process of multiple heterogeneous sources developed for the MOMIS system, include, unlike previous data integration proposals, extensional intra/interschema knowledge. Extensional knowledge is exploited to detect extensionally overlapping classes and to discover implicit join criteria among classes, which enables the goals of optimized query reformulation and object fusion to be achieved.The techniques have been implemented in the MOMIS system but can be applied, in general, to data integration systems including extensional intra/interschema knowledge in mapping descriptions.

D. Beneventano; S. Bergamaschi; F. Mandreoli ( 2001 ) - Extensional Knowledge for semantic query optimization in a mediator based system ( International Workshop on Foundations of Models for Information Integration - Viterbo - 16-18 Semptember) ( - International Workshop on Foundations of Models for Information Integration ) (10th workshop in the series Foundations of Models and Languages for Data and Objects (FMLDO) VITERBO ITA ) - pp. da 1 a 15 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Query processing in global information systems integrating multiple heterogeneous sources is a challenging issue in relation to the effective extraction of information available on-line. In this paper we propose intelligent, tool-supported techniques for querying global information systems integrating both structured and semistructured data sources. The techniques have been developed in the environment of a data integration, wrapper/mediator based system, MOMIS, and try to achieve the goal of optimized query reformulation w.r.t local sources. The developed techniques rely on the availability of integration knowledge whose semantics is expressed in terms of description logics. Integration knowledge includes local source schemata, a virtual mediated schema and its mapping descriptions, that is semantic mappings w.r.t. the underlying sources both at the intensional and extensional level. Mapping descriptions, obtained as a result of the semi-automatic integration process of multiple heterogeneous sources developed for the MOMIS system, include, unlike previous data integration proposals, extensional intra/interschema knowledge. Extensional knowledge is exploited to perform semantic query optimization in a mediator based system as it allows to devise an optimized query reformulation method. The techniques are under development in the MOMIS system but can be applied, in general, to data integration systems including extensional intra/interschema knowledge in mapping descriptions.

BERGAMASCHI S.; CASTANO S.; VINCINI M.; BENEVENTANO D. ( 2001 ) - Semantic Integration of Heterogeneous Information Sources - DATA & KNOWLEDGE ENGINEERING - n. volume 36 - pp. da 215 a 249 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Developing intelligent tools for the integration of information extracted from multiple heterogeneous sources is a challenging issue to effectively exploit the numerous sources available on-line in global information systems. In this paper, we propose intelligent, tool-supported techniques to information extraction and integration from both structured and semistructured data sources. An object-oriented language, with an underlying Description Logic, called ODLI3 , derived from the standard ODMG is introduced for information extraction. ODLI3 descriptions of the source schemas are exploited first to set a Common Thesaurus for the sources. Information integration is then performed in a semiautomatic way by exploiting the knowledge in the Common Thesaurus and ODLI 3 descriptions of source schemas with a combination of clustering techniques and Description Logics. This integration process gives rise to a virtual integrated view of the underlying sources for which mapping rules and integrity constraints are specified to handle heterogeneity. Integration techniques described in the paper are provided in the framework of the MOMIS system based on a conventional wrapper/mediator architecture.

D. BENEVENTANO; S. BERGAMASCHI; I. BENETTI; A. CORNI; F. GUERRA; G. MALVEZZI ( 2001 ) - SI-Designer: a tool for intelligent integration of information ( Hawaii International Conference on System Sciences - Hawaii - 3-6 January 2001) ( - Hawaii International Conference on System Sciences (HICSS-34) ) (IEEE Computer Society Los Alamitos, California USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

SI-Designer (Source Integrator Designer) is a designer supporttool for semi- automatic integration of heterogeneoussources schemata (relational, object and semi structuredsources); it has been implemented within the MOMIS projectand it carries out integration following a semantic approachwhich uses intelligent Description Logics-based techniques,clustering techniques and an extended ODMG-ODL language,ODL-I3, to represent schemata, extracted, integratedinformation. Starting from the sources’ ODL-I3 descriptions(local schemata) SI-Designer supports the designer inthe creation of an integrated view of all the sources (globalschema) which is expressed in the same ODL-I3 language.We propose SI-Designer as a tool to build virtual catalogsin the E-Commerce environment.

I. BENETTI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - SI-Designer: an Integration Framework for E-Commerce ( E-Business & the Intelligent Web - Seattle, USA - August 5 2001) ( - IJCAI*01 Workshop on E-Business & the Intelligent Web ) (Proceedings informali pubblicati in rete http://www.csd.abdn.ac.uk/~apreece/ebiweb/programme.html Seattle USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Electronic commerce lets people purchase goods and exchange information on business transactions on-line. Therefore one of the main challenges for the designers of the e-commerce infrastructures is the information sharing, retrieving data located in different sources thus obtaining an integrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach as they are conceived as instruments to dynamically retrieve information from multiple catalogs and present product data in a unified manner, without directly storing product data from catalogs.In this paper we propose SI-Designer, a support tool for the integration of data from structured and semi-structured data sources, developed within the MOMIS (Mediator environment for Multiple Information Sources) project.

S. BERGAMASCHI; G. CABRI; F. GUERRA; L. LEONARDI; M. VINCINI; F. ZAMBONELLI ( 2001 ) - Supporting information integration with autonomous agents ( Cooperative Information Agents (CIA 2001) - Modena, Italy - 6-8 Settembre 2001) ( - Cooperative Information Agents V, 5th International Workshop ) - LECTURE NOTES IN COMPUTER SCIENCE - n. volume 2182 - pp. da 88 a 99 ISBN: 9783540425458 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The large amount of information that is spread over the Internet is an important resource for all people but also introduces some issues that must be faced. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challanges for the today’s technologies. This paper proposes an approach based on mobile agents integrated in an information integration infrastructure. Mobile agents can significantly improve the design and the development of Internet applications thanks to their characteristics of autonomy and adaptability to open and distributed environments, such as the Internet. MOMIS (Mediator envirOnment for Multiple Information Sources) is an infrastructure for semi-automatic information integrationthat deals with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The aim of this paper is to show the advantage of the introduction in the MOMIS infrastructureof intelligent and mobile software agents for the autonomous management and coordination of the integration and query processes over heterogeneous sources.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - The Momis approach to Information Integration ( International Conference on Enterprise Information Systems (ICEI 01) - Setubal, Portugal - 7-10 July 2001) ( - Third International Conference on Enterprise Information Systems ) (ICEIS Press Setubal PRT ) - n. volume 1 - pp. da 194 a 198 ISBN: 9789729805028 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The web explosion, both at internet and intranet level, has transformed the electronic information systemfrom single isolated node to an entry points into a worldwide network of information exchange and businesstransactions. Business and commerce has taken the opportunity of the new technologies to define the ecommerceactivity. Therefore one of the main challenges for the designers of the e-commerceinfrastructures is the information sharing, retrieving data located in different sources thus obtaining anintegrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach asthey are conceived as instruments to dynamically retrieve information from multiple catalogs and presentproduct data in a unified manner, without directly storing product data from catalogs. Customers, instead ofhaving to interact with multiple heterogeneous catalogs, can interact in a uniform way with a virtual catalog.In this paper we propose a designer support tool, called SI-Designer, for information integration developedwithin the MOMIS project. The MOMIS project (Mediator environment for Multiple Information Sources)aims to integrate data from structured and semi-structured data sources.

S. BERGAMASCHI ( 2000 ) - Bologna, European City of Culture [Esposizione (290) - Esposizione]
Abstract

MOMIS Poster for Bologna, European City of Culture

D. BENEVENTANO; BERGAMASCHI S.; S. CASTANO; A. CORNI; R. GUIDETTI; G. MALVEZZI; M. MELCHIORI; M. VINCINI ( 2000 ) - Creazione di una vista globale d'impresa con il sistema MOMIS basato su Description Logics ( VII Convegno Associazione Italiana per l'Intelligenza Artificiale - Milano - 13-15 Settembre) ( - VII Convegno Associazione Italiana per l'Intelligenza Artificiale ) (AI*IA, Associazione Italiana Intelligenza Artificiale MILANO ITA ) - pp. da 20 a 30 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Sviluppare strumenti intelligenti per l'integrazione di informazioni provenienti da sorgenti eterogenee all'interno di un'impresa è un argomento di forte interesse in ambito di ricerca. In questo articolo proponiamo tecniche basate su strumenti intelligenti per l'estrazione e l'integrazione di informazioni provenienti da sorgenti strutturate e semistrutturate fornite dal sistema MOMIS. Per la descrizione delle sorgenti presenteremo e utilizzeremo il linguaggio object-oriented ODLI3 derivato dallo standard ODMG. Le sorgenti descritte in ODLI3 vengono elaborate in modo da creare un thesaurus delle informazioni condivise tra le sorgenti. L'integrazione delle sorgenti viene poi effettuata in modo semi-automatico elaborando le informazioni che descrivono le sorgenti con tecniche basate su Description Logics e tecniche di clustering generando uno Schema globale che permette la visione integrata virtuale delle sorgenti.

D. Beneventano; S. Bergamaschi; A. Corni; M. Vincini ( 2000 ) - Creazione di una vista globale d'impresa con il sistema MOMIS basato su Description Logics - AIIA NOTIZIE - n. volume 2 - pp. da 10 a 23 [Articolo in rivista (262) - Articolo su rivista]
Abstract

-

Domenico Beneventano; Sonia Bergamaschi; Claudio Sartori ( 2000 ) - Fondamenti di Informatica (Progetto Leonardo Bologna ITA ) - pp. da 1 a 305 ISBN: non disponibile [Monografia o trattato scientifico (276) - Monografia/Trattato scientifico]
Abstract

Manuale di fondamenti di programmazione dei calcolatori elettronici e in particolare con l'obiettivo di sviluppare un metodo di soluzione rigoroso di classi diverse di problemi. Particolare accento è posto sui costrutti fondamentali e sulla possibilità di costruire soluzioni basate sul riuso del software.

D. Beneventano; S. Bergamaschi; S. Castano; A. Corni; G. Guidetti; M. Malvezzi; M. Melchiori; Vincini ( 2000 ) - Information integration - the MOMIS project demostration ( VLDB 2000 - Cairo, Egypt - September 10-14, 2000) ( - VLDB 2000, Proceedings of 26th International Conference on Very Large Data Bases, September 10-14, 2000, Cairo, Egypt ) (Amr El Abbadi and Michael L. Brodie and Sharma Chakravarthy andUmeshwar Dayal and Nabil Kamel and Gunter Schlageter andKyu-YoungWhang San Francisco USA ) - n. volume 1 - pp. da 611 a 614 ISBN: 9781558607156 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The goal of this demonstration is to present the main features of a Mediator component, Global Schema Builder of an I3 system, called MOMIS (Mediator envirOnment for Multiple Information Sources). MOMIS has been conceived to provide an integrated access to heterogeneous information stored in traditional databases (e.g., relational, object- oriented) or file systems, as well as in semistructured sources. The demonstration is based on the integration of two simple sources of different kind, structured and semi-structured.

D. BENEVENTANO; S. BERGAMASCHI; A. CORNI; M. VINCINI ( 2000 ) - MOMIS: un sistema di Description Logics per l'integrazione del sistema informativo d'impresa ( XXXVIII Congresso Annuale AICA2000 - Taormina - 27-30 Settembre) ( - XXXVIII Congresso Annuale AICA2000 ) (AICA Taormina ITA ) - pp. da 1 a 12 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Taormina

S. BERGAMASCHI; FAUSTO RABITTI ( 2000 ) - Ontology based access to digital libraries ( - Workshop on Semantic Web Technologies ) [Esposizione (290) - Esposizione]
Abstract

Luxembourg. (Relazione Invitata)

D. BENEVENTANO; S. BERGAMASCHI; A. CORNI; R. GUIDETTI; G. MALVEZZI ( 2000 ) - SI-DESIGNER: un tool di ausilio all'integrazione intelligente di sorgenti di informazione ( Ottavo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2000) - L'Aquila, Palazzo dell'Emiciclo - 26-28 Giugno 2000) ( - Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD2000) ) (Convegno Nazionale Sistemi di Basi di Dati Evolute. L'Aquila ITA ) - pp. da 123 a 137 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

SI-Designer (Source Integrator Designer) e' un tool di supporto al progettista per l'integrazione semi-automatica di schemi di sorgenti eterogenee (relazionali, oggetti e semistrutturate). Realizzato nell'ambito del progetto MOMIS, SI-Designer esegue l'integrazione seguendo un approccio semantico che fa uso di tecniche intelligenti basate sulla Description Logics OLCD, di tecniche di clustering e di un linguaggio object-oriented per rappresentare le informazioni estratte ed integrate, ODLII3, derivato dallo standard ODMG. Partendo dalle descrizioni delle sorgenti in ODLII3 (gli schemi locali) SI-Designer assiste il progettista nella creazione di una vista integrata di tutte le sorgenti (schema globale) anch'essa espressa in linguaggio ODLII3.

S. BERGAMASCHI ( 2000 ) - Tecnologie database ed Integrazione di Dati nel Commercio Elettronico ( - Giornata di Studio su Economia Virtuale e Opportunita' Reali ) [Esposizione (290) - Esposizione]
Abstract

Vicenza. (Relazione Invitata)

S. BERGAMASCHI; S. CASTANO; C. SARTORI; P. TIBERIO; M. VINCINI ( 1999 ) - Distributed Database Support for Data-Intensive Workflow Application ( Workshop on Enterprise Management and Resource Planning: Methods,Tools and Arch. - Venice, Italy - 25-27 Novembre) ( - Workshop on Enterprise Management and Resource Planning: Methods,Tools and Arch. ) (Telecom Italia VENEZIA ITA ) - pp. da 271 a 282 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Venice, Italy

D. BENEVENTANO; S. BERGAMASCHI ( 1999 ) - Integration of information from multiple sources of textual data. ( - Intelligent Information Agents ) (SPRINGER Heidelberg DEU ) - pp. da 53 a 77 ISBN: 9783540651123 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

The chapter presents two ongoing projects towards an intelligent integration of information. They adopt a structural and semantic approach TSIMMIS (The Stanford IBM Manager of Multiple Information Sources) and MOMIS (Mediator environment for Multiple Information Sources) respectively. Both projects focus on mediator based information systems. The chapter describes the architecture of a wrapper and how to generate a mediator agent in TSIMMIS. Wrapper agents in TSIMMIS extract informations from a textual source and convert local data into a common data model; the mediator is an integration and refinement tool of data provided by the wrapper agents. In the second project MOMIS a conceptual schema for each source is provided adopting a common standard model and language The MOMIS approach uses a description logic or concept language for knowledge representation to obtain a semiautomatic generation of a common thesaurus. Clustering techniques are used to build the unified schema, i.e. the unified view of the data to be used for query processing in distributed heterogeneous and autonomous databases by a mediator.

S. BERGAMASCHI; S. CASTANO; M. VINCINI; D. BENEVENTANO ( 1999 ) - Intelligent Techniques for the Extraction and Integration of Heterogeneous Information ( IJCAI 1999 Workshop: Intelligent Information Integration - Stockholm - July 1999) ( - IJCAI 1999 Workshop: Intelligent Information Integration ) (CEUR Workshop Proceedings Aachen DEU ) - pp. da 109 a 129 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Developing intelligent tools for the integration of informationextracted from multiple heterogeneous sources is a challenging issue to effectively exploit the numerous sources available on-line in global information systems. In this paper, we propose intelligent, tool-supported techniques to information extraction and integration which take into account both structured and semistructured data sources. An object-oriented language called odli3, derived from the standard ODMG, with an underlying Description Logics, is introduced for information extraction. Odli3 descriptions of the information sources are exploited first to set a shared vocabulary for the sources.Information integration is performed in a semi-automatic way, by exploiting odli3 descriptions of source schemas with a combination of Description Logics and clustering techniques. Techniques described in the paper have been implemented in theMOMIS system, based on a conventional mediator architecture.

S. BERGAMASCHI; D. BENEVENTANO; F. SGARBI; M. VINCINI ( 1999 ) - ODL-Designer UNISQL: Un'Interfaccia per la Specifica Dichiarativa di Vincoli di Integrità in OODBMS ( Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD99) - COMO - Giugno 1999) ( - Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD99) ) (Silava Castano VERONA ITA ) - pp. da 241 a 255 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

La specifica ed il trattamento dei vincoli di integrita' rappresenta un tema di ricerca fondamentale nell'ambito dellebasi di dati; infatti, spesso, i vincoli costituiscono la partepiu' onerosa nello sviluppo delle applicazioni reali basate suDBMS. L'obiettivo principale del componente software ODL-Designer UNISQL, presentato nel lavoro, e' quello di consentire alprogettista di basi di dati di esprimere i vincoli di integrita'attraverso un linguaggio dichiarativo, superando quindi l'approcciodegli OODBMS attuali che ne consente l'espressione solo attraverso procedure (metodi etrigger). ODL-Designer UNISQL acquisisce vincoli dichiarativi e genera automaticamente, in maniera trasparente al progettista, le ``procedure'' che implementano tali vincoli.Il linguaggio supportato da ODL-Designer UNISQL e' lo standard ODL-ODMG opportunamente esteso per esprimere vincoli di integrita', mentre l'OODBMS commerciale utilizzato e' UNISQL.

S. BERGAMASCHI; S. CASTANO; M. VINCINI ( 1999 ) - Semantic Integration of Semistructured and Structured Data Sources - SIGMOD RECORD - n. volume 28 (1) - pp. da 54 a 59 ISSN: 0163-5808 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Providing an integrated access to multiple heterogeneous sources is a challenging issue in global information systems for cooperation and interoperability. In this context, two fundamental problems arise. First, how to determine if the sources contain semantically related information, that is, information related to the same or similar real-world concept(s). Second, how to handle semantic heterogeneity to support integration and uniform query interfaces. Complicating factors with respect to conventional view integration techniques are related to the fact that the sources to be integrated already exist and that semantic heterogeneity occurs on the large-scale, involving terminology, structure, and context of the involved sources, with respect to geographical, organizational, and functional aspects related to information use. Moreover, to meet the requirements of global, Internet-based information systems, it is important that tools developed for supporting these activities are semi-automatic and scalable as much as possible. The goal of this paper is to describe the MOMIS [4, 5] (Mediator envirOnment for Multiple Information Sources) approach to the integration and query of multiple, heterogeneous information sources, containing structured and semistructured data. MOMIS has been conceived as a joint collaboration between University of Milano and Modena in the framework of the INTERDATA national research project, aiming at providing methods and tools for data management in Internet-based information systems. Like other integration projects [1, 10, 14], MOMIS follows a “semantic approach” to information integration based on the conceptual schema, or metadata, of the information sources, and on the following architectural elements: i) a common object-oriented data model, defined according to the ODLI3 language, to describe source schemas for integration purposes. The data model and ODLI3 have been defined in MOMIS as subset of the ODMG-93 ones, following the proposal for a standard mediator language developed by the I3/POB working group [7]. In addition, ODLI3 introduces new constructors to support the semantic integration process [4, 5]; ii) one or more wrappers, to translate schema descriptions into the common ODLI3 representation; iii) a mediator and a query-processing component, based on two pre-existing tools, namely ARTEMIS [8] and ODB-Tools [3] (available on Internet at http://sparc20.dsi.unimo.it/), to provide an I3 architecture for integration and query optimization. In this paper, we focus on capturing and reasoning about semantic aspects of schema descriptions of heterogeneous information sources for supporting integration and query optimization. Both semistructured and structured data sources are taken into account [5]. A Common Thesaurus is constructed, which has the role of a shared ontology for the information sources. The Common Thesaurus is built by analyzing ODLI3 descriptions of the sources, by exploiting the Description Logics OLCD (Object Language with Complements allowing Descriptive cycles) [2, 6], derived from KL-ONE family [17]. The knowledge in the Common Thesaurus is then exploited for the identification of semantically related information in ODLI3 descriptions of different sources and for their integration at the global level. Mapping rules and integrity constraints are defined at the global level to express the relationships holding between the integrated description and the sources descriptions. ODB-Tools, supporting OLCD and description logic inference techniques, allows the analysis of sources descriptions for generating a consistent Common Thesaurus and provides support for semantic optimization of queries at the global level, based on defined mapping rules and integrity constraints.

S. BERGAMASCHI; S. CASTANO; S. DE CAPITANI DE VIMERCATI; S. MONTANARI; M. VINCINI ( 1998 ) - An Intelligent Approach to Information Integration ( International Conference on Formal Ontology in Information Systems (FOIS98) - Trento - June 1998) ( - 1st Conference on Formal Ontology in Information Systems (FOIS '98) / 6th International Conference on Principles of Knowledge Representation and Reasoning (KR '98) ) (IOS-Press (Amsterdam) Amsterdam ITA ) - n. volume 46 - pp. da 253 a 268 ISBN: 9789051993998 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

FORMAL ONTOLOGY IN INFORMATION SYSTEMS Book Series: FRONTIERS IN ARTIFICIAL INTELLIGENCE AND APPLICATIONS Volume: 46 Pages: 253-268

S. BERGAMASCHI; C. SARTORI ( 1998 ) - Chrono: a conceptual design framework for temporal entities ( Conceptual Modeling (ER98) - Singapore - 16-19November 1998) ( - IEEE 17th International Conference on Conceptual Modeling (ER'98) ) (Springer Berlin DEU ) - pp. da 35 a 50 ISBN: 3540651896 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Database applications are frequently faced with the necessity of representing time varying information and, particularly in the management of information systems, a few kinds of behavior in time can characterize a wide class of applications. A great amount of work in the area of temporal databases aiming at the denition of standard representation and manipulation of time, mainly in relational database environment, has been presented in the last years. Nevertheless, conceptual design of databases with temporal aspects has not yet received sufficient attention. The purpose of this paper is twofold: to propose a simple temporal treatment of information at the initial conceptual phase of database design; to show how the chosen temporal treatment can be exploited in time integrity enforcement by using standard DBMS tools, such as referential integrity and triggers. Furthermore, we present a design tool implementing our data model and constraint generation technique, obtained by extending a commercial design tool.

D. Beneventano; S. Bergamaschi; S. Lodi; C. Sartori ( 1998 ) - Consistency checking in complex object database schemata with integrity constraints - IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING - n. volume 10 (4) - pp. da 576 a 598 ISSN: 1041-4347 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Integrity constraints are rules that should guarantee the integrity of a database. Provided an adequate mechanism to express them is available, the following question arises: Is there any way to populate a database which satisfies the constraints supplied by a database designer? That is, does the database schema, including constraints, admit at least a nonempty model? This work answers the above question in a complex object database environment. providing a theoretical framework, including the following ingredients: 1) two alternative formalisms, able to express a relevant set of state integrity constraints with a declarative style; 2) two specialized reasoners, based on the tableaux calculus, able to check the consistency of complex objects database schemata expressed with the two formalisms. The proposed formalisms share a common kernel, which supports complex objects and object identifiers, and which allow the expression of acyclic descriptions of: classes. nested relations and views, built up by means of the recursive use of record. quantified set. and object type constructors and by the intersection, union, and complement operators. Furthermore, the kernel formalism allows the declarative formulation of typing constraints and integrity rules. In order to improve the expressiveness and maintain the decidability of the reasoning activities. we extend the kernel formalism into two alternative directions. The first formalism, OLCP, introduces the capability of expressing path relations. Because cyclic schemas are extremely useful, we introduce a second formalism, OLCD, with the capability of expressing cyclic descriptions but disallowing the expression of path relations. In fact. we show that the reasoning activity in OLCDP (i.e., OLCP with cycles) is undecidable.

S. BERGAMASCHI; S. DE CAPITANI DE VIMERCATI; S. MONTANARI; M. VINCINI ( 1998 ) - Exploiting Schema Knowledge for the Integration of Heterogeneous Sources ( Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD98) - Ancona - June 1999) ( - Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD98) ) (Maurizio Panti Ancona ITA ) - pp. da 103 a 120 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Ancona, Italy

S. BERGAMASCHI; M.Vincini; D.Beneventano ( 1997 ) - A semantics-driven query optimizer for OODBs ( 13th International Conference on Data Engineering (ICDE'97) - Birmigham, UK - Aprile) ( - 13th International Conference on Data Engineering (ICDE'97) ) (IEEE los alamitos, CA USA ) - pp. da 578 a 582 ISBN: 0818678070 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

ODB-QOptimizer is a ODMG 93 compliant tool for the schema validation and semantic query optimization. The approach is based on two fundamental ingredients. The first one is the OCDL description logics (DLs) proposed as a common formalism to express class descriptions, a relevant set of integrity constraints rules (IC rules) and queries. The second one are DLs inference techniques, exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query.

S. BERGAMASCHI; C. SARTORI ( 1997 ) - An Approach for the Extraction of Information from Heterogeneous Sources of Textual Data ( CIA 97 - kiel, germany - February) ( - Cooperative Information Agents, First International Workshop, CIA' 97 ) (Springer heildelberg DEU ) - pp. da 42 a 63 ISBN: 3540625917 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

CEUR Workshop Proceedings. Atene

S. Bergamaschi; D. Beneventano ( 1997 ) - Incoherence and Subsumption for recursive views and queries in Object-Oriented Data Models - DATA & KNOWLEDGE ENGINEERING - n. volume 21 (3) - pp. da 217 a 252 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Elsevier Science B.V. (North- Holland)

Sonia Bergamaschi; Alessandra Garuti; Claudio Sartori; Alberto Venuta ( 1997 ) - Object Wrapper: An Object-Oriented Interface for Relational Databases ( EUROMICRO Conference '97 - Budapest - 1-4 September 1997) ( - 23rd EUROMICRO Conference '97, New Frontiers of Information Technology ) (IEEE Computer Society Los Alamitos, CA USA ) - pp. da 41 a 46 ISBN: 0818681292 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Most commercial applications have to cope with a large number of stored object instances and have data shared among many users and applications. For object-oriented as well as conventional application development RDBMS technology is currently being used in most case. We describe a software module called Object Wrapper for storing and retrieving objects in a RDBMS. Having these capabilities in a separate component helps to isolate data management system dependencies and hence contributes to portable applications.

S. BERGAMASCHI; A. GARUTI; C. SARTORI; A. VENUTA ( 1997 ) - Object Wrapper: an Object-Oriented Interface for Relational Databases ( 3rd EUROMICRO Conference '97 New Frontiers of Information Technology, 1997 - Budapest - Settembre) ( - IEEE International Conference Euromicro'97 ) (IEEE los alamitos, CA USA ) - pp. da 41 a 45 ISBN: 0818681292 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Most commercial applications have to cope with a large number of stored object instances and have data shared among many users and applications. For object-oriented as well as conventional application development RDBMS technology is currently being used in most case. We describe a software module called Object Wrapper for storing and retrieving objects in a RDBMS. Having these capabilities in a separate component helps to isolate data management system dependencies and hence contributes to portable applications.

D. Beneventano; S. Bergamaschi; C. Sartori; M. Vincini ( 1997 ) - ODB-QOptimizer: a tool for semantic query optimization in OODB [Software (296) - Software]
Abstract

ODB-QOPTIMIZER is a ODMG 93 compliant tool for the schema validation and semantic query optimization.The approach is based on two fundamental ingredients. The first one is the OCDL description logics (DLs) proposed as a common formalism to express class descriptions, a relevant set of integrity constraints rules (IC rules) and queries.The second one are DLs inference techniques, exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query.

D.BENEVENTANO; S. BERGAMASCHI; C. SARTORI; M. VINCINI ( 1997 ) - ODB-QOptimizer: a tool for semantic query optimization in OODB ( IEEE Int. Conference on Data Engineering ICDE97 - Birmingham, UK - Aprile) ( - IEEE Int. Conference on Data Engineering ICDE97 ) (W. A. Gray and Per Ake Larson Birmingham GBR ) - pp. da 578 a 579 ISBN: 9780818678073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Birmingham, UK

D. BENEVENTANO; S. BERGAMASCHI; C. SARTORI; M. VINCINI ( 1997 ) - ODB-Tools: a description logics based tool for schema validation and semantic query optimization in Object Oriented Databases ( Fifth Conference of the Italian Association for Artificial Intelligence(AI*IA97) - Roma - September, 1997) ( - Fifth Conference of the Italian Association for Artificial Intelligence(AI*IA97) ) (AI*IA, Associazione Intelligenza Artificiale ROMA ITA ) - pp. da 128 a 145 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

LNAI 1321. Roma

D. BENEVENTANO; S. BERGAMASCHI; A. GARUTI; M. VINCINI; C.SARTORI ( 1996 ) - ODB- Reasoner: un ambiente per la verifica di schemi e l’ottimizzazione di interrogazioni in OODB ( Convegno su Sistemi Evoluti per Basi di Dati (SEBD96) - San Miniato - Luglio 1996) ( - Convegno su Sistemi Evoluti per Basi di Dati (SEBD96) ) (Fausto Rabitti et al. PISA ITA ) - pp. da 181 a 200 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

S.Miniato. Atti a cura di Fausto Rabitti et al.

D. BENEVENTANO; S. BERGAMASCHI; C.SARTORI ( 1996 ) - Scoperta di regole per l’ottimizzazione semantica delle interrogazioni ( Cibernetica e Machine Learning - Napoli - 26-28 Settembre 1996) ( - Giornate di Lavoro AI*IA su Accesso, Estrazione ed Integrazione di conoscenza ) (Associazione Italiana per l'Intelligenza Artificiale NAPOLI ITA ) - pp. da 59 a 63 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Ottimizzazione Semantica delle Interrogazioni in ambiente relazionale con riferimento a vincoli di integrità che rappresentano restrizioni e semplici regole sugli attributi. Utilizzo del sistema Explora per la derivazione automatica delle regole da usare nell'ambito del processo di Ottimizzazione Semantica delle Interrogazioni.

D. BENEVENTANO; S. BERGAMASCHI; C. SARTORI ( 1996 ) - Semantic Query Optimization by Subsumption in OODB ( Flexible Query-Answering Systems - Roskilde (Denmark) - May 22-24, 1996) ( - Flexible Query-Answering Systems, Proc. of the 1996 Workshop (FQAS'96) ) (Datalogiske Skrifter (Writings on Computer Science) 62 Roskilde DNK ) - pp. da 167 a 187 ISBN: 8837110901 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constraints) for transforming a query into a form that may be answered more efficiently than the original version. This paper proposes a general method for semantic query optimization in the framework of Object Oriented Database Systems. The method is applicable to the class of conjunctive queries and is based on two ingredients: a formalism able to express both class descriptions and integrity constraints rules as types; subsumption computation between types to evaluate the logical implications expressed by integrity constraints rules.

D. BENEVENTANO; S. BERGAMASCHI; C. SARTORI; J.P BALLERINI; M. VINCINI ( 1995 ) - A semantics-driven query optimizer for OODBs ( International Workshop on Description Logics - Roma - June 2-3, 1995.) ( - Proceedings of the 1995 International Workshop on Description Logics ) (Universita degli Studi di Roma "La Sapienza", Dipartimento di Informatica e Sistemistica. Roma ITA ) - n. volume Rapporto 07.95 - pp. da 59 a 64 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Semantic query optimization uses problem-specic knowledge (e.g. integrity constraints) for transforming a query into an equivalentone (i.e., with the same answer set) that may be answered more eciently. The optimizer is applicable to the class conjunctive queries is based on two fundamental ingredients. The first one is the ODL description logics proposed as a common formalism to express: class descriptions, a relevant set of integrity constraintsrules (IC rules), queries as ODL types. The second one are DLs (Description Logics) inference techniques exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query. The optimizer tentatively applies all the possible transformations and delays the choice of ben-ecial transformation till the end. Some preliminar ideas on ltering activities on the semantically expanded queryare reported. A prototype semantic queryoptimizer (ODB-QOptimizer) for object-oriented database systems (OODBs) is described.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1995 ) - Consistency checking in Complex Objects Database schemata with integrity constraints ( International Workshop on Database Programming Languages (DBPL) - Gubbio, Umbria, Italy - 6-8 September 1995) ( - Proceedings of the Fifth International Workshop on Database Programming Languages ) (Springer-Verlag Heidelberg DEU ) - pp. da 48 a 57 ISBN: 9783540760863 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Integrity constraints are rules which should guarantee the integrity of a database.Provided that an adequate mechanism to express them is available, the following question arises: is there any way to populate a database which satisfies the constraints supplied by a designer? i.e., does the database schema, including constraints, admit at least one model in which all classes are non-empty?This work gives an answer to the above question in an OODB environment, providing a Data Definition Language (DDL) able to express the semantics of a relevant set of state constraints and a specialized reasoner able to check the consistency of a schema with such constraints.The choice of the set of constraints expressed in the DDL is motivated by decidability issues.

S. BERGAMASCHI; C. SARTORI; M. VINCINI ( 1995 ) - DL techniques for intensional query answering in OODBs ( Int. Workshop on Reasoning about Structured Objects: Knowledge Representation meets Databases - Bielefeld, Germany - Settembre) ( - Int. Workshop on Reasoning about Structured Objects: Knowledge Representation meets Databases ) (Franz Baader, Martin Buchheit, Manfred A. Jeusfeld, Werner Nutt (Eds.) Bielefeld DEU ) - pp. da 108 a 112 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Int. Workshop on Reasoning about Structured Objects: Knowledge Representation meets Databases

Sonia Bergamaschi; Claudio Sartori; maria Rita Scalas ( 1995 ) - Lezioni di Fondamenti di Informatica (Progetto Leonardo Bologna ITA ) - pp. da 1 a 214 ISBN: 0000000000 [Monografia o trattato scientifico (276) - Monografia/Trattato scientifico]
Abstract

Lezioni di fondamenti di informatica - seconda edizione

J. P. BALLERINI; D. BENEVENTANO; S. BERGAMASCHI; M. VINCINI ( 1995 ) - ODBQOptimizer: un ottimizzatore semantico per interrogazioni in OODB ( Convegno su Sistemi Evoluti per Basi di Dati (SEBD95) - Ravello (Costiera Amalfitana), Italia - Giugno 1995) ( - Convegno su Sistemi Evoluti per Basi di Dati (SEBD95) ) (Antonio Albano Salerno ITA ) - pp. da 311 a 330 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Atti a cura di Antonio Albano et al.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1995 ) - Terminological logics for schema design and query processing inOODBs ( Reasoning about Structured Objects: Knowledge Representation Meets Databases - Saarbrucken, Germany - September 20-22, 1994) ( - Proceedings of the KI'94 Workshop KRDB'94 ) - CEUR WORKSHOP PROCEEDINGS - n. volume 1 - pp. da 10 a 14 ISSN: 1613-0073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The paper introduces ideas which make feasible and effective the application of Terminological Logic (TL) techniques for schema design and query optimization in Object Oriented Databases (OODBs).

BERGAMASCHI S; NEBEL B ( 1994 ) - ACQUISITION AND VALIDATION OF COMPLEX OBJECT DATABASE SCHEMATA SUPPORTING MULTIPLE INHERITANCE - APPLIED INTELLIGENCE - n. volume 4 - pp. da 185 a 203 ISSN: 0924-669X [Articolo in rivista (262) - Articolo su rivista]
Abstract

We present an intelligent tool for the acquisition of object-oriented schemata supporting multiple inheritance, which preserves taxonomy coherence and performs taxonomic inferences. Its theoretical framework is based on terminological logics, which have been developed in the area of artificial intelligence. The framework includes a rigorous formalization-of complex objects, which is able to express cyclic references on the schema and instance level; a subsumption algorithm, which computes all implied specialization relationships between types; and an algorithm to detect incoherent types, i.e., necessarily empty types. Using results from formal analyses of knowledge representation languages, we show that subsumption and incoherence detection are computationally intractable from a theoretical point of view. However, the problems appear to be feasible in almost all practical cases.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1994 ) - Constraints in Complex Object Database Models ( International Workshop on Description Logics - Bonn, Germany - May 28-29, 1994.) ( - Proceedings of the 1994 International Workshop on Description Logics ) (Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI) Saarbrücken DEU ) - n. volume D-94-10 - pp. da 101 a 105 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Database design almost invariably includes a specification of a set of rules (the integrity constraints) which should guarantee its consistency. Constraints are expressed in various fashions, depending on the data model, e.g. sub-sets of first order logic, or inclusion dependencies and predicates on row values, or methods in OO environments. Provided that an adequate formalism to express them is available, the following question arises?? Is there any way to populate a database which satisfieses the constraints supplied by a designer? Means of answering to this question should be embedded in automatic design tools, whose use is recommendable or often required in the difficult task of designing complex database schemas. The contribution of this research is to propose a computational solution to the problem of schema consistency in Complex Object Data Models.

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1994 ) - Description Logics as a core of a tutoring system ( Int. Workshop on Description Logics - Bonn, Germany - Maggio) ( - Int. Workshop on Description Logics 8(DFKI report n. D-94-10) ) (DFKI saarbrucken DEU ) - pp. da 20 a 28 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A tutorng system based on description logicsDFKI report n. D-94-10)

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1994 ) - Reasoning with constraints in Database Models ( Sistemi Evoluti per Basi di Dati - Rimini, Italy - 6-8 June 1994) ( - Atti del Secondo Convegno su Sistemi Evoluti per Basi di Dati (SEBD94) ) (Editrice Esculapio Progetto Leonardo Bologna ITA ) - pp. da 23 a 38 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Database design almost invariably includes a specification of a set of rules(the integrity constraints) which should guarantee its consistency. Provided that an adequate mechanism to express them is available, the following question arises: is there any way to populate a database which satisfies the constraints supplied by a designer? i.e., does the database schema, including constraints, admit at least one model in which all classes are non-empty? This work gives an answer to the above question in an OODB environment, providing a Data Definition Language (DDL) able to express the semantics of a relevant set of state constraints and a specialized reasoner able to check the consistency of a schema with such constraints.

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1994 ) - The Entity/Situation Knowledge Representation System (Elsevier BV:PO Box 211, 1000 AE Amsterdam Netherlands:011 31 20 4853757, 011 31 20 4853642, 011 31 20 4853641, EMAIL: nlinfo-f@elsevier.nl, INTERNET: http://www.elsevier.nl, Fax: 011 31 20 4853598 ) - DATA & KNOWLEDGE ENGINEERING - n. volume 14, N.2 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Elsevier Science B.V. (North- Holland)

BERGAMASCHI S; LODI S; SARTORI C ( 1994 ) - THE E/S KNOWLEDGE REPRESENTATION SYSTEM - DATA & KNOWLEDGE ENGINEERING - n. volume 14 - pp. da 81 a 115 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

This paper introduces the E/S knowledge representation model and describes a system based on that model. The model takes ideas from KL-ONE and ER, and its main strength is the direct representation of n-ary relationships. The system is classification-based, and therefore organizes its knowledge in hierarchies of structured intensional objects and offers a set of services to reason about intensional objects, to store extensional objects and to make inferences on the stored knowledge.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1994 ) - Using subsumption for semantic query optimization ( International Workshop on Description Logics - Bonn, Germany - May 28-29, 1994) ( - Proceedings of the 1994 International Workshop on Description Logics ) (Deutsche Forschungszentrum für Künstliche Intelligenz (DFKI) Saarbrücken DEU ) - n. volume D-94-10 - pp. da 97 a 100 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constraints) for transforming a query into an equivalent one that may be answered more efficiently than the original version. This paper proposes a general method for semantic query optimization in the framework of OODBs(Object Oriented Database Systems). The method is applicable to the class of conjunctive queries and is based on two ingredients: a description logics able to express both class descriptions and integrity constraints rules (IC rules) as types; subsumption computation between types to evaluate the logical implications expressed by IC rules.

A. Artale; J. P. Ballerini; S. Bergamaschi; F. Cacace; S. Ceri; F. Cesarini; A. Formica; H. Lam; S. Greco; G. Marrella; M. Missikoff; L. Palopoli; L. Pichetti; D. Saccà; S. Salza; C. Sartori; G. Soda; L. Tanca; M. Toiati ( 1993 ) - Prototypes in the LOGIDATA+ Project ( - LOGIDATA+: Deductive Databases with complex objects ) (Springer-Verlag Heidelberg DEU ) - pp. da 252 a 267 ISBN: 9783540569749 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

The paper introduces the prototypes developed in the LOGIDATA+ project

D. Beneventano; S. Bergamaschi; C. Sartori; A. Artale; F. Cesarini; G. Soda ( 1993 ) - Taxonomic Reasoning in LOGIDATA+ ( - LOGIDATA+: Deductive Databases with Complex Objects ) (SPRINGER Heidelberg DEU ) - n. volume 701 - pp. da 79 a 84 ISBN: 9783540569749 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This chapter introduces the subsumption computation techniques for a LOGIDATA+ schema.

D. BENEVENTANO; BERGAMASCHI S; SARTORI C ( 1993 ) - Taxonomic Reasoning with Cycles in LOGIDATA+ ( - LOGIDATA+: Deductive Databases with Complex Objects ) (SPRINGER Heidelberg DEU ) - n. volume 701 - pp. da 105 a 128 ISBN: 9783540569749 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This chapter shows the subsumption computation techniques for a LOGIDATA+ schema allowing cyclic definitions for classes. The formal framework LOGIDATA_CYC*, which extends LOGIDATA* to perform taxonomic reasoning in the presence of cyclic class definitions is introduced. It includes the notions of possible instances of a schema; legal instance of a schema, defined as the greatest fixed-point of possible instances; subsumption relation. On the basis of this framework, the definitions of coherent type and consistent class are introduced and the necessary algorithms to detect incoherence and compute subsumption in a LOGIDATA+ schema are given. Some examples of subsumption computation show its feasibility for schema design and validation.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1993 ) - Using Subsumption in Semantic Query Optmization ( IJCAI Workshop Object-Based Representation Systems - Chambery (FR) - August 1993) ( - Workshop on Object-Based Representation Systems, IJCAI 93 ) (Centre de Recherche en Informatique de Nancy (CRIN-CNRS & INRIA Lorraine) Nancy FRA ) - n. volume technical report CRIN 93-R-156 - pp. da 19 a 31 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The purpose of semantic query optimization is to use semantic knowledge (e.g. integrity constraints) for transforming a query into an equivalent one that may be answered more efficiently than the original version. This paper proposes a general method for semantic query optimization in the framework of OODBs (Object Oriented Database Systems). The method is applicable to the class of conjunctive queries and is based on two ingredients: a description logic able to express both class descriptions and integrity constraints rules (IC rules) as types; subsumption computation between types to evaluate the logical implications expressed by IC rules.

D. BENEVENTANO; S. BERGAMASCHI; S. LODI; C. SARTORI ( 1993 ) - Uso della Subsumption per l'Ottimizzazione Semantica delle Queries ( Sistemi Evoluti per Basi di Dati (SEBD'93) - Gizzeria, Italy - 14-16 June, 1993) ( - Convegno su Sistemi Evoluti per Basi di Dati (SEBD93) ) (Mediterranean Press RENDE (CS) ITA ) - pp. da 75 a 89 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In questo lavoro si vuole analizzare la possibilità di effettuare l'Ottimizzazione Semantica delle Interrogazioni utilizzando la relazione di subsumption. Il lavoro include una formalizzazione dei modelli dei dati ad oggetti complessi, arricchita con la nozione di subsumption, che individua tutte le relazioni di specilizzazione tra classi di oggetti sulla base delle loro descrizioni.

S. BERGAMASCHI; SARTORI C. ( 1992 ) - On taxonomic reasoning in conceptual design (ACM / Association for Computing Machinery:1515 Broadway, 17th Floor:New York, NY 10036:(212)869-7440, EMAIL: acmhelp@hq.acm.org, INTERNET: http://www.acm.org, Fax: (212)944-1318 ) - ACM TRANSACTIONS ON DATABASE SYSTEMS - n. volume 17 (3) - pp. da 385 a 422 ISSN: 0362-5915 [Articolo in rivista (262) - Articolo su rivista]
Abstract

This paper introduces for the first time the oupling of description logics and conceptual database design

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1992 ) - Representation Extensions of DLs ( AAAI Fall Symposium Series - Boston, USA - October 23,24,25) ( - Issues in Description Logics: Users Meet DevelopersWorking Notes AAAI Fall Symposium Series ) (AAAI Press Cambridge,MA USA ) - pp. da 40 a 47 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Representational extensions

D. BENEVENTANO; S. BERGAMASCHI ( 1992 ) - Subsumption for Complex Object Data Models ( Database Theory - ICDT'92, 4th International Conference - Berlin, Germany - Ottobre) ( - Lecture Notes in Computer Science ) (Springer Heidelberg DEU ) - n. volume 646/1992 - pp. da 357 a 375 ISBN: 9783540560395 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We adopt a formalism, similar to terminological logic languages developed in AI knowledge representation systems, to express the semantics of complex objects data models. Two main extensions are proposed with respect to previous proposed models: the conjunction operator, which permits the expression multiple inheritance between types (classes) as a semantic property and the introduction in the schema of derived (classes), similar to views. These extensions, together with the adoption of suitable semantics able for dealing with cyclic descriptions, allow for the automatic placement of classes in a specialization hierarchy. Mapping schemata to nondeterministic finite automata we face and solve interesting problems like detection of emptiness of a classextension and computation of a specialization ordering for the greatest, least and descriptive semantics. As queries can be expressed as derived classes these results also apply to intentional query answering and query validation.

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1991 ) - E-S: a three-sorted terminological language ( Terminological Logic User Workshop - Berlin - Dicembre) ( - Terminological Logic User Workshop ) (Technische Universitaet Berlin Berlino DEU ) - pp. da 35 a 42 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A three-sorted terminological lagiage: Entuty Situation

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1991 ) - Research Interests and Accomplishments for the Terminological Users Workshop ( Terminological Logic Users Workshop - Berlin - Ottobre) ( - Terminological Logic Users Workshop ) (Technische Universitaet Berlin berlino DEU ) - pp. da 27 a 35 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

New requirements for coupling databases and terminological logics

S. BERGAMASCHI; C. SARTORI ( 1991 ) - Subsumption for Database Schema Design ( International Workshop on Terminological Logics - DAghstul, Germany - Maggio) ( - International Workshop on Terminological Logics ) (DFKI Saarbruecken DEU ) - n. volume DFKI-D-91-13 - pp. da 23 a 27 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Subsumption is a useful technique to asses consistency checking in databases.

D. BENEVENTANO; S. BERGAMASCHI; C. SARTORI ( 1991 ) - Taxonomic Reasoning in Complex Object Data Models ( Adances in Data Management, Proceedings of the Third International Conference on Information Systems and Management - Bombay, India - December 12-14, 1991) ( - International Conference on Management of Data COMAD '91 ) (McGraw-Hill New Dehli IND ) - pp. da 64 a 74 ISBN: Non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We adopt a formalism, similar to terminological logic languages developed in AI knowledge representation systems, to express the semantics of complex objects data models. Two main extensions are proposed with respect to previous proposed models: the conjunction operator, which permits the expression multiple inheritance between types (classes) as a semantic property and the introduction in the schema of derived (classes), similar to views. Then we introduce the notion of subsumption between classes.

D. BENEVENTANO; S. BERGAMASCHI; C. SARTORI ( 1991 ) - Taxonomic Reasoning in LOGIDATA+ ( COMP-EURO 91 - Bologna, Italy - Maggio 1991) ( - Proceedings of the 5th Annual European Computer Conference, Bologna, Italy, May 1991 (COMP-EURO 91) ) (IEEE Computer Society Press Bologna ITA ) - pp. da 894 a 899 ISBN: 9780818621413 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper introduces the subsumption computation techniques for a LOGIDATA+ schema.

S. BERGAMASCHI; G. BOMBARDA; L. PIANCASTELLI; C. SARTORI ( 1989 ) - An expert system for the selection of a composite material ( Data and Knowledge Systems for Manufacturing and Engineering, 1989., Second International Conference on - Gaithersburg, MD - ottobre) ( - 2nd Int. Conf. on Data and Knowledge Systems for Manifacturing and Engineering ) (IEEE PRESS Los Alamos, CA USA ) - pp. da 140 a 141 ISBN: 081861983X [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

An expert system for composite material selection starting from the user specification was developed. Since a database management system (DBMS) was necessary to manage the amount of information for material and application characterization, a logical interface between the Expert 2 system and the relational database was developed. This configuration allows complete separation between the database problem of material characteristic management and the rule-oriented material selection problem handled by the expert system-

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1989 ) - Entita'-Situazione: un modello di rappresentazione della conoscenza ( Congresso Annuale AICA - Trieste - 4-8 ottobre) ( - Congresso Annuale AICA ) (AICA Milano ITA ) - pp. da 285 a 299 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Trieste

S. Bergamaschi; C. Sartori ( 1989 ) - Ingegneria della Conoscenza - INFORMATICA OGGI - n. volume 51 - pp. da 73 a 86 ISSN: 0392-8888 [Articolo in rivista (262) - Articolo su rivista]
Abstract

la necessità di sviluppare sistemi di rappresentazione della conoscenza che integrino espressività ed efficienza nella gestione di grandi quantità di informazione è ormai largamente sentita. In questo articolo si tratta l'evoluzione da DBMS a sistemi in grado di aggiungere espressività e maggiore intelligenza.

S. BERGAMASCHI; S. LODI; C. SARTORI ( 1989 ) - Un editor intelligente per la costruzione di una base di conoscenza secondo il modello entita' situazione ( Quarto Convegno Nazionale sulla Programmazione Logica - GULP - Bologna - 6-settembre) ( - Quarto Convegno Nazionale sulla Programmazione Logica - GULP ) (esculapio bologna ITA ) - pp. da 389 a 402 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Atti di: P. Mello, (a cura di). Bologna.

S. BERGAMASCHI; P. CIACCIA; C. SARTORI ( 1988 ) - Basi di conoscenza e basi di dati: memorizzazione degli aspetti estensionali ( Congresso Annuale AICA - cagliari - settembre) ( - Congresso Annuale AICA ) (esculapio bologna ITA ) - pp. da 45 a 53 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In questo paper si descrive la diversa rappresentazione della conoscenza estensionale nelle basi di dati e nelle basi di conoscenza.

S. BERGAMASCHI; F. BONFATTI; C. SARTORI ( 1988 ) - Entity-Situation: a Model for the Knowledge Representation Module of a KBMS ( International Conference on Extending Database Technology - Venice - March 14-18, 1988) ( - Advances in Database Technology - EDBT 88, Springer-Verlag ) (Springer Verlag Berlin DEU ) - n. volume lncs 303 - pp. da 578 a 582 ISBN: 3540190740 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

ADKS is an advanced data and management system whose main objective is to couple expressiveness and efficiency in the management of large knowledge bases is described. The architecture of the system and a new semantic model which is the basis of its knowledge representation module is presented.

S. BERGAMASCHI; L. CAVEDONI; C. SARTORI; P. TIBERIO ( 1988 ) - On taxonomic reasoning in E/R environment ( E/R conference - Roma - November 16-18) ( - Entity Relationship Conference 1988 ) (North - Holland amsterdam NLD ) - pp. da 443 a 453 ISBN: 0444874534 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

in C. Batini, (a cura di) Entity-Relationship Approach, Elsevier Science Publishers B.V.

S. BERGAMASCHI; F. BONFATTI; L. CAVAZZA; L. CAVAZZA; P. TIBERIO ( 1988 ) - Relational data base design for the intensional aspects of a knowledge base (Elsevier Science Limited:Oxford Fulfillment Center, PO Box 800, Kidlington Oxford OX5 1DX United Kingdom:011 44 1865 843000, 011 44 1865 843699, EMAIL: asianfo@elsevier.com, tcb@elsevier.co.UK, INTERNET: http://www.elsevier.com, http://www.elsevier.com/locate/shpsa/, Fax: 011 44 1865 843010 ) - INFORMATION SYSTEMS - n. volume 13, n. 3 - pp. da 245 a 256 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

PERGAMON PRESS

S. BERGAMASCHI; M. R. SCALAS ( 1986 ) - Choice of the optimal number of blocks for data access by an index (Elsevier Science Limited:Oxford Fulfillment Center, PO Box 800, Kidlington Oxford OX5 1DX United Kingdom:011 44 1865 843000, 011 44 1865 843699, EMAIL: asianfo@elsevier.com, tcb@elsevier.co.UK, INTERNET: http://www.elsevier.com, http://www.elsevier.com/locate/shpsa/, Fax: 011 44 1865 843010 ) - INFORMATION SYSTEMS - n. volume 11 - pp. da 3 a 3 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Pergamon Press