Foto personale

Pagina personale di Francesco GUERRA

Dipartimento di Ingegneria "Enzo Ferrari"

Azzopardi, Joel; Benedetti, Fabio; Guerra, Francesco; Lupu, Mihai ( 2017 ) - Back to the sketch-board: Integrating keyword search, semantics, and information retrieval ( 2nd COST Action IC1302 International KEYSTONE Conference on Semantic Keyword-Based Search on Structured Data Sources, IKC 2016 - Cluj-Napoca (Romania) - 2016) ( - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) ) (Springer Verlag ) - LECTURE NOTES IN COMPUTER SCIENCE - n. volume 10151 - pp. da 49 a 61 ISBN: 9783319536392 ISSN: 1611-3349 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We reproduce recent research results combining semantic and information retrieval methods. Additionally, we expand the existing state of the art by combining the semantic representations with IR methods from the probabilistic relevance framework. We demonstrate a significant increase in performance, as measured by standard evaluation metrics.

Cadegnani, Sara; Guerra, Francesco; Ilarri, Sergio; del Carmen Rodríguez-Hernández, María; Trillo-Lado, Raquel; Velegrakis, Yannis; Amaro, Raquel ( 2017 ) - Exploiting Linguistic Analysis on URLs for Recommending Web Pages: A Comparative Study - TRANSACTIONS ON COMPUTATIONAL COLLECTIVE INTELLIGENCE - n. volume 10190 - pp. da 26 a 45 ISSN: 2190-9288 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Nowadays, citizens require high level quality information from public institutions in order to guarantee their transparency. Institutional websites of governmental and public bodies must publish and keep updated a large amount of information stored in thousands of web pages in order to satisfy the demands of their users. Due to the amount of information, the “search form”, which is typically available in most such websites, is proven limited to support the users, since it requires them to explicitly express their information needs through keywords. The sites are also affected by the so-called “long tail” phenomenon, a phenomenon that is typically observed in e-commerce portals. The phenomenon is the one in which not all the pages are considered highly important and as a consequence, users searching for information located in pages that are not condiered important are having a hard time locating these pages. The development of a recommender system than can guess the next best page that a user wouild like to see in the web site has gained a lot of attention. Complex models and approaches have been proposed for recommending web pages to individual users. These approached typically require personal preferences and other kinds of user information in order to make successful predictions. In this paper, we analyze and compare three different approaches to leverage information embedded in the structure of web sites and the logs of their web servers to improve the effectiveness of web page recommendation. Our proposals exploit the context of the users’ navigations, i.e., their current sessions when surfing a specific web site. These approaches do not require either information about the personal preferences of the users to be stored and processed, or complex structures to be created and maintained. They can be easily incorporated to current large websites to facilitate the users’ navigation experience. Last but not least, the paper reports some comparative experiments using a real-world website to analyze the performance of the proposed approaches.

Bergamaschi, Sonia; Beneventano, Domenico; Mandreoli, Federica; Martoglia, Riccardo; Guerra, Francesco; Orsini, Mirko; Po, Laura; Vincini, Maurizio; Simonini, Giovanni; Zhu, Song; Gagliardelli, Luca; Magnotta, Luca ( 2017 ) - From Data Integration to Big Data Integration ( - A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years ) (Springer International Publishing ) - n. volume 31 - pp. da 43 a 59 ISBN: 9783319618920 ISSN: 2197-6511 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Abstract. The Database Group (DBGroup, www.dbgroup.unimore.it) and Information System Group (ISGroup, www.isgroup.unimore.it) re- search activities have been mainly devoted to the Data Integration Research Area. The DBGroup designed and developed the MOMIS data integration system, giving raise to a successful innovative enterprise DataRiver (www.datariver.it), distributing MOMIS as open source. MOMIS provides an integrated access to structured and semistructured data sources and allows a user to pose a single query and to receive a single unified answer. Description Logics, Automatic Annotation of schemata plus clustering techniques constitute the theoretical framework. In the context of data integration, the ISGroup addressed problems related to the management and querying of heterogeneous data sources in large-scale and dynamic scenarios. The reference architectures are the Peer Data Management Systems and its evolutions toward dataspaces. In these contexts, the ISGroup proposed and evaluated effective and efficient mechanisms for network creation with limited information loss and solutions for mapping management query reformulation and processing and query routing. The main issues of data integration have been faced: automatic annotation, mapping discovery, global query processing, provenance, multi- dimensional Information integration, keyword search, within European and national projects. With the incoming new requirements of integrating open linked data, textual and multimedia data in a big data scenario, the research has been devoted to the Big Data Integration Research Area. In particular, the most relevant achieved research results are: a scalable entity resolution method, a scalable join operator and a tool, LODEX, for automatically extracting metadata from Linked Open Data (LOD) resources and for visual querying formulation on LOD resources. Moreover, in collaboration with DATARIVER, Data Integration was successfully applied to smart e-health.

BERGAMASCHI, Sonia; INTERLANDI, Matteo; GUERRA, Francesco; TRILLO LADO, Raquel; VELEGRAKIS, Yannis ( 2016 ) - Combining User and Database Perspective for Solving Keyword Queries over Relational Databases - INFORMATION SYSTEMS - n. volume 55 - pp. da 1 a 19 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Over the last decade, keyword search over relational data has attracted considerable attention. A possible approach to face this issue is to transform keyword queries into one or more SQL queries to be executed by the relational DBMS. Finding these queries is a challenging task since the information they represent may be modeled across different tables and attributes. This means that it is needed to identify not only the schema elements where the data of interest is stored, but also to find out how these elements are interconnected. All the approaches that have been proposed so far provide a monolithic solution. In this work, we, instead, divide the problem into three steps: the first one, driven by the user׳s point of view, takes into account what the user has in mind when formulating keyword queries, the second one, driven by the database perspective, considers how the data is represented in the database schema. Finally, the third step combines these two processes. We present the theory behind our approach, and its implementation into a system called QUEST (QUEry generator for STructured sources), which has been deeply tested to show the efficiency and effectiveness of our approach. Furthermore, we report on the outcomes of a number of experimental results that we have conducted.

Sartori, Enrico; Velegrakis, Yannis; Guerra, Francesco ( 2016 ) - Entity-Based Keyword Search in Web Documents - TRANSACTIONS ON COMPUTATIONAL COLLECTIVE INTELLIGENCE - n. volume 9630 - pp. da 21 a 49 ISSN: 2190-9288 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The set of algorithms that compose a search engine rely on the infor- mation contained in the document representation to perform their task. Most of the traditional approaches represent a document as a flat list of words, but this model encounters difficulties in linking information regarding the same object but referring to it using different words. Moreover this approach to document modeling can’t give information about the relationship insisting among objects appearing into a document. What we propose in this work, is a novel approach to document representation and query answering, which addresses the aforemen- tioned problems through: i) Entity-based representation of objects referenced in documents, ii) Representation of the document aware of the relationship insisting among the objects appearing in the text. We provide a test implementation of the approach presented in the paper and present the result of the tests performed to measure its performance.

Bergamaschi, Sonia; Ferro, Nicola; Guerra, Francesco; Silvello, Gianmaria ( 2016 ) - Keyword-Based Search Over Databases: A Roadmap for a Reference Architecture Paired with an Evaluation Framework - TRANSACTIONS ON COMPUTATIONAL COLLECTIVE INTELLIGENCE - n. volume 9630 - pp. da 1 a 20 ISSN: 2190-9288 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Structured data sources promise to be the next driver of a significant socio-economic impact for both people and companies. Nevertheless, accessing them through formal languages, such as SQL or SPARQL, can become cumbersome and frustrating for end-users. To overcome this issue, keyword search in databases is becoming the technology of choice, even if it suffers from efficiency and effectiveness problems that prevent it from being adopted at Web scale. In this paper, we motivate the need for a reference architecture for keyword search in databases to favor the development of scalable and effective components, also borrowing methods from neighbor fields, such as information retrieval and natural language processing. Moreover, we point out the need for a companion evaluation framework, able to assess the efficiency and the effectiveness of such new systems and in the light of real and compelling use cases.

Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni; Velegrakis, Yannis ( 2016 ) - Providing Insight into Data Source Topics - JOURNAL ON DATA SEMANTICS - n. volume 5 - pp. da 211 a 228 ISSN: 1861-2032 [Articolo in rivista (262) - Articolo su rivista]
Abstract

A fundamental service for the exploitation of the modern large data sources that are available online is the ability to identify the topics of the data that they contain. Unfortunately, the heterogeneity and lack of centralized control makes it difficult to identify the topics directly from the actual values used in the sources. We present an approach that generates signatures of sources that are matched against a reference vocabulary of concepts through the respective signature to generate a description of the topics of the source in terms of this reference vocabulary. The reference vocabulary may be provided ready, may be created manually, or may be created by applying our signature-generated algorithm over a well-curated data source with a clear identification of topics. In our particular case, we have used DBpedia for the creation of the vocabulary, since it is one of the largest known collections of entities and concepts. The signatures are generated by exploiting the entropy and the mutual information of the attributes of the sources to generate semantic identifiers of the various attributes, which combined together form a unique signature of the concepts (i.e. the topics) of the source. The generation of the identifiers is based on the entropy of the values of the attributes; thus, they are independent of naming heterogeneity of attributes or tables. Although the use of traditional information-theoretical quantities such as entropy and mutual information is not new, they may become untrustworthy due to their sensitivity to overfitting, and require an equal number of samples used to construct the reference vocabulary. To overcome these limitations, we normalize and use pseudo-additive entropy measures, which automatically downweight the role of vocabulary items and property values with very low frequencies, resulting in a more stable solution than the traditional counterparts. We have materialized our theory in a system called WHATSIT and we experimentally demonstrate its effectiveness.

Guerra, Francesco; Trillo-Lado, Raquel; Ilarri, Sergio; Rodríguez-Hernández, María del Carmen ( 2016 ) - Towards Keyword-based Pull Recommendation Systems ( 18th International Conference on Enterprise Information Systems (ICEIS 2016) - Roma (IT) - 25-28 april 2016) ( - Proceedings of the 18th International Conference on Enterprise Information Systems ) (SCITEPRESS Setúbal PRT ) - n. volume 1 - pp. da 207 a 214 ISBN: 9789897581878 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Due to the high availability of data, users are frequently overloaded with a huge amount of alternatives when they need to choose a particular item. This has motivated an increased interest in research on recommendation systems, which filter the options and provide users with suggestions about specific elements (e.g., movies, restaurants, hotels, books, etc.) that are estimated to be potentially relevant for the user. In this paper, we describe and evaluate two possible solutions to the problem of identification of the type of item (e.g., music, movie, book, etc.) that the user specifies in a pull-based recommendation (i.e., recommendation about certain types of items that are explicitly requested by the user). We evaluate two alternative solutions: one based on the use of the Hidden Markov Model and another one exploiting Information Retrieval techniques. Comparing both proposals experimentally, we can observe that the Hidden Markov Model performs generally better than the Informatio n Retrieval technique in our preliminary experimental setup.

Rodrguez-Hernandez, Mara del Carmen; Guerra, Francesco; Ilarri, Sergio; Trillo Lado, Raquel ( 2015 ) - A First Step Towards Keyword-Based Searching for Recommendation Systems ( XX Jornadas de Ingegneria del Software y Bases de Datos - Santander - 15-17 Septiembre 2015) ( - Proceedings of the XX Jornadas de Ingegneria del Software y Bases de Datos ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Due to the high availability of data, users are frequently overloaded with a huge amount of alternatives when they need to choose a particular item. This has motivated an increased interest in research on recommendation systems, which lter the options and provide users with suggestions about specic elements (e.g., movies, restaurants, hotels, news, etc.) that are estimated to be potentially relevant for the user. Recommendation systems are still an active area of research, and particularly in the last years the concept of context-aware recommendation systems has started to be popular, due to the interest of considering the context of the user in the recommendation process. In this paper, we describe our work-in-progress concerning pull-based recommendations (i.e., recommendations about certain types of items that are explicitly requested by the user). In particular, we focus on the problem of detecting the type of item the user is interested in. Due to its popularity, we consider a keyword-based user interface: the user types a few keywords and the system must determine what the user is searching for. Whereas there is extensive work in the field of keyword-based search, which is still a very active research area, keyword searching has not been applied so far in most recommendation contexts.

Draszawka, Karol; Szymański, Julian; Guerra, Francesco ( 2015 ) - Improving css-KNN classification performance by shifts in training data ( International KEYSTONE Conference, IKC 2015 - Coimbra - 8-9 September 2015) ( - Semantic Keyword-based Search on Structured Data Sources First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015 ) (Springer DEU ) - n. volume 9398 - pp. da 51 a 63 ISBN: 9783319279312 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper presents a new approach to improve the performance of a css-k-NN classifier for categorization of text documents. The css-k-NN classifier (i.e., a threshold-based variation of a standard k-NN classifier we proposed in [1]) is a lazy-learning instance-based classifier. It does not have parameters associated with features and/or classes of objects, that would be optimized during off-line learning. In this paper we propose a training data preprocessing phase that tries to alleviate the lack of learning. The idea is to compute training data modifications, such that class representative instances are optimized before the actual k-NN algorithm is employed. The empirical text classification experiments using mid-size Wikipedia data sets show that carefully crossvalidated settings of such preprocessing yields significant improvements in k-NN performance compared to classification without this step. The proposed approach can be useful for improving the effectivenes of other classifiers as well as it can find applications in domain of recommendation systems and keyword-based search

Bernabei, Chiara; Guerra, Francesco; Trillo Lado, Raquel ( 2015 ) - Keyword Search in structured data and Network Analysis: a preliminary experiment over DBLP ( 2015 10th International Workshop on Semantic and Social Media Adaptation and Personalization (SMAP) - Trento - 5-6 November 2015) ( - Proceedings of the 10th International Workshop on Semantic and Social Media Adaptation and Personalization ) (IEEE ) - pp. da 40 a 45 ISBN: 9781509002429 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Identifying similar items to the ones provided as input to a search system, is a challenging task. The main issues concern not only the management of large collections of data, but also the profiling of the users, who usually have different opinions, tastes and expertise. In this paper we propose a preliminary investigation about the improvements in the accuracy of a search system provided by network analysis techniques supporting the discovery of relations among the items stored in the repository. For this reason, we have developed the SEEN prototype, a keyword search tool exploiting network analysis. SEEN has been evaluated against a relational version of the DBLP repository. The results of the preliminary experiments show that the the information provided by networks can improve the effectiveness of the results.

Bergamaschi, Sonia; Ferro, Nicola; Guerra, Francesco; Silvello, Gianmaria ( 2015 ) - Perspective Look at Keyword-based Search Over Relation Data and its Evaluation (Extended Abstract) ( 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) - Gaeta - 14-17 June 2015) ( - 23rd Italian Symposium on Advanced Database Systems (SEBD 2015) ) (Curren Associates New York USA ) - pp. da 168 a 175 ISBN: 9781510810877 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This position paper discusses the need for considering keyword search over relational databases in the light of broader systems, where keyword search is just one of the components and which are aimed at better supporting users in their search tasks. These more complex systems call for appropriate evaluation methodologies which go beyond what is typically done today, i.e. measuring performances of components mostly in isolation or not related to the actual user needs, and, instead, able to consider the system as a whole, its constituent components, and their inter-relations with the ultimate goal of supporting actual user search tasks.

Cadegnani, Sara; Guerra, Francesco; Ilarri, Sergio; Rodriguez.Hernandez, Marıa del Carmen; Trillo-Lado, Raquel; Velegrakis, Yannis ( 2015 ) - Recommending Web Pages Using Item-Based Collaborative Filtering Approaches ( International KEYSTONE Conference, IKC 2015 - Coimbra - 8-9 September 2015) ( - Semantic Keyword-based Search on Structured Data Sources - First COST Action IC1302 International KEYSTONE Conference, IKC 2015, Coimbra, Portugal, September 8-9, 2015 ) (Springer DEU ) - n. volume 9398 - pp. da 17 a 29 ISBN: 9783319279312 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Predicting the next page a user wants to see in a large website has gained importance along the last decade due to the fact that the Web has become the main communication media between a wide set of entities and users. This is true in particular for institutional government and public organization websites, where for transparency reasons a lot of information has to be provided. The “long tail” phenomenon affects also this kind of websites and users need support for improving the effectiveness of their navigation. For this reason, complex models and approaches for recommending web pages that usually require to process personal user preferences have been proposed. In this paper, we propose three different approaches to leverage information embedded in the structure of web sites and their logs to improve the effectiveness of web page recommendation by considering the context of the users, i.e., their current sessions when surfing a specific web site. This proposal does not require either information about the personal preferences of the users to be stored and processed or complex structures to be created and maintained. So, it can be easily incorporated to current large websites to facilitate the users’ navigation experience. Experiments using a real-world website are described and analyzed to show the performance of the three approaches.

Cardoso, Jorge; Guerra, Francesco; Houben, Geert-Jan; Pinto, Alexandre Miguel; Velegrakis, Yannis ( 2015 ) - Semantic keyword-based search on structured data sources: First COST action IC1302 international KEYSTONE conference, IKC 2015 coimbra, Portugal, September 8-9, 2015 revised selected papers (Springer Verlag DEU ) [Curatela (284) - Curatela]
Abstract

Proceedings of the First KEYSTONE Conference

Kozikowski, Piotr; Ioannou, Ekaterini; Velegrakis, Yannis; Guerra, Francesco ( 2015 ) - Support of part-whole relations in query answering ( International KEYSTONE Conference, IKC 2015 - Coimbra - 8-9 September 2015) ( - Semantic Keyword-based Search on Structured Data Sources. First COST Action IC1302 International KEYSTONE Conference, IKC 2015 ) (Springer CHE ) - n. volume 9398 - pp. da 94 a 107 ISBN: 9783319279312 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Part-whole relations are ubiquitous in our world, yet they do not get “first-class” treatment in the data managements systems most commonly used today. One aspect of part-whole relations that is particularly important is that of attribute transitivity. Some attributes of a whole are also attributes of its parts, and vice versa. We propose an extension to a generic entity-centric data model to support part-whole relations and attribute transitivity and provide more meaningful results to certain types of queries as a result. We describe how this model can be implemented using an RDF repository and three approaches to infer the implicit information necessary for query answering that adheres to the semantics of the model. The first approach is a naive implementation and the other two use indexing to improve performance. We evaluate several aspects of our implementations in a series of experimental results that show that the two approaches that use indexing are far superior to the naive approach and exhibit some advantages and disadvantages when compared to each other.

Guerra, Francesco; Simonini, Giovanni; Vincini, Maurizio ( 2015 ) - Supporting Image Search with Tag Clouds: A Preliminary Approach - ADVANCES IN MULTIMEDIA - n. volume 2015 - pp. da 1 a 10 ISSN: 1687-5680 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Algorithms and techniques for searching in collections of data address a challenging task, since they have to bridge the gap between the ways in which users express their interests, through natural language expressions or keywords, and the ways in which data is represented and indexed.When the collections of data include images, the task becomes harder, mainly for two reasons. From one side the user expresses his needs through one medium (text) and he will obtain results via another medium (some images). From the other side, it can be difficult for a user to understand the results retrieved; that is why a particular image is part of the result set. In this case, some techniques for analyzing the query results and giving to the users some insight into the content retrieved are needed. In this paper, we propose to address this problem by coupling the image result set with a tag cloud of words describing it. Some techniques for building the tag cloud are introduced and two application scenarios are discussed.

Bergamaschi, Sonia; Ferrari, Davide; Guerra, Francesco; Simonini, Giovanni ( 2014 ) - Discovering the topics of a data source: A statistical approach? ( Workshop on Surfacing the Deep and the Social Web, SDSW 2014, Co-located with the 13th International Semantic Web Conference, ISWC 2014 - ita - 2014) ( - CEUR Workshop Proceedings ) (CEUR-WS ) - CEUR WORKSHOP PROCEEDINGS - n. volume 1310 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present a preliminary approach for automatically discovering the topics of a structured data source with respect to a reference ontology. Our technique relies on a signature, i.e., a weighted graph that summarizes the content of a source. Graph-based approaches have been already used in the literature for similar purposes. In these proposals, the weights are typically assigned using traditional information-theoretical quantities such as entropy and mutual information. Here, we propose a novel data-driven technique based on composite likelihood to estimate the weights and other main features of the graphs, making the resulting approach less sensitive to overfitting. By means of a comparison of signatures, we can easily discover the topic of a target data source with respect to a reference ontology. This task is provided by a matching algorithm that retrieves the elements common to both the graphs. To illustrate our approach, we discuss a preliminary evaluation in the form of running example.

Sonia Bergamaschi; Francesco Guerra; Giovanni Simonini ( 2014 ) - Keyword Search over Relational Databases: Issues, Approaches and Open Challenges ( - Bridging Between Information Retrieval and Databases ) (Springer-Verlag Berlin Heidelberg Berlin DEU ) - n. volume LNCS 8173 - pp. da 54 a 73 ISBN: 9783642547973 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

In this paper, we overview the main research approaches developed in the area of Keyword Search over Relational Databases. In particular, we model the process for solving keyword queries in three phases: the management of the user’s input, the search algorithms, the results returned to the user. For each phase we analyze the main problems, the solutions adopted by the most important system developed by researchers and the open challenges. Finally, we introduce two open issues related to multi-source scenarios and database sources handling instance not fully accessible.

Francesco Guerra; Giovanni Simonini ( 2014 ) - Using Big Data to Support Automatic Word Sense Disambiguation ( Conference on High Performance Computing & Simulation - Bologna - 21 25 July 2014) ( - Proceedings of the International Conference on the 2014 High Performance Computing & Simulation ) (IEEE - Institute of Electrical and Electronics Engineers New Jersey USA ) - pp. da 311 a 314 ISBN: 9781479953134 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Word Sense Induction (WSI) usually relies on data structures built upon the words to be disambiguated. This is a time-consuming process that requires a huge computational effort. In this paper, we propose an approach to automatically build a generic sense inventory (called iSC) to be used as a reference for disambiguation. The sense inventory is built extracting insight from Big Data exploiting a community detection algorithm. Since generate taking into account large corpora of data, the iSCis independent of the domain of application and of predefined target words.

S. Bergamaschi; N.Ferro; F.Guerra ; G. Silvello ( 2013 ) - Keyword Search and Evaluation over Relational Databases: an Outlook to the Future ( DBRank 2013 - Riva del Garda (TN) - 30/08/2013) ( - 7th International Workshop on Ranking in Databases ) (ACM New York, NY, USA New York USA ) - pp. da 1 a 3 ISBN: 9781450324977 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This position paper discusses the need for considering keyword search over relational databases in the light of broader systems, where keyword search is just one of the components and which are aimed at better supporting users in their search tasks. These more complex systems call for appropriate evaluation methodologies which go beyond what is typically done today, i.e. measuring performances of components mostly in isolation or not related to the actual user needs, and, instead, able to consider the system as a whole, its constituent components, and their inter-relations with the ultimate goal of supporting actual user search tasks.

Bergamaschi, S.; Guerra, F.; Interlandi, M.; Trillo Lado, R.; Velegrakis, Y. ( 2013 ) - QUEST: A Keyword Search System for Relational Data based on Semantic and Machine Learning Techniques - PROCEEDINGS OF THE VLDB ENDOWMENT - n. volume 6(12) - pp. da 1222 a 1225 ISSN: 2150-8097 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We showcase QUEST (QUEry generator for STructured sources), a search engine for relational databases that combines semantic and machine learning techniques for transforming keyword queries into meaningful SQL queries. The search engine relies on two approaches: the forward, providing mappings of keywords into database terms (names of tables and attributes, and domains of attributes), and the backward, computing the paths joining the data structures identified in the forward step. The results provided by the two approaches are combined within a probabilistic framework based on the Dempster-Shafer Theory. We demonstrate QUEST capabilities, and we show how, thanks to the flexibility obtained by the probabilistic combination of different techniques, QUEST is able to compute high quality results even with few training data and/or with hidden data sources such as those found in the Deep Web.

Francesco Guerra;Maurizio Vincini ( 2013 ) - The Prosumer Paradigm for Life Cycle Assessment ServicesFrameworks of IT Prosumption for Business Development ( - Frameworks of IT Prosumption for Business Development ) (IGI GLOBAL Hershey USA ) - pp. da 234 a 246 ISBN: 9781466643130; 9781466643147 | 9781466643147 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Enterprises, governments, and government agencies have started to publish their data on the Internet, especially in the form of open structured data sources. The real exploitation of these free, large open data sources is more and more becoming a crucial activity for obtaining information and knowledge (i.e. competitive elements) in several business sectors. In addition, with the proliferation of Web 2.0 techniques and applications such as blogs, wikis, tagging systems, and mashups, the notion of user-centricity has gained a significant momentum to put ordinary users in the leading role of delivering exciting and personalized content and services. The term "prosumer," coined by the futurist Alvin Toffler in 1980, has been often referenced in business-related contexts to identify this situation. The chapter describes the application of the "prosumer paradigm" to a real data integration system of Life Cycle Assessment (LCA). ENEA, the Italian National Agency for new Technologies, Energy, and Sustainable Economic Development, promoted the adoption of such practice in small companies belonging to the industrial and agricultural sector supplying them with a simplified LCA system. In this chapter, the authors show how a domain expert user (the prosumer) can use the framework to easily map the classification of data flows and processes provided by the simplified LCA system into the ELCD database, containing a standard classification provided by the EU. This makes the proposal completely shareable with the whole thematic classification and vision promoted by the European Commission.

S. Bergamaschi; F. Guerra; M. Interlandi; S. Rota; R.Trillo; Y. Velegrakis ( 2013 ) - Using a HMM based approach for mapping keyword queries into database terms ( Symposium on Advanced Database Systems SEBD - Roccella Jonica - June 30th -July 04th, 2013) ( - Proceedings of 21st Italian Symposium on Advanced Database Systems ) (proceedings informali Roccella Jonica ITA ) - pp. da 239 a 246 ISBN: 0000000000 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Systems translating keyword queries into SQL queries over relational databases are usually referred to in the literature as schema-based approaches. These techniques exploit the information contained in the database schema to build SQL queries that express the intended meaning of the user query. Besides, typically, they perform a preliminary step that associates keywords in the user query with database elements (names of tables, attributes and domain attributes). In this paper, we present a probabilistic approach based on a Hidden Markov Model to provide such mappings. In contrast to most existing techniques, our proposal does not require any a-priori knowledge of the database extension.

Domenico Beneventano; Zoran Despotovic; Francesco Guerra; Sam Joseph; Gianluca Moro; Adrián Perreau de Pinninck ( 2012 ) - Agents and Peer-to-Peer Computing7th International Workshop, AP2PC 2008 Estoril, Portugal, May 2008 and 8th International Workshop, AP2PC 2009 Budapest, Hungary, May 2009, Revised Selected Papers (Springer-verlag Berlin DEU ) - pp. da 1 a 300 ISBN: 9783642318085 [Curatela (284) - Curatela]
Abstract

7th InternationalWorkshop, AP2PC 2008, Estoril, Portugal, May 13, 2008 and 8th InternationalWorkshop, AP2PC 2009 Budapest, Hungary, May 11, 2009 Revised Selected Papers

Roberto De Virgilio; Fausto Giunchiglia; Francesco Guerra; Letizia Tanca; Yannis Velegrakis ( 2012 ) - Introduction to the Special Issue on Semantic Web Data Management - INFORMATION SYSTEMS - n. volume 37(4) - pp. da 291 a 293 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

During the last decade we have witnessed a tremendousincrease in the amount of data that is available onthe Web in almost every field of human activity. Financialinformation, weather reports, news feeds, product information,and geographical maps are only a few examplesof such data, all intended to be consumed by the millionsof users surfing the Web. The advent of Web 2.0 applications,such as Wikis, social networking sites and mashupshave brought new forms of data and have radicallychanged the nature of modern Web. They have transformedthe Web from a publishing-only environment intoa vibrant place for information exchange. Web users areno longer plain data consumers but have become activedata producers and data dissemination agents, contributingfurther to the increase of the information plethora onthe Web.

Francesco Guerra; Marius-Octavian Olaru; Maurizio Vincini ( 2012 ) - Mapping and Integration of Dimensional Attributes Using Clustering Techniques. ( E-Commerce and Web Technologies - 13th International Conference, EC-Web 2012 - Vienna, Austria - September 4-5, 2012) ( - Lecture Notes in Business Information Processing ) (Springer New York USA ) - n. volume 123 - pp. da 38 a 49 ISBN: 9783642322723 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Following recent trends in Data Warehousing, companies realized that there is a great potential in combining their information repositories to obtain a broader view of the economical market. Unfortunately, even though Data Warehouse (DW) integration has been defined from a theoretical point of view, until now no complete, widely used methodology has been proposed to support the integration of the information coming from heterogeneous DWs. This paper deals with the automatic integration of dimensional attributes from heterogeneous DWs. A method relying on topological properties that similar dimensions maintain is proposed for discovering mappings of dimensions, and a technique based on clustering algorithms is introduced for integrating the data associated to the dimensions.

Roberto De Virgilio; Francesco Guerra; Yannis Velegrakis ( 2012 ) - Semantic Search over the Web (Springer Heidelberg DEU ) - pp. da 1 a 423 ISBN: 9783642250071 [Curatela (284) - Curatela]
Abstract

The Web has become the world’s largest database, with search being the main tool that allows organizations and individuals to exploit its huge amount of information. Search on the Web has been traditionally based on textual and structural similarities, ignoring to a large degree the semantic dimension, i.e., understanding the meaning of the query and of the document content. Combining search and semantics gives birth to the idea of semantic search. Traditional search engines have already advertised some semantic dimensions. Some of them, for instance, can enhance their generated result sets with documents that are semantically related to the query terms even though they may not include these terms. Nevertheless, the exploitation of the semantic search has not yet reached its full potential. In this book, Roberto De Virgilio, Francesco Guerra and Yannis Velegrakis present an extensive overview of the work done in Semantic Search and other related areas. They explore different technologies and solutions in depth, making their collection a valuable and stimulating reading for both academic and industrial researchers.The book is divided into three parts. The first introduces the readers to the basic notions of the Web of Data. It describes the different kinds of data that exist, their topology, and their storing and indexing techniques. The second part is dedicated to Web Search. It presents different types of search, like the exploratory or the path-oriented, alongside methods for their efficient and effective implementation. Other related topics included in this part are the use of uncertainty in query answering, the exploitation of ontologies, and the use of semantics in mashup design and operation. The focus of the third part is on linked data, and more specifically, on applying ideas originating in recommender systems on linked data management, and on techniques for the efficiently querying answering on linked data.

Roberto De Virgilio; Fausto Giunchiglia; Francesco Guerra; Letizia Tanca; Yannis Velegrakis ( 2012 ) - Special Issue on Semantic Web Data Management (Elsevier Amsterdam NLD ) - INFORMATION SYSTEMS - pp. da 291 a 390 ISSN: 0306-4379 [Curatela (284) - Curatela]
Abstract

Special issue on Semantic Web Data Management

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Silvia Rota; Raquel Trillo Lado; Yannis Velegrakis ( 2012 ) - Understanding the Semantics of Keyword Queries on Relational Data Without Accessing the Instance ( - Semantic Search over the Web ) (Springer Heidelberg DEU ) - pp. da 131 a 158 ISBN: 9783642250071 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

This chapter deals with the problem of answering a keyword query over a relational database. To do so, one needs to understand the meaning of the keywords in the query, “guess” its possible semantics, and materialize them as SQL queries that can be executed directly on the relational database. The focus of the chapter is on techniques that do not require any prior access to the instance data, making them suitable for sources behind wrappers or Web interfaces or, in general, for sources that disallow prior access to their data in order to construct an index. The chapter describes two techniques that use semantic information and metadata from the sources, alongside the query itself, in order to achieve that. Apart from understanding the semantics of the keywords themselves, the techniques are also exploiting the order and the proximity of the keywords in the query to make a more educated guess. The first approach is based on an extension of the Hungarian algorithm for identifying the data structures having the maximum likelihood to contain the user keywords. In the second approach, the problem of associating keywords into data structures of the relational source is modeled by means of a hidden Markov model, and the Viterbi algorithm is exploited for computing the mappings. Both techniques have been implemented in two systems called KEYMANTIC and KEYRY, respectively.

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - A Hidden Markov Model Approach to Keyword-Based Search over Relational Databases ( Conceptual Modeling (ER2011) - Brussels - 30/10/2011 - 03/11/2011) ( - Conceptual Modeling - ER 2011, 30th International Conference, ER 2011, Brussels, Belgium, October 31 - November 3, 2011. Proceedings ) (Springer Heidelberg DEU ) - pp. da 411 a 420 ISBN: 9783642246050 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We present a novel method for translating keyword queries over relationaldatabases into SQL queries with the same intended semantic meaning. Incontrast to the majority of the existing keyword-based techniques, our approachdoes not require any a-priori knowledge of the data instance. It follows a probabilisticapproach based on a Hidden Markov Model for computing the top-K bestmappings of the query keywords into the database terms, i.e., tables, attributesand values. The mappings are then used to generate the SQL queries that areexecuted to produce the answer to the keyword query. The method has been implementedinto a system called KEYRY (from KEYword to queRY).

Sonia Bergamaschi; Francesco Guerra; Mirko Orsini; Claudio Sartori; Maurizio Vincini ( 2011 ) - A Semantic Approach to ETL Technologies - DATA & KNOWLEDGE ENGINEERING - n. volume 70(8) - pp. da 717 a 731 ISSN: 0169-023X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Data warehouse architectures rely on extraction, transformation and loading (ETL) processes for the creation of anupdated, consistent and materialized view of a set of data sources. In this paper, we aim to support these processes byproposing a tool for the semi-automatic definition of inter-attribute semantic mappings and transformation functions.The tool is based on semantic analysis of the schemas for the mapping definitions amongst the data sources and thedata warehouse, and on a set of clustering techniques for defining transformation functions homogenizing data comingfrom multiple sources. Our proposal couples and extends the functionalities of two previously developed systems: theMOMIS integration system and the RELEVANT data analysis system.

Matteo Palmonari; Antonio Sala; Andrea Maurino; Francesco Guerra; Gabriella Pasi; Giuseppe Frisoni ( 2011 ) - Aggregated search of data and services - INFORMATION SYSTEMS - n. volume 36(2) - pp. da 134 a 150 ISSN: 0306-4379 [Articolo in rivista (262) - Articolo su rivista]
Abstract

From a user perspective, data and services provide a complementary view of an information source: data provide detailed information about specific needs, while services execute processes involving data and returning an informative result as well. For this reason, users need to perform aggregated searches to identify not only relevant data, but also services able to operate on them. At the current state of the art such aggregated search can be only manually performed by expert users, who first identify relevant data, and then identify existing relevant services.In this paper we propose a semantic approach to perform aggregated search of data and services. In particular, we define a technique that, on the basis of an ontological representation of both data and services related to a domain, supports the translation of a data query into a service discovery process.In order to evaluate our approach, we developed a prototype that combines a data integration system with a novel information retrieval-based Web Service discovery engine (XIRE). The results produced by a wide set of experiments show the effectiveness of our approach with respect to IR approaches, especially when Web Service descriptions are expressed by means of a heterogeneous terminology.

Sonia Bergamaschi; Domenico Beneventano; Francesco Guerra; Mirko Orsini ( 2011 ) - Data Integration ( - Handbook of Conceptual Modeling ) (Springer Berlin DEU ) - pp. da 441 a 476 ISBN: 9783642158643 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Given the many data integration approaches, a complete and exhaustivecomparison of all the research activities is not possible. In this chapter, we willpresent an overview of the most relevant research activities andideas in the field investigated in the last 20 years. We will also introduce the MOMISsystem, a framework to perform information extraction and integration from bothstructured and semistructured data sources, that is one of the most interesting resultsof our research activity. An open source version of the MOMIS system was deliveredby the academic startup DataRiver (www.datariver.it).

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - KEYRY: A Keyword-Based Search Engine over Relational Databases Based on a Hidden Markov Model ( Conceptual Modeling (ER2011) - Demo - Brussels - 30/10/2011 - 03/11/2011) ( - Advances in Conceptual Modeling. Recent Developments and New Directions - ER 2011 Workshops FP-UML, MoRE-BI, Onto-CoM, SeCoGIS, Variability@ER, WISM, Brussels, Belgium, October 31 - November 3, 2011. Proceedings ) (Springer Heidelberg DEU ) - pp. da 328 a 331 ISBN: 9783642245732 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose the demonstration of KEYRY, a tool for translating keywordqueries over structured data sources into queries in the native language ofthe data source. KEYRY does not assume any prior knowledge of the source contents.This allows it to be used in situations where traditional keyword searchtechniques over structured data that require such a knowledge cannot be applied,i.e., sources on the hidden web or those behind wrappers in integration systems.In KEYRY the search process is modeled as a Hidden Markov Model and the ListViterbi algorithm is applied to computing the top-k queries that better representthe intended meaning of a user keyword query. We demonstrate the tool’s capabilities,and we show how the tool is able to improve its behavior over time byexploiting implicit user feedback provided through the selection among the top-ksolutions generated.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Raquel Trillo Lado; Yannis Velegrakis ( 2011 ) - Keyword search over relational databases: a metadata approach ( ACM SIGMOD International Conference on Management of Data - Athens - June 12-16, 2011) ( - Proceedings of the ACM SIGMOD International Conference on Management of Data ) (ACM New York USA ) - pp. da 565 a 576 ISBN: 9781450306614 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Keyword queries offer a convenient alternative to traditionalSQL in querying relational databases with large, often unknown,schemas and instances. The challenge in answering such queriesis to discover their intended semantics, construct the SQL queriesthat describe them and used them to retrieve the respective tuples.Existing approaches typically rely on indices built a-priori on thedatabase content. This seriously limits their applicability if a-prioriaccess to the database content is not possible. Examples include theon-line databases accessed through web interface, or the sources ininformation integration systems that operate behind wrappers withspecific query capabilities. Furthermore, existing literature has notstudied to its full extend the inter-dependencies across the ways thedifferent keywords are mapped into the database values and schemaelements. In this work, we describe a novel technique for translatingkeyword queries into SQL based on the Munkres (a.k.a. Hungarian)algorithm. Our approach not only tackles the above twolimitations, but it offers significant improvements in the identificationof the semantically meaningful SQL queries that describe theintended keyword query semantics. We provide details of the techniqueimplementation and an extensive experimental evaluation.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Raquel Trillo Lado; Yannis Velegrakis ( 2011 ) - Keyword-based Search in Data Integration Systems ( Italian Symposium on Advanced Database Systems - Maratea - 26-29/06/2011) ( - Proceedings of the 19th Italian Symposium on Advanced Database Systems ) (Università della Basilicata Potenza ITA ) - pp. da 103 a 110 ISBN: 9780000000002 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we describe Keymantic, a framework for translating keywordqueries into SQL queries by assuming that the only available information isthe source metadata, i.e., schema and some external auxiliary information. Sucha framework finds application when only intensional knowledge about the datasource is available like in Data Integration Systems.

Silvia Rota; Sonia Bergamaschi; Francesco Guerra ( 2011 ) - The List Viterbi Training Algorithm and Its Application to Keyword Search over Databases ( CIKM’11 - Glasgow - October 24–28, 2011) ( - Proceedings of the 20th ACM Conference on Information and Knowledge Management ) (ACM New York USA ) - pp. da 1601 a 1606 ISBN: 9781450307178 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Hidden Markov Models (HMMs) are today employed in a varietyof applications, ranging from speech recognition to bioinformatics.In this paper, we present the List Viterbi training algorithm, aversion of the Expectation-Maximization (EM) algorithm based onthe List Viterbi algorithm instead of the commonly used forwardbackwardalgorithm. We developed the batch and online versionsof the algorithm, and we also describe an interesting application inthe context of keyword search over databases, where we exploit aHMM for matching keywords into database terms. In our experimentswe tested the online version of the training algorithm in asemi-supervised setting that allows us to take into account the feedbacksprovided by the users.

Sonia Bergamaschi; Francesco Guerra; Silvia Rota; Yannis Velegrakis ( 2011 ) - Understanding linked open data through keyword searching: the KEYRY approach ( Linked Web Data Management - Uppsala, Sweden - 25 March 2011) ( - Proceedings of the 1st International Workshop on Linked Web Data Management ) (ACM New York USA ) - pp. da 34 a 35 ISBN: 9781450306089 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We introduce KEYRY, a tool for translating keyword queries overstructured data sources into queries formulated in their native querylanguage. Since it is not based on analysis of the data sourcecontents, KEYRY finds application in scenarios where sourceshold complex and huge schemas, apt to frequent changes, such assources belonging to the linked open data cloud. KEYRY is basedon a probabilistic approach that provides the top-k results that betterapproximate the intended meaning of the user query.

Francesco Guerra; Sonia Bergamaschi ( 2011 ) - 2nd International Workshop on Data Engineering meets the Semantic [Esposizione (290) - Esposizione]
Abstract

The goal of DESWeb is to bring together researchers and practitioners from both fields of Data Management and Semantic Web. It aims at investigating the new challenges that Semantic Web technologies have introduced and new ways through which these technologies can improve existing data management solutions. Furthermore, it intends to study what data management systems and technologies can offer in order to improve the scalability and performance of Semantic Web applications.

Sonia Bergamaschi; Francesco Guerra; Barry Leiba ( 2010 ) - Guest editors' introduction: Information overload - IEEE INTERNET COMPUTING - n. volume 14(6) - pp. da 10 a 13 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Search the Internet for the phrase “information overload definition,” andGoogle will return some 7,310,000results (at the time of this writing).Bing gets 9,760,000 results for thesame query. How is it possible for usto process that much data, to select themost interesting information sources,to summarize and combine differentfacets highlighted in the results, andto answer the questions we set out toask? Information overload is present ineverything we do on the Internet.Despite the number of occurrences ofthe term on the Internet, peer-reviewedliterature offers only a few accuratedefinitions of information overload.Among them, we prefer the one thatdefines it as the situation that “occursfor an individual when the informationprocessing demands on time (InformationLoad, IL) to perform interactionsand internal calculations exceed thesupply or capacity of time available (Information Processing Capacity, IPC) for such processing.”1 In other words, when the information available exceeds the user’s ability to process it. This formaldefinition provides a measure that we can express algebraically as IL > IPC, offering a way for classifying and comparing the different situations in which the phenomenon occurs. But measuring IL and IPC is a complex task because they strictly depend on a set of factors involving both the individual and the information (such as the individual’s skill), as well as the motivations and goals behind the information request.Clay Shirky, who teaches at New York University,takes a different view, focusing on how we sift through the information that’s available to us. We’ve long had access to “more reading material than you could finish in a lifetime,” he says, and “there is no such thing as information overload, there’s only filter failure.”2 But howeverwe look at it, whether it’s too much productionor failure in filtering, it’s a general and common problem, and information overload management requires the study and adoption of special, user- and context-dependent solutions.Due to the amount of information available that comes with no guarantee of importance, trust, or accuracy, the Internet’s growth has inevitably amplified preexisting information overload issues. Newspapers, TV networks, and press agencies form an interesting example of overload producers: they collectively make available hundreds of thousands of partially overlapping news articles each day. This large quantity gives rise to information overload in a “spatial” dimension — news articles about the same subject are published in different newspapers— and in a “temporal” dimension — news articles about the same topic are published and updated many times in a short time period.The effects of information overload include difficulty in making decisions due to time spent searching and processing information,3 inabilityto select among multiple information sources providing information about the same topic,4 and psychological issues concerning excessive interruptions generated by too many informationsources.5 To put it colloquially, this excess of information stresses Internet users out.

Sonia Bergamaschi; Francesco Guerra; Barry Leiba ( 2010 ) - IEEE Internet Computing Special Issue on Information Overload (IEEE Computer Society Los Alamitos USA ) - IEEE INTERNET COMPUTING - pp. da 10 a 13 ISSN: 1089-7801 [Curatela (284) - Curatela]
Abstract

Search the Internet for the phrase “information overload definition,” and Google will return some 7,310,000 results (at the time of this writing). Bing gets 9,760,000 results for the same query. How is it possible for us to process that much data, to select the most interesting information sources, to summarize and combine different facets highlighted in the results, and to answer the questions we set out to ask? Information overload is present in everything we do on the Internet.Despite the number of occurrences of the term on the Internet, peer-reviewed literature offers only a few accurate definitions of information overload.Among them, we prefer the one that defines it as the situation that “occurs for an individual when the information processing demands on time (Information Load, IL) to perform interactionsand internal calculations exceed the supply or capacity of time available (Information Processing Capacity, IPC) for such processing.” In other words, when the information available exceeds the user’s ability to process it. This formal definition provides a measure that we can express algebraically as IL > IPC, offering a way for classifying and comparing the different situations in which the phenomenon occurs. But measuring IL and IPC is a complex task because they strictly depend on a set of factors involving both the individual and the information (such as the individual’s skill), as well as the motivations and goals behind the information request.Clay Shirky, who teaches at New York University, takes a different view, focusing on how we sift through the information that’s available to us. We’ve long had access to “more reading material than you could finish in a lifetime,” he says, and “there is no such thing as information overload, there’s only filter failure.” But however we look at it, whether it’s too much production or failure in filtering, it’s a general and common problem, and information overload management requires the study and adoption of special, user- and context-dependent solutions.Due to the amount of information available that comes with no guarantee of importance, trust, or accuracy, the Internet’s growth has inevitably amplified preexisting information overload issues. Newspapers, TV networks, and press agencies form an interesting example of overload producers: they collectively make available hundreds of thousands of partially overlapping news articles each day. This large quantity gives rise to information overload in a “spatial” dimension — news articles about the same subject are published in different newspapers— and in a “temporal” dimension — news articles about the same topic are published and updated many times in a short time period.The effects of information overload include difficulty in making decisions due to time spent searching and processing information, inability to select among multiple information sources providing information about the same topic, and psychological issues concerning excessive interruptions generated by too many information sources. To put it colloquially, this excess of information stresses Internet users out.

Sonia Bergamaschi; Elton Domnori; Francesco Guerra; Mirko Orsini; Raquel Trillo Lado; Yannis Velegrakis ( 2010 ) - Keymantic: Semantic Keyword-based Searching in Data Integration Systems [Software (296) - Software]
Abstract

Keymantic is a systemfor keyword-based searching in relational databases thatdoes not require a-priori knowledge of instances held in adatabase. It finds numerous applications in situations wheretraditional keyword-based searching techniques are inappli-cable due to the unavailability of the database contents forthe construction of the required indexes.

S. Bergamaschi; E. Domnori; F. Guerra; M. Orsini; R. Trillo Lado; Y. Velegrakis ( 2010 ) - Keymantic: Semantic Keyword-based Searching in Data Integration Systems - PROCEEDINGS OF THE VLDB ENDOWMENT - n. volume 3(2) - pp. da 1637 a 1640 ISSN: 2150-8097 [Articolo in rivista (262) - Articolo su rivista]
Abstract

We propose the demonstration of Keymantic, a system for keyword-based searching in relational databases that does not require a-priori knowledge of instances held in a database. It nds numerous applications in situations where traditional keyword-based searching techniques are inapplicable due to the unavailability of the database contents for the construction of the required indexes.

Francesco Guerra; Yannis Velegrakis; Sonia Bergamaschi ( 2010 ) - 1st International Workshop on Data Engineering meets the Semantic Web (DESWeb 2010) [Esposizione (290) - Esposizione]
Abstract

Modern web applications like Wiki’s, social networking sites and mashups, are radically changing thenature of modern Web from a publishing-only environment into a vivant place for information exchange.The successful exploitation of this information largely depends on the ability to successfully communicatethe data semantics, which is exactly the vision of the Semantic Web. In this context, new challengesemerge for semantic-aware data management systems.The contribution of the data management community in the Semantic Web effort is fundamental. RDFhas already been adopted as the representation model and exchange format for the semantics of thedata on the Web. Although, until recently, RDF had not received considerable attention, the recentpublication in RDF format of large ontologies with millions of entities from sites like Yahoo! andWikipedia, the huge amounts of microformats in RDF from life science organizations, and the giganticRDF bibliographic annotations from publishers, have made clear the need for advanced managementtechniques for RDF data.On the other hand, traditional data management techniques have a lot to gain by incorporating semanticinformation into their frameworks. Existing data integration, exchange and query solutions are typicallybased on the actual data values stored in the repositories, and not on the semantics of these values.Incorporation of semantics in the data management process improves query accuracy, and permit moreefficient and effective sharing and distribution services. Integration of new content, on-the-fly generationof mappings, queries on loosely structured data, keyword searching on structured data repositories, andentity identification, are some of the areas that can benefit from the presence of semantic knowledgealongside the data.The goal of DESWeb is to bring together researchers and practitioners from both fields of DataManagement and Semantic Web. It aims at investigating the new challenges that Semantic Webtechnologies have introduced and new ways through which these technologies can improve existing datamanagement solutions. Furthermore, it intends to study what data management systems andtechnologies can offer in order to improve the scalability and performance of Semantic Web applications.

S. Bergamaschi; F. Guerra; M. Orsini; C. Sartori; M. Vincini ( 2009 ) - An ETL tool based on semantic analysis of schemata and instances ( International Conference on Knowledge-based and Intelligent Information & Engineering Systems (KES 2009) - Santiago, Chile - September 28-30, 2009) ( - Knowledge-Based and Intelligent Information and Engineering Systems ) (Springer Heidelberg DEU ) - n. volume 5712 - pp. da 58 a 65 ISBN: 9783642045912 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as an ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse,and instance level operations, exploited for defining transformation functions that integrate data coming from multiple sources in a common representation.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.

Francesco Guerra; Sonia Bergamaschi; Mirko Orsini; Claudio Sartori; Maurizio Vincini ( 2009 ) - Improving Extraction and Transformation in ETL by Semantic Analysis ( European Conference on Knowledge Management - Vicenza, Italy - 3-4 September 2009) ( - Proceedings of the 10th European Conference on Knowledge Management ) (Academic Publishing Limited non disponibile GBR ) - pp. da 347 a 355 ISBN: 9781906638399 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Extraction, Transformation and Loading processes (ETL) are crucial for the data warehouseconsistency and are typically based on constraints and requirements expressed in natural language in the form ofcomments and documentations. This task is poorly supported by automatic software applications, thus makingthese activities a huge works for data warehouse. In a traditional business scenario, this fact does not representa real big issue, since the sources populating a data warehouse are fixed and directly known by the dataadministrator. Nowadays, the actual business needs require enterprise information systems to have a greatflexibility concerning the allowed business analysis and the treated data. Temporary alliances of enterprises,market analysis processes, the data availability on Internet push enterprises to quickly integrate unexpected datasources for their activities. Therefore, the reference scenario for data warehouse systems extremely changes,since data sources populating the data warehouse may not directly be known and managed by the designers,thus creating new requirements for ETL tools related to the improvement of the automation of the extraction andtransformation process, the need of managing heterogeneous attribute values and the ability to manage differentkinds of data sources, ranging from DBMS, to flat file, XML documents and spreadsheets. In this paper wepropose a semantic-driven tool that couples and extends the functionalities of two systems: the MOMISintegration system and the RELEVANT data analysis system. The tool aims at supporting the semi-automaticdefinition of ETL inter-attribute mappings and transformations in a data warehouse project. By means of asemantic analysis, two tasks are performed: 1) identification of the parts of the schemata of the data sourceswhich are related to the data warehouse; 2) supporting the definition of transformation rules for populating thedata warehouse. We experimented the approach in a real scenario: preliminary qualitative results show that ourtool may really support the data warehouse administrator’s work, by considerably reducing the data warehousedesign time.

F. GUERRA; BERGAMASCHI S; ORSINI M; SALA A; SARTORI C ( 2009 ) - Keymantic: A keyword Based Search Engine using Structural Knwoledge ( International Conference on Enterprise Information Systems - Milano, Italia - 6-10 Maggio 2009) ( - International Conference on Enterprise Information Systems ) (INSTICC Setubal PRT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Traditional techniques for query formulation need the knowledge of the database contents, i.e. which data are stored in the data source and how they are represented.In this paper, we discuss the development of a keyword-based search engine for structured data sources. The idea is to couple the ease of use and flexibility of keyword-based search with metadata extracted from data schemata and extensional knowledge which constitute a semantic network of knowledge. Translating keywords into SQL statements, we will develop a search engine that is effective, semantic-based, and applicablealso when instance are not continuously available, such as in integrated data sources or in data sources extracted from the deep web.

F. Guerra; A. Maurino; M. Palmonari; G. Pasi; A. Sala ( 2009 ) - Searching for Data and Services ( International Workshop on Interoperability through Semantic Data and Service Integration - Camogli - June 25th, 2009) ( - Proceedings of the 1st International Workshop on Interoperability through Semantic Data and Service Integration ) (Atti informali Camogli (Genova) ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The increasing availability of data and eServices on the Weballows users to search for relevant information and to perform operations through eServices. Current technologies do not support users in the execution of such activities as a unique task; thus users have first to find interesting information, and then, as a separate activity, to find and use Services. In this paper we present a framework able to query an integrated view of heterogeneous data and to search for eServices related toretrieved data.

S. Bergamaschi; F. Guerra; M. Orsini; C. Sartori; M. Vincini ( 2009 ) - Semantic Analysis for an Advanced ETL framework ( International Workshop on Interoperability through Semantic Data and Service Integration - Camogli (Genova) - June 25th, 2009) ( - Proceedings of the 1st International Workshop on Interoperability through Semantic Data and Service Integration ) (atti informali Camogli (Genova) ITA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse, and instance level operations, exploited for defining transformationfunctions that integrate in a common representation data coming from multiple sources.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.

D. Beneventano; F. Guerra; A. Maurino; M. Palmonari; G. Pasi; A. Sala ( 2009 ) - Unified Semantic Search of Data and Services ( International Conference on Metadata and Semantics Research (MTSR 2009) - Milano, Italia - September 30 - October 2) ( - Proceedings of the 3rd International Conference on Metadata and Semantics Search 2009 ) (Springer-Verlag Heidelberg DEU ) - pp. da 95 a 107 ISBN: 9783642045899 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The increasing availability of data and eServices on the Weballows users to search for relevant information and to perform operations through eServices. Current technologies do not support users in the execution of such activities as a unique task; thus users have first to find interesting information, and then, as a separate activity, to find and use eServices. In this paper we present a framework able to query an integrated view of heterogeneous data and to search for eServices related to retrieved data. A unique view of data and semantically describedeServices is the way in which it is possible to unify data andservice perspectives.

Sonia Bergamaschi; Francesco Guerra; Federica Mandreoli; Maurizio Vincini ( 2009 ) - Working in a dynamic environment: the NeP4B approach as a MAS ( Agents and Peer-to-Peer Computing (AP2PC 2009) - Budapest, Hungary - May 11, 2009) ( - Proceedings of the eighth International Workshop on Agents and Peer-to-Peer Computing ) (Springer, Verlag Berlin DEU ) - pp. da 117 a 130 ISBN: 9783642318085 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Integration of heterogeneous information in the context of Internet is becoming a key activity to enable a more organized and semantically meaningful access to several kinds of information in the form of data sources, multimediadocuments and web services. In NeP4B (Networked Peers for Business), a project funded by the Italian Ministry of University and Research, we developed an approach for providing a uniform representation of data, multimedia and services,thus allowing users to obtain sets of data, multimedia documents and lists of webservices as query results. NeP4B is based on a P2P network of semantic peers, connected one with each other by means of automatically generated mappings.In this paper we present a new architecture for NeP4B, based on a Multi-Agent System.We claim that such a solution may be more efficient and effective, thanks to the agents’ autonomy and intelligence, in a dynamic environment, where sources are frequently added (or deleted) to (from) the network.

D. BENEVENTANO; BERGAMASCHI S; CLAUDIO GENNARO; FRANCESCO GUERRA; MATTEO MORDACCHINI; ANTONIO SALA ( 2008 ) - A Mediator System for Data and Multimedia Sources ( Data Integration through Semantic Technology - Bangkok - 08 december 2008) ( - DATA INTEGRATION THROUGH SEMANTIC TECHNOLOGY ) (Data Integration through Semantic Technology - Workshop at the 3rd Asian Semantic Web Conference BANGKOK THA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Managing data and multimedia sources with a unique tool is a challenging issue. In this paper, the capabilities of the MOMIS integration system and the MILOS multimedia content management system are coupled, thus providing a methodology and a tool for building and querying an integrated virtual view of data and multimedia sources.

BENEVENTANO D; F. GUERRA; C. GENNARO ( 2008 ) - A Methodology for Building and Querying an Ontology representing Data and Multimedia Sources ( ODBIS Workshop on Ontologies-based Techniques for DataBases in Information and Knowledge Systems - Auckland (New Zealand) - 23 August) ( - Proceedings of 4th ODBIS Workshop on Ontologies-based Techniques for DataBases in Information Systems and Knowledge Systems ) (VLDB Endowment (workshop co-located with the VLDB conference) AUCKLAND NZL ) - pp. da 37 a 41 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Managing data and multimedia sources with a unique tool is a challenging issue. In this paper, the capabilities of the MOMIS integration system and the MILOS multimedia content management system are coupled, thus providing a methodology and a tool for building and querying a populated ontology representing data and multimedia sources.

Adrián Perreau de Pinninck; Francesco Guerra; Gianluca Moro ( 2008 ) - Eighth International Workshop on Agents and Peer-to-Peer Computing (AP2PC09) [Esposizione (290) - Esposizione]
Abstract

P2P networking is the term being used to describe a new crop of decentralized approaches to self-organize large overlay networks where participants can share and exploit enormous autonomous resources. At their heart P2P systems embody the earliest principles of the internet, decentralised systems of similarly enabled 'peers'. What makes P2P networking different is that the times have changed; the numbers of peers involved has multiplied, their rate of turn-over has increased, and they now operate as an overlay within the network application layer. New techniques such as distributed hash-tables (DHTs), semantic routing, and Plaxton Meshes are being combined with traditional concepts such as Hypercubes, Trust Metrics and caching techniques to pool together the untapped computing power at the "edges" of the internet. The possibilities of this paradigm have generated a lot of interest in research, industrial and social networks. P2P network collaboration is redefining the way of communicating, publishing, doing business and building collective knowledge thanks mainly to the advent of free or affordable technologies. For instance, the major film studios and the music corporations after realizing the economic potential of p2p networks, have started selling their product online. Citizen journalism is an example based on P2P interactions, in which the idea is that people without professional journalism training can use the tools of modern technology and the global distribution of the Internet to create, augment or fact-check media on their own or in collaboration with others; P2P reputation-based mechanisms are used to validate facts/news. P2P lending allows person to skip the bank and borrow from individuals; people can borrow from complete strangers or just use P2P lending services to structure loans between friends and family (e.g. Booper, Zopa, Kiva). Recently projects based on P2P architectures, for exchanging and sharing knowledge among companies (e.g. NeP4B), have been funded; the companies of any nature, size and geographic location will be able to search for partners, exchange data, negotiate and collaborate without limitations and constraints. For these and other similar phenomena has been coined at Harvard Law School the term Commons-based peer production to describe a new model of economic production in which the creative energy of large numbers of people is coordinated into large, meaningful projects, mostly without traditional hierarchical organization or financial compensation. The Internet is going to be revolutionized by applications able to harness the power of P2P networking to bring together communities of people and organizations with similar interests or goals, and the agent technology offers the potential for developing such systems. In P2P computing peers and services organise themselves dynamically without central coordination in order to foster knowledge sharing and collaboration, both in cooperative and non-cooperative environments. The success of P2P systems strongly depends on a number of factors. First, the ability to ensure equitable distribution of content and services. Economic and business models which rely on incentive mechanisms to supply contributions to the system are being developed, along with methods for controlling the "free riding" issue. Second, the ability to enforce provision of trusted services. Reputation based P2P trust management models are becoming a focus of the research community as a viable solution. The trust models must balance both constraints imposed by the environment (e.g. scalability) and the unique properties of trust as a social and psychological phenomenon. Recently, we are also witnessing a move of the P2P paradigm to embrace mobile computing and sensor networks in an attempt to achieve even higher ubiquitousness. The possibility of services related to physical location and the relation with agents in physical proximity introduces new opportunities and also new technical

Jorge Cardoso; Christoph Bussler; Francesco Guerra ( 2008 ) - Search using Metadata, Semantic, and Ontologies (Inderscience publisher Olney GBR ) - pp. da 1 a 2 ISBN: 17442621 [Curatela (284) - Curatela]
Abstract

Traditional search techniques establish a direct connection between the information provided by users with the search engine. Users are only allowed to specify a set of keywords that will be syntactically matched against a database of keywords and references. This simple approach has several drawbacks since it gives rise to a low precision (the ratio of positive results with respect to the total number of false and positive results retrieved) and low recall (the ratio of positive results retrieved with respect to the total number of positive results in the reference base). Many factors influence this low precision and recall, namely polysemy and synonymy. In the first case, one word specified in a query might have several meanings and, in the second case, distinct words may designate the same concept. If appropriate strategies are used and included in a new generation of search engines, the number of false results can be drastically reduced. As a result, the impact of these two degrading factors can be reduced and even eliminated. As the interconnection of research areas such as artificial intelligence, semantic web, and linguistics becomes stronger and more mature, it is reasonable to explore how better search engines can be developed to more adequately respond to users’ needs. A new kind of search engine that has been explored for a few years now has been termed “semantic-based search engines” by many researchers. The underlying paradigm of these engines is to find resources based on similar concepts and logical relationships and not just similar words. These engines typically rely on the use of metadata, controlled vocabularies, thesauri, taxonomy, and ontologies to describe the searchable resources to ensure that the most relevant items of information are returned. The intend of this special issue is to bring together a compilation of recent research and developments toward the creation of a new paradigm for search engines that relies on metadata, semantics and ontologies, by providing readers with a “broad spectrum vision” of the most important issues on semantic search engines. One of the main problems concerns the recognition of items of interest in web documents.

Sonia Bergamaschi; Francesco Guerra; Yannis Velegrakis ( 2008 ) - 2nd International Workshop on Semantic Web Architectures For Enterprises [Esposizione (290) - Esposizione]
Abstract

The Semantic Web vision aims at building a "web of data", where applications may share their data on the Internet and relate them to real world objects for interoperability and exchange. Similar ideas have been applied to web services, where different modeling architectures have been proposed for adding semantics to web service descriptions making services on the web widely available. The potential impact envisaged by these approaches on real business applications is also important in areas such as: Semantic-based business integration: business integration allows enterprises to share their data and services with other enterprises for business purposes. Making data and services available satisfies both "structural" requirements of enterprises (e.g. the possibility of sharing data about products or about available services), and "dynamic" requirement (e.g. business-to-business partnerships to execute an order). Information systems implementing semantic web architectures can enable and strongly support this process. Semantic interoperability: metadata and ontologies support the dynamic and flexible exchange of data and services across information systems of different organizations. Adding semantics to representations of data and services allows accurate data querying and service discovering. Semantic-based lifecycle management: metadata, ontologies and rules are becoming an effective way for modeling corporate processes and business domains, effectively supporting the maintenance and evolution of business processes, corporate data, and knowledge. Knowledge management: ontologies and automated reasoning tools seem to provide an innovative support to the elicitation, representation and sharing of corporate knowledge. SWAE (Semantic Web Architectures for Enterprises) aims at evaluating how and how much the Semantic Web vision has met its promises with respect to business and market needs. Papers and demonstrations of interest for the workshop will show and highlight the interactions between Semantic Web technologies and business applications. The workshop aims at collecting models, tools, use cases and practical experience in which Semantic Web techniques have been developed and applied to support any relevant business processes. It aims at assessing their degree of success, the challenges that have been addressed, the solutions that have been provided and the new tools that have been implemented. Special attention will be paid to proposals of “complete architecture”, i.e. applications that can effectively support the maintenance and evolution of business processes as a whole and applications that are able to combine representations of data and services in order to realize a common business knowledge management system.

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - A new type of metadata for querying data integration systems ( Convegno Nazionale Sistemi di Basi di Dati Evolute - Torre Canne (Fasano, BR) - 17-20 June 2007) ( - SEBD2007 ) (Michelangelo Ceci, Donato Malerba, Letizia Tanca Bari ITA ) - pp. da 266 a 273 ISBN: 9788890298103 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided languages and systems able to guarantee an integrated intensional representation of a given set of data sources.A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge. In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values.Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources.

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - Extracting Relevant Attribute Values for Improved Search - IEEE INTERNET COMPUTING - n. volume 11 (5) - pp. da 26 a 35 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

A new kind of metadata offers a synthesized view of an attribute's values for a user to exploit when creating or refining a search query in data-integration systems. The extraction technique that obtains these values is automatic and independent of an attribute domain but parameterized with various metrics for similarity measures. The authors describe a fully implemented prototype and some experimental results to show the effectiveness of "relevant values" when searching a knowledge base.

Sonia Bergamaschi; Paolo Bouquet; Daniel Giacomuzzi; Francesco Guerra; Laura Po; Maurizio Vincini ( 2007 ) - MELIS: a tool for the incremental annotation of domain ontologies [Software (296) - Software]
Abstract

Melis is a software tool for enablingan incremental process of automatic annotation of local schemas (e.g. re-lational database schemas, directory trees) with lexical information. Thedistinguishing and original feature of MELIS is its incrementality: thehigher the number of schemas which are processed, the more back-ground/domain knowledge is cumulated in the system (a portion of do-main ontology is learned at every step), the better the performance ofthe systems on annotating new schemas.

S. BERGAMASCHI; P. BOUQUET; D. GIACOMUZZI; F. GUERRA; L. PO; M. VINCINI ( 2007 ) - MELIS: An Incremental Method For The Lexical Annotation Of Domain Ontologies ( Web Information Systems and Technologies (WEBIST 2007) - Barcelona, Spain - March 3-6, 2007) ( - Proceedings of the Third International Conference on Web Information Systems and Technologies ) (for Systems and Technologies of Information, Control and Communication Setubal PRT ) - pp. da 240 a 247 ISBN: 978-972-8865-78-8 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELISis its incrementality: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of MELIS as a standalone tool and as a component integrated in MOMIS.

S. Bergamaschi; P. Bouquet; D. Giacomuzzi; F. Guerra; L. Po; M. Vincini ( 2007 ) - Melis: an incremental method for the lexical annotation of domain ontologies - INTERNATIONAL JOURNAL ON SEMANTIC WEB AND INFORMATION SYSTEMS - n. volume 3 - pp. da 57 a 80 ISSN: 1552-6283 [Articolo in rivista (262) - Articolo su rivista]
Abstract

In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELIS is the incremental process: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of ME LIS as a standalone tool and as a component integrated in MOMIS.

D. Beneventano; S. Bergamaschi; F. Guerra; M. Vincini ( 2007 ) - Progetto di Basi di Dati Relazionali (Pitagora Bologna ITA ) - pp. da 1 a 345 ISBN: 9788837116804 [Monografia o trattato scientifico (276) - Monografia/Trattato scientifico]
Abstract

L'obiettivo del volume è fornire al lettore le nozioni fondamentali di progettazione e di realizzazione di applicazioni di basi di dati relazionali. Relativamente alla progettazione, vengono trattate le fasi di progettazione concettuale e logica e vengono presentati i modelli dei dati Entity-Relationship e Relazionale che costituiscono gli strumenti di base, rispettivamente, per la progettazione concettuale e la progettazione logica. Viene inoltre introdotto lo studente alla teoria della normalizzazione di basi di dati relazionali. Relativamente alla realizzazione, vengono presentati elementi ed esempi del linguaggio standard per RDBMS (Relational Database Management Systems) SQL. Ampio spazio è dedicato ad esercizi svolti sui temi trattati. Il volume nasce dalla pluriennale esperienza didattica condotta dagli autori nei corsi di Basi di Dati e di Sistemi Informativi per studenti dei corsi di laurea e laurea specialistica della Facoltà di Ingegneria di Modena, della Facoltà di Ingegneria di Reggio Emilia e della Facoltà di Economia "Marco Biagi" dell'Università degli Studi di Modena e Reggio Emilia. Il volume attuale estende notevolmente le edizioni precedenti arricchendo la sezione di progettazione logica e di SQL.La sezione di esercizi è completamente nuova, inoltre, ulteriori esercizi sono reperibili su questa pagina web. Come le edizioni precedenti, costituisce più una collezione di appunti che un vero libro nel senso che tratta in modo rigoroso ma essenziale i concetti forniti. Inoltre, non esaurisce tutte le tematiche di un corso di Basi di Dati, la cui altra componente fondamentale è costituita dalla tecnologia delle basi di dati. Questa componente è, a parere degli autori, trattata in maniera eccellente da un altro testo di Basi di Dati, scritto dai nostri colleghi e amici Paolo Ciaccia e Dario Maio dell'Università di Bologna. Il volume, pure nella sua essenzialità, è ricco di esercizi svolti e quindi può costituire un ottimo strumento per gruppi di lavoro che, nell'ambito di software house, si occupino di progettazione di applicazioni di basi di dati relazionali.

S. BERGAMASCHI; GUERRA F; ORSINI M.; SARTORI C; VINCINI M ( 2007 ) - Relevant News: a semantic news feed aggregator ( Semantic Web Applications and Perspectives - Bari - 18 - 20 Dicembre 2007) ( - Semantic Web Applications and Perspectives - Proceedings of the 4th Italian Semantic Web Workshop ) (Giovanni Semeraro, Eugenio Di Sciascio, Christian Morbidoni, Heiko Stoemer BARI ITA ) - n. volume 314 - pp. da 150 a 159 ISBN: 16130073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In this paper we present RELEVANTNews, a web feed reader that automatically groups news related to the same topic published in different newspapers in different days. The tool is based on RELEVANT, a previously developed tool, which computes the “relevant values”, i.e. a subset of the values of a string attribute.Clustering the titles of the news feeds selected by the user, it is possible identify sets of related news on the basis of syntactic and lexical similarity.RELEVANTNews may be used in its default configuration or in a personalized way: the user may tune some parameters in order to improve the grouping results. We tested the tool with more than 700 news published in 30 newspapers in four daysand some preliminary results are discussed.

Sonia Bergamaschi; Claudio Sartori; Francesco Guerra; Mirko Orsini ( 2007 ) - RELEvant VAlues geNeraTor [Software (296) - Software]
Abstract

A new kind of metadata offers a synthesized view of an attribute's values for a user to exploit when creating or refining a search query in data-integration systems. The extraction technique that obtains these values is automatic and independent of an attribute domain but parameterized with various metrics for similarity measures.

S. BERGAMASCHI; F. GUERRA; M. ORSINI; C. SARTORI ( 2007 ) - Relevant values: new metadata to provide insight on attribute values at schema level ( International Conference on Enterprise Information Systems - Funchal, Madeira - 12-16, June 2007) ( - Proceedings of the 9th International Conference on Enterprise Information Systems ) (INSTICC - Institute for Systems and Technologies of Information, Controll and Communication Lisbona PRT ) - pp. da 274 a 279 ISBN: 9789728865887 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided languages and systems able to guarantee an integrated intensionalrepresentation of a given set of data sources. A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge.In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values. Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is basedon data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources, as in the Semantic Web context.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2007 ) - The SEWASIE MAS for Semantic Search ( First International Workshop on Agent supported Cooperative Work (ACW 2007) - Lyon -France - 29 October) ( - Proceedings of the Second IEEE International Conference on Digital Information Management ) (IEEE Engineering Management Society Los Alamitos, California USA ) - n. volume 2 - pp. da 793 a 798 ISBN: 9781424414765 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The capillary diffusion of the Internet has made available access to an overwhelming amount of data, allowing users having benefit of vast information. However, information is not really directly available: internet data are heterogeneous and spread over different places, with several duplications, and inconsistencies. The integration of such heterogeneous inconsistent data, with data reconciliation and data fusion techniques, may therefore represent a key activity enabling a more organized and semantically meaningful access to data sources. Some issues are to be solved concerning in particular the discovery and the explicit specification of the relationships between abstract data concepts and the need for data reliability in dynamic, constantly changing network. Ontologies provide a key mechanism for solving these challenges, but the web’s dynamic nature leaves open the question of how to manage them.Many solutions based on ontology creation by a mediator system have been proposed: a unified virtual view (the ontology) of the underlying data sources is obtained giving to the users a transparent access to the integrated data sources. The centralized architecture of a mediator system presents several limitations, emphasized in the hidden web: firstly, web data sources hold information according to their particular view of the matter, i.e. each of them uses a specific ontology to represent its data. Also, data sources are usually isolated, i.e. they do not share any topological information concerning the content or structure of other sources.Our proposal is to develop a network of ontology-based mediator systems, where mediators are not isolated from each other and include tools for sharing and mapping their ontologies. In this paper, we describe the use of a multi-agent architecture to achieve and manage the mediators network. The functional architecture is composed of single peers (implemented as mediator agents) independently carrying out their own integration activities. Such agents may then exchange data and knowledge with other peers by means of specialized agents (called brokering agents) which provide a coherent access plan to the peer network. In this way, two layers are defined in the architecture: at the local level, peers maintain an integrated view of local sources; at the network level, agents maintain mappings among the different peers. The result is the definition of a new type of mediator system network intended to operate in web economies, which we realized within SEWASIE (SEmantic Webs and AgentS in Integrated Economies), an RDT project supported by the 5th Framework IST program of the European Community, successfully ended on September 2005.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2007 ) - The SEWASIE Network of Mediator Agents for Semantic Search - JOURNAL OF UNIVERSAL COMPUTER SCIENCE - n. volume 13 (12) - pp. da 1936 a 1969 ISSN: 0948-695X [Articolo in rivista (262) - Articolo su rivista]
Abstract

Integration of heterogeneous information in the context of Internet becomes a key activity to enable a more organized and semantically meaningful access to data sources. As Internet can be viewed as a data-sharing network where sites are data sources, the challenge is twofold. Firstly, sources present information according to their particular view of the matter, i.e. each of them assumes a specific ontology. Then, data sources are usually isolated, i.e. they do not share any topological information concerning the content or the structure of other sources. The classical approach to solve these issues is provided by mediator systems which aim at creating a unified virtual view of the underlying data sources in order to hide the heterogeneity of data and give users a transparent access to the integrated information.In this paper we propose to use a multi-agent architecture to build and manage a mediators network. While a single peer (i.e. a mediator agent) independently carries out data integration activities, it exchanges knowledge with other peers by means of specialized agents (i.e. brokers) which provide a coherent access plan to access information in the peer network. This defines two layers in the system: at local level, peers maintain an integrated view of local sources, while at network level agents maintain mappings among the different peers. The result is the definition of a new networked mediator system intended to operate in web economies, which we realized in the SEWASIE (SEmantic Webs and AgentS in Integrated Economies) project. SEWASIE is a RDT project supported by the 5th Framework IST program of the European Community successfully ended on September 2005.

M. PALMONARI; F. GUERRA; A. TURATI; A. MAURINO; D. BENEVENTANO; E. DELLA VALLE; A. SALA; D. CERIZZA ( 2007 ) - Toward a Unified View of Data and Services ( first International International Workshop on Semantic Data and Service Integration - Vienna, Austria. - September 23, 2007) ( - In proceedings of the first International International Workshop on Semantic Data and Service Integration, Co-located with VLDB 2007, Vienna, Austria. September 23, 2007 ) (Serge Abiteboul, Monica Scannapieco Vienna AUS ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose an approach for describing a unified view of dataand services in a peer-to-peer environment. The researchareas of data and services are usually represented with dif-ferent models and queried by different tools with differentrequirements. Our approach aims at providing the user witha “complete” knowledge (in terms of data and services) ofa domain. Our proposal is not alternative to the techniquesdeveloped for representing and querying integrated data anddiscovering services, but works in conjunction with them byimproving the user knowledge.We are experimenting the approach within the ItalianFIRB project NeP4B (Networked Peers for Business), whichaims at developing an advanced technological infrastruc-ture to enable companies to search for partners, exchangedata, negotiate and collaborate without limitations and con-straints.

Sonia Bergamaschi; Paolo Bouquet; Francesco Guerra ( 2007 ) - 1st International Workshop on Semantic Web Architectures For Enterprises [Esposizione (290) - Esposizione]
Abstract

SWAE aims at evaluating how and how much the Semantic Web vision has met its promises with respect to business and market needs. Even though the Semantic Web is a relatively new branch of scientific and technological research, its relevance has already been envisaged for some crucial business processes: Semantic-based business data integration: data integration satisfies both "structural" requirements of enterprises (e.g. the possibility of consulting its data in a unified manner), and "dynamic" requirement (e.g. business-to-business partnerships to execute an order). Information systems implementing semantic web architectures can strongly support this process, or simply enable it. Semantic interoperability: metadata and ontologies support the dynamic and flexible exchange of data and services across information systems of different organizations. The development of applications for the automatic classification of services and goods on the basis of standard hierarchies, and the translation of such classifications into the different standards used by companies is a clear example of the potential for semantic interoperability methods and tools. Knowledge management: ontologies and automated reasoning tools seem to provide an innovative support to the elicitation, representation and sharing of corporate knowledge. In particular, for the shift from document-centric KM to an entity-centric KM approach. Enterprise and process modeling: ontologies and rules are becoming an effective way for modeling corporate processes and business domains (for example, in cost reduction). The goal of the workshop is to evaluate and assess how deep the permeation of Semantic Web models, languages, technologies and applications has been in effective enterprise business applications. It would also identify how semantic web based systems, methods and theories sustain business applications such as decision processes, workflow management processes, accountability, and production chain management. A particular attention will be dedicated to metrics and criteria that evaluate cost-effectiveness of system designing processes, knowledge encoding and management, system maintenance, etc.

S. BERGAMASCHI; P. BOUQUET; D. GIACOMUZZI; F. GUERRA; L. PO; M. VINCINI ( 2006 ) - An incremental method for meaning elicitation of a domain ontology ( Semantic Web Applications and Perspectives (SWAP 2006) - Scuola Normale Superiore, PISA - 18-20 December, 2006) ( - Proceedings of the 3rd Italian Semantic Web Workshop ) (CEUR-WS.org ) - n. volume 201 - pp. da 1 a 8 ISBN: 16130073 ISSN: 1613-0073 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Internet has opened the access to an overwhelming amount of data, requiring the development of new applications to automatically recognize, process and manage informationavailable in web sites or web-based applications. The standardSemantic Web architecture exploits ontologies to give a shared(and known) meaning to each web source elements.In this context, we developed MELIS (Meaning Elicitation and Lexical Integration System). MELIS couples the lexical annotation module of the MOMIS system with some components from CTXMATCH2.0, a tool for eliciting meaning from severaltypes of schemas and match them. MELIS uses the MOMIS WNEditor and CTXMATCH2.0 to support two main tasks in theMOMIS ontology generation methodology: the source annotationprocess, i.e. the operation of associating an element of a lexicaldatabase to each source element, and the extraction of lexicalrelationships among elements of different data sources.

S. Bergamaschi; G. Gelati; F. Guerra; M. Vincini ( 2006 ) - An intelligent data integration approach for collaborative project management in virtual enterprises - WORLD WIDE WEB - n. volume 9(1) - pp. da 35 a 61 ISSN: 1386-145X [Articolo in rivista (262) - Articolo su rivista]
Abstract

The increasing globalization and flexibility required by companies has generated new issues in the last decade related to the managing of large scale projects and to the cooperation of enterprises within geographically distributed networks. ICT support systems are required to help enterprises share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with a main goal: to facilitate the activity of assembling, integration and the verification of a multi-enterprise project. The main achievement of the system from a data management perspective is to avoid inconsistencies generated by updates at the sources' level and minimizes data replications. The developed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration system, which supports information sharing and querying, and web-services that ensure the interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia. Gruppo Formula S.p.A. and tested by Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge-IST-2000-28221).

Domenico Beneventano; Sonia Bergamaschi; Stefania Bruschi; Francesco Guerra;Mirko Orsini;Maurizio Vincini ( 2006 ) - Instances Navigation for Querying Integrated Data from Web-Sites ( - Web Information Systems and Technologies, International Conferences, WEBIST 2005 and WEBIST 2006. Revised Selected Papers ) (Springer Heidelberg DEU ) - pp. da 125 a 137 ISBN: 9783540323013; 9783540740629 | 9783540740629 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.

D. BENEVENTANO; S. BERGAMASCHI; S. BRUSCHI; F. GUERRA; M. ORSINI; M. VINCINI ( 2006 ) - Instances navigation for querying integrated data from web-sites ( International Conference on Web Information Systems - Setubal, Portugal, - April 11-13, 2006) ( - International Conference on Web Information Systems and Technologies ) (INSTICC Setubal PRT ) - pp. da 46 a 53 ISBN: 9789728865467 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.

F. GUERRA ( 2006 ) - Using Balanced Scorecards for supporting participations in Public Administrations ( IADIS International Conference e-Society 2006 - Dublin, Ireland - 13-16 July 2006) ( - e-society 2006 Proceedings, Volume I ) (IADIS Press Lisbon PRT ) - pp. da 265 a 272 ISBN: 9789728924164 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In recent years, several social-economical changes have been generating new challenges in the Public Administration actions. In particular, the diffusion of ICTs increased the request and need of developing new models for the e-government. In this paper, we propose to apply a general modification of Balanced Scorecard model, a framework developed for spreading knowledge about strategic actions and monitoring the activity in business companies, for e-government purposes. We claim that scorecards may encourage the citizens’ participations, since they completely allow evaluating the activities of a local government.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2005 ) - Building a tourism information provider with the MOMIS system - INFORMATION TECHNOLOGY & TOURISM - n. volume 7(3-4) - pp. da 221 a 238 ISSN: 1098-3058 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The tourism industry is a good candidate for taking up Semantic Web technology. In fact, there are many portals and websites belonging to the tourism domain that promote tourist products (places to visit, food to eat, museums, etc.) and tourist services (hotels, events, etc.), published by several operators (tourist promoter associations, public agencies, etc.). This article presents how the MOMIS system may be used for building a tourism information provider by exploiting the tourism information that is available in Internet websites. MOMIS (Mediator envirOnment for Multiple Information Sources) is a mediator framework that performs information extraction and integration from heterogeneous distributed data sources and includes query management facilities to transparently support queries posed to the integrated data sources.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2005 ) - Querying a super-peer in a schema-based super-peer network ( Databases, Information Systems, and Peer-to-Peer Computing - Trondheim, Norway - August 28-29, 2005) ( - International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2005) ) (Springer, Lecture Notes in Computer Science Berlino DEU ) - pp. da 13 a 25 ISBN: 9783540716600 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

We propose a novel approach for defining and querying a super-peer within a schema-based super-peer network organized into a two-level architecture: the low level, called the peer level (which contains a mediator node), the second one, called super-peer level (which integrates mediators peers with similar content).We focus on a single super-peer and propose a method to define and solve a query, fully implemented in the SEWASIE project prototype. The problem we faced is relevant as a super-peer is a two-level data integrated system, then we are going beyond traditional setting in data integration. We have two different levels of Global as View mappings: the first mapping is at the super-peer level and maps several Global Virtual Views (GVVs) of peers into the GVV of the super-peer; the second mapping is within a peer and maps the data sources into the GVV of the peer. Moreover, we propose an approach where the integration designer, supported by a graphical interface, can implicitly define mappings by using Resolution Functions to solve data conflicts, and the Full Disjunction operator that has been recognized as providing a natural semantics for data merging queries.

Sonia Bergamaschi; Domenico Beneventano; Maurizio Vincini; Francesco Guerra ( 2005 ) - SEWASIE - SEmantic Webs and AgentS in Integrated Economies. [Software (296) - Software]
Abstract

SEWASIE (SEmantic Webs and AgentS in Integrated Economies) aims to design and implement an advanced search engine enabling intelligent access to heterogeneous data sources on the web via semantic enrichment to provide the basis of structured secure web-based communication. SEWASIE implemented an advanced search engine that provides intelligent access to heterogeneous data sources on the web via semantic enrichment to provide the basis of structured secure web-based communication. SEWASIE provides users with a search client that has an easy-to-use query interface, and which can extract the required information from the Internet and can show it in a useful and user-friendly format. From an architectural point of view, the prototype provides a search engine client and indexing servers and ontologies.

Bergamaschi S; Guerra F; Vincini M ( 2004 ) - A peer-to-peer information system for the semantic web ( Agents and Peer-to-Peer Computing - Melbourne, Australia - July 14, 2003) ( - Agents and Peer-to-Peer Computing, Second International Workshop, AP2PC 2003 ) (Springer Heidelberg DEU ) - n. volume 2872 - pp. da 113 a 122 ISBN: 9783540240532 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Data integration, in the context of the web, faces new problems, due in particular to the heterogeneity of sources, to the fragmentation of the information and to the absence of a unique way to structure and view information. In such areas, the traditional paradigms, on which database foundations are based (i.e. client server architecture, few sources containing large information), have to be overcome by new architectures. The peer-to-peer (P2P) architecture seems to be the best way to fulfill these new kinds of data sources, offering an alternative to traditional client/server architecture. In this paper we present the SEWASIE system that aims at providing access to heterogeneous web information sources. An enhancement of the system architecture in the direction of P2P architecture, where connections among SEWASIE peers rely on exchange of XML metadata, is described.

D. BENEVENTANO; F. GUERRA; S. MAGNANI; M. VINCINI ( 2004 ) - A Web Service based framework for the semantic mapping between product classification schemas (Long Beach, CA : R. Chi, [2000]- ) - JOURNAL OF ELECTRONIC COMMERCE RESEARCH - n. volume 5(2) - pp. da 114 a 127 ISSN: 1526-6133 [Articolo in rivista (262) - Articolo su rivista]
Abstract

A marketplace is the place where the demands and offers of buyers and sellers participating in a business transaction may meet. Therefore, electronic marketplaces are virtual communities in which buyers may receive proposals from several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is not possible due to the lack of common standards, used by the community, describing and classifying them. Therefore, B2B and B2C marketplaces have to reclassify products and goods according to different standardization models. In this paper, we propose a semi-automatic methodology, supported by a web service based framework, to define semantic mappings amongst different product classification schemas (ecommerce standards and catalogues) and we provide the ability to be able to search and navigate these mappings.The proposed methodology is shown over fragments of UNSPSC and ecl@ss standards and over a fragment of the eBay online catalogue.

S. Bergamaschi; D. Beneventano; F. Guerra; M. Orsini; M. Vincini ( 2004 ) - MOMIS: an Ontology-based Information Integration System(software) [Software (296) - Software]
Abstract

The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent oftheir location or their data’s heterogeneity.An open source version of the MOMIS system was released on April 2010 by the spin-off DATARIVER (www.datariver.it)Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web ages with machine-readable definitions. It thus relieson the a priori existence of ontologies that represent the domains associated with the given information sources. This approachrelies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that theannotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. Byinvolving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover,the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content.

I. BENETTI; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2004 ) - SOAP-ENABLED WEB SERVICES FOR KNOWLEDGE MANAGEMENT - INTERNATIONAL JOURNAL OF WEB ENGINEERING AND TECHNOLOGY - n. volume 1(2) - pp. da 218 a 235 ISSN: 1476-1289 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The widespread diffusion of the World Wide Web among medium/small companies yields a huge amount of information to make business available online. Nevertheless the heterogeneity of that information, forces even trading partners involved in the same business process to face daily interoperability issues.The challenge is the integration of distributed business processes, which, in turn, means integration of heterogeneous data coming from distributed sources.This paper presents the new web services-based architecture of the MOMIS (Mediator envirOnment for Multiple Information Sources) framework that enhances the semantic integration features of MOMIS, leveraging new technologies such as XML web services and the SOAP protocol.The new architecture decouples the different MOMIS modules, publishing them as XML web services. Since the SOAP protocol used to access XML web services requires the same network security settings as a normal internet browser, companies are enabled to share knowledge without softening their protection strategies.

R. BENASSI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2004 ) - Synthesizing an Integrated Ontology with MOMIS ( International Conference on Knowledge Engineering and Decision Support (ICKEDS 2004) - Porto, Portugal - Portugal, 21-23 July) ( - International Conference on Knowledge Engineering and Decision Support (ICKEDS) ) (Proceedings su cd Porto PRT ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The Mediator EnvirOnment for Multiple Information Sources (MOMIS) aims at constructing synthesized, integrated descriptions of the information coming from multiple heterogeneous sources, in order to provide the user with a global virtual view of the sources independent from their location and the level of hetero-geneity of their data. Such a global virtual view is a con-ceptualization of the underlying domain and then may be thought of as an ontology describing the involved sources. In this article we explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underly-ing domain

S. BERGAMASCHI; G. GELATI; F. GUERRA; M. VINCINI ( 2003 ) - A Experiencing AUML for the WINK Multi-Agent System ( WOA 2003: dagli Oggetti agli Agenti - Villasimius (Cagliari), Italy - 10 - 11 Settembre 2003) ( - WOA 2003: Dagli Oggetti agli Agenti. 4th AI*IA/TABOO Joint Workshop "From Objects to Agents": Intelligent Systems and Pervasive Computing ) (Pitagora Editrice Bologna Bologna ITA ) - pp. da 148 a 148 ISBN: 9788837114138 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In the last few years, efforts have been done towards bridging thegap between agent technology and de facto standard technologies,aiming at introducing multi-agent systems in industrialapplications. This paper presents an experience done by using oneof such proposals, Agent UML. Agent UML is a graphicalmodelling language based on UML. The practical usage of thisnotation has brought to suggest some refinements of the AgentUML features.

D. BENEVENTANO; S. BERGAMASCHI; A. FERGNANI; F. GUERRA; M. VINCINI; D. MONTANARI ( 2003 ) - A Peer-to-Peer Agent-Based Semantic Search Engine ( Sistemi Evoluti per Basi di Dati (SEBD 2003) - Cetraro (CS) - June 24-27, 2003) ( - Proceedings of the Eleventh Italian Symposium on Advanced Database Systems ) (Rubbettino Editore Cosenza ITA ) - pp. da 367 a 378 ISBN: 9788849806298 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Several architectures, protocols, languages, and candidate standards, have been proposed to let the "semantic web'' idea take off. In particular, searching for information requires cooperation of the information providers and seekers. Past experience and history show that a successful architecture must support ease of adoption and deployment by a wide and heterogeneous population, a flexible policy to establish an acceptable cost-benefit ratio for using the system, and the growth of a cooperative distributed infrastructure with no central control. In this paper an agent-based peer-to-peer architecture is defined to support search through a flexible integration of semantic information.Two levels of integration are foreseen: strong integration of sources related to the same domain into a single information node by means of a mediator-based system; weak integration of information nodes on the basis of semantic relationships existing among concepts of different nodes.The EU IST SEWASIE project is described as an instantiation of this architecture. SEWASIE aims at implementing an advanced search engine, which will provide SMEs with intelligent access to heterogeneous information on the Internet.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2003 ) - Building an integrated Ontology within the SEWASIE system ( Workshop on Semantic Web and Databases - Berlin, Germany - September 7-8, 2003) ( - First International Workshop on Semantic Web and Databases (SWDB) ) (Isabel F. Cruz, Vipul Kashyap, Stefan Decker, Rainer Eckstein Berlin DEU ) - pp. da 91 a 107 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

MOMIS (Mediator envirOnment for Multiple Information Sources) is a framework for information extraction and integration of heterogeneous structured and semi-structured information sources. The result of the integration process is a Global Virtual View (in short GVV) which is a set of (global) classesthat represent the information contained in the sources being used. In this paper, we present the application of our integration concerning a specific type of source (i.e. web documents), and show how the result of the integration approach can be exploited to create a conceptualization of the domain belonging the sources, i.e. an ontology. Two new achievements of the MOMIS system are presented: the semi-automatic annotation of the GVV and the extension of a built-up ontology by the addition of another source.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA ( 2003 ) - Building an Ontology with MOMIS ( Semantic Integration Workshop - Sanibel Island, Florida, USA - October 20, 2003) ( - Proceedings of the Semantic Integration Workshop Collocated with the Second International Semantic Web Conference (ISWC-03) ) - CEUR WORKSHOP PROCEEDINGS - n. volume 82 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Nowadays the Web is a huge collection of data and its expansion rate is very high. Web users need new ways to exploit all this available information and possibilities. A new vision of the Web, the Semantic Web , where resources are annotated with machine-processable metadata providing them with background knowledge and meaning, arises. A fundamental component of the Semantic Web is the ontology; this “explicit specification of a conceptualization” allows information providers to give a understandable meaning to their documents. MOMIS (Mediator envirOnment for Multiple Information Sources) is a framework for information extraction and integration of heterogeneous information sources. The system implements a semi-automatic methodology for data integration that follows the Global as View (GAV) approach. The result of the integration process is a global schema, which provides a reconciled, integrated and virtual view of the underlying sources, GVV (Global Virtual View). The GVV is composed of a set of (global) classes that represent the information contained in the sources. In this paper, we focus on the MOMIS application into a particular kind of source (i.e. web documents), and show how the result of the integration process can be exploited to create a conceptualization of the underlying domain, i.e. a domain ontology for the integrated sources. GVV is then semi-automatically annotated according to a lexical ontology. With reference to the Semantic Web area, where generally the annotation process consists of providing a web page with semantic markups according to an ontology, we firstly markup the local metadata descriptions and then the MOMIS system generates an annotated conceptualization of the sources. Moreover, our approach “builds” the domain ontology as the synthesis of the integration process, while the usual approach in the Semantic Web is based on “a priori” existence of ontology

D. BENEVENTANO; S. BERGAMASCHI; J. GELATI; F. GUERRA; M. VINCINI ( 2003 ) - MIKS: an agent framework supporting information access and integration ( - Intelligent Information Agents Research and Development in Europe: An AgentLink Perspective ) (Springer Heidelberg DEU ) - n. volume 2586 - pp. da 22 a 49 ISBN: 9783540007593 [Contributo in volume (Capitolo o Saggio) (268) - Capitolo/Saggio]
Abstract

Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems for cooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, in this paper we propose the exploitation of mobile agents in the information integration area, and, in particular, their integration in the Momis infrastructure. MOMIS (Mediator EnvirOnment for Multiple Information Sources) is a system that has been conceived as a pool of tools to provide an integrated access to heterogeneous information stored in traditional databases (for example relational, object oriented databases) or in file systems, as well as in semi-structured data sources (XML-file).This proposal has been implemented within the MIKS (Mediator agent for Integration of Knowledge Sources) system and it is completely described in this paper.

D. Beneventano; S. Bergamaschi; F. Guerra; M. Vincini ( 2003 ) - Synthesizing, an integrated ontology - IEEE INTERNET COMPUTING - n. volume 7 - pp. da 42 a 51 ISSN: 1089-7801 [Articolo in rivista (262) - Articolo su rivista]
Abstract

To exploit the Internet’s expanding data collection, current Semantic Web approaches employ annotation techniques to link individual information resources with machine-comprehensible metadata. Before we can realize the potential this new vision presents, however, several issues must be solved. One of these is the need for data reliability in dynamic, constantly changing networks. Another issue is how to explicitly specify relationships between abstract data concepts. Ontologies provide a key mechanism for solving these challenges, but the Web’s dynamic nature leaves open the question of how to manage them. The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent of their location or their data’s heterogeneity. Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web pages with machine-readable definitions. It thus relies on the a priori existence of ontologies that represent the domains associated with the given information sources. This approach relies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that the annotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. By involving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover, the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content. In this article, we use Web documents as a representative information source to describe the Momis methodology’s general application. We explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underlying domain. In particular, our method provides a way to extend previously created conceptualizations, rather than starting from scratch, by inserting a new source.

S. BERGAMASCHI; G.GELATI; F. GUERRA; M. VINCINI ( 2003 ) - WINK: a Web-based Enterprise System for Collaborative Project Management in Virtual Enterprises ( Web Information Systems Engineering - Roma, Italy - 10-12 December 2003) ( - 4th International Conference on Web Information Systems Engineering ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 176 a 185 ISBN: 9780769519999 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The increasing of globalization and flexibility required to the companies has generated, in the last decade, new issues, related to the managing of large scale projects within geographically distributed networks and to the cooperation of enterprises. ICT support systems are required to allow enterprises to share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with two main goals: avoiding inconsistencies generated by updates at the sources’ level and minimizing data replications. The proposed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration component, which supports information sharing and querying, and SOAP enabled web-services which ensure the whole interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia, Gruppo Formula S.p.A. and Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge - IST-2000-28221).

Bergamaschi S; Guerra F; Vincini M ( 2002 ) - A data integration framework for e-commerce product classification ( International Semantic Web Conference (ISWC 2001) - Cagliari, Italy - 9-12 June 2002) ( - The Semantic Web - ISWC 2002, First International Semantic Web Conference ) (Springer Heidelberg DEU ) - n. volume 2342 - pp. da 379 a 393 ISBN: 9783540437604 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A marketplace is the place in which the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology, supported by a tool (SI-Designer), to define the mapping among different e-commerce product classification standards. This methodology was developed for the MOMIS system within the Intelligent Integration of Information research area. We describe our extension to the methodology that makes it applyable in general to product classification standard, by selecting a fragment of ECCMA/UNSPSC and ecl @ss standard.

D. BENEVENTANO; S. BERGAMASCHI; M.FELICE; D. GAZZOTTI; G.GELATI; F. GUERRA; M. VINCINI ( 2002 ) - An Agent framework for Supporting the MIKS Integration Process ( WOA 2002: Dagli Oggetti agli Agenti - Milano, Italia - 18-19 November 2002) ( - WOA 2002: Dagli Oggetti agli Agenti. 3rd AI*IA/TABOO Joint Workshop "From Objects to Agents": From Information to Knowledge ) (Pitagora Editrice Bologna ITA ) - pp. da 35 a 41 ISBN: 9788837113636 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems forcooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, inthis paper we propose the exploitation of mobile agents inthe information integration area, and, in particular, the rolesthey play in enhancing the feature of the Momis infrastructure.Momis (Mediator agent for Integration of Knowledge Sources) is asystem that has been conceived as a pool of tools to provide anintegrated access to heterogeneous information stored intraditional databases (for example relational, object orienteddatabases) or in file systems, as well as in semi-structured datasources (XML-file).In this paper we describe the new agent-based framework concerning the integration process as implemented in Miks (Mediator agent for Integration of Knowledge Sources) system.

I. BENETTI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2002 ) - An information integration framework for E-commerce - IEEE INTELLIGENT SYSTEMS - n. volume 17 - pp. da 18 a 25 ISSN: 1541-1672 [Articolo in rivista (262) - Articolo su rivista]
Abstract

The Web has transformed electronic information systems from single, isolated nodes into a worldwide network of information exchange and business transactions. In this context, companies have equipped themselves with high-capacity storage systems that contain data in several formats. The problems faced by these companies often emerge because the storage systems lack structural and application homogeneity in addition to a common ontology.The semantic differences generated by a lack of consistent ontology can lead to conflicts that range from simple name contradictions (when companies use different names to indicate the same data concept) to structural incompatibilities (when companies use different models to represent the same information types).One of the main challenges for e-commerce infrastructure designers is information sharing and retrieving data from different sources to obtain an integrated view that can overcome any contradictions or redundancies. Virtual catalogs can help overcome this challenge because they act as instruments to retrieve information dynamically from multiple catalogs and present unified product data to customers. Instead of having to interact with multiple heterogeneous catalogs, customers can instead interact with a virtual catalog in a straightforward, uniform manner.This article presents a virtual catalog project called Momis (mediator environment for multiple information sources). Momis is a mediator-based system for information extraction and integration that works with structured and semistructured data sources. Momis includes a component called the SI-Designer for semiautomatically integrating the schemas of heterogeneous data sources, such as relational, object, XML, or semistructured sources. Starting from local source descriptions, the Global Schema Builder generates an integrated view of all data sources and expresses those views using XML. Momis lets you use the infrastructure with other open integration information systems by simply interchanging XML data files.Momis creates XML global schema using different stages, first by creating a common thesaurus of intra and interschema relationships. Momis extracts the intraschema relationships by using inference techniques, then shares these relationships in the common thesaurus. After this initial phase, Momis enriches the common thesaurus with interschema relationships obtained using the lexical WordNet system (www.cogsci.princeton.edu/wn), which identifies the affinities between interschema concepts on the basis of their lexicon meaning. Momis also enriches the common thesaurus using the Artemis system, which evaluates structural affinities among interschema concepts.

G. Cabri; F. Guerra; M. Vincini; S. Bergamaschi; L. Leonardi; F. Zambonelli ( 2002 ) - MOMIS: Exploiting agents to support information integration - INTERNATIONAL JOURNAL OF COOPERATIVE INFORMATION SYSTEMS - n. volume 11 - pp. da 293 a 313 ISSN: 0218-8430 [Articolo in rivista (262) - Articolo su rivista]
Abstract

Information overloading introduced by the large amount of data that is spread over the Internet must be faced in an appropriate way. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for today's technologies related to information management. In the area of information integration, this paper proposes an approach based on mobile software agents integrated in the MOMIS (Mediator envirOnment for Multiple Information Sources) infrastructure, which enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The exploitation of mobile agents in MOMIS can significantly increase the flexibility of the system. In fact, their characteristics of autonomy and adaptability well suit the distributed and open environments, such as the Internet. The aim of this paper is to show the advantages of the introduction in the MOMIS infrastructure of intelligent and mobile software agents for the autonomous management and coordination of integration and query processing over heterogeneous data sources.

S. BERGAMASCHI; F. GUERRA ( 2002 ) - Peer to Peer Paradigm for a Semantic Search Engine ( Agents and Peer-to-Peer Computing (AP2PC 2002) - Bologna, Italy - 15 July 2002) ( - Agents and Peer-to-Peer Computing, First International Workshop ) (Springer Heidelberg DEU ) - n. volume 2530 - pp. da 81 a 86 ISBN: 9783540405382 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

This paper provides, firstly, a general description of the research project SEWASIE and, secondly, a proposal of an architectural evolution of the SEWASIE system in the direction of peer-to-peer paradigm. The SEWASIE project has the aim to design and implement an advanced search engine enabling intelligent access to heterogeneous data sources on the web using community-specific multilingual ontologies. After a presentation of the main features of the system a preliminar proposal of architectural evolutions of the SEWASIE system in the direction of peer-to-peer paradigm is proposed.

S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2002 ) - Product Classification Integration for E-Commerce ( Second International Workshop on Electronicy Business Hubs - WEBH - Aix En Provence, France - 2-6 September 2002) ( - DEXA Workshops ) (IEEE Computer Society Los Alamitos, California USA ) - pp. da 861 a 867 ISBN: 9780769516684 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

A marketplace is the place where the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology to define a mapping among different e-commerce product classification standards. This methodology is an extension of the MOMIS-system, a mediator system developed within the Intelligent Integration of Information research area.

BERGAMASCHI S; BENEVENTANO D; CASTANO S; DE ANTONELLIS V; FERRARA A; GUERRA F; F. MANDREOLI; ORNETTI G. C; VINCINI M ( 2002 ) - Semantic Integration and Query Optimization of Heterogeneous Data Sources ( 1st OOIS Workshop on Efficient Web-based Information Systems (EWIS 2002) - Montpellier, France - September 2, 2002) ( - Advances in Object-Oriented Information Systems ) (Springer Heidelberg DEU ) - n. volume 2426 - pp. da 154 a 165 ISBN: 9783540440888 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

In modern Internet/Intranet-based architectures, an increasing number of applications requires an integrated and uniform accessto a multitude of heterogeneous and distributed data sources. Inthis paper, we describe the ARTEMIS/MOMIS system for the semantic integration and query optimization of heterogeneous structured and semistructured data sources.

D. BENEVENTANO; S. BERGAMASCHI; D. BIANCO; F. GUERRA; M. VINCINI ( 2002 ) - SI-Web: a Web based interface for the MOMIS project ( Sistemi Evoluti per Basi di Dati (SEBD 2002) - Portoferraio, Italy - 19-21 June 2002) ( - Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2002) ) (Paolo Ciaccia, Fausto Rabitti, Giovanni Soda Portoferraio ITA ) - pp. da 407 a 411 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The MOMIS project (Mediator envirOnment for MultipleInformation Sources) developed in the past years allows the integration of data from structured and semi-structured data sources. SI-Designer (Source Integrator Designer) is a designer support tool implemented within the MOMIS project for semi-automatic integration of heterogeneous sources schemata. It is a java application where all modules involved are available as CORBA Object and interact using established IDL interfaces. The goal of this demonstration is to present a new tool: SI-Web (Source Integrator on Web), it offers the same features of SI-Designer but it has got the great advantage of being usable onInternet through a web browser.

D. BENEVENTANO; S. BERGAMASCHI; D. GAZZOTTI; G.GELATI; F. GUERRA; M. VINCINI ( 2002 ) - The WINK Project for Virtual Enterprise Networking and Integration ( Sistemi Evoluti per Basi di Dati (SEBD 2002) - Portoferraio, Italy - 19-21 June 2002) ( - Convegno Nazionale Sistemi di Basi di Dati Evolute (SEBD2002) ) (Paolo Ciaccia, FAusto Rabitti, Giovanni Soda Portoferraio ITA ) - pp. da 283 a 290 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

To stay competitive (or sometimes simply to stay) on the market companies and manufacturers more and more often have to join their forces to survive and possibly flourish. Among other solutions, the last decade has experienced the growth and spreading of an original business model called Virtual Enterprise. To manage a Virtual Enterprise modern information systems have to tackle technological issues as networking, integration and cooperation. The WINK project, born form the partnership between University of Modena and Reggio Emilia and Gruppo Formula, addresses these problems. The ultimate goal is to design, implement and finally test on a pilot case (provided by Alenia), the WINK system, as combination of two existing and promising software systems (the WHALES and MIKS systems), to provide the Virtual Enterprise requirement for data integration and cooperation amd management planning.

G. GELATI; F. GUERRA; M. VINCINI ( 2001 ) - Agents Supporting Information Integration: the MIKS Framework ( WOA 2001: Dagli Oggetti agli Agenti - Modena, Italy - 4-5 September 2001) ( - WOA 2001: Dagli Oggetti agli Agenti. 2nd AI*IA/TABOO Joint Workshop "From Objects to Agents": Evolutive Trends of Software Systems ) (Pitagora Editrice Bologna ITA ) - n. volume https://www.unimore.u-gov.it/unimore/# - pp. da 109 a 112 ISBN: 9788837112721 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

During past years we have developed the MOMIS (Mediator envirOnment for Multiple Information Sources) system for the integration of data from structured and semi-structured data sources.In this paper we propose some preliminary considerations about one feasible extension of the system, intended to improve some of the functionalities by exploiting intelligent and mobile agents. The new framework is named a MIKS (Mediator agent for Integration of Knowledge Sources).

G. GELATI; F. GUERRA; M. VINCINI ( 2001 ) - Agents Supporting Information Integration: the MIKS Framework - AIIA NOTIZIE - n. volume 4 - pp. da 50 a 51 [Articolo in rivista (262) - Articolo su rivista]
Abstract

During past years we have developed the MOMIS (Mediator envirOnment for Multiple Information Sources) system for the integration of data from structured and semi-structured data sources.In this paper we propose some preliminary considerations about one feasible extension of the system, intended to improve some of the functionalities by exploiting intelligent and mobile agents. The new framework is named a MIKS (Mediator agent for Integration of Knowledge Sources).

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - Exploiting extensional knowledge for query reformulation and object fusion in a data integration system ( Sistemi Evoluti per Basi di Dati (SEBD 2001) - Venezia, Italy - 27-29 Giugno 2001) ( - Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2001) ) (Augusto Celentano, Letizia Tanca, Paolo Tiberio Venezia ITA ) - pp. da 257 a 272 ISBN: non disponibile [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Query processing in global information systems integrating multiple heterogeneous sources is a challenging issue in relation to the effective extraction of information available on-line. In this paper we propose intelligent, tool-supported techniques for querying global information systems integrating both structured and semistructured data sources. The techniques have been developed in the environment of a data integration, wrapper/mediator based system, MOMIS, and try to achieve two main goals: optimized query reformulation w.r.t local sources and object fusion, i.e. grouping together information (from the same or different sources) about the same real-world entity. The developed techniques rely on the availability of integrationknowledge, i.e. local source schemata, a virtual mediated schema and its mapping descriptions, that is semantic mappings w.r.t. the underlying sources both at the intensional and extensional level. Mapping descriptions, obtained as a result of the semi-automatic integration process of multiple heterogeneous sources developed for the MOMIS system, include, unlike previous data integration proposals, extensional intra/interschema knowledge. Extensional knowledge is exploited to detect extensionally overlapping classes and to discover implicit join criteria among classes, which enables the goals of optimized query reformulation and object fusion to be achieved.The techniques have been implemented in the MOMIS system but can be applied, in general, to data integration systems including extensional intra/interschema knowledge in mapping descriptions.

D. BENEVENTANO; S. BERGAMASCHI; I. BENETTI; A. CORNI; F. GUERRA; G. MALVEZZI ( 2001 ) - SI-Designer: a tool for intelligent integration of information ( Hawaii International Conference on System Sciences - Hawaii - 3-6 January 2001) ( - Hawaii International Conference on System Sciences (HICSS-34) ) (IEEE Computer Society Los Alamitos, California USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

SI-Designer (Source Integrator Designer) is a designer supporttool for semi- automatic integration of heterogeneoussources schemata (relational, object and semi structuredsources); it has been implemented within the MOMIS projectand it carries out integration following a semantic approachwhich uses intelligent Description Logics-based techniques,clustering techniques and an extended ODMG-ODL language,ODL-I3, to represent schemata, extracted, integratedinformation. Starting from the sources’ ODL-I3 descriptions(local schemata) SI-Designer supports the designer inthe creation of an integrated view of all the sources (globalschema) which is expressed in the same ODL-I3 language.We propose SI-Designer as a tool to build virtual catalogsin the E-Commerce environment.

I. BENETTI; D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - SI-Designer: an Integration Framework for E-Commerce ( E-Business & the Intelligent Web - Seattle, USA - August 5 2001) ( - IJCAI*01 Workshop on E-Business & the Intelligent Web ) (Proceedings informali pubblicati in rete http://www.csd.abdn.ac.uk/~apreece/ebiweb/programme.html Seattle USA ) [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Electronic commerce lets people purchase goods and exchange information on business transactions on-line. Therefore one of the main challenges for the designers of the e-commerce infrastructures is the information sharing, retrieving data located in different sources thus obtaining an integrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach as they are conceived as instruments to dynamically retrieve information from multiple catalogs and present product data in a unified manner, without directly storing product data from catalogs.In this paper we propose SI-Designer, a support tool for the integration of data from structured and semi-structured data sources, developed within the MOMIS (Mediator environment for Multiple Information Sources) project.

S. BERGAMASCHI; G. CABRI; F. GUERRA; L. LEONARDI; M. VINCINI; F. ZAMBONELLI ( 2001 ) - Supporting information integration with autonomous agents ( Cooperative Information Agents (CIA 2001) - Modena, Italy - 6-8 Settembre 2001) ( - Cooperative Information Agents V, 5th International Workshop ) - LECTURE NOTES IN COMPUTER SCIENCE - n. volume 2182 - pp. da 88 a 99 ISBN: 9783540425458 ISSN: 0302-9743 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The large amount of information that is spread over the Internet is an important resource for all people but also introduces some issues that must be faced. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challanges for the today’s technologies. This paper proposes an approach based on mobile agents integrated in an information integration infrastructure. Mobile agents can significantly improve the design and the development of Internet applications thanks to their characteristics of autonomy and adaptability to open and distributed environments, such as the Internet. MOMIS (Mediator envirOnment for Multiple Information Sources) is an infrastructure for semi-automatic information integrationthat deals with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The aim of this paper is to show the advantage of the introduction in the MOMIS infrastructureof intelligent and mobile software agents for the autonomous management and coordination of the integration and query processes over heterogeneous sources.

D. BENEVENTANO; S. BERGAMASCHI; F. GUERRA; M. VINCINI ( 2001 ) - The Momis approach to Information Integration ( International Conference on Enterprise Information Systems (ICEI 01) - Setubal, Portugal - 7-10 July 2001) ( - Third International Conference on Enterprise Information Systems ) (ICEIS Press Setubal PRT ) - n. volume 1 - pp. da 194 a 198 ISBN: 9789729805028 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

The web explosion, both at internet and intranet level, has transformed the electronic information systemfrom single isolated node to an entry points into a worldwide network of information exchange and businesstransactions. Business and commerce has taken the opportunity of the new technologies to define the ecommerceactivity. Therefore one of the main challenges for the designers of the e-commerceinfrastructures is the information sharing, retrieving data located in different sources thus obtaining anintegrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach asthey are conceived as instruments to dynamically retrieve information from multiple catalogs and presentproduct data in a unified manner, without directly storing product data from catalogs. Customers, instead ofhaving to interact with multiple heterogeneous catalogs, can interact in a uniform way with a virtual catalog.In this paper we propose a designer support tool, called SI-Designer, for information integration developedwithin the MOMIS project. The MOMIS project (Mediator environment for Multiple Information Sources)aims to integrate data from structured and semi-structured data sources.

D. CALVANESE; S. CASTANO; F. GUERRA; D. LEMBO; M. MELCHIORI; G. TERRACINA; D. URSINO; M. VINCINI ( 2001 ) - Towards a comprehensive methodological framework for integration ( Knowledge Representation meets Databases (KRDB2001) - Roma, Italy - 15 September 2001) ( - Proceedings of the 8th International Workshop on Knowledge Representation meets Databases (KRDB2001) ) - CEUR WORKSHOP PROCEEDINGS - n. volume 45 [Contributo in Atti di convegno (273) - Relazione in Atti di Convegno]
Abstract

Nowadays, data can be represented and stored by using different formats ranging from non structured data, typical of file systems, to semi-structured data, typical of Web sources, to highly structured data, typical of relational database systems. Therefore,the necessity arises to define new models and approaches for uniformly handling all these heterogeneous information sources. In this paper we propose a framework which aims at uniformly managing information sources having different formats and structures for obtaining a global, integrated and uniform representation.