|
Maurizio VINCINI
Professore Associato Dipartimento di Ingegneria "Enzo Ferrari"
|
Home |
Curriculum(pdf) |
Didattica |
Pubblicazioni
2023
- An Intrinsically Interpretable Entity Matching System
[Relazione in Atti di Convegno]
Baraldi, A.; Del Buono, F.; Guerra, F.; Paganelli, M.; Vincini, M.
abstract
Explainable classification systems generate predictions along with a weight for each term in the input record measuring its contribution to the prediction. In the entity matching (EM) scenario, inputs are pairs of entity descriptions and the resulting explanations can be difficult to understand for the users. They can be very long and assign different impacts to similar terms located in different descriptions. To address these issues, we introduce the concept of decision units, i.e., basic information units formed either by pairs of (similar) terms, each one belonging to a different entity description, or unique terms, existing in one of the descriptions only. Decision units form a new feature space, able to represent, in a compact and meaningful way, pairs of entity descriptions. An explainable model trained on such features generates effective explanations customized for EM datasets. In this paper, we propose this idea via a three-component architecture template, which consists of a decision unit generator, a decision unit scorer, and an explainable matcher. Then, we introduce WYM (Why do You Match?), an implementation of the architecture oriented to textual EM databases. The experiments show that our approach has accuracy comparable to other state-of-the-art Deep Learning based EM models, but, differently from them, its predictions are highly interpretable.
2023
- Interpretable Entity Matching with WYM
[Relazione in Atti di Convegno]
Baraldi, A.; Del Buono, F.; Guerra, F.; Guiduzzi, G.; Paganelli, M.; Vincini, M.
abstract
2023
- Progetto di Basi di Dati Relazionali
[Monografia/Trattato scientifico]
Beneventano, Domenico; Bergamaschi, Sonia; Gagliardelli, Luca; Guerra, Francesco; Vincini, Maurizio
abstract
L’obiettivo del volume è fornire al lettore le nozioni fondamentali di progettazione e di realizzazione di applicazioni di basi di dati relazionali.
Relativamente alla progettazione, vengono trattate le fasi di progettazione concettuale e logica e vengono presentati i modelli dei dati Entity-Relationship e Relazionale che costituiscono gli strumenti di base, rispettivamente, per la progettazione concettuale e la progettazione logica.
Viene inoltre introdotto lo studente alla teoria della normalizzazione di basi di dati relazionali.
Relativamente alla realizzazione, vengono presentati elementi ed esempi del linguaggio standard per RDBMS (Relational Database Management Systems) SQL. Ampio spazio è dedicato ad esercizi svolti sui temi trattati.
2021
- Automated Machine Learning for Entity Matching Tasks
[Relazione in Atti di Convegno]
Paganelli, Matteo; DEL BUONO, Francesco; Pevarello, Marco; Guerra, Francesco; Vincini, Maurizio
abstract
The paper studies the application of automated machine learning approaches (AutoML) for addressing the problem of Entity Matching (EM). This would make the existing, highly effective, Machine Learning (ML) and Deep Learning based approaches for EM usable also by non-expert users, who do not have the expertise to train and tune such complex systems. Our experiments show that the direct application of AutoML systems to this scenario does not provide high quality results. To address this issue, we introduce a new component, the EM adapter, to be pipelined with standard AutoML systems, that preprocesses the EM datasets to make them usable by automated approaches. The experimental evaluation shows that our proposal obtains the same effectiveness as the state-of-the-art EM systems, but it does not require any skill on ML to tune it.
2019
- Big Data Integration of Heterogeneous Data Sources: The Re-Search Alps Case Study
[Relazione in Atti di Convegno]
Guerra, Francesco; Sottovia, Paolo; Paganelli, Matteo; Vincini, Maurizio
abstract
The application of big data integration techniques in real scenarios needs to address practical issues related to the scalability of the process and the heterogeneity of data sources. In this paper, we describe the pipeline that has been developed in the context of the Re-search Alps project, a project funded by the EU Commission through the INEA Agency in the CEF Telecom framework, that aims at creating an open dataset describing research centers located in the Alpine area.
2019
- Foreword to the Special Issue: "Semantics for Big Data Integration"
[Articolo su rivista]
Beneventano, Domenico; Vincini, Maurizio
abstract
In recent years, a great deal of interest has been shown toward big data. Much of the work on big data has focused on volume and velocity in order to consider dataset size. Indeed, the problems of variety, velocity, and veracity are equally important in dealing with the heterogeneity, diversity, and complexity of data, where semantic technologies can be explored to deal with these issues. This Special Issue aims at discussing emerging approaches from academic and industrial stakeholders for disseminating innovative solutions that explore how big data can leverage semantics, for example, by examining the challenges and opportunities arising from adapting and transferring semantic technologies to the big data context.
2017
- Analyzing mappings and properties in Data Warehouse integration
[Articolo su rivista]
Beneventano, Domenico; Olaru, MARIUS OCTAVIAN; Vincini, Maurizio
abstract
The information inside the Data Warehouse (DW) is used to take strategic decisions inside the organization that is why data quality plays a crucial role in guaranteeing the correctness of the decisions. Data quality also becomes a major issue when integrating information from two or more heterogeneous DWs. In the present paper, we perform extensive analysis of a mapping-based DW integration methodology and of its properties. In particular, we will prove that the proposed methodology guarantees coherency, meanwhile in certain cases it is able to maintain soundness and consistency. Moreover, intra-schema homogeneity is discussed and analysed as a necessary condition for summarizability and for optimization by materializing views of dependent queries.
2017
- From Data Integration to Big Data Integration
[Capitolo/Saggio]
Bergamaschi, Sonia; Beneventano, Domenico; Mandreoli, Federica; Martoglia, Riccardo; Guerra, Francesco; Orsini, Mirko; Po, Laura; Vincini, Maurizio; Simonini, Giovanni; Zhu, Song; Gagliardelli, Luca; Magnotta, Luca
abstract
Abstract. The Database Group (DBGroup, www.dbgroup.unimore.it) and Information System Group (ISGroup, www.isgroup.unimore.it) re- search activities have been mainly devoted to the Data Integration Research Area. The DBGroup designed and developed the MOMIS data integration system, giving raise to a successful innovative enterprise DataRiver (www.datariver.it), distributing MOMIS as open source. MOMIS provides an integrated access to structured and semistructured data sources and allows a user to pose a single query and to receive a single unified answer. Description Logics, Automatic Annotation of schemata plus clustering techniques constitute the theoretical framework. In the context of data integration, the ISGroup addressed problems related to the management and querying of heterogeneous data sources in large-scale and dynamic scenarios. The reference architectures are the Peer Data Management Systems and its evolutions toward dataspaces. In these contexts, the ISGroup proposed and evaluated effective and efficient mechanisms for network creation with limited information loss and solutions for mapping management query reformulation and processing and query routing. The main issues of data integration have been faced: automatic annotation, mapping discovery, global query processing, provenance, multi- dimensional Information integration, keyword search, within European and national projects. With the incoming new requirements of integrating open linked data, textual and multimedia data in a big data scenario, the research has been devoted to the Big Data Integration Research Area. In particular, the most relevant achieved research results are: a scalable entity resolution method, a scalable join operator and a tool, LODEX, for automatically extracting metadata from Linked Open Data (LOD) resources and for visual querying formulation on LOD resources. Moreover, in collaboration with DATARIVER, Data Integration was successfully applied to smart e-health.
2015
- Semantic Annotation of the CEREALAB database by the AGROVOC Linked Dataset
[Articolo su rivista]
Beneventano, Domenico; Bergamaschi, Sonia; Serena, Sorrentino; Vincini, Maurizio; Benedetti, Fabio
abstract
Nowadays, there has been an increment of open data government initiatives promoting the
idea that particular data should be freely published. However, the great majority of these resources is
published in an unstructured format and is typically accessed only by closed communities.
Starting from these considerations, in a previous work related to a youth precariousness dataset, we
proposed an experimental and preliminary methodology or facilitating resource providers in
publishing public data into the Linked Open Data (LOD) cloud, and for helping consumers (companies
and citizens) in efficiently accessing and querying them.
Linked Open Data play a central role for accessing and analyzing the rapidly growing pool of life
science data and, as discussed in recent meetings, it is important for data source providers themselves
making their resources available as Linked Open Data.
In this paper we extend and apply our methodology to the agricultural domain, i.e. to the CEREALAB
database, created to store both genotypic and phenotypic data and specifically designed for plant
breeding, in order to provide its publication into the LOD cloud.
2015
- Supporting Image Search with Tag Clouds: A Preliminary Approach
[Articolo su rivista]
Guerra, Francesco; Simonini, Giovanni; Vincini, Maurizio
abstract
Algorithms and techniques for searching in collections of data address a challenging task, since they have to bridge the gap between the ways in which users express their interests, through natural language expressions or keywords, and the ways in which data is represented and indexed.When the collections of data include images, the task becomes harder, mainly for two reasons. From one side the user expresses his needs through one medium (text) and he will obtain results via another medium (some images). From the other side, it can be difficult for a user to understand the results retrieved; that is why a particular image is part of the result set. In this case, some techniques for analyzing the query results and giving to the users some insight into the content retrieved are needed. In this paper, we propose to address this problem by coupling the image result set with a tag cloud of words describing it.
Some techniques for building the tag cloud are introduced and two application scenarios are discussed.
2014
- A Data Warehouse Integration Methodology in Support of Collaborating SMEs
[Capitolo/Saggio]
Olaru, MARIUS OCTAVIAN; Vincini, Maurizio
abstract
to be defined
2014
- Integrating Multidimensional Information for the Benefit of Collaborative Enterprises
[Articolo su rivista]
Olaru, MARIUS OCTAVIAN; Vincini, Maurizio
abstract
:Collaborative business making is emerging as a possible solution for the difficulties that Small and Medium Enterprises (SMEs) are having in the current difficult economic scenarios. Collaboration, as opposed to competition, provides a competitive advantage to companies and organizations that operate in a joint business structure. When dealing with multiple organizations, managers must access unified strategic information obtained from the knowledge repositories of each individual organization; unfortunately, traditional Business Intelligence (BI) tools are not designed with the aim of collaboration so the task becomes difficult from a managerial, organizational and technological point of view. To deal with this shortcoming, we provide an integration, mapping-based, methodology for heterogeneous Data Warehouses that aims at facilitating business stakeholders’ access to unified strategic information obtained from a network of heterogeneous collaborating SMEs. A complete formalization, based on graph theory and the RELEVANT clustering approach is provided together with experimental evaluation of the proposed methodology over real DW instances.
2013
- Analyzing Dimension Mappings and Properties in Data Warehouse IntegrationOn the Move to Meaningful Internet Systems: OTM 2013 Conferences
[Relazione in Atti di Convegno]
Beneventano, Domenico; Olaru, MARIUS OCTAVIAN; Vincini, Maurizio
abstract
ud, and ODBASE 2013
2013
- Semantic Integration of heterogeneous data sources in the MOMIS Data Transformation System
[Articolo su rivista]
Vincini, Maurizio; Bergamaschi, Sonia; Beneventano, Domenico
abstract
In the last twenty years, many data integration systems following a classical wrapper/mediator architecture and providing a Global Virtual Schema (a.k.a. Global Virtual View - GVV) have been proposed by the research community. The main issues faced by these approaches range from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. Despite the research effort, all the approaches proposed require a lot of user intervention for customizing and managing the data integration and reconciliation tasks. In some cases, the effort and the complexity of the task is huge, since it requires the development of specific programming codes. Unfortunately, due to the specificity to be addressed, application codes and solutions are not frequently reusable in other domains. For this reason, the Lowell Report 2005 has provided the guideline for the definition of a public benchmark for information integration problem. The proposal, called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches), focuses on how the data integration systems manage syntactic and semantic heterogeneities, which definitely are the greatest technical challenges in the field. We developed a Data Transformation System (DTS) that supports data transformation functions and produces query translation in order to push down to the sources the execution. Our DTS is based on MOMIS, a mediator-based data integration system that our research group is developing and supporting since 1999. In this paper, we show how the DTS is able to solve all the twelve queries of the THALIA benchmark by using a simple combination of declarative translation functions already available in the standard SQL language. We think that this is a remarkable result, mainly for two reasons: firstly to the best of our knowledge there is no system that has provided a complete answer to the benchmark, secondly, our queries does not require any overhead of new code.
2013
- The Prosumer Paradigm for Life Cycle Assessment ServicesFrameworks of IT Prosumption for Business Development
[Capitolo/Saggio]
Guerra, Francesco; Vincini, Maurizio
abstract
Enterprises, governments, and government agencies have started to publish their data on the Internet, especially in the form of open structured data sources. The real exploitation of these free, large open data sources is more and more becoming a crucial activity for obtaining information and knowledge (i.e. competitive elements) in several business sectors. In addition, with the proliferation of Web 2.0 techniques and applications such as blogs, wikis, tagging systems, and mashups, the notion of user-centricity has gained a significant momentum to put ordinary users in the leading role of delivering exciting and personalized content and services. The term "prosumer," coined by the futurist Alvin Toffler in 1980, has been often referenced in business-related contexts to identify this situation. The chapter describes the application of the "prosumer paradigm" to a real data integration system of Life Cycle Assessment (LCA). ENEA, the Italian National Agency for new Technologies, Energy, and Sustainable Economic Development, promoted the adoption of such practice in small companies belonging to the industrial and agricultural sector supplying them with a simplified LCA system. In this chapter, the authors show how a domain expert user (the prosumer) can use the framework to easily map the classification of data flows and processes provided by the simplified LCA system into the ELCD database, containing a standard classification provided by the EU. This makes the proposal completely shareable with the whole thematic classification and vision promoted by the European Commission.
2012
- A Dimension Integration Method for a Heterogeneous Data Warehouse Environment.
[Relazione in Atti di Convegno]
Marius Octavian, Olaru; Vincini, Maurizio
abstract
Data Warehousing is the main Business Intelligence instruments that allows the extraction of relevant, aggregated information from the operational data, in order to support the decision making process inside complex
organizations. Following recent trends in Data Warehousing, companies realized that there is a great potential
in combining their information repositories in order to offer all participants a broader view of the economical
market. Unfortunately, even though Data Warehouse integration has been defined from a theoretical point of
view, until now no complete, widely used methodology has been proposed to support Data Warehouse integration. This paper proposes a method that is able to achieve both schema and instance level integration
of heterogeneous Data Warehouse dimensions attributes by exploiting the topology of dimensions and the
dimension-chase procedure.
2012
- A meta-language for MDX queries in eLog Business Solution
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Interlandi, Matteo; Mario, Longo; Po, Laura; Vincini, Maurizio
abstract
The adoption of business intelligence technologyin industries is growing rapidly. Business managers are notsatisfied with ad hoc and static reports and they ask for moreflexible and easy to use data analysis tools. Recently, applicationinterfaces that expand the range of operations available to theuser, hiding the underlying complexity, have been developed. Thepaper presents eLog, a business intelligence solution designedand developed in collaboration between the database group ofthe University of Modena and Reggio Emilia and eBilling, anItalian SME supplier of solutions for the design, production andautomation of documentary processes for top Italian companies.eLog enables business managers to define OLAP reports bymeans of a web interface and to customize analysis indicatorsadopting a simple meta-language. The framework translates theuser’s reports into MDX queries and is able to automaticallyselect the data cube suitable for each query.Over 140 medium and large companies have exploited thetechnological services of eBilling S.p.A. to manage their documentsflows. In particular, eLog services have been used by themajor media and telecommunications Italian companies and theirforeign annex, such as Sky, Mediaset, H3G, Tim Brazil etc. Thelargest customer can provide up to 30 millions mail pieces within6 months (about 200 GB of data in the relational DBMS). In aperiod of 18 months, eLog could reach 150 millions mail pieces(1 TB of data) to handle.
2012
- Dimension matching in Peer-to-Peer Data Warehousing
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Marius Octavian, Olaru; Sorrentino, Serena; Vincini, Maurizio
abstract
During the last decades, the Data Warehouse has been one of the main components of a Decision Support System (DSS) inside a company. Given the great diffusion of Data Warehouses nowadays, managers have realized that there is a great potential in combining information coming from multiple information sources, like heterogeneous Data Warehouses from companies operating in the same sector. Existing solutions rely mostly on the Extract-Transform-Load (ETL) approach, a costly and complex process. The process of Data Warehous integration can be greatly simplified by developing a method that is able to semi-automatically discover semantic relationships among attributes of two or more different, heterogeneous Data Warehouse schemas. In this paper, we propose a method for the semi-automatic discovery of mappings between dimension hierarchies of heterogeneous Data Warehouses. Our approach exploits techniques from the Data Integration research area by combining topological properties of dimensions and semantic techniques.
2012
- Mapping and Integration of Dimensional Attributes Using Clustering Techniques.
[Relazione in Atti di Convegno]
Guerra, Francesco; Marius Octavian, Olaru; Vincini, Maurizio
abstract
Following recent trends in Data Warehousing, companies realized that there is a great potential in combining their information repositories to obtain a broader view of the economical market. Unfortunately, even though Data Warehouse (DW) integration has been defined from a theoretical point of view, until now no complete, widely used methodology has been proposed to support the integration of the information coming from heterogeneous DWs. This paper deals with the automatic integration of dimensional attributes from heterogeneous DWs. A method relying on topological properties that similar dimensions maintain is proposed for discovering mappings of dimensions, and a technique based on clustering algorithms is introduced for integrating the data associated to the dimensions.
2012
- On the Use of Dimension Properties in Heterogeneous Data Warehouse Integration
[Relazione in Atti di Convegno]
Marius Octavian, Olaru; Vincini, Maurizio
abstract
A new trend in Business Intelligence is the process of combining information from two or more different and heterogeneous Data Warehouses. Existing solutions rely mostly on the Extract-Transform-Load (ETL) approach, a
costly and laborious process. The process of Data Warehouse integration can
be greatly simplified by developing methods to semi-automatically discover semantic mappings among attributes of two or more different, heterogeneous Data
Warehouse schemas, like the one proposed in this paper.
2012
- Working in a dynamic environment: the NeP4B approach as a MAS
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Mandreoli, Federica; Vincini, Maurizio
abstract
Integration of heterogeneous information in the context of Internet is becoming a key activity to enable a more organized and semantically meaningful access to several kinds of information in the form of data sources, multimediadocuments and web services. In NeP4B (Networked Peers for Business), a project funded by the Italian Ministry of University and Research, we developed an approach for providing a uniform representation of data, multimedia and services,thus allowing users to obtain sets of data, multimedia documents and lists of webservices as query results. NeP4B is based on a P2P network of semantic peers, connected one with each other by means of automatically generated mappings.In this paper we present a new architecture for NeP4B, based on a Multi-Agent System.We claim that such a solution may be more efficient and effective, thanks to the agents’ autonomy and intelligence, in a dynamic environment, where sources are frequently added (or deleted) to (from) the network.
2011
- A Semantic Approach to ETL Technologies
[Articolo su rivista]
Bergamaschi, Sonia; Guerra, Francesco; Orsini, Mirko; Claudio, Sartori; Vincini, Maurizio
abstract
Data warehouse architectures rely on extraction, transformation and loading (ETL) processes for the creation of anupdated, consistent and materialized view of a set of data sources. In this paper, we aim to support these processes byproposing a tool for the semi-automatic definition of inter-attribute semantic mappings and transformation functions.The tool is based on semantic analysis of the schemas for the mapping definitions amongst the data sources and thedata warehouse, and on a set of clustering techniques for defining transformation functions homogenizing data comingfrom multiple sources. Our proposal couples and extends the functionalities of two previously developed systems: theMOMIS integration system and the RELEVANT data analysis system.
2011
- A Web Platform for Collaborative Multimedia Content Authoring Exploiting Keyword Search Engine and Data Cloud
[Articolo su rivista]
Bergamaschi, Sonia; Interlandi, Matteo; Vincini, Maurizio
abstract
The composition of multimedia presentations is a time- and resource-consuming task if not afforded in a well-defined manner. This is particularly true when people having different roles and following different high-level directives, collaborate in the authoring and assembling of a final product. For this reason we adopt the Select, Assemble, Transform and Present (SATP) approach to coordinate the presentation authoring and a tag cloud-based search engine in order to help users in efficiently retrieving useful assets. In this paper we present MediaPresenter, the framework we developed to support companies in the creation of multimedia communication means, providing an instrument that users can exploit every time new communication channels have to be created.
2011
- A web-based platform for multimedia content authoring exploiting keyword search engine and data cloud
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; F., Ferrari; Interlandi, Matteo; Vincini, Maurizio
abstract
The composition of multimedia presentations is atime and resource consuming task if not afforded in a well definedmanner. This is particularly true when people having differentroles and following different high-level directives, collaborate inthe authoring and assembling of a final product. For this reasonwe adopt the Select, Assemble, Transform and Present (SATP)approach to coordinate the presentation authoring and a tagcloud-based search engine in order to help users in efficientlyretrieving useful assets. In this paper we present MediaPresenter,the framework we developed to support companies in the creationof multimedia communication means, providing an instrumentthat users can exploit every time new communication channelshave to be created.
2011
- MediaBank: Keyword Search and Tag Cloud Functionalities for aMultimedia Content Authoring Web Platform
[Articolo su rivista]
Bergamaschi, Sonia; Interlandi, Matteo; Vincini, Maurizio
abstract
The composition of multimedia presentations is atime- and resource-consuming task if not afforded ina well-defined manner. This is particularly true whenpeople having different roles and following differenthigh-level directives, collaborate in the authoringand assembling of a final product. For this reasonwe adopt the Select, Assemble, Transform andPresent (SATP) approach to coordinate thepresentation authoring and a tag cloud-based searchengine in order to help users in efficiently retrievinguseful assets. In the first of this paper we presentMediaPresenter, the framework we developed tosupport companies in the creation of multimediacommunication means, providing an instrument thatusers can exploit every time new communicationchannels have to be created. In the second part wedescribe how we adopt keyword search techniquescoupled with Tag Cloud in order to summarize theresults over the stored data.
2011
- MediaPresenter, a web platform for multimedia content management
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; F., Ferrari; Interlandi, Matteo; Vincini, Maurizio
abstract
The composition of multimedia presentations is a time and resource consuming task if not afforded in a well defined manner. This is particularly true for medium/big companies, where people having different roles and following different high-level directives, collaborate in the authoring and assembling of a final product. In this paper we present MediaPresenter, the framework we developed to support companies in the creation of multimedia communication means, providing an instrument that users can exploit every time new communication channels have tobe created.
2011
- Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dimensions
[Articolo su rivista]
Bergamaschi, Sonia; Marius Octavian, Olaru; Sorrentino, Serena; Vincini, Maurizio
abstract
Data Warehousing is the main Business Intelligence instrument for the analysis of large amounts of data. It permits the extraction of relevant information for decision making processes inside organizations. Given the great diffusion of Data Warehouses, there is an increasing need to integrate information coming from independent Data Warehouses or from independently developed data marts in the same Data Warehouse. In this paper, we provide a method for the semi-automatic discovery of common topological properties of dimensions that can be used to automatically map elements of different dimensions in heterogeneous Data Warehouses. The method uses techniques from the Data Integration research area and combines topological properties of dimensions in a multidimensional model.
2011
- Semi-automatic Discovery of Mappings Between Heterogeneous Data Warehouse Dimensions
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Marius Octavian, Olaru; Sorrentino, Serena; Vincini, Maurizio
abstract
Data Warehousing is the main Business Intelligence instrument for the analysis of large amounts of data. It permits the extraction of relevant information for decision making processes inside organizations. Given the great diffusion of Data Warehouses, there is an increasing need to integrate information coming from independent Data Warehouses or from independently developed data marts in the same Data Warehouse. In this paper, we provide a method for the semi-automatic discovery of common topological properties of dimensions that can be used to automatically map elements of different dimensions in heterogeneous Data Warehouses. The method uses techniques from the Data Integration research area and combines topological properties of dimensions in a multidimensional model.
2010
- A COMPLETE LCA DATA INTEGRATION SOLUTION BUILT UPON MOMIS SYSTEM
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Sgaravato, Luca; Vincini, Maurizio
abstract
Life Cycle Thinking is day by day spreading outside scientific circles to assume a key role in the modern production system. Rewarded by consumers or ruled by governments an increasing number of firms is focusing on the assessment of their industrial processes. ENEA supports the adoption of such practice in small companies supplying them with simplified LCA tools; extending their database with value and up-to-date data published by the European Commission is of primary importance in order to provide effective assistance. This paper presents and demonstrates how the MOMIS (and RELEVANT) systems may be coupled and extended to actually provide a time and effort effective support in developing and deploying such an integration solution. The paper describes all the stages involved in the Extract Transform and Load process, with strong emphasis on the benefits the integration designer can achieve by the means of the semi-automatic definition of inter-attribute mappings and transformation functions [1,2] on a large number of records.
2010
- MOMIS: Getting through the THALIA benchmark
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Orsini, Mirko; Vincini, Maurizio
abstract
During the last decade many data integration systems characterized by a classical wrapper/mediator architecture based on a Global Virtual Schema (Global Virtual View - GVV) have been proposed. The data sources store data, while the GVV provides a reconciled, integrated, and virtual view of the underlying sources. Each proposed system contribute to the state of the art advancement by focusing on different aspects to provide an answer to one or more challenges of the data integration problem, ranging from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. The approaches are still in part manual, requiring a great amount of customization for data reconciliation and for writing specific non reusable programming code. The specialization of mediator systems make a comparisons among the various systems difficult. Therefore, the last Lowell Report [1] has provided the guideline for the definition of a public benchmark for data integration problems. The proposal is called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) [2], and it provides researchers with a collection of downloadable data sources representing University course catalogues, a set of twelve benchmark queries, as well as a scoring function for ranking the performance of an integration system. In this paper we show how the MOMIS mediator system we developed [3,4] can deal with all the twelve queries of the THALIA benchmark by simply extending and combining the declarative translation functions available in MOMIS and without any overhead of new code. This is a remarkable result, in fact, as far as we know, no system has provided a complete answer to the benchmark.
2009
- An ETL tool based on semantic analysis of schemata and instances
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Orsini, Mirko; C., Sartori; Vincini, Maurizio
abstract
In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as an ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse,and instance level operations, exploited for defining transformation functions that integrate data coming from multiple sources in a common representation.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.
2009
- Improving Extraction and Transformation in ETL by Semantic Analysis
[Relazione in Atti di Convegno]
Guerra, Francesco; Bergamaschi, Sonia; Orsini, Mirko; Claudio, Sartori; Vincini, Maurizio
abstract
Extraction, Transformation and Loading processes (ETL) are crucial for the data warehouseconsistency and are typically based on constraints and requirements expressed in natural language in the form ofcomments and documentations. This task is poorly supported by automatic software applications, thus makingthese activities a huge works for data warehouse. In a traditional business scenario, this fact does not representa real big issue, since the sources populating a data warehouse are fixed and directly known by the dataadministrator. Nowadays, the actual business needs require enterprise information systems to have a greatflexibility concerning the allowed business analysis and the treated data. Temporary alliances of enterprises,market analysis processes, the data availability on Internet push enterprises to quickly integrate unexpected datasources for their activities. Therefore, the reference scenario for data warehouse systems extremely changes,since data sources populating the data warehouse may not directly be known and managed by the designers,thus creating new requirements for ETL tools related to the improvement of the automation of the extraction andtransformation process, the need of managing heterogeneous attribute values and the ability to manage differentkinds of data sources, ranging from DBMS, to flat file, XML documents and spreadsheets. In this paper wepropose a semantic-driven tool that couples and extends the functionalities of two systems: the MOMISintegration system and the RELEVANT data analysis system. The tool aims at supporting the semi-automaticdefinition of ETL inter-attribute mappings and transformations in a data warehouse project. By means of asemantic analysis, two tasks are performed: 1) identification of the parts of the schemata of the data sourceswhich are related to the data warehouse; 2) supporting the definition of transformation rules for populating thedata warehouse. We experimented the approach in a real scenario: preliminary qualitative results show that ourtool may really support the data warehouse administrator’s work, by considerably reducing the data warehousedesign time.
2009
- Semantic Analysis for an Advanced ETL framework
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Orsini, Mirko; C., Sartori; Vincini, Maurizio
abstract
In this paper we propose a system supporting the semi-automatic definition of inter-attribute mappings and transformation functions used as ETL tool in a data warehouse project. The tool supports both schema level analysis, exploited for the mapping definitions amongst the data sources and the data warehouse, and instance level operations, exploited for defining transformationfunctions that integrate in a common representation data coming from multiple sources.Our proposal couples and extends the functionalities of two previously developed systems: the MOMIS integration system and the RELEVANT data analysis system.
2008
- Issues in peer-to-peer electronic services (Extended abstract)
[Relazione in Atti di Convegno]
Comerio, M.; Maurino, A.; Vincini, M.; Viscusi, G.
abstract
The wide diffusion of ICTs has unchained subsequent waves of expectations about how the new ways of communicating and ex- changing information and knowledge would affect inter-firm relation- ships, hopefully improving their performance. However, facts often down- sized expectations. Even the so called Web revolution did not solve above mentioned issues. In fact the adoption of Internet-based tools by many SMEs have still limited to the Internet access, e-mail and static company Web sites only. Today the new frontier is the adoption of peer-to-peer approach and semantic-based technologies such as semantic Web ser- vices and ontologies. However, a crucial node for the future development of peer-to-peer semantic networks is understanding if, and eventually how, software architectures based on those concepts can create economic value for all network members. This paper presents relevant issues in the provisioning of electronic services in peer-to-peer environments and proposes preliminary solutions under developing in the context of the NeP4B (Network Peers for Business) project.
2008
- Open Source come modello di business per le PMI: analisi critica e casi di studio
[Capitolo/Saggio]
Bergamaschi, Sonia; Nigro, Francesco; Po, Laura; Vincini, Maurizio
abstract
Il software Open Source sta attirando l'attenzione a tutti i livelli, sia all'interno del mondo economico che produttivo, perché propone un nuovo modello di sviluppo tecnologico ed economico fortemente innovativo e di rottura con il passato.In questo elaborato verranno analizzate le ragioni che stanno determinando il successo di tale modello e verranno inoltre presentate alcune casistiche in cui l'Open Source risulta vantaggioso, evidenziando gli aspetti più interessanti sia per gli utilizzatori che per i produttori del software.
2007
- MELIS: An Incremental Method For The Lexical Annotation Of Domain Ontologies
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; P., Bouquet; D., Giacomuzzi; Guerra, Francesco; Po, Laura; Vincini, Maurizio
abstract
In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELISis its incrementality: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of MELIS as a standalone tool and as a component integrated in MOMIS.
2007
- MELIS: a tool for the incremental annotation of domain ontologies
[Software]
Bergamaschi, Sonia; Paolo, Bouquet; Daniel, Giacomuzzi; Guerra, Francesco; Po, Laura; Vincini, Maurizio
abstract
Melis is a software tool for enablingan incremental process of automatic annotation of local schemas (e.g. re-lational database schemas, directory trees) with lexical information. Thedistinguishing and original feature of MELIS is its incrementality: thehigher the number of schemas which are processed, the more back-ground/domain knowledge is cumulated in the system (a portion of do-main ontology is learned at every step), the better the performance ofthe systems on annotating new schemas.
2007
- Melis: an incremental method for the lexical annotation of domain ontologies
[Articolo su rivista]
Bergamaschi, Sonia; P., Bouquet; D., Giacomuzzi; Guerra, Francesco; Po, Laura; Vincini, Maurizio
abstract
In this paper, we present MELIS (Meaning Elicitation and Lexical Integration System), a method and a software tool for enabling an incremental process of automatic annotation of local schemas (e.g. relational database schemas, directory trees) with lexical information. The distinguishing and original feature of MELIS is the incremental process: the higher the number of schemas which are processed, the more background/domain knowledge is cumulated in the system (a portion of domain ontology is learned at every step), the better the performance of the systems on annotating new schemas.MELIS has been tested as component of MOMIS-Ontology Builder, a framework able to create a domain ontology representing a set of selected data sources, described with a standard W3C language wherein concepts and attributes are annotated according to the lexical reference database.We describe the MELIS component within the MOMIS-Ontology Builder framework and provide some experimental results of ME LIS as a standalone tool and as a component integrated in MOMIS.
2007
- Progetto di Basi di Dati Relazionali
[Monografia/Trattato scientifico]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
L'obiettivo del volume è fornire al lettore le nozioni fondamentali di progettazione e di realizzazione di applicazioni di basi di dati relazionali. Relativamente alla progettazione, vengono trattate le fasi di progettazione concettuale e logica e vengono presentati i modelli dei dati Entity-Relationship e Relazionale che costituiscono gli strumenti di base, rispettivamente, per la progettazione concettuale e la progettazione logica. Viene inoltre introdotto lo studente alla teoria della normalizzazione di basi di dati relazionali. Relativamente alla realizzazione, vengono presentati elementi ed esempi del linguaggio standard per RDBMS (Relational Database Management Systems) SQL. Ampio spazio è dedicato ad esercizi svolti sui temi trattati. Il volume nasce dalla pluriennale esperienza didattica condotta dagli autori nei corsi di Basi di Dati e di Sistemi Informativi per studenti dei corsi di laurea e laurea specialistica della Facoltà di Ingegneria di Modena, della Facoltà di Ingegneria di Reggio Emilia e della Facoltà di Economia "Marco Biagi" dell'Università degli Studi di Modena e Reggio Emilia. Il volume attuale estende notevolmente le edizioni precedenti arricchendo la sezione di progettazione logica e di SQL.La sezione di esercizi è completamente nuova, inoltre, ulteriori esercizi sono reperibili su questa pagina web. Come le edizioni precedenti, costituisce più una collezione di appunti che un vero libro nel senso che tratta in modo rigoroso ma essenziale i concetti forniti. Inoltre, non esaurisce tutte le tematiche di un corso di Basi di Dati, la cui altra componente fondamentale è costituita dalla tecnologia delle basi di dati. Questa componente è, a parere degli autori, trattata in maniera eccellente da un altro testo di Basi di Dati, scritto dai nostri colleghi e amici Paolo Ciaccia e Dario Maio dell'Università di Bologna. Il volume, pure nella sua essenzialità, è ricco di esercizi svolti e quindi può costituire un ottimo strumento per gruppi di lavoro che, nell'ambito di software house, si occupino di progettazione di applicazioni di basi di dati relazionali.
2007
- Query Translation on heterogeneous sources in MOMIS Data Transformation Systems
[Relazione in Atti di Convegno]
Beneventano, Domenico; Vincini, Maurizio; Orsini, Mirko; Bergamaschi, Sonia; Nana, C.
abstract
Abstract
2007
- Querying a super-peer in a schema-based super-peer network
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
We propose a novel approach for defining and querying a super-peer within a schema-based super-peer network organized into a two-level architecture: the low level, called the peer level (which contains a mediator node), the second one, called super-peer level (which integrates mediators peers with similar content).We focus on a single super-peer and propose a method to define and solve a query, fully implemented in the SEWASIE project prototype. The problem we faced is relevant as a super-peer is a two-level data integrated system, then we are going beyond traditional setting in data integration. We have two different levels of Global as View mappings: the first mapping is at the super-peer level and maps several Global Virtual Views (GVVs) of peers into the GVV of the super-peer; the second mapping is within a peer and maps the data sources into the GVV of the peer. Moreover, we propose an approach where the integration designer, supported by a graphical interface, can implicitly define mappings by using Resolution Functions to solve data conflicts, and the Full Disjunction operator that has been recognized as providing a natural semantics for data merging queries.
2007
- Relevant News: a semantic news feed aggregator
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Orsini, Mirko; Sartori, C; Vincini, Maurizio
abstract
In this paper we present RELEVANTNews, a web feed reader that automatically groups news related to the same topic published in different newspapers in different days. The tool is based on RELEVANT, a previously developed tool, which computes the “relevant values”, i.e. a subset of the values of a string attribute.Clustering the titles of the news feeds selected by the user, it is possible identify sets of related news on the basis of syntactic and lexical similarity.RELEVANTNews may be used in its default configuration or in a personalized way: the user may tune some parameters in order to improve the grouping results. We tested the tool with more than 700 news published in 30 newspapers in four daysand some preliminary results are discussed.
2007
- The SEWASIE MAS for Semantic Search
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The capillary diffusion of the Internet has made available access to an overwhelming amount of data, allowing users having benefit of vast information. However, information is not really directly available: internet data are heterogeneous and spread over different places, with several duplications, and inconsistencies. The integration of such heterogeneous inconsistent data, with data reconciliation and data fusion techniques, may therefore represent a key activity enabling a more organized and semantically meaningful access to data sources. Some issues are to be solved concerning in particular the discovery and the explicit specification of the relationships between abstract data concepts and the need for data reliability in dynamic, constantly changing network. Ontologies provide a key mechanism for solving these challenges, but the web’s dynamic nature leaves open the question of how to manage them.Many solutions based on ontology creation by a mediator system have been proposed: a unified virtual view (the ontology) of the underlying data sources is obtained giving to the users a transparent access to the integrated data sources. The centralized architecture of a mediator system presents several limitations, emphasized in the hidden web: firstly, web data sources hold information according to their particular view of the matter, i.e. each of them uses a specific ontology to represent its data. Also, data sources are usually isolated, i.e. they do not share any topological information concerning the content or structure of other sources.Our proposal is to develop a network of ontology-based mediator systems, where mediators are not isolated from each other and include tools for sharing and mapping their ontologies. In this paper, we describe the use of a multi-agent architecture to achieve and manage the mediators network. The functional architecture is composed of single peers (implemented as mediator agents) independently carrying out their own integration activities. Such agents may then exchange data and knowledge with other peers by means of specialized agents (called brokering agents) which provide a coherent access plan to the peer network. In this way, two layers are defined in the architecture: at the local level, peers maintain an integrated view of local sources; at the network level, agents maintain mappings among the different peers. The result is the definition of a new type of mediator system network intended to operate in web economies, which we realized within SEWASIE (SEmantic Webs and AgentS in Integrated Economies), an RDT project supported by the 5th Framework IST program of the European Community, successfully ended on September 2005.
2007
- The SEWASIE Network of Mediator Agents for Semantic Search
[Articolo su rivista]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
Integration of heterogeneous information in the context of Internet becomes a key activity to enable a more organized and semantically meaningful access to data sources. As Internet can be viewed as a data-sharing network where sites are data sources, the challenge is twofold. Firstly, sources present information according to their particular view of the matter, i.e. each of them assumes a specific ontology. Then, data sources are usually isolated, i.e. they do not share any topological information concerning the content or the structure of other sources. The classical approach to solve these issues is provided by mediator systems which aim at creating a unified virtual view of the underlying data sources in order to hide the heterogeneity of data and give users a transparent access to the integrated information.In this paper we propose to use a multi-agent architecture to build and manage a mediators network. While a single peer (i.e. a mediator agent) independently carries out data integration activities, it exchanges knowledge with other peers by means of specialized agents (i.e. brokers) which provide a coherent access plan to access information in the peer network. This defines two layers in the system: at local level, peers maintain an integrated view of local sources, while at network level agents maintain mappings among the different peers. The result is the definition of a new networked mediator system intended to operate in web economies, which we realized in the SEWASIE (SEmantic Webs and AgentS in Integrated Economies) project. SEWASIE is a RDT project supported by the 5th Framework IST program of the European Community successfully ended on September 2005.
2006
- An incremental method for meaning elicitation of a domain ontology
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; P., Bouquet; D., Giacomuzzi; Guerra, Francesco; Po, Laura; Vincini, Maurizio
abstract
Internet has opened the access to an overwhelming amount of data, requiring the development of new applications to automatically recognize, process and manage informationavailable in web sites or web-based applications. The standardSemantic Web architecture exploits ontologies to give a shared(and known) meaning to each web source elements.In this context, we developed MELIS (Meaning Elicitation and Lexical Integration System). MELIS couples the lexical annotation module of the MOMIS system with some components from CTXMATCH2.0, a tool for eliciting meaning from severaltypes of schemas and match them. MELIS uses the MOMIS WNEditor and CTXMATCH2.0 to support two main tasks in theMOMIS ontology generation methodology: the source annotationprocess, i.e. the operation of associating an element of a lexicaldatabase to each source element, and the extraction of lexicalrelationships among elements of different data sources.
2006
- An intelligent data integration approach for collaborative project management in virtual enterprises
[Articolo su rivista]
Bergamaschi, Sonia; Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
The increasing globalization and flexibility required by companies has generated new issues in the last decade related to the managing of large scale projects and to the cooperation of enterprises within geographically distributed networks. ICT support systems are required to help enterprises share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with a main goal: to facilitate the activity of assembling, integration and the verification of a multi-enterprise project. The main achievement of the system from a data management perspective is to avoid inconsistencies generated by updates at the sources' level and minimizes data replications. The developed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration system, which supports information sharing and querying, and web-services that ensure the interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia. Gruppo Formula S.p.A. and tested by Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge-IST-2000-28221).
2006
- Instances Navigation for Querying Integrated Data from Web-Sites
[Capitolo/Saggio]
Beneventano, Domenico; Bergamaschi, Sonia; Bruschi, Stefania; Guerra, Francesco; Orsini, Mirko; Vincini, Maurizio
abstract
Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.
2006
- Instances navigation for querying integrated data from web-sites
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Bruschi, Stefania; Guerra, Francesco; Orsini, Mirko; Vincini, Maurizio
abstract
Research on data integration has provided a set of rich and well understood schema mediation languages and systems that provide a meta-data representation of the modeled real world, while, in general, they do not deal with data instances.Such meta-data are necessary for querying classes result of an integration process: the end user typically does not know the contents of such classes, he simply defines his queries on the basis of the names of classes and attributes.In this paper we introduce an approach enriching the description of selected attributes specifying as meta-data a list of the “relevant values” for such attributes. Furthermore relevant values may be hierarchically collected in a taxonomy. In this way, the user may exploit new meta-data in the interactive process of creating/refining a query. The same meta-data are also exploited by the system in the query rewriting/unfolding process in orderto filter the results showed to the user.We conducted an evaluation of the strategy in an e-business context within the EU-IST SEWASIE project. The evaluation proved the practicability of the approach for large value instances.
2005
- Building a tourism information provider with the MOMIS system
[Articolo su rivista]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The tourism industry is a good candidate for taking up Semantic Web technology. In fact, there are many portals and websites belonging to the tourism domain that promote tourist products (places to visit, food to eat, museums, etc.) and tourist services (hotels, events, etc.), published by several operators (tourist promoter associations, public agencies, etc.). This article presents how the MOMIS system may be used for building a tourism information provider by exploiting the tourism information that is available in Internet websites. MOMIS (Mediator envirOnment for Multiple Information Sources) is a mediator framework that performs information extraction and integration from heterogeneous distributed data sources and includes query management facilities to transparently support queries posed to the integrated data sources.
2005
- SEWASIE - SEmantic Webs and AgentS in Integrated Economies.
[Software]
Bergamaschi, Sonia; Beneventano, Domenico; Vincini, Maurizio; Guerra, Francesco
abstract
SEWASIE (SEmantic Webs and AgentS in Integrated Economies) aims to design and implement an advanced
search engine enabling intelligent access to heterogeneous data sources on the web via semantic enrichment to
provide the basis of structured secure web-based communication.
SEWASIE implemented an advanced search engine that provides intelligent access to heterogeneous data
sources on the web via semantic enrichment to provide the basis of structured secure web-based
communication. SEWASIE provides users with a search client that has an easy-to-use query interface, and
which can extract the required information from the Internet and can show it in a useful and user-friendly
format. From an architectural point of view, the prototype provides a search engine client and indexing
servers and ontologies.
2004
- A Web Service based framework for the semantic mapping between product classification schemas
[Articolo su rivista]
Beneventano, Domenico; Guerra, Francesco; Magnani, Stefania; Vincini, Maurizio
abstract
A marketplace is the place where the demands and offers of buyers and sellers participating in a business transaction may meet. Therefore, electronic marketplaces are virtual communities in which buyers may receive proposals from several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is not possible due to the lack of common standards, used by the community, describing and classifying them. Therefore, B2B and B2C marketplaces have to reclassify products and goods according to different standardization models. In this paper, we propose a semi-automatic methodology, supported by a web service based framework, to define semantic mappings amongst different product classification schemas (ecommerce standards and catalogues) and we provide the ability to be able to search and navigate these mappings.The proposed methodology is shown over fragments of UNSPSC and ecl@ss standards and over a fragment of the eBay online catalogue.
2004
- A peer-to-peer information system for the semantic web
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
Data integration, in the context of the web, faces new problems, due in particular to the heterogeneity of sources, to the fragmentation of the information and to the absence of a unique way to structure and view information. In such areas, the traditional paradigms, on which database foundations are based (i.e. client server architecture, few sources containing large information), have to be overcome by new architectures. The peer-to-peer (P2P) architecture seems to be the best way to fulfill these new kinds of data sources, offering an alternative to traditional client/server architecture.In this paper we present the SEWASIE system that aims at providing access to heterogeneous web information sources. An enhancement of the system architecture in the direction of P2P architecture, where connections among SEWASIE peers rely on exchange of XML metadata, is described.
2004
- MOMIS: an Ontology-based Information Integration System(software)
[Software]
Bergamaschi, Sonia; Beneventano, Domenico; Guerra, Francesco; Orsini, Mirko; Vincini, Maurizio
abstract
The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent oftheir location or their data’s heterogeneity.An open source version of the MOMIS system was released on April 2010 by the spin-off DATARIVER (www.datariver.it)Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web ages with machine-readable definitions. It thus relieson the a priori existence of ontologies that represent the domains associated with the given information sources. This approachrelies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that theannotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. Byinvolving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover,the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content.
2004
- SOAP-ENABLED WEB SERVICES FOR KNOWLEDGE MANAGEMENT
[Articolo su rivista]
I., Benetti; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The widespread diffusion of the World Wide Web among medium/small companies yields a huge amount of information to make business available online. Nevertheless the heterogeneity of that information, forces even trading partners involved in the same business process to face daily interoperability issues.The challenge is the integration of distributed business processes, which, in turn, means integration of heterogeneous data coming from distributed sources.This paper presents the new web services-based architecture of the MOMIS (Mediator envirOnment for Multiple Information Sources) framework that enhances the semantic integration features of MOMIS, leveraging new technologies such as XML web services and the SOAP protocol.The new architecture decouples the different MOMIS modules, publishing them as XML web services. Since the SOAP protocol used to access XML web services requires the same network security settings as a normal internet browser, companies are enabled to share knowledge without softening their protection strategies.
2004
- Synthesizing an Integrated Ontology with MOMIS
[Relazione in Atti di Convegno]
Benassi, Roberta; Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The Mediator EnvirOnment for Multiple Information Sources (MOMIS) aims at constructing synthesized, integrated descriptions of the information coming from multiple heterogeneous sources, in order to provide the user with a global virtual view of the sources independent from their location and the level of hetero-geneity of their data. Such a global virtual view is a con-ceptualization of the underlying domain and then may be thought of as an ontology describing the involved sources. In this article we explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underly-ing domain
2004
- TUCUXI: the Intelligent Hunter Agent for Concept Understanding and Lexical Chaining
[Relazione in Atti di Convegno]
Benassi, Roberta; Bergamaschi, Sonia; Vincini, Maurizio
abstract
In this paper we present Tucuxi, an intelligent hunter agent that replaces traditional keyword-based queries on the Web with a user-provided domani ontology, where meanings to be searched are not ambiguous.
2004
- Web Semantic Search with TUCUXI
[Relazione in Atti di Convegno]
R., Benassi; Bergamaschi, Sonia; Vincini, Maurizio
abstract
S.Margherita di Pula (CAGLIARI), Italia, 21-23 Giugno.
2003
- A Experiencing AUML for the WINK Multi-Agent System
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
In the last few years, efforts have been done towards bridging thegap between agent technology and de facto standard technologies,aiming at introducing multi-agent systems in industrialapplications. This paper presents an experience done by using oneof such proposals, Agent UML. Agent UML is a graphicalmodelling language based on UML. The practical usage of thisnotation has brought to suggest some refinements of the AgentUML features.
2003
- A Peer-to-Peer Agent-Based Semantic Search Engine
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Fergnani, Alain; Guerra, Francesco; Vincini, Maurizio; D., Montanari
abstract
Several architectures, protocols, languages, and candidate standards, have been proposed to let the "semantic web'' idea take off. In particular, searching for information requires cooperation of the information providers and seekers. Past experience and history show that a successful architecture must support ease of adoption and deployment by a wide and heterogeneous population, a flexible policy to establish an acceptable cost-benefit ratio for using the system, and the growth of a cooperative distributed infrastructure with no central control. In this paper an agent-based peer-to-peer architecture is defined to support search through a flexible integration of semantic information.Two levels of integration are foreseen: strong integration of sources related to the same domain into a single information node by means of a mediator-based system; weak integration of information nodes on the basis of semantic relationships existing among concepts of different nodes.The EU IST SEWASIE project is described as an instantiation of this architecture. SEWASIE aims at implementing an advanced search engine, which will provide SMEs with intelligent access to heterogeneous information on the Internet.
2003
- Building an Integrated Ontology within the SEWASIE Project: The Ontology Builder Tool
[Abstract in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; D., Miselli; A., Fergnani; Vincini, Maurizio
abstract
See http://www.sewasie.org/
2003
- Building an integrated Ontology within SEWASIE system
[Relazione in Atti di Convegno]
Beneventano, D.; Bergamaschi, S.; Guerra, F.; Vincini, M.
abstract
The SEWASIE (SEmantic Webs and AgentS in Integrated Economies) project (IST-2001-34825) is an European research project that aims at designing and implementing an advanced search engine enabling intelligent access to heterogeneous data sources on the web. In this paper we focus on the Ontology Builder component of the SEWASIE system, that is a framework for information extraction and integration of heterogeneous structured and semi-structured information sources, built upon the MOMIS (Mediator envirOnment for Multiple Information Sources) system. The result of the integration process is a Global Virtual View (in short GVV) which is a set of (global) classes that represent the information contained in the sources being used. In particular, we present the application of our integration concerning a specific type of source (i.e. web documents), and show the extension of a built-up GVV by the addition of another source.
2003
- Building an integrated Ontology within the SEWASIE system
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
MOMIS (Mediator envirOnment for Multiple Information Sources) is a framework for information extraction and integration of heterogeneous structured and semi-structured information sources. The result of the integration process is a Global Virtual View (in short GVV) which is a set of (global) classesthat represent the information contained in the sources being used. In this paper, we present the application of our integration concerning a specific type of source (i.e. web documents), and show how the result of the integration approach can be exploited to create a conceptualization of the domain belonging the sources, i.e. an ontology. Two new achievements of the MOMIS system are presented: the semi-automatic annotation of the GVV and the extension of a built-up ontology by the addition of another source.
2003
- MIKS: an agent framework supporting information access and integration
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; J., Gelati; Guerra, Francesco; Vincini, Maurizio
abstract
Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems for cooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, in this paper we propose the exploitation of mobile agents in the information integration area, and, in particular, their integration in the Momis infrastructure. MOMIS (Mediator EnvirOnment for Multiple Information Sources) is a system that has been conceived as a pool of tools to provide an integrated access to heterogeneous information stored in traditional databases (for example relational, object oriented databases) or in file systems, as well as in semi-structured data sources (XML-file).This proposal has been implemented within the MIKS (Mediator agent for Integration of Knowledge Sources) system and it is completely described in this paper.
2003
- Synthesizing, an integrated ontology
[Articolo su rivista]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
To exploit the Internet’s expanding data collection, current Semantic Web approaches employ annotation techniques to link individual information resources with machine-comprehensible metadata. Before we can realize the potential this new vision presents, however, several issues must be solved. One of these is the need for data reliability in dynamic, constantly changing networks. Another issue is how to explicitly specify relationships between abstract data concepts. Ontologies provide a key mechanism for solving these challenges, but the Web’s dynamic nature leaves open the question of how to manage them. The Mediator Environment for Multiple Information Sources (Momis), developed by the database research group at the University of Modena and Reggio Emilia, aims to construct synthesized, integrated descriptions of information coming from multiple heterogeneous sources. Our goal is to provide users with a global virtual view (GVV) of information sources, independent of their location or their data’s heterogeneity. Such a view conceptualizes the underlying domain; you can think of it as an ontology describing the sources involved. The Semantic Web exploits semantic markups to provide Web pages with machine-readable definitions. It thus relies on the a priori existence of ontologies that represent the domains associated with the given information sources. This approach relies on the selected reference ontology’s accuracy, but we find that most ontologies in common use are generic and that the annotation phase (in which semantic annotations connect Web page parts to ontology items) causes a loss of semantics. By involving the sources themselves, our approach builds an ontology that more precisely represents the domain. Moreover, the GVV is annotated according to a lexical ontology, which provides an easily understandable meaning to content. In this article, we use Web documents as a representative information source to describe the Momis methodology’s general application. We explore the framework’s main elements and discuss how the output of the integration process can be exploited to create a conceptualization of the underlying domain. In particular, our method provides a way to extend previously created conceptualizations, rather than starting from scratch, by inserting a new source.
2003
- WINK: A web-based system for collaborative project management in virtual enterprises
[Relazione in Atti di Convegno]
Bergamaschi, S.; Gelati, G.; Guerra, F.; Vincini, M.
abstract
The increasing of globalization and flexibility required to the companies has generated, in the last decade, new issues, related to the managing of large scale projects within geographically distributed networks and to the cooperation of enterprises. ICT support systems are required to allow enterprises to share information, guarantee data-consistency and establish synchronized and collaborative processes. In this paper we present a collaborative project management system that integrates data coming from aerospace industries with two main goals: avoiding inconsistencies generated by updates at the sources’ level and minimizing data replications. The proposed system is composed of a collaborative project management component supported by a web interface, a multi-agent data integration component, which supports information sharing and querying, and SOAP enabled web-services which ensure the whole interoperability of the software components. The system was developed by the University of Modena and Reggio Emilia, Gruppo Formula S.p.A. and Alenia Spazio S.p.A. within the EU WINK Project (Web-linked Integration of Network based Knowledge - IST-2000-28221).
2002
- A data integration framework for e-commerce product classification
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
A marketplace is the place in which the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology, supported by a tool (SI-Designer), to define the mapping among different e-commerce product classification standards. This methodology was developed for the MOMIS system within the Intelligent Integration of Information research area. We describe our extension to the methodology that makes it applyable in general to product classification standard, by selecting a fragment of ECCMA/UNSPSC and ecl @ss standard.
2002
- A semantic approach to access heterogeneous data sources: the SEWASIE Project
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Vincini, Maurizio
abstract
SEWASIE is implementing an advanced search engine that provides intelligent access to heterogeneous data sources on the web via semantic enrichment. This can be thought of as the basis of structured secure web-based communication. SEWASIE provides users with a search client that has an easy-to-use query interface, and which can extract the required information from the Internet and to show it in a useful and user-friendly format. From an architectural point of view, the prototype will provide a search engine client and indexing servers and ontologies.There are many benefits to be had from such a system. There will be a reduction of transaction costs by efficient search and communication facilities. Within the business context, the system will support integrated searching and negotiating, which will promote the take-up of key technologies for SMEs and give them a competitive edge.
2002
- An Agent framework for Supporting the MIKS Integration Process
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; M., Felice; D., Gazzotti; Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
Providing an integrated access to multiple heterogeneous sourcesis a challenging issue in global information systems forcooperation and interoperability. In the past, companies haveequipped themselves with data storing systems building upinformative systems containing data that are related one another,but which are often redundant, not homogeneous and not alwayssemantically consistent. Moreover, to meet the requirements ofglobal, Internet-based information systems, it is important thatthe tools developed for supporting these activities aresemi-automatic and scalable as much as possible.To face the issues related to scalability in the large-scale, inthis paper we propose the exploitation of mobile agents inthe information integration area, and, in particular, the rolesthey play in enhancing the feature of the Momis infrastructure.Momis (Mediator agent for Integration of Knowledge Sources) is asystem that has been conceived as a pool of tools to provide anintegrated access to heterogeneous information stored intraditional databases (for example relational, object orienteddatabases) or in file systems, as well as in semi-structured datasources (XML-file).In this paper we describe the new agent-based framework concerning the integration process as implemented in Miks (Mediator agent for Integration of Knowledge Sources) system.
2002
- An information integration framework for E-commerce
[Articolo su rivista]
I., Benetti; Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The Web has transformed electronic information systems from single, isolated nodes into a worldwide network of information exchange and business transactions. In this context, companies have equipped themselves with high-capacity storage systems that contain data in several formats. The problems faced by these companies often emerge because the storage systems lack structural and application homogeneity in addition to a common ontology.The semantic differences generated by a lack of consistent ontology can lead to conflicts that range from simple name contradictions (when companies use different names to indicate the same data concept) to structural incompatibilities (when companies use different models to represent the same information types).One of the main challenges for e-commerce infrastructure designers is information sharing and retrieving data from different sources to obtain an integrated view that can overcome any contradictions or redundancies. Virtual catalogs can help overcome this challenge because they act as instruments to retrieve information dynamically from multiple catalogs and present unified product data to customers. Instead of having to interact with multiple heterogeneous catalogs, customers can instead interact with a virtual catalog in a straightforward, uniform manner.This article presents a virtual catalog project called Momis (mediator environment for multiple information sources). Momis is a mediator-based system for information extraction and integration that works with structured and semistructured data sources. Momis includes a component called the SI-Designer for semiautomatically integrating the schemas of heterogeneous data sources, such as relational, object, XML, or semistructured sources. Starting from local source descriptions, the Global Schema Builder generates an integrated view of all data sources and expresses those views using XML. Momis lets you use the infrastructure with other open integration information systems by simply interchanging XML data files.Momis creates XML global schema using different stages, first by creating a common thesaurus of intra and interschema relationships. Momis extracts the intraschema relationships by using inference techniques, then shares these relationships in the common thesaurus. After this initial phase, Momis enriches the common thesaurus with interschema relationships obtained using the lexical WordNet system (www.cogsci.princeton.edu/wn), which identifies the affinities between interschema concepts on the basis of their lexicon meaning. Momis also enriches the common thesaurus using the Artemis system, which evaluates structural affinities among interschema concepts.
2002
- MOMIS: Exploiting agents to support information integration
[Articolo su rivista]
Cabri, Giacomo; Guerra, Francesco; Vincini, Maurizio; Bergamaschi, Sonia; Leonardi, Letizia; Zambonelli, Franco
abstract
Information overloading introduced by the large amount of data that is spread over the Internet must be faced in an appropriate way. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challenges for today's technologies related to information management. In the area of information integration, this paper proposes an approach based on mobile software agents integrated in the MOMIS (Mediator envirOnment for Multiple Information Sources) infrastructure, which enables semi-automatic information integration to deal with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The exploitation of mobile agents in MOMIS can significantly increase the flexibility of the system. In fact, their characteristics of autonomy and adaptability well suit the distributed and open environments, such as the Internet. The aim of this paper is to show the advantages of the introduction in the MOMIS infrastructure of intelligent and mobile software agents for the autonomous management and coordination of integration and query processing over heterogeneous data sources.
2002
- Product Classification Integration for E-Commerce
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
A marketplace is the place where the demand and supply of buyers and vendors participating in a business process may meet. Therefore, electronic marketplaces are virtual communities in which buyers may meet proposals of several suppliers and make the best choice. In the electronic commerce world, the comparison between different products is blocked due to the lack of standards (on the contrary, the proliferation of standards) describing and classifying them. Therefore, the need for B2B and B2C marketplaces is to reclassify products and goods according to different standardization models. This paper aims to face this problem by suggesting the use of a semi-automatic methodology to define a mapping among different e-commerce product classification standards. This methodology is an extension of the MOMIS-system, a mediator system developed within the Intelligent Integration of Information research area.
2002
- SI-Web: a Web based interface for the MOMIS project
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; D., Bianco; Guerra, Francesco; Vincini, Maurizio
abstract
The MOMIS project (Mediator envirOnment for MultipleInformation Sources) developed in the past years allows the integration of data from structured and semi-structured data sources. SI-Designer (Source Integrator Designer) is a designer support tool implemented within the MOMIS project for semi-automatic integration of heterogeneous sources schemata. It is a java application where all modules involved are available as CORBA Object and interact using established IDL interfaces. The goal of this demonstration is to present a new tool: SI-Web (Source Integrator on Web), it offers the same features of SI-Designer but it has got the great advantage of being usable onInternet through a web browser.
2002
- Semantic Integration and Query Optimization of Heterogeneous Data Sources
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Beneventano, Domenico; Castano, S; DE ANTONELLIS, V; Ferrara, A; Guerra, Francesco; Mandreoli, Federica; ORNETTI G., C; Vincini, Maurizio
abstract
In modern Internet/Intranet-based architectures, an increasing number of applications requires an integrated and uniform accessto a multitude of heterogeneous and distributed data sources. Inthis paper, we describe the ARTEMIS/MOMIS system for the semantic integration and query optimization of heterogeneous structured and semistructured data sources.
2002
- The WINK Project for Virtual Enterprise Networking and Integration
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Gazzotti, Davide; Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
To stay competitive (or sometimes simply to stay) on the market companies and manufacturers more and more often have to join their forces to survive and possibly flourish. Among other solutions, the last decade has experienced the growth and spreading of an original business model called Virtual Enterprise. To manage a Virtual Enterprise modern information systems have to tackle technological issues as networking, integration and cooperation. The WINK project, born form the partnership between University of Modena and Reggio Emilia and Gruppo Formula, addresses these problems. The ultimate goal is to design, implement and finally test on a pilot case (provided by Alenia), the WINK system, as combination of two existing and promising software systems (the WHALES and MIKS systems), to provide the Virtual Enterprise requirement for data integration and cooperation amd management planning.
2001
- Agents Supporting Information Integration: the MIKS Framework
[Articolo su rivista]
Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
During past years we have developed the MOMIS (Mediator envirOnment for Multiple Information Sources) system for the integration of data from structured and semi-structured data sources.In this paper we propose some preliminary considerations about one feasible extension of the system, intended to improve some of the functionalities by exploiting intelligent and mobile agents. The new framework is named a MIKS (Mediator agent for Integration of Knowledge Sources).
2001
- Agents Supporting Information Integration: the MIKS Framework
[Relazione in Atti di Convegno]
Gelati, Gionata; Guerra, Francesco; Vincini, Maurizio
abstract
During past years we have developed the MOMIS (Mediator envirOnment for Multiple Information Sources) system for the integration of data from structured and semi-structured data sources.In this paper we propose some preliminary considerations about one feasible extension of the system, intended to improve some of the functionalities by exploiting intelligent and mobile agents. The new framework is named a MIKS (Mediator agent for Integration of Knowledge Sources).
2001
- Exploiting extensional knowledge for query reformulation and object fusion in a data integration system
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
Query processing in global information systems integrating multiple heterogeneous sources is a challenging issue in relation to the effective extraction of information available on-line. In this paper we propose intelligent, tool-supported techniques for querying global information systems integrating both structured and semistructured data sources. The techniques have been developed in the environment of a data integration, wrapper/mediator based system, MOMIS, and try to achieve two main goals: optimized query reformulation w.r.t local sources and object fusion, i.e. grouping together information (from the same or different sources) about the same real-world entity. The developed techniques rely on the availability of integrationknowledge, i.e. local source schemata, a virtual mediated schema and its mapping descriptions, that is semantic mappings w.r.t. the underlying sources both at the intensional and extensional level. Mapping descriptions, obtained as a result of the semi-automatic integration process of multiple heterogeneous sources developed for the MOMIS system, include, unlike previous data integration proposals, extensional intra/interschema knowledge. Extensional knowledge is exploited to detect extensionally overlapping classes and to discover implicit join criteria among classes, which enables the goals of optimized query reformulation and object fusion to be achieved.The techniques have been implemented in the MOMIS system but can be applied, in general, to data integration systems including extensional intra/interschema knowledge in mapping descriptions.
2001
- SI-Designer: an Integration Framework for E-Commerce
[Relazione in Atti di Convegno]
I., Benetti; Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
Electronic commerce lets people purchase goods and exchange information on business transactions on-line. Therefore one of the main challenges for the designers of the e-commerce infrastructures is the information sharing, retrieving data located in different sources thus obtaining an integrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach as they are conceived as instruments to dynamically retrieve information from multiple catalogs and present product data in a unified manner, without directly storing product data from catalogs.In this paper we propose SI-Designer, a support tool for the integration of data from structured and semi-structured data sources, developed within the MOMIS (Mediator environment for Multiple Information Sources) project.
2001
- Semantic Integration of Heterogeneous Information Sources
[Articolo su rivista]
Bergamaschi, Sonia; Castano, S.; Vincini, Maurizio; Beneventano, Domenico
abstract
Developing intelligent tools for the integration of information extracted from multiple heterogeneous sources is a challenging issue to effectively exploit the numerous sources available on-line in global information systems. In this paper, we propose intelligent, tool-supported techniques to information extraction and integration from both structured and semistructured data sources. An object-oriented language, with an underlying Description Logic, called ODLI3 , derived from the standard ODMG is introduced for information extraction. ODLI3 descriptions of the source schemas are exploited first to set a Common Thesaurus for the sources. Information integration is then performed in a semiautomatic way by exploiting the knowledge in the Common Thesaurus and ODLI 3 descriptions of source schemas with a combination of clustering techniques and Description Logics. This integration process gives rise to a virtual integrated view of the underlying sources for which mapping rules and integrity constraints are specified to handle heterogeneity. Integration techniques described in the paper are provided in the framework of the MOMIS system based on a conventional wrapper/mediator architecture.
2001
- Supporting information integration with autonomous agents
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Cabri, Giacomo; Guerra, Francesco; Leonardi, Letizia; Vincini, Maurizio; Zambonelli, Franco
abstract
The large amount of information that is spread over the Internet is an important resource for all people but also introduces some issues that must be faced. The dynamism and the uncertainty of the Internet, along with the heterogeneity of the sources of information are the two main challanges for the today’s technologies. This paper proposes an approach based on mobile agents integrated in an information integration infrastructure. Mobile agents can significantly improve the design and the development of Internet applications thanks to their characteristics of autonomy and adaptability to open and distributed environments, such as the Internet. MOMIS (Mediator envirOnment for Multiple Information Sources) is an infrastructure for semi-automatic information integrationthat deals with the integration and query of multiple, heterogeneous information sources (relational, object, XML and semi-structured sources). The aim of this paper is to show the advantage of the introduction in the MOMIS infrastructureof intelligent and mobile software agents for the autonomous management and coordination of the integration and query processes over heterogeneous sources.
2001
- The MOMIS approach to information integration
[Relazione in Atti di Convegno]
Beneventano, D.; Bergamaschi, S.; Guerra, F.; Vincini, M.
abstract
The web explosion, both at internet and intranet level, has transformed the electronic information system from single isolated node to an entry points into a worldwide network of information exchange and business transactions. Business and commerce has taken the opportunity of the new technologies to define the ecommerce activity. Therefore one of the main challenges for the designers of the e-commerce infrastructures is the information sharing, retrieving data located in different sources thus obtaining an integrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach as they are conceived as instruments to dynamically retrieve information from multiple catalogs and present product data in a unified manner, without directly storing product data from catalogs. Customers, instead of having to interact with multiple heterogeneous catalogs, can interact in a uniform way with a virtual catalog. In this paper we propose a designer support tool, called SI-Designer, for information integration developed within the MOMIS project. The MOMIS project (Mediator environment for Multiple Information Sources) aims to integrate data from structured and semi-structured data sources.
2001
- The Momis approach to Information Integration
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Guerra, Francesco; Vincini, Maurizio
abstract
The web explosion, both at internet and intranet level, has transformed the electronic information systemfrom single isolated node to an entry points into a worldwide network of information exchange and businesstransactions. Business and commerce has taken the opportunity of the new technologies to define the ecommerceactivity. Therefore one of the main challenges for the designers of the e-commerceinfrastructures is the information sharing, retrieving data located in different sources thus obtaining anintegrated view to overcome any contradiction or redundancy. Virtual Catalogs synthesize this approach asthey are conceived as instruments to dynamically retrieve information from multiple catalogs and presentproduct data in a unified manner, without directly storing product data from catalogs. Customers, instead ofhaving to interact with multiple heterogeneous catalogs, can interact in a uniform way with a virtual catalog.In this paper we propose a designer support tool, called SI-Designer, for information integration developedwithin the MOMIS project. The MOMIS project (Mediator environment for Multiple Information Sources)aims to integrate data from structured and semi-structured data sources.
2001
- Towards a comprehensive methodological framework for integration
[Relazione in Atti di Convegno]
D., Calvanese; S., Castano; Guerra, Francesco; D., Lembo; M., Melchiori; G., Terracina; D., Ursino; Vincini, Maurizio
abstract
Nowadays, data can be represented and stored by using different formats ranging from non structured data, typical of file systems, to semi-structured data, typical of Web sources, to highly structured data, typical of relational database systems. Therefore,the necessity arises to define new models and approaches for uniformly handling all these heterogeneous information sources. In this paper we propose a framework which aims at uniformly managing information sources having different formats and structures for obtaining a global, integrated and uniform representation.
2000
- Creazione di una vista globale d'impresa con il sistema MOMIS basato su Description Logics
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; S., Castano; A., Corni; R., Guidetti; G., Malvezzi; M., Melchiori; Vincini, Maurizio
abstract
Sviluppare strumenti intelligenti per l'integrazione di informazioni provenienti da sorgenti eterogenee all'interno di un'impresa è un argomento di forte interesse in ambito di ricerca. In questo articolo proponiamo tecniche basate su strumenti intelligenti per l'estrazione e l'integrazione di informazioni provenienti da sorgenti strutturate e semistrutturate fornite dal sistema MOMIS. Per la descrizione delle sorgenti presenteremo e utilizzeremo il linguaggio object-oriented ODLI3 derivato dallo standard ODMG. Le sorgenti descritte in ODLI3 vengono elaborate in modo da creare un thesaurus delle informazioni condivise tra le sorgenti. L'integrazione delle sorgenti viene poi effettuata in modo semi-automatico elaborando le informazioni che descrivono le sorgenti con tecniche basate su Description Logics e tecniche di clustering generando uno Schema globale che permette la visione integrata virtuale delle sorgenti.
2000
- Creazione di una vista globale d'impresa con il sistema MOMIS basato su Description Logics
[Articolo su rivista]
Beneventano, Domenico; Bergamaschi, Sonia; A., Corni; Vincini, Maurizio
abstract
-
2000
- Information integration - the MOMIS project demostration
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; S., Castano; Corni, Alberto; G., Guidetti; M., Malvezzi; M., Melchiori; Vincini, Maurizio
abstract
The goal of this demonstration is to present the main features of a Mediator component, Global Schema Builder of an I3 system, called MOMIS (Mediator envirOnment for Multiple Information Sources). MOMIS has been conceived to provide an integrated access to heterogeneous information stored in traditional databases (e.g., relational, object- oriented) or file systems, as well as in semistructured sources. The demonstration is based on the integration of two simple sources of different kind, structured and semi-structured.
2000
- Information integration: The momis project demonstration
[Relazione in Atti di Convegno]
Beneventano, D.; Bergamaschi, S.; Castano, S.; Cornil, A.; Guidettil, R.; Malvezzi, G.; Melchiori, M.; Vincini, M.
abstract
2000
- MOMIS: un sistema di Description Logics per l'integrazione del sistema informativo d'impresa
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; Corni, Alberto; Vincini, Maurizio
abstract
Taormina
1999
- Distributed Database Support for Data-Intensive Workflow Application
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; S., Castano; C., Sartori; Tiberio, Paolo; Vincini, Maurizio
abstract
Venice, Italy
1999
- Intelligent Techniques for the Extraction and Integration of Heterogeneous Information
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; S., Castano; Vincini, Maurizio; Beneventano, Domenico
abstract
Developing intelligent tools for the integration of informationextracted from multiple heterogeneous sources is a challenging issue to effectively exploit the numerous sources available on-line in global information systems. In this paper, we propose intelligent, tool-supported techniques to information extraction and integration which take into account both structured and semistructured data sources. An object-oriented language called odli3, derived from the standard ODMG, with an underlying Description Logics, is introduced for information extraction. Odli3 descriptions of the information sources are exploited first to set a shared vocabulary for the sources.Information integration is performed in a semi-automatic way, by exploiting odli3 descriptions of source schemas with a combination of Description Logics and clustering techniques. Techniques described in the paper have been implemented in theMOMIS system, based on a conventional mediator architecture.
1999
- ODL-Designer UNISQL: Un'Interfaccia per la Specifica Dichiarativa di Vincoli di Integrità in OODBMS
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Beneventano, Domenico; F., Sgarbi; Vincini, Maurizio
abstract
La specifica ed il trattamento dei vincoli di integrita' rappresenta un tema di ricerca fondamentale nell'ambito dellebasi di dati; infatti, spesso, i vincoli costituiscono la partepiu' onerosa nello sviluppo delle applicazioni reali basate suDBMS. L'obiettivo principale del componente software ODL-Designer UNISQL, presentato nel lavoro, e' quello di consentire alprogettista di basi di dati di esprimere i vincoli di integrita'attraverso un linguaggio dichiarativo, superando quindi l'approcciodegli OODBMS attuali che ne consente l'espressione solo attraverso procedure (metodi etrigger). ODL-Designer UNISQL acquisisce vincoli dichiarativi e genera automaticamente, in maniera trasparente al progettista, le ``procedure'' che implementano tali vincoli.Il linguaggio supportato da ODL-Designer UNISQL e' lo standard ODL-ODMG opportunamente esteso per esprimere vincoli di integrita', mentre l'OODBMS commerciale utilizzato e' UNISQL.
1999
- Semantic Integration of Semistructured and Structured Data Sources
[Articolo su rivista]
Bergamaschi, Sonia; S., Castano; Vincini, Maurizio
abstract
Providing an integrated access to multiple heterogeneous sources is a challenging issue in global information systems for cooperation and interoperability. In this context, two fundamental problems arise. First, how to determine if the sources contain semantically related information, that is, information related to the same or similar real-world concept(s). Second, how to handle semantic heterogeneity to support integration and uniform query interfaces. Complicating factors with respect to conventional view integration techniques are related to the fact that the sources to be integrated already exist and that semantic heterogeneity occurs on the large-scale, involving terminology, structure, and context of the involved sources, with respect to geographical, organizational, and functional aspects related to information use. Moreover, to meet the requirements of global, Internet-based information systems, it is important that tools developed for supporting these activities are semi-automatic and scalable as much as possible. The goal of this paper is to describe the MOMIS [4, 5] (Mediator envirOnment for Multiple Information Sources) approach to the integration and query of multiple, heterogeneous information sources, containing structured and semistructured data. MOMIS has been conceived as a joint collaboration between University of Milano and Modena in the framework of the INTERDATA national research project, aiming at providing methods and tools for data management in Internet-based information systems. Like other integration projects [1, 10, 14], MOMIS follows a “semantic approach” to information integration based on the conceptual schema, or metadata, of the information sources, and on the following architectural elements: i) a common object-oriented data model, defined according to the ODLI3 language, to describe source schemas for integration purposes. The data model and ODLI3 have been defined in MOMIS as subset of the ODMG-93 ones, following the proposal for a standard mediator language developed by the I3/POB working group [7]. In addition, ODLI3 introduces new constructors to support the semantic integration process [4, 5]; ii) one or more wrappers, to translate schema descriptions into the common ODLI3 representation; iii) a mediator and a query-processing component, based on two pre-existing tools, namely ARTEMIS [8] and ODB-Tools [3] (available on Internet at http://sparc20.dsi.unimo.it/), to provide an I3 architecture for integration and query optimization. In this paper, we focus on capturing and reasoning about semantic aspects of schema descriptions of heterogeneous information sources for supporting integration and query optimization. Both semistructured and structured data sources are taken into account [5]. A Common Thesaurus is constructed, which has the role of a shared ontology for the information sources. The Common Thesaurus is built by analyzing ODLI3 descriptions of the sources, by exploiting the Description Logics OLCD (Object Language with Complements allowing Descriptive cycles) [2, 6], derived from KL-ONE family [17]. The knowledge in the Common Thesaurus is then exploited for the identification of semantically related information in ODLI3 descriptions of different sources and for their integration at the global level. Mapping rules and integrity constraints are defined at the global level to express the relationships holding between the integrated description and the sources descriptions. ODB-Tools, supporting OLCD and description logic inference techniques, allows the analysis of sources descriptions for generating a consistent Common Thesaurus and provides support for semantic optimization of queries at the global level, based on defined mapping rules and integrity constraints.
1998
- An Intelligent Approach to Information Integration
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; S., Castano; S., DE CAPITANI DE VIMERCATI; S., Montanari; Vincini, Maurizio
abstract
FORMAL ONTOLOGY IN INFORMATION SYSTEMS Book Series: FRONTIERS IN ARTIFICIAL INTELLIGENCE AND APPLICATIONS Volume: 46 Pages: 253-268
1998
- Exploiting Schema Knowledge for the Integration of Heterogeneous Sources
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; S., DE CAPITANI DE VIMERCATI; S., Montanari; Vincini, Maurizio
abstract
Ancona, Italy
1997
- A semantics-driven query optimizer for OODBs
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; Vincini, Maurizio; Beneventano, Domenico
abstract
ODB-QOptimizer is a ODMG 93 compliant tool for the schema validation and semantic query optimization. The approach is based on two fundamental ingredients. The first one is the OCDL description logics (DLs) proposed as a common formalism to express class descriptions, a relevant set of integrity constraints rules (IC rules) and queries. The second one are DLs inference techniques, exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query.
1997
- ODB-QOptimizer: a tool for semantic query optimization in OODB
[Software]
Beneventano, Domenico; Bergamaschi, Sonia; Sartori, Claudio; Vincini, Maurizio
abstract
ODB-QOPTIMIZER is a ODMG 93 compliant tool for the schema validation and semantic query optimization.The approach is based on two fundamental ingredients. The first one is the OCDL description logics (DLs) proposed as a common formalism to express class descriptions, a relevant set of integrity constraints rules (IC rules) and queries.The second one are DLs inference techniques, exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query.
1997
- ODB-QOptimizer: a tool for semantic query optimization in OODB
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; C., Sartori; Vincini, Maurizio
abstract
Birmingham, UK
1997
- ODB-Tools: a description logics based tool for schema validation and semantic query optimization in Object Oriented Databases
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; C., Sartori; Vincini, Maurizio
abstract
LNAI 1321. Roma
1996
- ODB- Reasoner: un ambiente per la verifica di schemi e l’ottimizzazione di interrogazioni in OODB
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; A., Garuti; Vincini, Maurizio; C., Sartori
abstract
S.Miniato. Atti a cura di Fausto Rabitti et al.
1995
- A semantics-driven query optimizer for OODBs
[Relazione in Atti di Convegno]
Beneventano, Domenico; Bergamaschi, Sonia; C., Sartori; J. P., Ballerini; Vincini, Maurizio
abstract
Semantic query optimization uses problem-specic knowledge (e.g. integrity constraints) for transforming a query into an equivalentone (i.e., with the same answer set) that may be answered more eciently. The optimizer is applicable to the class conjunctive queries is based on two fundamental ingredients. The first one is the ODL description logics proposed as a common formalism to express: class descriptions, a relevant set of integrity constraintsrules (IC rules), queries as ODL types. The second one are DLs (Description Logics) inference techniques exploited to evaluate the logical implications expressed by IC rules and thus to produce the semantic expansion of a given query. The optimizer tentatively applies all the possible transformations and delays the choice of ben-ecial transformation till the end. Some preliminar ideas on ltering activities on the semantically expanded queryare reported. A prototype semantic queryoptimizer (ODB-QOptimizer) for object-oriented database systems (OODBs) is described.
1995
- DL techniques for intensional query answering in OODBs
[Relazione in Atti di Convegno]
Bergamaschi, Sonia; C., Sartori; Vincini, Maurizio
abstract
Int. Workshop on Reasoning about Structured Objects: Knowledge Representation meets Databases
1995
- ODBQOptimizer: un ottimizzatore semantico per interrogazioni in OODB
[Relazione in Atti di Convegno]
J. P., Ballerini; Beneventano, Domenico; Bergamaschi, Sonia; Vincini, Maurizio
abstract
Atti a cura di Antonio Albano et al.