General Value Conversion and Aggregation Operations: Definition and Integration with Relational, Entity-Based and Deductive Data Retrieval Techniques.

Kalervo Järvelin# and Timo Niemi+

#Department of Information Studies
+Department of Computer Science
University of Tampere
P.O.Box 607
FIN-33101 TAMPERE, Finland

University of Tampere, Department of Computer Science, Report A-1992-3, July 1992. 206 p.


Abstract

Existing DBMS's do not support sufficiently advanced information retrieval in heterogeneous facts database environments. From the user viewpoint this means that many information needs cannot be satisfied solely by traditional fact database operations. Transitive computation, multi-level aggregation and value conversion are frequently needed together with traditional operations. The possibilities for providing these capabilities in fact database management systems based on the relational data model are considered in this paper.

It is important that the extensions for advanced data retrieval are made in a uniform way with other relational processing. This means that, on one hand, new relational operations are developed, and on the other hand, non-relational operations are integrated with relational processing via predicates expressed within relational operations. In this paper, relational algebra is extended by two generalized relational operations : one for multi-level aggregation and the other for value conversion. A set of non-relational operations (called deductive operations) for performing transitive computation is also introduced. User's query formulation can also be facilitated by providing him with an entity-based data retrieval operation on a high abstraction level. Such a high-level entity-based data retrieval operation is also introduced.

The value conversion operation provides unit of measurement -related transparency. It supports very versatile conversion (including conversion of compound attributes) and checks automatically the derivability of conversion requests. The conversion expressions require minimal information from the user. The data aggregation operation provides aggregation level transparency. It supports, among others, multiple layered aggregation levels and hierarchical reclassification of the classification attributes determining the aggregation levels. In the data aggregation operation, the functional dependencies between the source and result relations are connected in a complex way. Both operations are defined in this paper in an exact way so that they construct both the instances and the schemas, including functional dependencies, of the result relations.

Complex data retrieval requires that value conversion, aggregation, deductive operations and entity-based data retrieval operation are integrated with traditional relational operations. In this paper, a query language for advanced information retrieval consisting of these operations is developed. This language allows the intermixing of these operations with each other without limiting the nesting levels. Special attention is paid to the structures, primitives and principles in terms of which the operations and the query language can be implemented. All these aspects are defined formally, in a functional way. In other words, the definition is independent of any programming language. In addition, the concretization of these aspects in a prototype system based on Prolog -language and a workstation environment are considered.

Keywords: relational databases, entity types, heterogeneous databases, data inconsistency, value conversion, data aggregation, deduction.


Return to Kal's home page.
Return to Kal's publication list.
Paluu Kallen kotisivulle.
Paluu Kallen julkaisuluetteloon.