I. Some Examples/Applications in Databases (and Data Mining)

A. Some simple granulation

1. Simplest granulation is a partition

A partition P of a classical set U is a collection of subsets that are mutually disjoint and their union is U. Each subset is called an equivalence class.

Observe that a partition defines and is defined by an equivalence relation R.

2. Binary granulation: Let B be a binary relation on U. For each p Î U, we associate a subset B(p) ={u |  (p, u) Î B }. The subset B(p) is called a granule; and

the collection of B(p) for all p is called a granulation or a granular structure. .

 

B.Relational Table (relation instance) has a very natural granular structure;

 

2.1 (Pawlak Theorem). n-column relational table  « (U, E1, E2, . . .),  where each E1, E2, . . . are equivalence relations induced by attributes A1, A2, . . .. 

 

      2.2 Example:  Let U={h1, h2, h3, h4, h5, h6} be a set of 6 persons, where each hi represents a person.

              Let us consider one column table (exhibit in horizontally)

 

U (set of entities)

h1

 h2

h3

h4

H5

h6.

Hair color

Black

Black

Brown

Red

Red

Brown

         

             Then this attribute defines a partition on U:

Equivalence class #1 ={h1,  h2}, 

Equivalence class #2 ={h3,  h6}

                      Equivalence class #3 ={h4,  h5}, 

 

One column table (single attribute relation) « (U, R)

            Each equivalence class is a subset of U. If the set U is ordered, then subsets

        can be represented by bitmaps

Equivalence class #1 =(110000)

Equivalence class #2 =(001001)

Equivalence class #3 =(000110), 

       This is the same as the bitmap representations in relational table (see, Garcia-Monolina and etc (Stanford U) database book, pp 702.)

 

            Applications Tsau Young Lin DBLP #47, #32 and #69; 

Result 1: Given a relation instance (relational table), there are at most finite 

                number (non-isomorphic) attributes(features) can be constructed.

Result 2: Association rules can be computed by GrC(bitmaps) very quickly

Result 3:  All generalized association rules can be found by solving linear inequalities

 

3.        The set of attribute values (called active domain in David Meyer’s Book)  in a Column of a Relational Table (relation instance) may have additional granular structure to represent some semantic or additional knowledge about active domain;

   

3.1.  David Hsiao (well known for database machine) had imposed an equivalence relation on each active attribute domain: An equivalence class consists of semantically near data and they should be stored in physical vicinity

 

Such a data model is called Attribute Based data Model,

Eugene Wong  DBLP #2: Eugene Wong, T. C. Chiang: Canonical Structure in Attribute Based File Organization. Commun. ACM 14(9): 593-597 (1971)

 

         3.2 Seymour Ginsburg and Richard Hull had  imposed order

                relation on   each attribute domain (for example,

                numerical attribute domain has the ordered binary

                 relation, £,  of numbers)

Seymour Ginsburg #70 and #76: 

1. Seymour Ginsburg, Richard Hull: Ordered Attribute Domains in the Relational Model. XP2 Workshop on Relational Database Theory 1981

2. Seymour Ginsburg, Richard Hull: Order Dependency in the Relational Model. Theor. Comput. Sci. 26: 149-195 (1983)

 

      3.2.1.  A Binary Relation B is a simpler kind of granulation

B(p)={q | q Î U such that  (p, q) Î B}

            B(p) is a granule.

 

C. Relational Tables with a binary relation on attribute domains that carries some semantic or additional knowledge

 

(Extended Pawlak Theorem)  Such a relation  « (U, B1, B2,     ),  where B1, B2, are binary relations that carries some semantic or additional knowledge.

 

         3.3. W. Chu and T. Y. Lin had imposed neighborhood system on attribute Domain

4.1. W. Chu DBLP #31 (1992) Wesley W. Chu, Qiming Chen: Neighborhood and Associative Query Answering. J. Intell. Inf. Syst. 1(3/4): 355-382 (1992)

4.2. T. Y. Lin DBLP #1 (1988) T. Y. Lin: T. Y. Lin: Neighborhood systems and relational databases. ACM Conference on Computer Science 1988: 725

 

(Lin’s website in All Publication list #24 #25: T. Y. Lin: “Neighborhood Systems and Approximation in Database and Knowledge Base Systems”, Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems, Poster Session, October 12-15, 1989, pp. 75-86.

 

D. Relational Tables with a neighborhood system (a pre-topological space) on each attribute domain that carries some semantic or additional knowledge

 

(Extended Pawlak Theorem)  Such a relation  « (U, N1, N2,     ),  where N1, N2, are neighborhood systems(pre-topological spaces) that carries some semantic or additional knowledge

 

E Relational Tables with a n-ary relation on each attribute domain that carries some semantic or additional knowledge

 

(Extended Pawlak Theorem)  Such a relation  « (U, G1, G2,     ),  where G1, G2, are general  n-ary relations (different n for different G) that carries some semantic or additional knowledge

This is unexplored area