I. Some Examples/Applications in
Databases (and Data Mining)
A. Some simple granulation
1. Simplest granulation is a partition
A partition P of a classical set U is a collection of subsets that are mutually disjoint and their union is U. Each subset is called an equivalence class.
Observe that a partition defines and is defined by an equivalence relation R.
2. Binary granulation: Let B be a binary relation on U. For each p Î U, we associate a subset B(p) ={u | (p, u) Î B }. The subset B(p) is called a granule; and
the collection of B(p) for all p is called a granulation or a granular structure. .
B.Relational Table (relation instance) has a very natural granular
structure;
2.1 (Pawlak Theorem). n-column relational table « (U, E1, E2, . . .), where each E1, E2, . . . are equivalence relations induced by attributes A1, A2, . . ..
2.2 Example: Let U={h1, h2, h3, h4, h5, h6} be a set of 6 persons, where each hi represents a person.
Let us consider one column table (exhibit in horizontally)
|
U (set of entities) |
h1 |
h2 |
h3 |
h4 |
H5 |
h6. |
|
Hair color |
Black |
Black |
Brown |
Red |
Red |
Brown |
Then this attribute defines a partition on U:
Equivalence class #1 ={h1, h2},
Equivalence class #2 ={h3, h6}
Equivalence class #3 ={h4, h5},
One column table (single attribute relation) « (U, R)
Each equivalence class is a subset of U. If the set U is ordered, then subsets
can be represented by bitmaps
Equivalence class #1 =(110000)
Equivalence class #2 =(001001)
Equivalence class #3 =(000110),
This is the same as the bitmap representations in relational table (see, Garcia-Monolina and etc (Stanford U) database book, pp 702.)
Applications Tsau Young Lin DBLP #47, #32 and #69;
Result 1: Given a relation instance (relational table), there are at most finite
number (non-isomorphic) attributes(features) can be constructed.
Result 2: Association rules can be computed by GrC(bitmaps) very quickly
Result 3: All generalized association rules can be found by solving linear inequalities
3. The set of attribute values (called active domain in David Meyer’s Book) in a Column of a Relational Table (relation instance) may have additional granular structure to represent some semantic or additional knowledge about active domain;
3.1. David Hsiao (well known for database machine) had imposed an equivalence relation on each active attribute domain: An equivalence class consists of semantically near data and they should be stored in physical vicinity
Such a data model is called Attribute Based data Model,
Eugene Wong DBLP #2: Eugene Wong, T. C. Chiang: Canonical Structure in Attribute Based File Organization. Commun. ACM 14(9): 593-597 (1971)
3.2 Seymour Ginsburg and Richard Hull had imposed order
relation on each attribute domain (for example,
numerical attribute domain has the ordered binary
relation, £, of numbers)
1.
2.
3.2.1. A Binary Relation B is a simpler kind of granulation
B(p)={q | q Î U such that (p, q) Î B}
B(p) is a granule.
C. Relational Tables with a binary
relation on attribute domains that carries some semantic
or additional knowledge
(Extended Pawlak Theorem) Such a relation « (U, B1, B2, ), where B1, B2, are binary relations that carries some semantic or additional knowledge.
3.3. W. Chu and T. Y. Lin had imposed neighborhood system on attribute Domain
4.1. W. Chu DBLP #31 (1992) Wesley W. Chu, Qiming Chen: Neighborhood and Associative Query Answering. J. Intell. Inf. Syst. 1(3/4): 355-382 (1992)
4.2. T. Y. Lin DBLP #1 (1988) T.
Y. Lin: T. Y. Lin: Neighborhood systems and relational databases. ACM Conference
on Computer Science 1988: 725
(Lin’s website in All Publication list #24 #25: T. Y. Lin: “Neighborhood Systems and Approximation in Database and Knowledge Base Systems”, Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems, Poster Session, October 12-15, 1989, pp. 75-86.
D. Relational Tables with a
neighborhood system (a pre-topological space) on each attribute domain that
carries some semantic or additional knowledge
(Extended Pawlak Theorem) Such a relation « (U, N1, N2, ), where N1, N2, are neighborhood systems(pre-topological spaces) that carries some semantic or additional knowledge
E Relational Tables with a n-ary relation on each attribute
domain that carries some semantic or additional knowledge
(Extended Pawlak Theorem) Such a relation « (U, G1, G2, ), where G1, G2, are general n-ary relations (different n for different G) that carries some semantic or additional knowledge
This is unexplored area