Statistical Relational Databases: Normal Forms
March 1991 (vol. 3 no. 1)
pp. 55-64

Problems associated with defining normal forms of relational tables relevant to statistical processing are discussed. The concepts of derived identifier, class identifier, derived class-counts, count domains, compact domains, and uniform domains for statistical relational tables are introduced. The structures of the first and the second statistical-normal forms and the relational decompositions needed to achieve them are also discussed. It is shown that the statistical-normal form can be an important method to determine whether the usual statistical analysis techniques are valid. Some suggestions are presented for extending the structured query language (SQL) statements to achieve these operations on statistical relational tables. Some results linking Codd's normal forms with statistical normal forms are discussed. Relational statistical abnormalities, called outlyers, are also discussed.

Index Terms:
statistical relational databases; normal forms; relational tables; derived identifier; class identifier; derived class-counts; count domains; compact domains; uniform domains; relational decompositions; statistical analysis; structured query language; SQL; statistical abnormalities; outlyers; query languages; relational databases
S.P. Ghosh, "Statistical Relational Databases: Normal Forms," IEEE Transactions on Knowledge and Data Engineering, vol. 3, no. 1, pp. 55-64, March 1991, doi:10.1109/69.75889
