In this paper, we measure and analyze the graph features of Semantic Web (SW) schemas with focus on power-law degree distributions. Our main finding is that the majority of SW schemas with a significant number of properties (resp. classes) approximate a power-law for total-degree (resp. number of subsumed classes) distribution. Moreover, our analysis revealed some emerging conceptual modeling practices of SW schema developers, namely: a) each schema has a few focal classes that have been analyzed in detail (i.e., having numerous properties and subclasses) which are further connected with focal classes defined in other schemas, b) the class subsumption hierarchies are mostly unbalanced (i.e., some branches are deep and heavy, while others are shallow and light), c) most properties have as domain/range classes that are located highly at the class subsumption hierarchies and d) the number of recursive/multiple properties is significant. The knowledge of these features is essential for guiding synthetic SW schema generation, which is an important step towards benchmarking SW repositories and query languages implementations.
Semantic Web, power-laws, conceptual schemas morphology
Vassilis Christophides, Yannis Theoharis, Yannis Tzitzikas, Dimitris Kotzinos, "On Graph Features of Semantic Web Schemas", IEEE Transactions on Knowledge & Data Engineering, vol. 20, no. , pp. 692-702, May 2008, doi:10.1109/TKDE.2007.190735
