Issue No.05 - September/October (2004 vol.19)
Published by the IEEE Computer Society
Nigel Shadbolt , University of Southampton
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/MIS.2004.51
Social networks are affecting the development of computer science and might be the key to the Semantic Web's success.
Networks are everywhere—they operate between genes and proteins, neurons and liver cells, and people and computers. Networks dominate our lives. Whether it's the transportation network that takes us from points A to B or the communication network that routes this message from my computer to the IEEE Intelligent Systems staff in Los Alamitos, networks are fundamental. Perhaps the most pervasive of all are among the most intangible—social networks. Humans can't seem to help but network—it's the natural condition of Homo sapiens. Social networks are perhaps some of the most important structures in our lives.
They're also receiving a lot of attention from social scientists, mathematicians, and a whole range of researchers interested in trying to understand the relationships that exist in the world around us. The psychologist Stanley Milgram was one of the first to demonstrate an intriguing property of these networks—the "small world phenomenon" ( Psychology Today, May 1967, pp. 60–67). In one experiment he arranged for random people in Kansas to send a letter to random individuals in Massachusetts via a sequence of first-name acquaintances. Most of the letters were lost, but some reached the target, on average passing through the hands of only six or seven people.
This small-world phenomenon sparked interest among mathematicians, who have always been intrigued by the character and nature of graphs. With a graph we can capture a potential planet full of relations. If the vertices represent people, we can model the fact that two people know each other on first-name terms by an edge. In terms of this relation, the six billion or so people on the planet form a fairly sparse graph because of the average number of people any person knows. But people tend to have friends who know each other. This means that social networks exhibit clustering—where edges occur with high density to form subgraphs. Following Milgram's observations, if we're to model social networks, we also need a way of obtaining an average path distance between vertices that's not too long.
In the late '90s, work by Duncan Watts and others (popularized in his book Small Worlds: The Dynamics of Networks between Order and Randomness [Princeton Univ. Press, 1999]) showed how to set up graphs that had low path lengths and lots of clustering—"small-world graphs." Researchers have used these to model not just human social relationships but also the structure of energy distribution and invertebrate neural networks.
WHAT'S YOUR ERDÖS NUMBER?
Curiously, mathematicians have become interested in networks not just in themselves but for themselves. The Erdös project ( www.oakland.edu/enp), begun in the '90s by Jerry Grossman and others, looked at the collaborative networks that you could discover through coauthorship on research papers. They took as their start the famous Hungarian mathematician Paul Erdös. He published an extraordinary 1,000+ papers with a wide variety of colleagues across the world. Within this collaboration network, they began by assigning Paul Erdös an Erdös number of 0. Erdös's coauthors have Erdös number 1; people other than Erdös who have coauthored a paper with someone with Erdös number 1 but not with Erdös have Erdös number 2, and so on. If you have no chain of coauthorships connecting you to Erdös, you have an infinite Erdös number.
The project maintains comprehensive data sets capturing these coauthorship details. Within this data you can quickly discern interesting patterns. One striking feature is that Paul Erdös's collaborators tend to be prolific collaborators themselves and do much joint research with each other as well. The quality of the Erdös small world is especially noteworthy. Up to 1998, all the winners of the Fields Medal, one of the top mathematics prizes, had Erdös numbers lower than 6.
Milgram's early work also identified a funneling effect—a very small number of individuals with significantly higher-than-average connectivity did most of the forwarding. These people with large networks of contacts and friends acted as hubs. They were at the center of connections between the vast majority of otherwise weakly connected individuals. Networks that contain relatively few nodes that are highly interconnected, with the rest connecting to only relatively few nodes, are known as scale free. A scale-free network is also a small world.
NETWORKS MEET AI
The relevance of this to any discipline and ours in particular is that it offers an opportunity to look for patterns in our own scientific collaborations. We need ways to characterize what work is being carried out in an area, the impact it's having, who's being influenced by it, who are the sources of expertise on a particular topic, where the hubs are, and so on. These broad questions of scientific-knowledge management and of the structure and nature of scientific domains are also amenable to our methods and technologies.
A prime example is citation analysis. We all cite others' previous work. These citations form networks in which articles are the vertices and a directed edge from A to B indicates that A cites B. Not only have AI and IS researchers built influential citation resources (such as CiteSeer, http://citeseer.ist.psu.edu), but our colleagues are busy providing the means to trawl these citations and the articles themselves for evidence of emerging new areas of research, areas in relative decline, structural relations between topics, and so on. Work is also underway to intelligibly visualize the results of dynamic trends in content popularity and impact. An excellent collection of articles, many of which use the analysis of collaboration networks or citation graphs to understand the content and character of scientific disciplines, appear in Richard Shiffrin and Katy Borner's collection Mapping Knowledge Domains ( Proc. Nat'l Academy of Sciences, 6 Apr. 2004, Suppl. 1, p. 101).
The impact of social networks on our discipline has another interesting angle. It's possible that it might be one of the major ways in which the Semantic Web eventually becomes a reality rather than an aspiration. In my own group, we recently built semantic indexing software—one of this work's aspects was to harvest and crawl structured metadata on the Web. A striking feature of our inventory of such metadata was that a substantial amount of it was FOAF—Friend of a Friend (see www.foaf-project.org). FOAF is essentially a vocabulary for expressing personal information and relationships in a form that's machine readable. Moreover, FOAF documents are relatively easy for machines to process, merge, and aggregate. FOAF's vocabulary lets you assert things such as "My name is …," "I work for …," "I'm interested in …," "I live near …," "I'm pictured in these photos …," and "I write in this weblog …".
Scientists and engineers aren't yet providing the amount of annotated content that's likely to produce the runaway-network effect and bootstrap the Semantic Web. The original Web was kick-started partly by people self-expressing. People built their home pages, search engines let them connect with like-minded souls, and before you knew it communities were starting to condense out of the Web. Perhaps success in the Semantic Web will be about recognizing that people always prove more interesting to each other than technology.
Of course, the move to a computational infrastructure that supports the expression of social interactions and networks raises issues, many of which are ethical and legal. How can we ensure that information of a social nature isn't abused when placed in cyberspace? How do you know the provenance of any of the information appearing? What are the privacy issues about my friends declaring facts about me? What obligations are the aggregators of FOAF information under?
Our social networks—our relationships—evolve and change over time. They most certainly go a long way toward defining us whether at work or play, in research or business. Our technologies and research will likely be an important element in supporting social networks' future development and exploitation.
Speaking of evolution and change, I'm delighted to welcome Fei-Yue Wang to our editorial board (see the sidebar). It's also a real pleasure to announce that Jim Hendler of the University of Maryland has been appointed to succeed me as editor in chief in January 2005. I know that Jim has really exciting plans for the magazine—a title that to my mind is a great example of a collaborative endeavor. I was also pleased to hear that according to the ISI Web of Science, the impact factor of IEEE Intelligent Systems is now 3.725, placing it fourth of the 77 AI journals that ISI tracks. Exploiting the network effect, let me ask you to spread the word. Articles published in IEEE IS are being cited with increasing frequency and impact.