Odds
not hopeless for new Web sites
By
Eric Smalley and Kimberly Patch,
Technology Research News
There is at least a little room at the
top, according to a team of NEC researchers who found that the structure
of groups of related sites on the World Wide Web is different than that
of the Web as a whole.
Recent research has shown that the overall distribution of links on the
Web follows
a power-law structure, meaning that a small number of large Web sites
gain most new links,
making them larger still. "This means that an extremely small number of
Web pages have the vast majority of inlinks," said David Pennock, a research
scientist at NEC Research Institute.
The NEC researchers found that the distribution of inbound links within
specific Web communities isn't quite as highly concentrated, however.
They used the finding to build a model that accounts for the differing
structures of various segments of the Web versus the Web as a whole.
The model can be used as a tool to measure the degree of competitiveness
in a network, said Pennock. "This may be important, for example, to e-commerce
companies looking to enter a new market niche," he said. The model also
applies to natural and social networks like metabolic groups of cells
and actor collaborations, he said.
In a rich-get-richer network, nodes that already have many inbound links
tend to receive more over time. "Mathematically, the probability that
a node receives another inlink is proportional to the number of inlinks
it already has," said Pennock. If a network grows via this preferential
attachment process alone, "the rich nodes keep getting richer and the
poor nodes can never catch up," he said.
Although it is well established that the rich-get-richer phenomenon applies
to the Web as a whole, the research shows that the distribution of inbound
links within Web communities is sometimes different from that of the entire
Web. Pages within a Web community that contain different numbers of links
can have the same probability of receiving a given new link. The researchers
refer to this as uniform attachment, in contrast to the preferential attachment
of the Web as a whole.
In a uniform-attachment community "more Web pages fare better than would
be the case under a pure power-law distribution," said Pennock.
Web communities have a mix of uniform and preferential attachments. For
some communities, the percentage of preferential attachment is high and
link-poor nodes will have difficulty ever catching up, said Pennock. If
the percentage of uniform attachment is high, however, "poor nodes
can often -- with some luck -- get rich, too," he said.
Using the model, the researchers found that the community of e-commerce
Web sites selling publications, a category dominated by Amazon.com, is
highly competitive and is structured similarly to the Web as a whole.
In contrast, the community of e-commerce Web sites selling professional
photographers' services is much less competitive, meaning smaller sites
have a better chance of gaining links in order to grow.
The research also provides insights into the vulnerability of networks
to both accidental failures and malicious attacks, Pennock said. "With
more accurate models of different network types, we can begin to understand
which are more robust and which are more delicate and prone to disruption,"
he said.
The researchers' next steps are to measure competition within different
Web communities and to apply their model to better understand network
fault-tolerance and robustness, said Pennock. They also plan to incorporate
some sense of Web page topics because Web pages about the same topic tend
to link to one another, Pennock said.
Anyone with access to a search engine and who has the ability to program
the researchers' model can now measure the degree of competitiveness within
Web communities, said Pennock. Applications of the model in other areas
including network fault tolerance and mobile phone networks "are probably
about two years off," he said.
Pennock's research colleagues were Gary W. Flake, Steve Lawrence and Eric
J. Glover of NEC Research Institute and C. Lee Giles of NEC Research Institute
and Pennsylvania State University. They published the research in the
April 16, 2002 issue of the Proceedings of the National Academy of Sciences.
The research was funded by NEC Corporation.
Timeline: Now, 2 years
Funding: Corporate
TRN Categories: Internet
Story Type: News
Related Elements: Technical paper, "Winners Don't Take All:
Characterizing the Competition for Links on the Web," Proceedings of the
National Academy of Sciences, April 16, 2002
Advertisements:
|
April
17/24, 2002
Page
One
Shake and serve
Odds not hopeless
for new Web sites
Content scheme
banishes browser plug-ins
Polarized light speeds
messages
File compressor ID's authors
News:
Research News Roundup
Research Watch blog
Features:
View from the High Ground Q&A
How It Works
RSS Feeds:
News | Blog
| Books
Ad links:
Buy an ad link
Advertisements:
|
|
|
|