Jump to content

Webgraph

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Trappist the monk (talk | contribs) at 19:50, 20 March 2014 (Properties: Fix CS1 unknown parameter errors using AWB). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink on page X, referring to page Y.

Properties

  • The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the Erdős–Rényi model:[1] in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear, however: it is well described by a lognormal distribution, as well as the Barabási–Albert model for power laws.[2][3]
  • The webgraph is an example of a scale-free network.

Applications

  • The webgraph is used for computing the PageRank [4] of the WWW pages.
  • The webgraph is used for computing the personalized PageRank.[5]
  • The webgraph can be used for detecting webpages of similar topics, through graph-theoretical properties only, like co-citation [6]
  • The webgraph is applied in the HITS algorithm for identifying hubs and authorities in the web.

References

  1. ^ P. Erdős, A. Renyi, Publ. Math. Inst. Hung. Acad. Sci. 5 (1960)
  2. ^ Clauset, A.; Shalizi, C. R.; Newman, M. E. J. (2007). "Power-law distributions in empirical data". SIAM Rev. (51): 661–703. doi:10.1137/070710111.
  3. ^ Barabási, Albert-László; Albert, Réka (October 1999). "Emergence of scaling in random networks" (PDF). Science. 286 (5439): 509–512. doi:10.1126/science.286.5439.509. PMID 10521342..
  4. ^ S. Brin, L. Page, Computer Networks and ISDN Systems 30, 107 (1998)
  5. ^ Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web (WWW '03). ACM, New York, NY, USA, 271–279. DOI=10.1145/775152.775191 http://doi.acm.org/10.1145/775152.775191
  6. ^ Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, Trawling the Web for emerging cyber-communities, Computer Networks, Volume 31, Issues 11–16, 17 May 1999, Pages 1481–1493, ISSN 1389–1286, doi:10.1016/S1389-1286(99)00040-7.