Webgraph
Appearance
The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink on page X, referring to page Y.
Properties
- The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the Erdős–Rényi model:[1] in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear,[2] however: it is relatively well described by a lognormal distribution, as well as the Barabási–Albert model for power laws.[3][4]
- The webgraph is an example of a scale-free network.
Applications
- The webgraph is used for computing the PageRank [5] of the WWW pages.
- The webgraph is used for computing the personalized PageRank.[6]
- The webgraph can be used for detecting webpages of similar topics, through graph-theoretical properties only, like co-citation [7]
- The webgraph is applied in the HITS algorithm for identifying hubs and authorities in the web.
References
- ^ P. Erdős, A. Renyi, Publ. Math. Inst. Hung. Acad. Sci. 5 (1960)
- ^ Meusel, R.; Vigna, S.; Lehmberg, O.; Bizer, C. (2015). "The Graph Structure in the Web - Analyzed on Different Aggregation Levels". Journal of Web Science. 1 (1): 33–47. doi:10.1561/106.00000003.
- ^ Clauset, A.; Shalizi, C. R.; Newman, M. E. J. (2007). "Power-law distributions in empirical data". SIAM Rev. 51 (4): 661–703. doi:10.1137/070710111.
- ^ Barabási, Albert-László; Albert, Réka (October 1999). "Emergence of scaling in random networks" (PDF). Science. 286 (5439): 509–512. doi:10.1126/science.286.5439.509. PMID 10521342..
- ^ S. Brin, L. Page, Computer Networks and ISDN Systems 30, 107 (1998)
- ^ Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web (WWW '03). ACM, New York, NY, USA, 271–279. DOI 10.1145/775152.775191
- ^ Ravi Kumar, Prabhakar Raghavan, Sridhar Rajagopalan, Andrew Tomkins, 1999 Trawling the Web for emerging cyber-communities, Computer Networks, Volume 31, Issues 11–16, Pages 1481–1493 doi:10.1016/S1389-1286(99)00040-7.