Random surfing model

The random surfing model is a subset of the random effects model which describes the probability of a random user visiting a web page. Each new node describes a person setting up a web page. Nodes are connected to the existing graph by choosing a start node at random, then performing a short and random traversal of the nodes, or random walk.^[1] Connections to other nodes in this graph are formed when outbound links are placed on the page.

Introduction

A user navigates the internet in two primary ways; the user may access a site directly by entering the site's URL or clicking a bookmark, or the user may use a series of hyperlinks to get to the desired page. The random surfer model assumes that the link which the user selects next is picked at random. The model also assumes that the number of successive links is not infinite – the user will at some point lose interest and leave the current site for a completely new site.^[1]

Graph definitions

In the random surfing model, webgraphs are presented as a sequence of directed graphs $G_{t},t=1,2,...$ such that a graph $G_{t}$ has $t$ vertices and $t$ edges. The process of defining graphs is parameterized with a probability $p$ , thus we let $q=1-p$ .^[2]

Nodes of the model arrive one at time, forming $k$ connections to the existing graph $G_{t}$ . In some models, connections represent directed edges, and in others, connections represent undirected edges. Models start with a single node $v_{0}$ and have $k$ self-loops. $v_{t}$ denotes a vertex added in the $t^{th}$ step, and $n$ denotes the total number of vertices.^[1]

Model 1. (1-Step Walk With Self-Loop)

Parameters
$k$	$p$	$t$

At time $t$ , vertex $v_{t}$ makes $k$ connections by $k$ iterations of the following steps:

Pick an existing node $v$ uniformly at random from $\{v_{0},v_{1},...,v_{t-1}\}$
With probability $p$ stay at $v$ ; with probability $1-p$ take a 1-step walk to a random neighbor of $v$
Add an edge from $v_{t}$ to the current node

For directed graphs, edges added are directed from $v_{t}$ into the existing graph. Edges are undirected in respective undirected graphs.

Model 2. (Random walks with coin flips)

Parameters
$k$	$p$	$t$

At time $t$ , vertex $v_{t}$ makes $k$ connections by $k$ iterations of the following steps:

Pick an existing node $v$ uniformly at random from $\{v_{0},v_{1},...,v_{t-1}\}$
Flip a coin of bias $p$
If the coin comes up heads add an edge from $v_{t}$ to the current node and stop
If the coin comes up tails, move to a random neighbor of the current node and go back to step 2

For directed graphs, edges added are directed from $v_{t}$ into the existing graph. Edges are undirected in respective undirected graphs.

Limitations

There are some caveats to the standard random surfer model, one of which is that the model ignores the content of the sites which users select – since the model assumes links are selected at random. Because users tend to have a goal in mind when surfing the internet, the content of the linked sites is a determining factor of whether or not the user will click a link.^[1]^[2]

Application

The normalized eigenvector centrality combined with random surfer model's assumption of random jumps created the foundation of Google's PageRank algorithm.^[2]^[3]

References

^ ^a ^b ^c ^d Blum, Avrim; Chan, T-H. Hubert; Rwebangira, Mugizi Robert (21 January 2006). Written at 3600 University City Science Center Philadelphia, PA, United States. "A Random-Surfer Web-Graph Model" (PDF). Computer Science Department. ANALCO '06: Proceedings of the Meeting on Analytic Algorithmics and Combinatorics. Carnegie Mellon University: Society for Industrial and Applied Mathematics: 238–246.{{cite journal}}: CS1 maint: location (link)
^ ^a ^b ^c Chebolu, Prasad; Melsted, Páll (1 January 2008). "PageRank and the random surfer model" (PDF). Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Department of Mathematical Sciences, Carnegie Mellon University: 1010–1018. doi:10.1145/1347082.1347193.
^ Zaki, Mohammed J.; Meira, Jr., Wagner (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press. ISBN 9780521766333.

External Links

[:0-1] Blum, Avrim; Chan, T-H. Hubert; Rwebangira, Mugizi Robert (21 January 2006). Written at 3600 University City Science Center Philadelphia, PA, United States. "A Random-Surfer Web-Graph Model" (PDF). Computer Science Department. ANALCO '06: Proceedings of the Meeting on Analytic Algorithmics and Combinatorics. Carnegie Mellon University: Society for Industrial and Applied Mathematics: 238–246.{{cite journal}}: CS1 maint: location (link)

[:1-2] Chebolu, Prasad; Melsted, Páll (1 January 2008). "PageRank and the random surfer model" (PDF). Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms. Department of Mathematical Sciences, Carnegie Mellon University: 1010–1018. doi:10.1145/1347082.1347193.

[3] Zaki, Mohammed J.; Meira, Jr., Wagner (2014). Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press. ISBN 9780521766333.

[1]

[2]

[3]