PageRank

  • PR

Best for: Node importance Aliases: PR

How it works

$$PR(p_i)=\frac{1-d}{N}+d\sum_{p_j\in M(p_i)}\frac{PR(p_j)}{L(p_j)}$$

Models a random surfer: with probability $d$ follow an outgoing link, with probability $1-d$ teleport to any page uniformly at random. The PageRank of $p_i$ is $PR(p_i)=\frac{1-d}{N}+d\sum_{p_j\in M(p_i)}\frac{PR(p_j)}{L(p_j)}$, where $M(p_i)$ are pages linking to $p_i$ and $L(p_j)$ is the out-degree of $p_j$. The score vector is computed by iterating to a fixed point, equivalently the leading eigenvector of the Google matrix $G=d\tilde A+\tfrac{1-d}{N}\mathbf 1\mathbf 1^\top$.

When to use

Computing node importance from link structure alone; SEO, citation analysis, and seed selection.

Watch out

Insensitive to node/edge features; biased toward high-degree nodes; dangling-node handling required.

Common fields

Search · citation networks · fraud signals