Consider adding support for Personalized PageRank (aka Topic Sensitive PageRank) #271

Description
An extension of the PageRank (PR) algorithm exists referred to as both Personalized PageRank (PPR) and Topic Sensitive PageRank.
The PR algorithm ranks all vertices of a graph, where more “important” nodes gains high scores. This score is highly dependent on the degree of the node. The PR algorithm and the random surfer view on this algorithm is described in multiple sources ( start here https://en.wikipedia.org/wiki/PageRank) .
In PPR each vertex has a probability/score representing the likelihood that the random surfer visits that node, in the case where he selects to “teleport”. In PPR the probability of teleportation is also controlled by the damping factor.
The parameter introduced in PPR is often referred to as the restart vector, being a vector of probabilities of length N. N being the number of vertices in the graph.
More effort and research should done in finding the parallell implementation of PPR, most suitable to implement in neo4-graph-algorithms.
The PPR is more personalized than PR, in the sense that the graph walk are biased towards the nodes with high scores in the restart vector. This algorithm has several use cases i.e. recommendation systems, personalized searches, who-follows-who, fraud propagation.
My current application of PPR is fraud propagation, and in my opinion it would make sense to support this in neo4j. Both because you now have support for PR. And because neo4j already is powerful when it comes to fraud detection using “pattern matching” with Cypher. PPR support can take this a step further.
Some thoughts on implementation
- The PPR should take a parameter referring to the node property holding the restart probability
- In the case where only a few nodes have restart prob. > 0 some optimisation is necessary. In these sparse scenarios it does not make sense to visit all nodes to fetch the restart prob.
- Consider adding support for weights on edges, this is more relevant for PPR than PR.
- Consider adding support for directed edges. Only following outgoing edges.