PrimeKG Pathfinding Algorithm Benchmark Laboratory

Benchmarking Graph Algorithms for Drug Repurposing Discovery

Benchmark Results

We evaluated eight algorithms on 150 curated drug–disease pathways mapped from DrugMechDB into PrimeKG. The main result is that edge weighting alone produces very small differences, while changing the search strategy leads to the first clear improvement in mechanism recovery.

Headline Findings

Phase 1 Convergence

The five edge-weighting methods cluster tightly together. The overall F1 spread is only 0.023, which shows that graph topology dominates edge reweighting.

Best Performer

Bidirectional search achieved the best overall performance with F1 = 0.5709 and Edit Distance = 0.4291, outperforming all Phase 1 baselines.

Main Interpretation

The problem is not just which edge weights we use. The deeper issue is that shortest-path methods systematically collapse long mechanisms into short shortcut paths.

Algorithm Performance Across 150 Pathways

Phase 1 compares edge-weighting strategies. Phase 2 introduces search-strategy changes. Bidirectional search is the first method that improves both node-overlap accuracy and path-order similarity.

F1 Score ↑

Phase 1
Semantic Bridging
0.5569
PageRank-Inverse
0.5472
Dijkstra
0.5448
Hub-Penalized
0.5403
Meta-Path BFS
0.5338
Phase 2
Bidirectional ★
0.5709
K-Shortest + Bio
0.5553
Bidir + Relation Wt
0.5491

Edit Distance ↓

Phase 1
Semantic Bridging
0.5374
PageRank-Inverse
0.5437
Hub-Penalized
0.5513
Meta-Path BFS
0.5612
Dijkstra
0.6146
Phase 2
Bidirectional ★
0.4291
K-Shortest + Bio
0.4447
Bidir + Relation Wt
0.4509

Phase 1 convergence: F1 spread = 0.023. Edge weighting alone does not substantially change performance. The graph structure itself dominates.

Case Studies

These examples show two common outcomes. Some algorithms collapse long mechanisms into short shortcuts. Others recover the correct intermediate proteins when shortcut hubs are discouraged.

Case Study

Regorafenib → GIST: 5-Node Kinase Signaling Mechanism

Ground Truth 5 nodes
DRGREG
Regorafenib
PRTPDGFRA
PDGF Receptor α
BIOPROLIF
Cell Proliferation
PRTKIT
KIT Receptor
DISGIST
Gastroint. Stromal Tumor
Semantic Bridging 1 / 3 intermediates
DRGREG
Regorafenib
PRTPDGFRA
PDGF Receptor α
DISGIST
Gastroint. Stromal Tumor
▸ Upstream target only. Finds PDGFRA via name similarity, but skips cell proliferation and KIT.
PageRank-Inverse 1 / 3 intermediates
DRGREG
Regorafenib
PRTKIT
KIT Receptor
DISGIST
Gastroint. Stromal Tumor
▸ Downstream hub only. Finds KIT, but misses the upstream signaling chain and biological process.
Case Study

Pegvisomant → Acromegaly: 4-Node Growth Hormone Mechanism

Ground Truth 4 nodes
DRGPEG
Pegvisomant
PRTGHR
Growth Hormone Receptor
PRTGH1
Somatotropin
DISACR
Acromegaly
Dijkstra — Shortest Path 0 / 2 intermediates
DRGPEG
Pegvisomant
DISACR
Acromegaly
✗ Direct indication edge. Skips GHR and GH1 entirely, so no mechanism is recovered.
Hub-Penalized 2 / 2 intermediates
DRGPEG
Pegvisomant
PRTGHR
Growth Hormone Receptor
PRTGH1
Somatotropin
DISACR
Acromegaly
✓ Full mechanism recovered. Hub penalization recovers both correct protein intermediates.

Failure Analysis

Failure patterns are not random. The strongest recurring issue is routing through highly connected hub nodes, which lowers biological specificity.

Hub Routing Hurts F1

For bidirectional search, pathways that route through high-degree hub intermediates have a lower average F1 than pathways that avoid them. This supports the idea that shortcut nodes reduce mechanistic fidelity.

Shortest Paths Are Too Short

Ground-truth mechanisms average 5.8 nodes, while most predicted paths are much shorter. This means many algorithms terminate early before traversing the actual biological cascade.

Node Selection Is the Bottleneck

Forcing longer paths does not automatically solve the problem. Once the first hop is wrong, downstream predictions tend to remain mechanistically incorrect.

Key Takeaways