Why don’t we have something more “torrent-like” for search?
Imagine a decentralized network where volunteers run crawler nodes that each fetch and extract a tiny slice of the web. Those partial results get merged into open, versioned indexes that can be distributed via P2P (or mirrored anywhere). Then anyone can build ranking, vertical search, or specialized tools on top of that shared index layer.
I get that reproducing Google’s “Coca-Cola formula” (ranking, spam fighting, infra, freshness, etc.) is probably unrealistic. But I’d happily use the coconut-water version: an open baseline index that’s good enough, extensible, and not owned by a single gatekeeper.
I know we have common crawl, but small processing nodes can be more efficient and fresh
Imagine a decentralized network where volunteers run crawler nodes that each fetch and extract a tiny slice of the web. Those partial results get merged into open, versioned indexes that can be distributed via P2P (or mirrored anywhere). Then anyone can build ranking, vertical search, or specialized tools on top of that shared index layer.
I get that reproducing Google’s “Coca-Cola formula” (ranking, spam fighting, infra, freshness, etc.) is probably unrealistic. But I’d happily use the coconut-water version: an open baseline index that’s good enough, extensible, and not owned by a single gatekeeper.
I know we have common crawl, but small processing nodes can be more efficient and fresh