Original author here. This is interesting -- I tried to do my best to find out w...

aristus · on June 3, 2015

HipHop's replacement was pretty widely reported: http://www.wired.com/2013/06/facebook-hhvm-saga/

This link has papers on pub/sub, HHVM, and so on: https://research.facebook.com/publications/

Re Haystack: possible I am misremembering, or the project to completely replace Haystack stalled since I left.

If you want to gather a more complete picture of infrastructure at these companies I suggest, well, not imposing the strange limitation of only reading peer-reviewed papers. Almost none of the stuff I worked on ended up in conference proceedings.

ms705 · on June 3, 2015

Thanks, I've added HHVM and marked HipHop as superseded.

I also added Wormhole, which I think is the pub/sub system you're referring to (published in NSDI 2015: https://www.usenix.org/system/files/conference/nsdi15/nsdi15...).

Updated version at original URL: http://malteschwarzkopf.de/research/assets/facebook-stack.pd...

Regarding the focus on academic papers: I agree that this does not reflect the entirety of what goes on inside companies (or indeed how open they are; FB and Google also release lots of open-source software). Certainly, only reading the papers would be a bad idea. However, a peer-reviewed paper (unlike, say, a blog post) is a permanent, citable reference and is part of the scientific literature. This sets a quality bar (enforced through peer review, which deemed the description to be plausible and accurate), and allows the amount of information to remain manageable. The number of other sources of information makes them impractical to write up concisely, and it is hard to say what ought to be included and what should not when going beyond published papers.

I don't think anyone should base their perception of how Google's or Facebook's stack works on these charts and bibliographies -- not least because they will quickly be out of date. However, I personally find them useful as a quick reference to comprehensive, high-quality descriptions of systems that are regularly mentioned in discussions :-)

Maro · on June 4, 2015

Other ways to get information:

1. Ask the developers per email.

2. Fly out to SF, visit the campus, have lunch.

Will work for FB (I do it all the time), Google people won't tell you anything.

kajecounterhack · on June 3, 2015

Re Hiphop I think HHVM + Hack (Facebook's internal "improved PHP") has superseded it but while HHVM is open sourced Hack isn't public.

chrsm · on June 3, 2015

Hack is just a piece of HHVM, it's open: http://hacklang.org/

kajecounterhack · on June 3, 2015

Ah thanks, I didn't realize it was out. I read a separate article that said it hadn't been released yet -- it was probably outdated.

nulltype · on June 3, 2015

Does Spanner talk to Bigtable? From reading the paper, I thought it was built directly on Colossus.

ms705 · on June 3, 2015

Ah, you're right!

I misread "This section [...] illustrate[s] how replication and distributed transactions have been layered onto our Bigtable-based implementation." in the paper as meaning that Spanner is partly layered upon BigTable, but what it really means is that the implementation is based upon (as in, inspired by) BigTable.

Spanner actually has its own tablet implementation as part of the spanserver (stored in Colossus) and does not use BigTable. I've amended the diagram to reflect this.