Relational databases are awesome if you are not dealing with huge amounts of dat...

jeltz · on Sept 17, 2012

A relational database at very modest hardware can handle millions of rows, and with a solid database, good hardware and a DBA who knows his shit you can handle billions.

OpenStreetMap has over a billion nodes stored in a PostgreSQL database.

http://www.openstreetmap.org/stats/data_stats.html

My point is that you can get very far with a classic relational database before you have to scale vertically.

einhverfr · on Sept 18, 2012

Billions of rows btw is not a problem even on modest hardware, with a half-decent db, etc. It's analytics on this where it become possibly an issue. Retrieving one row out of 1 billion is not that much more complex than retrieving one row out of 1 million, nor is it that much more expensive computationally assuming the right index is in place.

The problem comes with high concurrency, in particular very high write concurrency, or with very complex queries which require a lot of RAM to do properly. But that's where you need a solid db, good hardware, and a solid DBA.

jeltz · on Sept 18, 2012

Very true. If you do not do much with your data it is trivial to handle.

einhverfr · on Sept 18, 2012

I always get a kick out of Stonebraker selling NewSQL/VoltDB as the solution to this sort of problem. He provides a triangle where at the vertices are VoltDB, column stores, and NoSQL.

However, what he always fails to mention is that some of the most important OLTP that happens is in enterprise ERP systems, and these involve every kind of workload of all of these. Yes, you could probably get some additional performance gains by splitting up your ERP into VoltDB, Vertica, and CouchDB with connectors between them but why?

And relational data doesn't even start to get moderately big until you are in the TB range. This is true even of ERP apps.

Millions of rows are a problem? What are you using? Microsoft Access?

gregjor · on Sept 17, 2012

There may be cases where non-relational databases are the right solution, but unless you are using a toy RDBMS millions of rows is not considered a lot of data. Millions of rows added every day... now you're talking. Oracle handles that kind of thing just fine.