The best work I've ever done was for one of the biggest energy/agg companies in the world. They wanted an outside contractor to build something from scratch and ignore their normal corporate bureaucracy. They thought it was insane how much better the system we (I) built for them was than their normal in house tooling. The crazy part was the solutions were mostly just normal cloud tech. Postgres, DynamoDB, some AWS tools. They paid over a million dollars (nothing to them) for a set of tools that weren't technically advanced at all.
In my experience, the problems are a result of the manufactured constraints on the physical hardware/sensor devices. Mountings and power and connectivity. Receiving data from an array of devices into a shared storage system for analysis is a well solved problem. Receiving data from multiple sensors on a fixed interval, when the sensors may be made by different companies/work totally differently, is where the complexity lives. Combined with each company trying to build their own awful closed source proprietary data system on top of their sensors, you've got a really terrible time.
> SQL turns out to be really slow for this kind of data
I think this is just poor modeling. SQL is just fine for the work you're talking about.
> Combined with each company trying to build their own awful closed source proprietary data system on top of their sensors, you've got a really terrible time.
The more I think about right to repair, the more I become convinced it's a symptom of hazy interface specifications.
Radical idea: Ban sensor / actuator companies from building software on top of them in-house.
They're welcome to offer a turn-key solution to market, but it must (1) have hardware and software built by two separate, independent companies & (2) publish its interface specs, between those two companies, to all customers or end users.
Things would cost more. But I'm not convinced this would be a worse world, in aggregate.
Kind of simpler to me, is just making closed-source drivers illegal. I like the idea of being able to buy something cheap because they were too lazy to actually write a spec for how it works. But not being able to read the de facto spec - the source code for the driver - is just silly. I've paid for the device, the driver is basically useless except for that device. Give me the code.
Not completely related, but I also briefly consulted at Monasanto in an entire department that seemed composed of consultants hired to build something from scratch and ignore the normal bureaucracy.
It was fun because we were hired to do things that in house employees had failed to do for whatever reason, so when I built the little dashboard I was hired for, everyone was so amazed.
I just completed the job I was given, but to them I was a superstar.
Consulting can be fun like that.
> I think this is just poor modeling. SQL is just fine for the work you're talking about.
I'm as big a booster of good old SQL as anyone, but there's a lot to be said for more targeted time series solutions when it comes to sensors.
I'm working on a platform for monitoring water water tank levels. It slices Grafana and influxdb horizontally to share the resources between multiple users and multiple tanks.
The productivity of such a stack is high, when it comes to getting beautifully rendered, interactive graphs of e.g stacked water levels. And with influxdb flux language, you can write joins that join data from the time series database and the rdbms (for more reference data, like the names and calibration data of individual tanks).
Yes you can do anything with SQL but there's a reason for the presence of dedicated time series databases.
Sensors aren't really different from any other timeseries data.
> The productivity of such a stack is high, when it comes to getting beautifully rendered, interactive graphs of e.g stacked water levels. And with influxdb flux language, you can write joins that join data from the time series database and the rdbms (for more reference data, like the names and calibration data of individual tanks).
Your productivity being high with a given tech stack does not disqualify an alternative tech stack from having equally high (or much higher) productivity for equally trained users.
> Yes you can do anything with SQL but there's a reason for the presence of dedicated time series databases.
> Your productivity being high with a given tech stack does not disqualify an alternative tech stack from having equally high (or much higher) productivity for equally trained users.
Try implementing classic timescale features like down sampling in your straight RDBMS.
Certainly you can do it, just as you can build a house with a hammer and nails rather than a nail gun.
But you'll spend lots of time building undifferentiated infrastructure that you could have got out of the box.
In my experience, the problems are a result of the manufactured constraints on the physical hardware/sensor devices. Mountings and power and connectivity. Receiving data from an array of devices into a shared storage system for analysis is a well solved problem. Receiving data from multiple sensors on a fixed interval, when the sensors may be made by different companies/work totally differently, is where the complexity lives. Combined with each company trying to build their own awful closed source proprietary data system on top of their sensors, you've got a really terrible time.
> SQL turns out to be really slow for this kind of data
I think this is just poor modeling. SQL is just fine for the work you're talking about.