I got my undergrad in physics and data hacking was discussed at length in every lab class. I don't know if this is a common experience but it was really one of the most beneficial lessons.
In be beginning it always felt obvious what hacking was or wasn't but towards the end it really felt hard to distinguish. I think that was the point. It created a lot of self doubt which led to high levels of scrutiny.
Later I worked as an engineer and saw frequent examples of errors you describe. One time another engineer asked if we could extrapolate data in a certain way, I said no and would likely lead to catastrophic failure. Lead engineer said I was being a perfectionist. Well, the rocket engine exploded during the second test fire, costing the company millions and years of work. The perfectionist label never stopped despite several instances (not to that scale). Any extra time and money to satisfy my "perfectionism" was greatly offset by preventable failures.
Later I went to grad school for CS and it doesn't feel much different. Academia, big tech, small tech, whatever. People think you plug data into algorithms and the result you get is all there is. But honestly, that's where the real work starts.
Algorithms aren't oracles and you need to deeply study them to understand their limits and flaws. If you don't, you get burned. But worse, often the flame is invisible. A lot of time and money is wasted trying to treat those fires and it's frequent for people to believe the only flames that exist are the obvious and highly visible ones.
I'm not sure if there's a great universal book. Generally you learn this through the formal education and as parts of textbooks. I mean there are dedicated topics like Bayesian Experimental Design (might have "Optimal" in there) and similar subjects, but I'm not sure that's what you're looking for. One point of contention I've had when in grad school (CS) was about the lack of this training for CS students, especially in data analysis classes and ML. I'm not surprised students end up believing "output = correct".
These are topics you can generally learn on your own (maybe why no consolidated class?). The real key is to ask a bunch of questions about your metrics. Remember: all metrics are guides, as they aren't perfectly aligned with the thing you actually want to measure. You need to understand the divergence to understand when it works and when it doesn't. This can be tricky, but to get into the habit constantly ask yourself "what is assumed". There are always a lot of assumptions. Definitely not something usually not communicated well...
In be beginning it always felt obvious what hacking was or wasn't but towards the end it really felt hard to distinguish. I think that was the point. It created a lot of self doubt which led to high levels of scrutiny.
Later I worked as an engineer and saw frequent examples of errors you describe. One time another engineer asked if we could extrapolate data in a certain way, I said no and would likely lead to catastrophic failure. Lead engineer said I was being a perfectionist. Well, the rocket engine exploded during the second test fire, costing the company millions and years of work. The perfectionist label never stopped despite several instances (not to that scale). Any extra time and money to satisfy my "perfectionism" was greatly offset by preventable failures.
Later I went to grad school for CS and it doesn't feel much different. Academia, big tech, small tech, whatever. People think you plug data into algorithms and the result you get is all there is. But honestly, that's where the real work starts.
Algorithms aren't oracles and you need to deeply study them to understand their limits and flaws. If you don't, you get burned. But worse, often the flame is invisible. A lot of time and money is wasted trying to treat those fires and it's frequent for people to believe the only flames that exist are the obvious and highly visible ones.