People that take this attitude against distributed processing usually have never had to process an amount of data bigger than would fit on one machine or always have the budget to fit everything on one very expensive large machine. It's lack of experience masquerading as cleverness. The only ones that have a right to make this arguments are the people that spread all their processing out as streams on one machine or using distributed streams but even that has serious limitation.
If there was something better than Spark for distributed processing, we would be using it. The rest of your comment is a straw man argument, assuming everybody uses it for datasets fitting in memory of a single node.
First step is decide if you really need distributed data processing. I think this is the point author is making. I've seen GB sized data considered "BIG DATA" and its unbelievable the architectural patterns used to support this "BIG DATA".
Spark and PySpark are just PITA to the max.