I am mostly a C++ developer but I have been on some Java projects recently, and I am a bit shocked by the "what if it changes?" culture. Lots of abstraction, weak coupling, design patterns, etc... It looks like a cult to the gang of four.
Of course, it is not exclusive to Java, and these patterns are here for a reason, but I have a feeling that Java developers tend to overdo it.
A language is nothing but the code it engenders. Bad code, bad language.
A language that evolves "is" the new code that is written in it, with deep legacy substrata written to previous versions. You can have a good top level but an embarrassing legacy. We should always strive to make our legacy embarrassing, because that marks improvement.
If ten-year-old code in your language is not an embarrassment, your language is stagnating.
I think a large part of the difference is because interfaces and polymorphism is effectively free in Java, whereas virtual methods in C++ comes at a cost.
Virtual dispatch always has a cost, it's "free" in Java in the sense that it's always done at the VM level, so you might as well just use it, even final methods are just an agreement with the compiler, the VM doesn't care, it will dynamically look up the method in the class hierarchy like God intended. C++ makes it painful and obvious what you're getting yourself into.
The JVM is off course very clever and is, I'm sure, doing tons of shenanigans to reduce the cost, but that's not free, that's someone else investing tons of time and effort and complexity to reduce the cost of a fundamentally expensive operation.
> even final methods are just an agreement with the compiler, the VM doesn't care, it will dynamically look up the method
You’re right about the “final” keyword being a placebo, but you got the rest exactly backwards.
The JVM is ridiculously aggressive in optimizing for throughout over latency: It assumes that everything is final and compiles the code with that assumption until proven otherwise. If it sees a method getting overridden, it will go back and recompile all the callers and everything that was incorrectly inlined.
A lot of Java code depends on this. For example if you only load one of several plugins at runtime, there’s no overhead vs implementing that plugin’s feature in the main code base.
A single plugin case is a kinda optimistic wishful thinking. Sure, this case happens. Sometimes.
But in real code you often have plenty of things like iterators or lambdas, and you'll have many of types of those. So the calls to them will be megamorphic and no JVM magic can do anything about it.
While in C++ world you'd use templates or in Rust you'd use traits, which are essentially zero cost, guaranteed.
Somehow I never noticed it happening in practice. In all the cases where cache was the problem, it was caused by data, not code. CPUs prefetch the code into cache quite well.
They are kinda related in a way that Java implementation of generics does not help with devirtualization, while C++ templates / Rust traits do help by not needing virtual calls from the start.
If you load more than one Comparator type, then the calls to comparator are megamorphic and devirtualization won't happen unless the whole sort is inlined.
In languages like C++, you'd make it a template and the compiler would always know the target type, so no need for virtual.
Why? The JVM has complete knowledge over the entire code base at runtime. It knows which methods require virtual function calls and which ones are just regular function calls. If nothing extends the class, then there will be no virtual functions in the entire class. If something extends the class but it does not override any methods then there again will be no virtual functions. If a class overrides a single method, only that method is going to be a virtual function.
Sure, but if you have the same preferences (throughout over latency), you’ll find that there are no performance benefits to be gained from non-virtual functions in any JITed language. The “final” keyword is just there for documentation.
Virtual or not isn't about performance, it is about system architecture. Virtual is structurally about implementation. Exactly to the degree that the public interface matches the inheritance interface, the abstraction is a failure.
At least, if you are being object-oriented, which Java tries to force on you. Of course, you are free to violate that expectation, and sometimes must since Java offers no other means of organization; so if you do, more power to you.
The issue is that when you are trying to understand/modify someone else code it always comes down to confusing implementation details rather than abstract architecture on top of them.
Oh yeah, I've heard this so many times - JVM can optimize all that dynamism out. Except in cases when it can't or just won't.
The reality is, it is very far from free. Most Java developers are simply not aware of the real cost. Then they are surprised how the code gets 5x speedup and needs 20x less memory after rewriting it to Rust or C++.
You don't have the option of telling Java whether to use "that dynamism". All Java calls are virtual calls, in most simple cases the JIT optimizes it out (i.e. single implementer of an interface), sometimes it can't.
This isn't just speculation about what the JVM does, you can examine the bytecode being generated by the JIT-process to verify whether this optimization happens.
Do you have real-world examples of this 5x speedup from this decade? 20x memory I can sort of see because the GC overhead can be pretty nasty in some edge cases, but I'd expect closer to 1.5-2x speedup from C++ if the Java code is anywhere near well written.
This is just Rust code. Where is the equivalent Java code?
Like the 50x-100x memory consumption is highly suspicious. If Java uses 50x the memory consumption compare to C++, how come I can allocate a long[SINT_MAX-10] on a system machine with 32 Gb of RAM? Shouldn't the process require of order 0.8 Tb of RAM if this statement is correct? Or can C++ allocate a 2 bn array of longs in 40 Mb of RAM? If so let me know, I would be very interested in using this novel compression technology.
Why aren't you using OptionalLong[1]? You shouldn't use Optional<Long>, that's never a good choice. At any rate, nobody should be claiming Java optionals are are free, they're a high level abstraction and absolutely do not belong in hot codepaths.
In general it's fairly easy to construct benchmarks that favor any particular language, which is why you constantly see these blog posts about how high level interpreted languages (JS, PHP, Haskell) are faster than C++.
You can easily construct "comparisons" that make JVM languages look superior to C++ as well, just carelessly allocate throwaway objects of different sizes and lifetimes (like you can in Java), oh no, why is C++ slowing down? Surely there's no heap fragmentation! That's a bad faith benchmark though. It doesn't really demonstrate anything other than that C++ following Java idioms isn't very good.
> And what do I do with that knowledge if it turns out the optimization didn't happen?
The way the JIT works is by aggressively overassuming, and then recompiling with more generalized interpretations of the code when assumptions turn out to be false. But the wider problems of compilers occasionally generating suboptimal instructions isn't something that is Java specific.
> At any rate, nobody should be claiming Java optionals are are free, they're a high level abstraction and absolutely do not belong in hot codepaths.
1. In this particular case you might be lucky, because someone provided a hand-coded, specialized workaround. But that was not the purpose of that benchmark. And in bigger code bases you often are not that lucky or don't have time to roll your own, so you must rely on generic optimizations. Sure, you may get with Java quite close to C by forgetting OOP and implementing everything on ints or longs in a giant array. But that defeats the purpose of using Java; and that would make it lower-level and less productive than C.
2. Someone form the commenters on Reddit actually tried OptionalLong, and it did not help. See the comments section, there should be a link somewhere.
3. I can use this high-level abstraction in C++ at negligible cost in hot paths.
> This is just Rust code. Where is the equivalent Java code?
You probably won't find exactly equivalent code for software bigger than a tiny microbenchmark. The closest you can get are other tools built for similar purpose e.g. cassandra-stress or nosqlbench. I can assert you that the majority of CPU consumption in those benchmarking tools comes from the database driver, not the tool itself. And comparing tools using a well-optimized, battle tested Java driver with a similar tool using a C++ or Rust driver can already tell you something about the performance of those drivers. Generally I found all of the C++ drivers and the Rust driver for Cassandra are significantly more efficient than the Java one. Fortunately, outside of the area of benchmarking, that might not matter at all because in many cases it is the server that is the bottleneck. Actually all those drivers have excellent performance and have been optimized far more than typical application code out there.
> Like the 50x-100x memory consumption is highly suspicious.
This isn't a linear scaling factor. It applies to this particular case only. And the reason this number is so huge are:
1. the Rust tool runs in a fraction of memory that is needed even for the JVM alone; Rust has a very tiny runtime
2. the Java tools are configured for speed; so they don't even specify -Xmx and just let GC autotuning configure that. And I guess the GC overshoots by a large margin. because it often ends up at levels of ~1 GB. So it could be likely tuned down, but at the expense of speed.
Better tested, maybe, I don't know but I believe you.
More portable, yes, but it is complicated. Java runs on a VM, so it gets portability from here, if your platform runs the VM, it will run the project. However, as a user, I still had issues with using the right VM on the right version, OpenJDK and Oracle JDK are not completely interchangeable. Messing up with classpath and libraries too. Not so different from C actually, but at the JVM level instead of the platform, the advantage being that it is easier to switch JVM than to switch platform.
I am mostly a C++ developer but I have been on some Java projects recently, and I am a bit shocked by the "what if it changes?" culture. Lots of abstraction, weak coupling, design patterns, etc... It looks like a cult to the gang of four.
Of course, it is not exclusive to Java, and these patterns are here for a reason, but I have a feeling that Java developers tend to overdo it.