> What greater 'tax' could you possibly conceive of than a whopping 38%
I always find it fascinating looking at the number of lines of code which do the same thing in different languages. Foundationdb maintains official bindings in Java, Go, Ruby and Python. The implementations have almost complete feature parity. They're all maintained by the same team, and as far as I can tell they all follow best practices in their respective languages.
The sizes (minus comments and blank lines) are[1]:
Python: 4053
Ruby: 2397
Go: 3968
Java: 10077
38% seems like a lot but language expressiveness dominates that. If ruby and java code were equally difficult to write, you'd be paying a 320% tax moving from ruby to java for an equivalent program.
I'd love to see a "programming language benchmark game" that only allowed idiomatic code and compared languages based on code length.
I have no idea what the equivalent size ratio is for javascript and typescript, but having worked with both, I find typescript projects end up bigger. I write more typescript because the type system and the tooling encourages verbose and explicit over terse and clever. (In typescript I end up with more classes, longer function names and fewer polymorphic higher order functions).
The typescript tax is real. That said, I believe the defect rate increases by 38% too. Depending on what you're doing the tax might be worth it.
[1] Measured from commit 48d84faa3 using tokei "code lines". Includes JNI binding code and makefiles, but not documentation
> Java size has nothing to do with types and everything to do with the language.
Yes. I'm making two claims. First to disagree with the GP's claim that a 38% difference between languages was 'whopping':
> What greater 'tax' could you possibly conceive of than a whopping 38%
The fact that the type system isn't entirely to blame is also clear looking at the ruby and python sizes. Ruby and python have very similar type systems, but the ruby code is only about half the size of its python equivalent.
My second claim is that if we did the same comparison with javascript and typescript, typescript would be bigger. I don't have any stats for this though - just lots of experience. The type system seems to push code in a bit more of a classical OO direction because there aren't higher kinded types, associated types, or any of that fun stuff.
As others have said it'd be fascinating to compare typescript and purescript / elm codebases. I'd expect the haskell-derived languages would come out way ahead on expressiveness. Its constantly surprising to me that we don't have any actual numbers.
> First to disagree with the GP's claim that a 38% difference between languages was 'whopping'
??? You're comparing LOCs and number of bugs. That 320% larger java code might have 50% less bugs and may have taken 50% less time to write. Terse code is harder and usually takes longer to write than expressive code.
When writing in Ruby the "what the hell is this object I'm dealing with here and what methods does it have available?" Tax is like 320% for me. I love Ruby. It's so much fun and it's just neat in general but I find it much more taxing than a nice statically typed codebase I can navigate and reason about with ease.
Everything is just taxed differently. No such thing as duty free programming.
Is LOC really a significant measurement? Sure you have to type more characters when working with a strictly typed language, but that takes a couple of seconds opposed to however long it might take to track down a bug in a massive enterprise system. Not to mention the type system often forces you to consider cases you might otherwise overlook when writing dynamic code, hopefully resulting in better code to begin with—it imposes a bit of discipline.
In my opinion whatever perceived tax a type system entails is superficial—sure, coming from a dynamic language, having to specify types for everything seems like taking on an extra chore—but this is just the surface experience of using a type system—once you’ve used one for a while and furthermore, had the benefit of reading typed code, you really start to appreciate having types around and any so called “tax” involved quickly becomes negligible.
While it might be a “tax” when writing code it’s defintely a boon when reading code—it’s much faster to look at types to get a handle on an api than it is to have to dig through and read the actual implementation to try and decipher what exactly a function expects and returns. When using a Haskell library, for example, I rarely ever have to read documentation extensively or read implementations—looking at the types of functions and a brief description (one sentence) is often sufficient. It takes a couple of seconds. This is the norm with Haskell code. Contrarily, this is an exceptional case for dynamic languages—if the documentation isn’t superb you always wind up having to peek at the implementations at some point—this typically takes longer than glancing at types and a one sentence description.
Dynamic languages might be more “expressive” from a writers perspective (since you can leave out types) but they’re far from expressive from a readers perspective—since the writer convenience often puts the onus of sussing thing out on the reader. In enterprise systems, reader convenience is arguably more important.
15-50 being quite a big range and the 2004 publication date of the book (the comments hint at the reference being earlier) make me reluctant to take it at face value without looking into it in more depth, especially regarding new technology like TS, Rust, or enterprise Haskell.
In the original post AirBnB said 38% of their bugs could be prevented by static typing. This would bring this to 0.9-3%. Not insignificant, but LoC is still more important.
Haskell has the reputation that if it compiles, it works.
Whether it’s true or not, it would be interesting to consider languages in their ability to produce correct code more quickly.
While I don’t like the idea of writing 3x the amount of code, if one implementation is only 30% longer, for example, and is more likely correct on the first run, it’s still a huge productivity improvement.
This is my experience with statically typed languages. It may be a bit more verbose in the implementation, but it takes much less time to reach a correct solution.
I find that whatever "savings" there are in terms of code volume in dynamically typed languages is usually more than made up for in the need for more tests.
Java has a lot more boilerplate—or near boilerplate—than the others. For example, you probably want to implement some kind of toString method, if only for your own sanity during debugging. You can do that in Python, but __str__ is often good enough.
I also wonder how many Java "lines" are just a closing brace or something trivial.
> I'd love to see a "programming language benchmark game"
> that only allowed idiomatic code and compared languages based on code length.
You just described the old Debian language benchmark game. You could select the various coefficients to apply to code size, program speed etc. I remember it played an important role when I decided to give OCaml a try: I selected 1 for size, 1 for speed, 1 for RAM and zero for everything else, and that resulted in OCaml and Pascal, the two languages I was prejudiced against from school years. Of course I went with OCaml.
Java is a terrible exemplar for a statically typed language. It's incredibly verbose, and is often held up as a bad example for how to do typing. More modern languages manage have type systems which are more effective at identifying issues at compile time while also being more flexible in terms of the design patterns they allow.
You might have also noticed that Go is statically typed, and it's actually less lines of code than Python (maybe a lot less if trailing braces are counted in that LOC number).
I always find it fascinating looking at the number of lines of code which do the same thing in different languages. Foundationdb maintains official bindings in Java, Go, Ruby and Python. The implementations have almost complete feature parity. They're all maintained by the same team, and as far as I can tell they all follow best practices in their respective languages.
The sizes (minus comments and blank lines) are[1]:
Python: 4053
Ruby: 2397
Go: 3968
Java: 10077
38% seems like a lot but language expressiveness dominates that. If ruby and java code were equally difficult to write, you'd be paying a 320% tax moving from ruby to java for an equivalent program.
I'd love to see a "programming language benchmark game" that only allowed idiomatic code and compared languages based on code length.
I have no idea what the equivalent size ratio is for javascript and typescript, but having worked with both, I find typescript projects end up bigger. I write more typescript because the type system and the tooling encourages verbose and explicit over terse and clever. (In typescript I end up with more classes, longer function names and fewer polymorphic higher order functions).
The typescript tax is real. That said, I believe the defect rate increases by 38% too. Depending on what you're doing the tax might be worth it.
[1] Measured from commit 48d84faa3 using tokei "code lines". Includes JNI binding code and makefiles, but not documentation