Ugh... The type hierarchy of Python is painfully intertwined with the implementation of its type system. That implementation varies by Python implementation, and since the Python language is as old as Visual Basic, Borland Pascal, Werther's caramels, and trilobites, it varies dramatically over time. Add to that the fact that Python has a philosophy of wearing leaky abstractions proudly on its sleeve and barfing implementation-level details at its users.
Python really is one of the most complex, arbitrary, and poorly abstracted languages in popular use. I have no idea how anyone lives with it for anything except writing extremely repetitive, formulaic, and superficial code, because having to dig into its guts feels like a nightmare. Even Java's type system, as antiquated and simplistic as it is, is easier to understand and work with by comparison.
For all its criticisms, it's an unusually good language for just getting shit done.
Sure, you can build whatever you want in almost any language you want, but 90% of those will be more verbose, harder to read, require compile steps, and be 10x the LOC, and take 4x longer to write... You can have it; I'll keep writing Python.
> unusually good language for just getting shit done.
I agree, but only for very specific/small values of "shit".
A codebase with thousands of lines of Python is IMHO already stretching the limits of readability; when you get to the tens/hundreds of thousands of lines, it's definitely past.
I've tried and used 10-15 programming languages in my days, and while I haven't used them all professionally Python is still by far my preferred language out of all that I've tried. TypeScript comes in at a close second, although it depends a lot on what you use it for of course.
I've managed to write both buggy and working code in all languages, and can't say that Python is more buggy than any other. Slower at runtime, for sure. But not less correct.
Not the person you replied to, but I was a big Python fan until I found Scala. I try to write code that looks like Python (which means avoiding some libraries that rely too heavily on symbol-heavy names), but the type system can help me check that I really know the things I think I know, and even in small Python scripts there are conveniences that I miss (e.g. case classes - though nowadays attrs is more or less equivalent).
Correctness is not necessarily the issue. The hard thing with large Python codebases is that the lack of types and data definitions makes it hard to understand what data any piece of code is operating on. This may not be an issue for a project written/read by a single person, but when working on big team projects it's a different story.
Any statically-typed language will avoid this issue.
Python supports type annotations and static typing via mypy and co. I find statically typed Python is absolutely comparable to other statically typed languages. At least I don't feel much of a difference working with it compared to go, typescript.
I think Python is the language I wrote largest variety of programs in.
I recently tried to get stream of controller locations from Steam VR.
I was stumbling around looking for information about that in any language and the thing became a breeze only after I found Python bindings for it.
I never done commercial work in Python in my life. I did PHP, some Java, some Ruby, C# and JavaScript. But I keep encountering Python in stuff I do out of curiosity.
> I have no idea how anyone lives with it for anything except writing extremely repetitive, formulaic, and superficial code, because having to dig into its guts feels like a nightmare.
It’s interesting that Python powers an outsized chunk of contemporary scientific research.
Perhaps its fitness for purpose is a result of its messiness, or perhaps that messiness maps better to the way that many people think? Would a more rigorously-typed alternate-universe Python have been as successful?
> Would a more rigorously-typed alternate-universe Python have been as successful?
One of the reasons Python became popular was its reputation as “executable pseudocode”. Rigorous typing would detract from that.
Guido and the authors of Mypy clearly believe that now the language is popular, it can afford to add explicit types and become less like pseudocode. But I don’t think it would have become as popular had it taken this direction from the beginning.
> One of the reasons Python became popular was its reputation as “executable pseudocode”. Rigorous typing would detract from that.
Not necessarily, if it had good type inference. I write Scala that looks pretty close to Python (unfortunately the community as a whole makes extensive use of symbolic names, which I think is a mistake, but you can avoid those libraries), and with Scala 3's indentation-based syntax it can become even more so.
Hmm, now that I look at it my personal code is mostly gnarly functional libraries. https://github.com/m50d/plus-minus-zero/ is probably the project that's closest to something I'd do in Python, but I've still used a fancy reactive library and some functional tools for dealing with the explicit async futures.
(It's a quick calculator tool to compute your European Mahjong Associaton rank score based on your tournament results, and can import your existing results through some quick and dirty web scraping. The domain linked from github has expired, but it's running at https://adoring-khorana-c20461.netlify.app/ - Scala usually runs on the JVM but this build is set up to compile to JS and run in the browser).
> Would a more rigorously-typed alternate-universe Python have been as successful?
Yes, perhaps even moreso. It certainly would've been easier for alternative implementations like PyPy.
Maybe even easier to make the transition to Python3 if the interface had been better designed.
But, hey, with all its faults I still think it's really great and deserves a lot of credit. That said, it could be a good time for Nim or some other new language to shine.
You make the common assumption that success is caused by property, rather than by the most important criterion of “right place at the right time”.
It would not surprise me at all that in an alternate universe, the exact same language was released a mere six months later, and was forgotten into obscurity.
Microsoft is undeniably the giant it is today largely because at that exact moment IBM needed a cheap operating system for it's line of personal computers and that Bill Gates' parent at the time worked for IBM and could arrange that deal certainly also helped.
The success of many giant companies can be traced to a pivotal event that was sheer dumb luck — for FedEx, this event was actually gambling in a casino with what money they had left when they were close to bankruptcy.
You are right about cause of many things with that observation.
I fully agree that you can trace the shape and popularity of C# to many specific "place and time" business events.
However Python is different. Whent it got traction it wasn't the only thing around, and it's usage over the years follows the curve that no other language is following.
Because of that I believe there's more to Python popularity than just circumstances of its birth and development.
IMO it's because researchers like it, because you can slap together some super sloppy hacky mess prototype and it will usually work. More germain languages stop you from from generating pure chaos.
Researchers also love excel, Fortran, and Matlab. Let that sink in for a bit.
This is true. I primarily write research code as well as some code that goes to prod.
I love writing Python for exactly the reason you've mentioned. If something that needs to be re-written in a "proper" language, we've guys that can write that thing in them quickly and I don't need to worry about them that much. Did I mention those guys usually gets paid half as much as I do? As long as things work man.
> Would a more rigorously-typed alternate-universe Python have been as successful?
Maybe at some point we'll get TypedPython (like we got TypeScript).
In my opinion that's the way language design should be done.
First discover what semantic people want to use to express themselves and if you got that right enough then try to describe that semantic with a type system as rich as you need to let people express themselves more precisely.
Starting with designing typesystem is too hard and leads to fiasco that kept Haskell in obscurity and spawned whole slew of dynamic languages.
Although maybe we got better at making flexible enough type systems... Rust seems to be doing a lot of stuff right.
We already have TypedPython. It's called Mypy, and any organization doing serious or larger scale Python development should already have this hooked up to their CI.
Together with Pycharm you get almost same level of fearless refactoring only C# or Java can offer.
It's a bit shocking to see people not using it or not even being aware of it's existence, because without it large codebases quickly become a steaming pile of unsafe mess.
The same arguments I used to make against MATLAB are used against Python on HN. Yet both are as or more more successful than your favorite language (from where I’m sitting).
Sure! In Python 2.x, "class" was an instance of "type." In Python 3.x, "type" is an instance of "class." (This is, of course, an over-simplification of a much more complicated reality.)
Could you expand on that? I’m not sure what exactly you’re saying, because you’re not using standard or unambiguous terminology. (`type` is a thing, but “class” isn’t a thing, but rather a category of things, a syntactic construct.)
Python 2:
>>> class OldStyleClass: pass
>>> isinstance(OldStyleClass, type)
False
>>> class NewStyleClass(object): pass
>>> isinstance(NewStyleClass, type)
True
Python 3:
>>> class Class: pass
>>> isinstance(Class, type)
True
Old-style classes were kind of their own thing in their own space, but with new-style classes, classes are to `type` as instances are to `object`, and I don’t recall there being any major 2/3 difference beyond the simple removal of old-style classes.
The 'class' versus 'type' distinction here is actually arbitrary. The same C level code is responsible for the message that you get from 'type(whatever)' in both Python 2 and Python 3, but in Python 2 it drew a distinction between heap-allocated things (which were reported as 'class') and things that were not heap allocated (which were reported as 'type'). Non heap allocated things had to be created in C; heap allocated things were usually implemented in Python and were usually made with 'class X(base): ...'.
(This change was introduced in Python 3.0a5, bug #2565. Looking at the bug, this is a followup of making the type() of new style classes be reported as 'class ...', but preserving old behavior of type() for built-ins, done in 2001.)
Ah, gotcha. But I don’t think that that’s actually a difference—it’s just a slight terminology change around “type” and “class”, which I think may have been related to tidying up old-style classes. (That is: yeah, if you’re comparing old-style classes, it may have been a difference (I’m not certain), but I believe it’s completely superficial once you’re comparing the recommended form of classes for the last quite a few years of Python 2.)
As I recall, in the early 1.x days it wasn't apparent that the class/type dichotomy was a problem. It took a lot of work during the 2.x era to bring them together.
Thanks for the correction; it’s about five years since I’ve done any serious work in Python, and six or seven since I last did any involved metaprogramming, and I assiduously avoided old-style classes even then, and so I had completely forgotten about types.ClassType. Python 2.4 was the oldest version I ever worked with, also.
> I have no idea how anyone lives with it for anything except writing extremely repetitive, formulaic, and superficial code, because having to dig into its guts feels like a nightmare.
This is sort of why I like Python: it prioritizes getting something working today and trades off being able to build a complex system over several years. Java is a great language if you're going to build a large-scale application and want a bunch of developers across time and space to productively contribute to it. But business requirements change, and if you can get what you want done by using a series of applications that live for a few weeks or months, why wouldn't you do that?
Another commenter mentioned scientific computing, which is a good example. When you're evaluating hypotheses, it's important to be able to write new code with high throughput. That means you need enough power of abstraction to be able to have libraries like NumPy etc., but you don't want so much structure that it's hard to write 10 lines of code and validate or falsify your gut feeling and then move on. Python was built around a REPL; Java only grew JShell fairly recently, and Jupyter was built for Python, not Java.
And even when you're trying to reproduce old work, it's a lot more scientifically valuable to have someone be able to look at the paper (the spec) and write a new implementation than to say "Start with this 100 kloc Java codebase which we wrote for Java 1.2 but still works today." There are of course useful applications for that capability and it's great that Java supports it, but this isn't one of those applications.
I don't do scientific Python myself, but I support users who do and I do a lot of infrastructure work. For that, it's still much more valuable to be able to write new code and get rid of it than to work within an old and well-built system. If I want to answer questions of, say, what our storage use patterns look like so the business can figure out what to invest in, it's a lot more valuable to be able to take that problem statement and produce an answer quickly than to build a long-lasting application that can keep producing answers to variants of that question, because the next question isn't likely to look too similar in terms of implementation.
Or maybe put it this way - if you're the IRS, and you're writing code to process and verify people's tax returns, Java sounds great. You're going to want to keep the tax code from N years ago still working, and you're going to want this application to live many years. But for me, I have a little program to do the math for me to fill out my taxes, and that program is in Python, because I have no need to keep maintaining the script when the tax code changes (or when I move states), and I'm the only author. I make a copy of the Python file each year and I drop anything irrelevant. The ability for me to read the entire script top to bottom and be convinced "OK, this solves this one task accurately" is far more valuable than the program being able to solve hundreds of tasks it needed to solve in the past.
This is a traditional argument and I understand where you're coming from.
But look at where it got us. At every company where I've worked with data scientists, every piece of "data science" code is written twice -- once in Python, and again in another language. We had to hire at least one software engineer for every data scientist. There is an entire industry that targets "productionalizing" things written in Python by data scientists, because Python code is not production-ready code.
Python for education? Absolutely. Python for whiteboard interviews? Great. Python as a DSL for data science? Obviously. Python as a scripting language? Sure.
Python as a production-ready language for a growing company? I have spoken to people at lots of companies that started with Python, and then had to dedicate months or years for a full rewrite. And if they have people writing Python code, they have hired more people to rewrite that code immediately so it's not in Python anymore. I say "bye" to every interviewer who pitches me on a job writing Python: every time I have looked into an opportunity like that, it was an attempt to throw fresh meat at rewriting a creaky, unmaintainable Django monstrosity. This isn't me being an armchair philosopher, this is the industry around me.
I can agree that large projects in dynamically typed languages can be unwieldy without type hinting but there are tools to make it manageable (Ruby, JavaScript, and PHP are no different in regards to typing)
On that list, I only have first-hand experience with Uber. The Uber entry on that list links to a blog post from Uber engineering. This four-year-old blog post says:
> We rip out and replace older Python code
How many other companies on that list also "use" Python the same way that Uber "uses" Python?
If you follow the link next to that text, it says that they're ripping out sync Python using Flask from their monolith and replacing it with async Python using Tornado in a microservice, though some teams are also exploring Go.
Which seems like an entirely reasonable way to use Python (no quotation marks needed), and exactly what I'm advocating - Python is a language where you can ship something today and reimplement it next year, also in Python, for the same engineering effort that you'd spend doing it once in a more highly structured language. Alternatively, you can reimplement it in another language. You can safely rip out and replace the original Python, because it's a language that optimizes for humans both reading and writing it.
And just about every place I've worked, business requirements are constantly changing, and the scale and structure of the company (and associated Conway's Law implications) are changing, and so code you write today is going to be tech debt in a year anyway. A language that encourages you to write outrippable code and makes it easy to replace it is your ally under these conditions.
(Put another way: Python is a language that is readable enough to avoid the https://www.joelonsoftware.com/2000/04/06/things-you-should-... trap, which is fundamentally about code that is so complex that a human can't figure out all of what's going on and the only safe way is to treat the existing code as a relic.)
Again, I'm not saying this is the language for everyone to use for all cases. There are cases where you want to make the code a little harder for humans to read and write so that the computer can help you with things. If that is indeed your use case, go write Java! But I think there's plenty of stuff you can call "production-ready" that doesn't fit this particular mold.
I wrote that I have first-hand experience at Uber because I was ripping out that Python. There is no more Python, and certainly no more has been added in the four years since that blog post was written. Python is only for scripting data pipelines and automation. Almost everything else is Java/Go.
Python isn't only unwieldly because of dyanamic typing, but also because it eschews functional idioms. JavaScript is much better in this regard, and it makes the code much more maintainable.
Sure, `reduce()` was removed from the standard library between Python 2 and Python 3. Here is Guido arguing that `lambda` and several other functions should be removed as well.[1]
To the point about data scientists - there isn't really a way you could solely hire the software engineers, right? So there is value in Python allowing the data scientists to iterate much quicker than they could if they were writing in a "production-ready" language.
I agree there's work to be done in closing that gap so you don't need an extra software engineer for productionalizing mostly-working Python (and I'm excited about tools for managing large-scale Python - e.g., several of my coworkers are trying out MyPy, which I haven't personally felt too much of a need for but seems like it could help), but the gap exists precisely because you can write something working in Python quickly, and it's not so much of an investment that you'd feel bad throwing it away if it doesn't work.
The company that started with Python and spent years on a full rewrite made enough money with Python to survive those years. If they had started in a more "production-ready" language they might not have shipped at all, and if they did they might not have shipped the right thing.
And at least for me personally, as an infra person, the question I'm evaluated on at the end of the day (or year) is "Did the infrastructure work," not "Did you write production-quality software to support the infrastructure." Some of the most critical software I've written (across multiple companies) has been 50 lines of Python shared via a Slack snippet or a shared homedir and only retreated into version control months later. There are a lot of problems that genuinely require only 50 lines of complexity, and the ceremony of a language like Java makes it much harder to understand what's going on. For those problems that do require umnanageable Django monoliths, by all means, write it in something else.
Given that 90% of companies fail, maybe that's a good tradeoff. Build the thing that might work in Python, see if it gains any traction (the famous product-market fit), and then rewrite it if it worked. If Python saves you more than 10% of the effort on the first write, then on average it's worthwhile.
Thankfully, this rewriting has never been my job. Companies where I've seen it were typically targeting Spark as the execution environment, so the production languages were Java and Scala, at a ratio of about 2:1. PySpark in production was either disallowed up front or quickly disallowed retroactively after experiencing the magic and delight of data scientists shipping production PySpark.
I don’t dislike python, but as far as I know reproducibility is a big problem in scientific circles (actually in both meanings, but I’m talking about copy-pasting codes not running at all).
ML is famous for it and python is the most used language there.
Reproducibility would be a problem in scientific circles regardless of language. Researchers are negatively incentivized on reproducibility to avoid scoping and the like.
You're right, reproducibility is a problem in science, all science and not just ML (again, as you say), so I don't think that proves anything wrong with Python.
That is true, but maybe Python doesn’t help with its primarily interpreter-based ecosystem (think of jupyter) where a result may only worked for the author because a variable he/she uses used to refer to something else no longer there in the source code (but still defined). Which bit me more than once with for example sage - but it may be simply a problem with jupyter/the way the scientific community writes code.
I always love coming into these Python threads. It's extremely cathartic to know there are enough people who hate the language as much as I do, even if we are stuck with it for any number of good reasons.
All of the reasons we're stuck with it related to datascience and scientific computing are the exact reasons why we hate it. It's exceptionally fast to write code that does a very specific task without sparing a single solitary thought for any sort of software development best practices. It's incredibly approachable, which means it can be used by any keen scientific or mathematical minds who spend more time reading and writing whitepapers than they do writing code. The code, as an entity in and of itself, clearly shows that. But the _output_ of that code also shows that, which is why it isn't going to go away or be swapped out for anything else.
This is like the ideal breeding ground for further scientific or academic pursuits, and since everyone is standing on the shoulders of numpy/scipy/sklearn/pytorch-esquian giants, you're stuck in the scenario of having to replicate all of these libraries in any other language. I can't honestly say it's worth the effort to replace the entire stack, and it's also not going to help when these same researchers continue to use python going forward and start using some other library that you didn't port yet.
And so, they will continue pumping out new papers and prototypes-masquerading-as-quality-libraries, and we'll continue roping them in and duct taping fix after fix after fix on this kaleidoscope of horrors and we'll continue self medicating to dull the daily frustration of python while also accepting that it's the price we're going to need to pay if we want to work with these brilliant minds with all their neat new ideas pushing forth such a breadth of new understanding from the data _we already have_. Or at least that's what we say when we try to ignore the growing bald spot and bursts of incandescent rage every time we get a "quick script that needs a little polish" that clocks in at 2000 loc and literally doesn't have a single function definition in it and variable names that identify up to 14 different types depending on some global state and...
You know, I'm going to stop typing this. It's christmas eve and I wasn't planning on being this angry today
Edit: Some people weren't liking this, it may be because they think I'm being sarcastic when I talk about these brilliant people. I want to just clarify - I am not being sarcastic. They ARE brilliant. They're far more intelligent than I, and they're really pushing the boundaries of human understanding in mathematics and sciences. I acknowledge they're bring a whole hell more good to the table than they are bad, and there's a damn good reason why I am going to be doing this thing I'm doing in perpetuity, or until I can't handle it anymore.
But that doesn't mean their coding practices aren't often _abysmal_, and I won't apologize for my frustrations in this and the language that just _lets_ them do it.
I'm a deep learning researcher. I have no idea what you're talking about. When I crank out some Pytorch code of questionable quality it's not because of Python. It's because I either don't know better, or don't care. Switching to another language won't change this.
Some languages force you into better practices. The extra overhead that you may not want to deal with - and that learning curve - are the exact sorts of things you want to do if you want to write code that is of less questionable quality.
Good Python looks a lot like what you would get from a compiled language, and every researcher I work with looks at it and goes "holy shit this is so much extra work I don't wanna do it"
And like I said, that's fair. The code is not the point, it's just a tool to get the data - that's the point.
I equate it to the one time use jigs woodworkers use. Some can get pretty fancy and be awesome, but most are just slapped together and will probably get chucked in the bin. I don't mean to shame anyone - there's a reason for it, and it serves it's purpose.
It's when someone hands you that shitty-ass jig, from a language that lets you do some pretty heinous things by design, the frustration builds like crazy! I've seen some pretty gnarly Java and C# in my day too, and all I know is the worst Java still is an order of magnitude easier to handle than the worst Python I see on the regular, it's wild.
Can you please show me some examples of what good Python code looks like, and point out the overhead? Next month I will be working on a major redesign of a fairly complicated simulations code (Pytorch) with the goal of making it more flexible, and incorporating some new features. This code will be used by many others, so I want to follow good software engineering practices.
I don't off hand, but I didn't want to leave you hanging too long without saying anything either.
The main two things that get me all bent out of shape is having a purposefully disjoint types passed in as the same variable with a bunch of logic around handling either of them, often in very different ways, and never, ever checking types of values or ranges.
If you start throwing in type hinting and make use of mypy, it keeps your own code pretty coherent. If you do need to have disjoint types coming in, spend a lot of time thinking about how you want it to work. It may be reasonably cheap to force everything into a single type from the many possible types coming in, which should simplify things a lot. If that won't work, consider wrapping any of these types in a composite object that unifies the _how_ of accessing the data inside the type into a single low-cost abstraction. Whatever you do, don't let the logic about how to operate over your abstract data input bleed into the logic of how you're building off of it.
This is one of those things that Python makes hard, not because it purposefully stops you, but because it makes it _so very easy_ to spew implementation details through every single function call. It's easy, and programming is hard, and people have deadlines and it's something you can easily convince yourself you don't need to do, and suddenly you're writing bad code and nothing and nobody is there to stop you from it.
I'm a big fan of keeping my type hints in my code and not using typeshed; I WANT people to see what types I'm agreeing to support. You don't have to agonize over documentation or look 14 levels deep in my code to see what I'm handling. The goal here should be for someone to read my function signature and go "oh, okay, I know exactly what I need to provide for use".
Another vexing thing that comes about ALL the time in dsci/ml code is single letter identifiers. A lot of this is because the paper says this variable is `p` or maybe even `epsilon` so... that's what the variables get named. I've even seen `f()`, `g()`, and `h()` in the wild, and of course there wasn't a lick of documentation around it.
Unless your audience is only ever the people who wrote the paper itself, or those who studied it vigorously (more so than just reading it), these are terrible choices.
This advice is not python specific - it's language agnostic, but I notice it most in Python solely because I run into it the most dealing with non-devs taking a stab at writing their first libraries (or, rather, a bit of polish on their initial prototype). But in general, write your code not so that it's easy to write, but so that it's easy to read, easy to reason about, and doesn't require a chain of whitepapers to understand. The goal is so that someone reading this later doesn't have to literally be you, at this exact moment in time, to understand what is going on. You want your code to live long past your current attention on it, so write it in a way that is easy for another maintainer to pick up and run with.
In the end, other languages make some of this table stakes - not the naming, obviously, but the types? Range checking? Handling error conditions explicitly? Python gives you all the rope in the world necessary to bind yourself into a knot with, so it's all up to the developer to do the right thing. When I lament python, it's not because it's inherently bad, but because allowing people to play fast and loose with the rules means you're going to find a lot of people who just don't give a shit.
You know, what you are describing sounds exactly like most production code I've inherited over the years. It's what imperative programming devolves into. Start at the top, run to the bottom. You see a lot less of this with compiled languages, and I suspect that has a lot to do with the additional complexity of actually compiling the code filtering out people who don't want to be bothered with details. Enjoy what's left of Christmas Eve.
It seems like a lot of criticism of Python is directed at the data science community but I'm curious what the alternative is here?
Is it surprising people trained in math are writing hard to maintain programs? Is biology+Perl any better (I'd argue worse).
What languages exist that are hard to write bad code in? I've worked at places that rewrite abominations of Java every 5 years because they are such coupled, unmaintainable messes.
JPMC has Athena and Pinterest seem firmly committed to Python. I know some people that worked at Braintree (now part of PayPal) that heavily used Python
Home Assistant and Ansible are two massive OSS projects
I'm hoping the default choice becomes Julia. For one, it's Lisp. Imagine tensorflow or whatever being distributed as one Julia codebase, no separate C++ to compile, no need because Julia can do it all. I hope it replaces R, Mathematica, Matlab, Excel VBA.
> What languages exist that are hard to write bad code in?
A language with a good type system (ex. Haskell, OCaml, Rust etc.) makes it hard to write bad code, because it eliminates a large class of potentially "bad" programs, and thus provides guarantees about the code that does type check.
If I have an arbitrary Haskell program, it might still be messy and hard to read, but I know I'll have certain guarantees about the code. For example, if a function isn't in IO, it won't have any effects, and a replacing it with a function that returns the same values for the same arguments will be safe.
Additionally, the language makes it easy for libraries to have well designed APIs, that ensure that the user doesn't forget to make the required checks. For example, removing null, and forcing Option/Maybe for functions that might fail forces the the caller to handle the failure case, instead of just forgetting to check for null, which is common.
If they don’t have bad code it’s because it’s a bunch of enthusiasts that are good at coding. Realistically I’m sure there’s even now examples of code that would be considered “bad” in any of those languages.
It's more the fact that these languages force certain best practices using their compiler. While python can be great when used in best practice, its level of freedom make it easy for the user to get away with obvious mistakes. It's great to have freedom like that when you're writing small projects, but a large project could lead to unintended effects that cause more damaging issues than the time lost using a stricter language.
That’s all great in principle, but is there any evidence to support this claim?
Certain properties are of course enforced so it’s impossible for a project to do certain things that make the code harder to read. On the other hand, humans are so creative that I have a hard time imagining that, given time and a wider mix of talent, you’ll still have code hygiene issues. Maybe not the exact problems that another language might have, but certainly your own flavor will be developed as developers have less contact with the core language enthusiasts that establish said best-practices..
Again, happy to be proven wrong but I’d like at least anecdotes or some kind of evidence rather than a theoretical argument from first principles that completely ignores the human element.
Words like "any effects" and "safe" have broader meanings than the way they are used in type systems. I ask that you be a bit more precise in your advocacy.
I can replace a sorting method having O(N log N) time with one having O(N^2) time, without changing the type. But it will certainly have an effect on my system, and if the run-time is too long, may result in an unsafe system (eg, in a real-time control system which requires a maximum response time).
While I know what you mean, "bad code" includes things like unacceptable performance.
I also find it hard to understand how to apply type systems to algorithms. What is the type of an algorithm which returns a graph diameter, and how does it differ from one returning the graph eccentricity or other graph property?
(That's where "same values for the same arguments" comes in, but that's that hard part of the problem, yes?)
In a world where unsafePerformIO, error, etc don't exist, sure.
The reason most Haskell codebases are cleaner is because the culture manages to uniquely emphasize good practices such that people don't try to break abstraction barriers.
I'm sure it's possible to write Haskell code that "misbehaves" as much as python code does. Once smart people with no training are let loose on something, that'll definitely happen.
We're lucky that Haskell hasn't been attacked yet :)
> and a replacing it with a function that returns the same values for the same arguments will be safe
No. That is only safe at compile time but not at runtime.
Because the type alone often does not guarantee the meaning and semantics of the values in the specific context and program are the same.
I'm not sure I get the argument here. So long as the input-output mapping for the function is identical, it cannot change the benavior of the program. This is what I would consider safe.
The semantics of values are irrelevant because the function will behave identically regardless of those semantics.
This only true if the type describes the whole meaning of the values.
If the exit code of a unix program changes from 0 to 1 it still returns an int, but the meaning of that value is completely different one.
But the exit code can't change from 0 to 1. For any given input to the Unix program, if the output was 0 with one implementation, then it would always remain 0 with any other _equivalent_ implementation.
This is literally what is meant by "returns the same values for the same arguments". Any function that fits this description, by definition, cannot suffer from the problem you are describing.
I often think that stopping people from writing code might be the best way to avoid bad code.
Stricter languages like Rust, ADA, or even F# (quite strongly typed), will not let you compile until you reach a certain level of explicitness/correctness. Thereby teaching you some basic things until you can compile.
Languages like Perl and Python allow you to do a lot of things, which ends up unmaintainable because it's too easy. The easy version just breaks during runtime then.
I don't know if it's related to C classes at all, but an interesting semi-related tidbit: 2.x has a weird distinction between "classic" classes and "new-style" classes that inherit from object. In 3.x, everything descends from object.
And “classes in C” is really just a matter of agreeing on a convention. GObject and COM are object systems that have C implementations.
I suppose it’s a mental thing when you’re “below” the abstract object system instead of getting to treat them as tangible things unto themselves. Like how C doesn’t really have strings — they’re pointers or structs and so you don’t get any sort of encapsulation that’s not enforced by convention. You have to maintain the invariants or only use methods that maintain them.
Could you be specific? The article's targeted at Python 3 and I don't see anything that's really obviously Python 2 only. Did you mean the link in the sidebar maybe? That was written back in 2011 and I don't know whether it's still true or not, but it only talks about "new style" (at the time) classes so it still could be.
Python really is one of the most complex, arbitrary, and poorly abstracted languages in popular use. I have no idea how anyone lives with it for anything except writing extremely repetitive, formulaic, and superficial code, because having to dig into its guts feels like a nightmare. Even Java's type system, as antiquated and simplistic as it is, is easier to understand and work with by comparison.