Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The unreasonable effectiveness of the Julia programming language (arstechnica.com)
284 points by nikbackm on Oct 9, 2020 | hide | past | favorite | 212 comments


I quite liked this article, but both the article and the comments here mostly skip over my absolute favorite Julia feature: because Julia is fast end-to-end, you can code in the style that naturally matches your use case and mental models without sacrificing performance. Functional patterns and imperative patterns, vector-based or element-by-element, your types or built-in: they all work well and run quickly!

I do most of my programming in Julia and Python. Python libraries like numpy and pandas are fast and efficient—if you're staying "within the lines" of how the library is designed to work. And most of the time this is okay! But not irregularly I want to do array operations on arrays of user-defined types, or I want to walk through a dataframe row by row without paying a huge performance penalty, etc. And all the sudden the well-tuned Python ecosystem feels very restrictive.

In Julia, my workflow is roughly: 1) think about the problem. 2) Code an intuitive solution. 3) If necessary, tweak a little bit of code to improve performance by reducing allocations or type instability.

That's a lot less mental work than my Python/Matlab/Java workflow of 1) think about the problem. 2) Think about how the solution can be expressed in the paradigm the language supports performance with. 3) Write a solution in this particular paradigm. 4) Tune for performance, which may be awkward if the initial solution was not intuitive.


I believe the essence of what you are saying is that Julia is great for explorative programming, while Python et al are better suited for slightly more structured, more productionized programming.

I personally use Mathematica for this purpose and find it indispensable. I wonder if the PL designers should focus more on this class of languages, instead of features for production languages (like fancy type systems).


> Julia is great for explorative programming, while Python et al are better suited for slightly more structured, more productionized programming

In Julia, explorative programming is productionized programming, you don't have to switch to a different or restricted set of tools to make your program fast, it's the same code that you can profile on the fly. With types, you can patch your hotspots by adding a specific method to the generic function, and it will get invoked automatically instead of the more general case algorithm.


How fast is Julia wrt to CUDA C++, or C++/Rust?

In those languages, I can actually take the theoretical practical hardware limits (memory BW, throughput, latency), and relatively easily achieve 99% of their utilization.

When people say “very fast” and “my cores are at 100%” they often mean “my program achieves 1% of the perf the hardware can deliver”.


Julia is in that class of languages. We regularly look at roofline plots when performance tuning.


Is Julia JIT deterministic / reproducible? It doesn't matter if you are writing the code to run yourself on your own machine, which is usually the case in research. But would you recommend Julia for a project has soft realtime requirements on the order of 10 ms and has to run in hundreds of machines scattered across the world, and you have to support personally?


Probably not in default configuration, but we have some people interested in this kind of thing. The compiler candy definitely be configured to make that work, but so far interest hadn't been sufficient to implement that. If you have a particular use case in mind, do let us know to see if we can help you out.


No, that is what people cannot wrap their head around. They always think there is a catch or a downside with Julia. But Julia is really to have a cake and it too.... ok you got to accept your first plot is slow ;-)


You have to accept that compilation is on the fly, so you cannot test the actually binary code your customers will run. There is an attempt at AOT in a library but it did not look stable last I checked (when Julia went 1.0).


>You have to accept that compilation is on the fly, so you cannot test the actually binary code your customers will run

So? That's the case for every JIT or interpreted language...


If comment was made WRT alleged fit for demanding performance - JIT-ted languages usually can't compete and unpredictable performance could mean unpredictable failures.


Julia, by virtue of being compiled, is also probably better for production. But it’s much less well known and has less library support, which is a significant barrier to use.


Pycall and RCall really help with this.

Dragging something in from scikit-learn was literally like...two lines of code. It's not the same...investment as most FFI systems.


No Julia is just better than Python in every possible way


How much of a speed difference is there for iterating over the rows of a data frame in Julia vs pandas itertuples with index false and name none? Any ballpark figure will do, I’m curious.


For a DataFrame with 1 column containing integers 0 to 99,999, in Python I'm seeing ~16ms to add up the column by iterating with itertuples (index false, name none - which I didn't know helped performance, so thanks!). Julia it's 8ms to add up when iterating with eachrow and accessing the column by name.

So ~2x faster in Julia? Though it's a sort of awkward comparison, since this type of iteration seems a little anti-idiomatic in Python/pandas and doesn't fully leverage type information in Julia.


Since everybody is so hyped about Julia, I took on learning it and using it in one of my pet projects. The use case seemed to be perfect for Julia: financial portfolio optimization.

But honestly, it is a royal pain in the ass to use. Mostly because tooling is so crappy (no decent IDE).

For example, I've made these notes on my journey with Julia:

- Juno and VSCode julia is a pain

- Debugging is pain - no watch expressions (but debug> works)

- Works like with matlab (workspaces, designed for the single script use-cases)

- Long registry lookup times (2-5mins) when starting a new session

- No hints from IDE (Juno) on what methods are available for which types

- Ctrl-click navigation does not work to lookup definitations in libraries

- Debugger does not show the full list of elements - there are triple dots in the middle

- Filters and maps are not lazy

- Arrays start at one

- Precompilation times can take 1-2min

- Package management is done as a part of script execution

- Delay (0.5-1s) when processing commands (compilation?)

- Editors (Juno) do not check types at runtime when the information is readily available

- Include() includes (is this PHP all over again?) a file into the script!

- Reuses same REPL between different files

- DataFrames.jl: Transform does nonsense (use map instead)

- DataFrames.jl: No transpose

Obviously, there is a bunch of good things like freakin insane speed and vectorized method calls are awesome, but I struggle seeing myself becoming more producting than Python + Pandas.

It's really a shame that language is strongly and semi-statically typed yet no tooling is making a use of that.


To be fair, most of these are ide complaints, not language complaints.

Much like all development work, language designers are now required to know much more than how to write an effective static compiler, an effective language syntax, and an effective standard library. A working IDE that integrates easily with all mainstream editors, extensions for all mainstream editors, cross-platform compilation for all major environments, a secure and effective dependency manager, a secure and effective build tool, and incremental compilation are all requirements for a language to get off the ground. Honestly, that seems like a big hurdle for one person or even a dedicated team to manage effectively without major funding outside of what is normally present in research grants. Hosting and curation alone for the dependency package system is a costly problem.

Are we as developers painting ourselves into a corner of stagnating pl technology with our expectations that all these things exist in order for us to call a language "good?"


Yes. But only computer scientists are interested only in the language itself. I am an engineer so I have to deal with practical aspects of it.


I'm an engineer. I primarily care about how good the language is at expressing and executing my intent. I've been around long enough to see several tooling fads come and go. I've also been around enough to know that the barrier for delivering quality software is becoming insurmountable, and the additional tooling and infrastructure isn't resulting in projects being more successful, nor being delivered more quickly, nor being less costly than when I started. Teams are enormous now. Software projects are enormous. Deployment requires deep knowledge of multiple hosting platforms and various cloud provider apis at the enterprise level. Available libraries, increasing reliance upon testing, type safety, and better development life cycle practices have had a far larger impact than the ide's I've seen come and go. In my experience, a particular tool has a popularity lifecycle of about ten years. Vscode and lsp will probably be the same. Give me types, good compiler error output, and access to a large universe of libraries with good documentation over relying upon ide api discovery any day of the week and twice on Sundays. The rest is an ever-changing tide of requirements fashion.


Give us some more time, there's just so many things to do and the reliance issues favor getting the fundamentals right first (much easier to switch the debugger UI than the names of functions that'll get put all over the place). We certainly understand the value of good tooling, so just stay tuned and maybe try it again every once in a while to see if it's good enough for you yet. Too much to do, too little time :)


I think this is an important point. What a practitioner can do with a language is influenced as much by the tooling as with the language itself. The implication is that languages should be constructed with as much thought given to the external tooling as to the compiler or interpreter.

Julia in particular is built to be a practical language so, these points are valuable.


I agree with you 100%.

Julia is a language that is half baked for engineering use. Any one who has ever built anything beyond examples and tutorials, something production worthy and used by hundreds of people - they will attest to how immature Julia is.

That's not the fault of the programming language per se... just that it takes 10+ years to get up to speed in terms of tooling and engineering worthiness.

In addition, there are a lot of problems with the language itself some of which are highlighted in the top comment of this thread.


We use Julia in production at RelationalAI. Our cloud hosted next generation knowledge graph management system is a highly performant database and query optimization engine, and it's written 100% in Julia.

We love Julia for the same reasons others have stated here: It's easy and intuitive to write, making it productive to work with for both our engineers and our mathematicians. It's reflection abilities make performance tuning and optimization easier than any experience I've had before. The ability to fine tune the assembly emitted for your hotspots _matters_, and it's straightforward to do in Julia. And for the rest of the code, being easy to write, easy to read, and close to the math are all a big win.


I've built production software being shipped to pharmaceutical companies and used in FDA submissions using Julia (pumas.ai) because it is quite mature in the field of modeling and simulation in comparison to other languages. The stability of the language along with the ability to easily get fast code is the main reason, and it's the reason why if you go to something like the American Conference on Pharmacology (ACoP) you'll hear lots of mentions of Julia.


I've also built several production systems and it has nothing but nightmare. Stack traces are meaningless and there were so many issues with things like ISO Time Zones, HTTP requests (why isn't there a solid http request library? its a mess), Docker problems with compilation, start times were atrocious and still are I think, package management is insane, setting up an internal julia repository is a total nightmare - just go on SO and search for it.

I've lived through this.


Another way of thinking about it is that when switch to a new language, we have to start over building nice tools for it, which seems inefficient. It's like doing a big rewrite.

That's less true than it used to be. Julia wouldn't be where it is without LLVM, and things like VS Code's language server protocol do help. But there is still a lot of redoing things for people who write language tools.


Ceylon had good IDE support when it was 4 years old and nobody used it. That's just an excuse.


That doesn't prove that "it's just an excuse".

It just shows that while IDE support is a factor, it's not the single determining factor.

Ceylon had horrible marketing and advocacy, for starters. And the IDE was Eclipse, at a time that it was getting increasingly unpopular...


Most of his complaints can be solved via a Julia LSP. LSP‘s aren‘t that hard to write.


There already exists a Julia LSP. It's sufficient, could be better though.


>>To be fair, most of these are ide complaints, not language complaints.

The language itself doesn't matter without its ecosystem. This includes tooling, libraries, quality of documentation, the community, and more.


This is why I think its mostly non-engineers commenting about how great Julia is. No one has tried to deploy it prod. Sure there might be exceptions, but if you interview 100 engineers who've tried deploying Julia, a vanishingly small % would recommend it if at all.


I'm not commenting on Julia per se. I'm commenting on the state of discussions surrounding programming languages in general.

> if you interview 100 engineers who've tried deploying Julia, a vanishingly small % would recommend it if at all.

Popularity of a choice does not correlate with a choice being correct. It is a fallacy to evaluate a particular tool's usefulness for a particular job by how popular that tool is. And doubly so when most of a random sampling of engineers includes virtually no one experienced in using said tool. In discussions of virtually any non-popular programming language today, the vast majority of commenters on the language have absolutely 0 experience using it day in and day out for its intended use case. All you have in that discussion are hammer users. To them everything is a nail. And they universally hate any new tool that doesn't look like a hammer, because they can't take the time to learn the new tool. Why would they? Everything is a nail anyway. This is disappointing, especially here, because Hacker News is the place where the concept of using a non-familiar language was expressed as a super-hack for startups. My original reply expresses the conjecture that eventually, all pls will look the same and nobody will be able to develop a language that will truly provide an increase in quality, decrease development or maintenance costs, or improve the reliability and safety of systems the world runs on, because people who evaluate languages actually evaluate them for the tooling that largely gets bolted on after the fact by nonexperts (the lsp server / vscode crowd). There are three lsp servers for python. Only one of them works with pipenv on Windows. Does that make python bad? If python were introduced today, as it was when it first came out, would it gain popularity with all the expectations we place upon a language? No. And that would be a shame. The same is true of Javascript and Java. C and c++ would go nowhere. About the only language that would have made it is Csharp and Visual Basic and Kotlin, because they were written to SELL IDEs and tools.


Well said. Also, the article starts with a litany of praise from a variety of research groups. Some of these are big projects with government funding and serious computing requirements. Julia is deployed all over the place now.


>>Popularity of a choice does not correlate with a choice being correct. It is a fallacy to evaluate a particular tool's usefulness for a particular job by how popular that tool is.

Okay but this is a very engineering-specific mindset. I encourage you to broaden your perspective. "Correctness" of a language choice is more than about the language itself. In most situations, it is better to choose a language that scores 9/10 on how well it fits the problem domain than one that scores 10/10, if the 9/10 language will let you solve the problems faster (pre-existing solutions), with less frustration (good docs) and zero need to reinvent the wheel (good libraries). This is especially true if the 9/10 language will also allow you to recruit more easily due to its popularity.


> The same is true of Javascript and Java.

World would have been a bit better place to live.


> Editors (Juno) do not check types at runtime when the information is readily available

This complaint is really common with people that see types and expect them to do something like TypeScript. Julia is not a statically typed language. The types are not there for a type checker to check your code for correctness.

Instead the types allow Julia to dispatch your method call to the correct implementation for the type of your variable at runtime. This is Julia's secret sauce, and it's the main reason you'll hear about Julia programmers declaring about how composable Julia packages are. I can "reach into your package", and define the behaviour of your functions on my own custom types and then whenever anyone tries to call the function/method with one of my types, it'll just work. I don't need to author a pull request to your package or futz around with anything that you've wrote.

Novice Julia programmers often come in thinking that the type annotations are there for ensuring correctness, but that's not it at all. In actuality, you want to be as general with your types as you can, and by default most parameters will probably be untyped. If you need certain behaviour for the function, then you should probably annotate it, but it's definitely not required.

I'm guilty of this misstep as well, one of the biggest things I struggled with when I was new was using ::Array everywhere, when what I wanted was ::AbstractArray.

> Include() includes (is this PHP all over again?) a file into the script!

Use Modules. Don't fault Julia for you being a beginner and not reading the necessary parts of the documentation.


I am not sure how much I agree with this. I have spent most of my professional career in Python and have watched the trajectory of type hinting in the Python ecosystem somewhat closely.

Python is not a statically typed language and yet type hinting sentiment seems to have gone from: "Why do you want it? Python isn't statically typed?" to "Well, mypy is a useful optional package" to "Let's just support type hinting in the language itself". Which is to say I think there is value to type checking by development tools even in dynamically typed languages.

--

On a slightly unrelated note, Julia seems to be similar to Python when it comes to types; that is to say both appear to be dynamically and strongly typed. But take that with a grain of salt as I don't have any first-hand knowledge of Julia.


Oh I’m not saying that it’s a bad thing to want static type checking. I’m just saying that the type annotations in current Julia are absolutely not for that. It’s for multiple dispatch and for helping the compiler choose optimize your code.

The thing that’s interesting about Julia is that it’s sort of like a verb first language instead noun first. You describe the behaviors that you want by creating methods with one name, getindex, setindex!, etc. that defines the API. Then you can implement that for any type by creating a new method with the same name and a different type signature.

Edit: In fact, I think that over-annotating your methods is probably the number one mistake that new Julia programmers make. Adding a type to the type signature doesn't make your code fast. Adding too many types will restrict your code and it won't work when you think it will because you've told the compiler that this only works with 3-dimensional arrays of type Float64 and you're trying to call it with a 3-dimensional array of type Float32 or Int64.

What makes code fast is type stability, but that's a whole other topic.


> Instead the types allow Julia to dispatch your method call to the correct implementation for the type of your variable at runtime. This is Julia's secret sauce, and it's the main reason you'll hear about Julia programmers declaring about how composable Julia packages are. I can "reach into your package", and define the behaviour of your functions on my own custom types and then whenever anyone tries to call the function/method with one of my types, it'll just work. I don't need to author a pull request to your package or futz around with anything that you've wrote.

That's not a bug, that's a feature. Libraries are abstractions. I get an API and it encapsulates the behavior of the library. That is a nice thing. Being able to modify runtime behavior of internal libraries can lead to insane jumblygoo of code spathetti that would bring the finest programmers to their knees.

I don't want implicit behavior at runtime. I want explicit behavior in case of non-contractual externalities (wrong user input for e.g. at runtime). I want the program to fail so it can be patched.


You still get an API that encapsulates the behavior. This is not like monkey-patching (directly changing the behavior of libraries), but separating the abstraction layers. Every complex enough system will have multiple layers (for example when working on communications, if you're working with the network layer you don't need to focus on the physical layer below or the application layer above). Multiple dispatch allows the library ecosystem to better work in the same way:

For machine learning models we have the layer that handles the low level operation (sums, multiplication), which are swappable (you can have an implementation that runs in the CPU - Julia's Base - and an implementation that run in the GPU - CUDA.jl - and even a TPU - XLA.jl or Torch as backend). Above you have the tracker (the layer responsible for the autodifferentiation logic, which includes Tracker, Zygote, ForwardDiff). And above you have the library with rules for generating gradients (DiffRules, ChainRules), and above you have ML constructs (NNLib), and above ML frameworks (Flux, Knet) and above more specialized libraries like DiffEqFlux.

Whoever writes the ML framework doesn't need to care about the backend, whoever writes the GPU backend doesn't need to care about ML framework. This is not because the person writing the GPU backend patched the ML framework, but because the ML framework legitimately doesn't care about how the low level operations are executed, it doesn't work on that level of abstraction. And the user of the ML library can still see it like a monolith not unlike Pytorch or Tensorflow when he imports a library like Flux, until he wants to extend them and then he will find that they are in fact many independent swappable systems that compose into something more than the sum of it's parts.


I think the previous poster's use of the phrase "reach into your package" was a bit colorful. You're never actually modifying the internal code of a library. You're only extending a generic function from a library to work on your own custom type. So that extension only affects code that uses your new custom type. It's akin to using class inheritance in OOP languages.


> Long registry lookup times (2-5mins) when starting a new session

I'm not sure what you are referring to here. Yes, sometimes after you've installed new packages, the Julia VS Code Language Server has to do some re-indexing which can take some time, but they've improved this and you can still program and use the integrated REPL while this is occurring.

> Ctrl-click navigation does not work to lookup definitations in libraries

In VS Code, F12 (Go to Definition) does work. You can also use `@edit foo(x)` in the REPL.

> Filters and maps are not lazy

Use Iterators.filter or use generator expressions.

> Package management is done as a part of script execution

I'm also not sure what this refers to, but the package manager in Julia is one of the best things about Julia.

> Editors (Juno) do not check types at runtime when the information is readily available

The VS Code Julia extension has a linter that is pretty helpful. It's not perfect, but they're actively working on developing it.

> DataFrames.jl: Transform does nonsense (use map instead)

The new select and transform functions in DataFrames.jl are actually quite powerful and useful.

> DataFrames.jl: No transpose

Transpose is not a generic concept for a table. Sure, it might make sense in specific cases, but in general it doesn't make sense to transpose a table.


>Sure, it might make sense in specific cases, but in general it doesn't make sense to transpose a table.

From my intuition, I don't see anything stopping the function from existing -- it can be applied to any arbitrary table. You probably don't want to transpose your table, except when you want to, but that's true of any function -- I'm can't imagine any scenario where transpose(table)->table would as an algorithm fail (unless I suppose if julia tables include header rows, in which case there's probably no generally correct definition)


> From my intuition, I don't see anything stopping the function from existing

A datatable is (or is isomorphic to and can be analyzed as) a mapping from row numbers to tuples of a given shape[0].

The transpose of table will only be table (a mapping from rows numbers to tuples of a common shape) if the tuples of the starting table were homogenous (every field of the same type.)

This works with, say, matrices where all the elements are numbers; but it fails in the general case.

[0] Yes, I know, Julia defines them as columns of arrays, and columnar organization is ideal for all kinds of processing tasks. For me, thinking about and explaining the the problem with transpose works easier thinking about it in row-oriented form (which is logically equivalent). In column-oriented description, a datatable is an ordered set of columns, each of which is a homogenous array, but if you try to transpose it, each of the columns of the result would be a heterogenous array unless the columns were all of the same type to start with. So, again, it fails to be a table->table function except in the case where the starting table consists of columns of identical type.


No, I'm still not understanding. Here's my thinking:

Scenario #1: If the tuples are the same shape (type, size), it's fine

    [
       (string, int, date)
       (string, int, date)
       (string, int, date)
    ]
transposed:

    [
       (string, string, string)
       (int, int, int)
       (date, date, date)
    ]
Both input and output have tables with consistent shapes (type, size)

Scenario #2: Assuming its legal, if the tuples are differently shaped (by datatype), its weird (but that was true of your original table anyways), but you can still do a valid transposition to produce a valid table

    [
       (string, int, date)
       (int, date, string)
    ]
transposed:

    [
       (string, int)
       (int, date)
       (date, string)
    ]
It was weird to begin with, and it's similarly weird to end with. I can't imagine the output not being a legal table by any rule that does not also disallow the input.

Scenario #3: Similarly to scenario 2, If your tuples are differently shaped (by size), you can still do a transposition

    [
       (string, int)
       (string, int, date)
    ]
transposed:

    [
       (string, string)
       (int, int)
       (date)
    ]
and like Scenario #2, the output is as illegal as the input

In the latter two cases, I don't know what you'd want to do with the transposition (or even its input), but I don't see anything stopping the operation itself from being reasonable/consistent/valid.

Is there another scenario I'm failing to imagine?


> Scenario #1: If the tuples are the same shape (type, size), it's fine

In the row-oriented view: each row of a datatable is a tuple of the same shape (size and order of types as every other row) -- just like a database table. So, if the shape is (string, int, date) for row #1, its that shape for every row.

In the column-oriented view, each column is a homogenous array: every element in the column has the same type.

> [ (string, int, date) (string, int, date) (string, int, date) ]

Sure, this is a fine starting table; in row-oriented, its shape (the shape of every row) is (string, int, date). In column-oriented view, the table as a whole can be viewed as a tuple of shape (string[3], int[3], date[3]) because it has three rows. Cool.

> transposed:

> [ (string, string, string) (int, int, int) (date, date, date) ]

Right, this is no longer a datatable. The first row has shape (string, string, string). So, if its a table, the other two rows must also have shape (string, string, string); but instead, each has a different shape.


Ah!

Ok, that makes sense.


These problems mostly seem to come from trying to use Julia in some kind of visual studio paradigm. The key is to leave the REPL running, I find. Then, with Revise.jl the speed is insane because nothing has to start up; only the changes you make will be recompiled during execution. Also take a look at Pluto.jl.

Edit: By the Visual Studio paradigm, I mean a paradigm where all the stuff is made around visual interactions with the IDE. For example, showing autocomplete which works because the programmer is very constrained in OOP (foo.b completes to foo.bar) and statically typed languages with Java and C# as prime examples.

I think these constraints are made around the idea that programmers are dumb and that it mostly is a distraction from how one actually wants to think about the problem at hand.


> - DataFrames.jl: Transform does nonsense (use map instead)

What is this supposed to mean?

> - DataFrames.jl: No transpose

This is actually going to be fixed in the next few days (https://github.com/JuliaData/DataFrames.jl/pull/2447). Given the limited amount of work it required, it sounds quite exaggerated to mention it as a major limitation of the language.


  - Arrays start at one
Why is this such a big deal? You just change the range of your for-loops from 0<=i<n to 1<=i<=n. Are you using cyclic buffers?


It matters when doing multidimensional index arithmetic. The formulae for ranking and unranking multi-dimensional to flat indexes are neater with 0-indexing than with 1-indexing.


Julia has really nice abstractions for dealing with this. Things for which you might do such calculations in other languages can often be done in a way generic to the number of dimensions, as well as the initial index, using CartesianIndices & friends. And this should be zero cost.


Almost all languages above the C level has nice abstractions for this, but you still sometimes need to do it for various reasons. Basically, whenever the indexes don't matter, then 1-indexing and 0-indexing are equivalently good, but when they do matter, then 0-indexing leads to neater calculations.


OffsetArrays.jl is your friend.

I'm working on a project where indexing naturally runs from -n to n. I kludged the usual index hackery and was so plagued with off-by-one errors that I gave up on it. Then I remembered Tim Holy had a blog post about OffsetArrays.jl and started using that.

Problem solved.

Seriously, the most annoying thing I find about Julia is that the various packages fairly often have "creative" names, that make it difficult discover that they might actually be useful.

YMMV.


> I'm working on a project where indexing naturally runs from -n to n. I kludged the usual index hackery and was so plagued with off-by-one errors that I gave up on it. Then I remembered Tim Holy had a blog post about OffsetArrays.jl and started using that.

Ah, thank you. I've been tinkering around with Julia, and was trying out 2D random walks. Not being able to use a 2 dimensional array and store the position [0, 0] confused me. I ended up starting at X,X (where X > total number of moves) and then got stuck with plotting it nicely, because I wanted the plot axes to show how far it had moved, not X +- how far it had moved.


There are plenty of matrix uses where 1-indexing is more natural. I think the main reason Julia uses it is to be more consistent with Matlab.


Abstraction discovery is a hard problem in general. Good names help, but there's probably more we can do. Tooling finding a common pattern that the abstraction eliminates, perhaps?


I guess it could poke you every time you say `strides(A)`?

Here's a recent example of an upgrade, from code with stride calculations to a more abstracted version, which is generic to offsets & number of dimensions:

https://github.com/JuliaLang/julia/pull/37367/files#diff-29c...


Fair enough. But is this a common enough occurrence in the life of a programmer that it's worth being a religious extremist about?


No need for an inquisition. In both Fortran and Julia you can set your index origin to be whatever works best for the problem you are expressing.


It's in a list of annoyances, together with 16 others. I don't think anyone is being "a religious extremist about" it...


In Fortran, you can start the index from any number you want. A(-10:-3) is a completely valid slice or full array declaration, so are A(0:7) or A(1:8). and this feature is absolutely useful in scientific computing.


I also didn't think it was a big deal - I write a lot of lua, and I love lua - but every time I have to do anything involving modulo maths and arrays it's a pain in the ass.


Julia makes it easier for example with functions that handle mod in the context of 1-indexed arrays (if you don't want to use OffsetArrays).

https://docs.julialang.org/en/v1/base/math/#Base.mod1


Well, that seems cool, but isn't indexing from zero simpler?

I don't really remember if zero-indexing was a big deal when I was learning to program. Are there big advantages for non-programmers? I figure that's why Lua went the 1-index route.


>I don't really remember if zero-indexing was a big deal when I was learning to program

And neither is 1-indexing even though people are always complaining about it. I kinda like when indexing from the third element to the seventh element to be just 3:7, instead of 2:6. and the last element to be length(x) instead of length(x)-1. And that use case is much more frequent to me than creating circular buffers on an array. And in Julia you can just use begin:end as well, or any iterator, it's really overall a non-issue for me.

Not sure about Lua, but in Julia (and Matlab, and R) is just because they actually have vectors/matrices types that have a convention that predates even programming, while it makes sense for languages that an array is basically a memory offset (that maps to a pointer arithmetic logic) to start in 0.


Array should start from any index scientist wishes. That is how it is done in Fortran.



Dijkstra has a certain tone. He can make his opinions sound almost like a mathematical proof. But when you dig into it, he just says 0-based is "nicer".


The points he makes are so extremely minor... You could just as easily say that natural language and mathematical convention are index-1.


I never understood the fascination with preferring "the element at zero offset from the start" to "the first element" - and I don't think djiskarta makes a compelling argument.


Edit: just learned about OffsetArrays.jl - ok now that's really cool. Keeping comment for context.

> - Arrays start at one

Oof. Big turnoff for me. For some reason, this particular context switch really grinds my gears. All languages I use (except bash, which grinds my gears) use Dijkstra-style array slices: python, go, c/++, js. I'm sure it makes it easier for Matlab converts.

This, plus the mentioned weaknesses with libs like http, grpc et al, IMHO will relegate Julia to many years of just being wrapped by other languages. To avoid that, I think they should be thinking about continuing to woo more of the engineering crowd - which to their credit I think they've done pretty well so far.

https://www.cs.utexas.edu/users/EWD/transcriptions/EWD08xx/E...

https://github.com/JuliaArrays/OffsetArrays.jl


They’re making the language mostly for statisticians, mathematicians and scientific programmers, where it makes sense to have arrays start at 1. Why should they cater to whiny engineers who can’t deal with anything they aren’t used to?


Arrays start at 1!? What Joy! Maybe I can try this language!


I'm an engineer who writes the occasional matlab/python code to automate tasks, process data and fit models. Numerical libraries are extremely important to me.

I recently started using Julia for simulating/fitting some differential equations and am thoroughly impressed with the speed, syntax and library documentation. Startup speed was initially painful but I am used to it now since I only pay it once.

The IDE situation is not fully stable yet but Juno is very good. I wish they package a standalone Julia IDE (I am aware of the new VS Code plugin) based on either Atom or Code just to make it easier for those switching over from MATLAB and Python (Spyder) IDEs.

Don't know how far they've progressed on static compilation yet, but if they get that even for a language subset it would truly be a game changer for general purpose programming.


> I wish they package a standalone Julia IDE

That's precisely what Julia Computing's JuliaPro is. https://juliacomputing.com/products/juliapro

(Full disclosure, I am employed by Julia Computing)


From a quick glance on the web page I cannot tell what JuliaPro now really is.

"Fastest on-ramp"? What is that?

What I would want to see in first 3 seconds is something like "JuiaPro is a subscription service giving you access to curated packages"... You know something "tangible"... Maybe it's just me being lazy...


Yeah, fair enough. I've filed an issue internally against the website copy. It should definitely tell you what it is above the fold.


Thank you. I thought this was some paid enterprise product. Didn't realize there was a free version.

It looks like I have to sign-in to download but can't find a link anywhere to create an account, either on the homepage or anywhere else on juliacomputing.com.


Here's the register with email link: https://auth.juliahub.com/register.html


thanks!


Static compilation works - see PackageCompiler.jl.


I’d be particularly interested in opinions about whether my popularized attempt to explain the expression problem is effective, or just confusing.


It was rather confusing and I consider myself well-informed about the expression problem.

But to give you some constructive feedback, your understanding, or your writing, of the expression problem lacks two fundamental concepts:

First of all, in functional languages it is as simple as in object-oriented languages to add new data types. It hard to add a variant of an existing datatype. In object-oriented languages it is also trivial to add new functions, but it is hard to add a new function for all derivates of a class. So it would be more like having a table for perfectly cooking all variants of fish.

And secondly, the solution offered by Julia is also incomplete. Both functional and object-oriented languages give you completeness guarantees. So the compiler will warn you when you miss a variant of fish (functional) for a method or a method for a particular fish (object-oriented) At least to my understanding the completeness of methods (that's the right term, no?) in Julia is unchecked. You can extend, but you can easily miss a case.

The same (unchecked extensions) can be done in functional languages with open algebraic datatypes and in object-oriented languages with default methods that throw exceptions.


Thanks for this detailed criticism, it’s quite useful.

My understanding is that for functions with more than a few arguments, it is expected in Julia that only a small subset of the possibly thousands of combinations will be covered. Thousands, because with multiple dispatch you dispatch on the types of all the arguments. The binary operator “*” in Julia has 364 methods, and that just takes two arguments. So you define the ones that are useful.


Sure, that is an absolutely sensible design decision. But for other cases completeness is vital. Consider for instance the trivial case of the map function over lists (constructed by :: and []), often abbreviated as * :

  f * [ ] = [ ]

  f * a::as = (f a)::(f * as)
Even if that is a nearly trivial case, it is easy to see that one could forget the first case, as all the interesting bits happen elsewhere. Now imagine, I would want to have a lazy reversal of lists, that I store as an alternative until I need to deconstruct it. I write that x~xs, meaning "x added to the end of xs".

Now I have to expand my map function as follows (because map does not care about ordering):

  f * a~as = (f a) ~ (f * as)
If I miss that case (or method on Julia's case) my extension of the datatype is plain wrong.


As someone who would probably understand an explanation in real words, my eyes glazed over and I stopped parsing after about the third mention of fish. But I'm not the target audience.

In popular science articles, I much prefer paragraphs laid out with a sentence of technical description followed by a layperson's explanation and/or example. That way, if I understand the first sentence, I can skip to the next paragraph. It's been a while, but I recall this being a popular format for textbooks as well.


For a more technical description, I don't think there's a better source than Stefan Karpinski's 2019 JuliaCon talk "The Unreasonable Effectiveness of Multiple Dispatch".

https://www.youtube.com/watch?v=kc9HwsxE1OY


Thanks for the feedback.


I started your section about the fish and didn't finish it. I expect ArsTechnica articles to be technical; in this case I think actual code examples would be good. I think Stefan K's video referenced elsewhere in this thread used examples that are both accessible and educational.


While the explanation was a little wordy, it really helped me visualize what was happening — particular the image with cross-links, and the two different kinds of table of contents.

Now I see that the essence of the concept relates to the semantics of “categories” (mutually exclusive, collectively exhaustive) -vs- “tags”, especially when there might be multiple ways to split the domain into categories (especially when they fill out an incomplete/sparse subset of the full Cartesian product). Conventional dispatch (presumably implementation in the language?) is based on one of either functions/types as categories, while multiple dispatch uses all of these inputs as tags to specialize on the appropriate implementation.

If this perspective is correct, I would like to understand why/how the implementation conventional languages forced them into category-based single dispatch.

Thanks for a nice article :-)


Many older languages don’t, Common Lisp, Erlang, and even C++ templates have multiple dispatch or similar.

Object oriented languages however largely use single dispatch due to being focused on objects and the implementation usage of vtables which work on a single object, as in C++/Java/Python.


Interesting perspective. Reminds me of how trying to index the web using categories and hierarchies gave way to search using keywords. Similar with email interfaces.


I keep reading about up and coming Julia programming language a did a quick read through of the article. Coming from a programming background I was interested in what ways Julia was different from python and more mainstream Java.

One thing that would've made things easier to understand would've been comparing Julia's multiple dispatch to Java's dynamic dispatch(they're more or less the same from what I gather). Great article over-all, easy to understand for the layman.


You might mean C# here instead of Java? I don't know of anything similar to Julia's multiple dispatch in Java land.


I really liked the nontechnical explanation, although the diagrams are a bit busy.

Something it could have benefited from is an explanation of what "dispatch" means, and why "multiple dispatch" is so called.


I think the recipe metaphor section was far too long (should have been 10% of the size maybe) and I would have preferred the actual numeric examples to be filled out instead.

More worked examples with perhaps some of the recipe metaphor mixed in would be better.


I think it is a difficult topic to explain in lay terms. Your analogy made sense, but man was it wordy. I loved the diagrams! Nice article overall.


As a data point, I also found the cooking examples confusing. Probably because I am more familiar with programming languages than cooking :)


Fyi, I liked this to get a quick and initial understanding of the language (I just read it): https://juliabyexample.helpmanual.io

Question:

I see that on Gentoo Linux all versions of the language-compiler (pkg "dev-lang/julia" versions 1.2, 1.3, 1.4, 1.5) are all marked as not yet being officially stable ( https://packages.gentoo.org/packages/dev-lang/julia ) => any "real" reason for this? Are there any open/important bugs or is it just because of e.g. a low usage of the language itself on the Gentoo distro, or maybe because the specs of the language are still changing, etc...?

I think that e.g. Rust more or less as old as Julia, but Rust on Gentoo is marked as being stable since a long time... .


That makes no sense. Everything since Julia v. 1.0 should be considered stable. There have been no real breaking changes since that release.


Thx

EDIT:

Ok, maybe it's because of some dependencies. 3 are marked as not yet stable ("~" character).

  # emerge -pv dev-lang/julia
  These are the packages that would be merged, in order:
  Calculating dependencies... done!
  [ebuild  N     ] net-libs/mbedtls-2.24.0:0/5.13.1::gentoo  USE="threads -doc -havege -libressl -programs -static-libs -test -zlib" ABI_X86="(64) -32 (-x32)" CPU_FLAGS_X86="sse2" 3,821 KiB
  [ebuild  N     ] media-libs/qhull-2015.2::gentoo  USE="-doc -static-libs" 987 KiB
  [ebuild  N     ] sci-libs/amd-2.4.6::gentoo  USE="fortran -doc" 336 KiB
  [ebuild  N     ] sci-libs/camd-2.4.6::gentoo  USE="-doc" 310 KiB
  [ebuild  N     ] sci-libs/ccolamd-2.9.6::gentoo  299 KiB
  [ebuild  N    ~] sci-libs/openlibm-0.7.0:0/0.7.0.0::gentoo  USE="-static-libs" 358 KiB
  [ebuild  N     ] sci-libs/lapack-3.8.0-r1::gentoo  USE="-deprecated -doc -eselect-ldso -lapacke" 7,253 KiB
  [ebuild  N     ] media-libs/glfw-3.2.1::gentoo  USE="-examples -wayland" 462 KiB
  [ebuild  N     ] sci-libs/metis-5.1.0-r4::gentoo  USE="openmp -doc" 4,869 KiB
  [ebuild  N     ] virtual/lapack-3.8::gentoo  USE="-eselect-ldso" 0 KiB
  [ebuild  N     ] virtual/blas-3.8::gentoo  USE="-eselect-ldso" 0 KiB
  [ebuild  N    ~] dev-libs/openspecfun-0.5.1::gentoo  USE="-static-libs" 119 KiB
  [ebuild  N     ] sci-mathematics/glpk-4.65:0/40::gentoo  USE="-doc -examples -gmp -mysql -odbc" 4,070 KiB
  [ebuild  N    ~] sci-visualization/gr-0.50.0-r1::gentoo  USE="X tiff truetype -cairo -ffmpeg -postscript" 8,411 KiB
  [ebuild  N     ] sci-libs/cholmod-3.0.13::gentoo  USE="lapack matrixops modify partition (-cuda) -doc" 680 KiB
  [ebuild  N     ] sci-libs/arpack-3.1.5::gentoo  USE="-doc -examples -mpi" 1,481 KiB
  [ebuild  N     ] sci-libs/spqr-2.0.9::gentoo  USE="-doc -partition -tbb" 2,111 KiB
  [ebuild  N     ] sci-libs/umfpack-5.7.9::gentoo  USE="cholmod -doc" 754 KiB
  [ebuild  N    ~] dev-lang/julia-1.4.0-r2::gentoo  USE="-system-llvm" 44,278 KiB


This is a Gentoo distribution thing. I interpret it mostly as a quality bar. See: https://devmanual.gentoo.org/keywording/index.html#moving-fr...

One of the requirements of being in arch (vs. ~arch) is not having any ~arch dependencies, so that's the first direct issue with Julia moving from ~arch to arch.

(Also, I don't feel particularly bad about pulling in a package or two from ~arch.)


Ok, makes sense, thank you - personally, I agree with the current approach (that basically says "hey, careful, there might be complications) :)

Yeah, in general I'm ok as well about using in some cases one or two direct "~"/unstable-packages, but historically I always got "burned" when using more than 3-4, therefore nowadays, being older and hopefully a little bit more wise, I'm super-extra-cautious :)


Julia is one of those programming languages where I find it interesting but I cannot imagine a practical use case for myself. I even took the time to `brew install` it and was thinking about it again recently when I saw it update. The same is true with numpy or R - my work just doesn't involve high performance numerical computing. And even if it did, it would always be secondary to some other purpose. Outside of a REPL, if I were to productionize some heavy computational work, even numerically based, I would probably still lean towards containerizing a C++ (or maybe Rust) binary.

It's one of those "right tool for the job" type quandaries. For many popular languages (Javascript, Python, Go, C/C++, Rust, Java, OCaml) I have an intuition on when I would reach for them based on my experience. With Julia - I am not sure the shape or character of the problem where I would reach for it.


Sounds like you just may be the wrong target. Numpy, R and Julia are not designed for software engineering and are likely more approachable and useful to scientists than something like C++ or Rust.


So how is julia nowadays? I stopped using and following it sometime ago as I found it as slow as python, or worse, for anything that wasn't numerical computing.

Really wanted to love it.


In my experience, the big "problem" with Julia's performance is that while it's actually a compiled language, it's also dynamic.

In most compiled languages, if you write code the compiler can't completely infer, it just won't compile. Conversely, once your program compiles, you don't have to worry about the compiler anymore when running the binary - you know your program is compiled effectively.

A Julia program, on the other hand, will run just fine if the compiler can't figure out the types at compile time. It will just infer whatever it can, run that, and then use runtime values when it needs to. That's a big plus for people like me where performance doesn't matter 95% of the time, but it's a bit of a performance trap for newbies.

Almost all cases of people complaining about Julia being slow come from them inadvertantly writing code the compiler can't infer properly. Luckily, Julia has interactive tools to check the inference of functions. In my experience, once you get used to writing Julia, it's rare that you accidentally write non-inferrable code.


Yeah, I think type instability is definitely the biggest culprit when someone's ported over some Python or Matlab code for the first time and isn't getting as much of a speedup as expected. It isn't hard to fix, you just have to know about it (and how to check for it with `@code_warntype`).

That and perhaps the excess unnecessary allocations from array indexing on the right hand side of an assignment, if you don't know about `view`s.


I think julia has evolved a ton over the past few years and a lot of work has gone into making things that aren't numerical computing feel like first class uses of the language.

A very popular thing right now in the community is people making websites, dashboards, visualization tools, etc. with julia.

There's a lot of people thinking about things like string handling due to NLP, bioinformatics and a few other technical fields that don't use the same standard datatypes you'd expect from an engineer or whatever.

I do a lot of random hobby programming, often building random tools that aren't necessarily numerical programming related and I find it quite natural and easy to make these things highly performant (often with no runtime overhead).


Really? My only experience with Julia was porting a simulation from Python and the speedup was incredible. Admittedly, that's an instance of numerical computing, but I would wonder why it was slow for you.


This is only meta analysis, but some topics come up more frequently when people say how impressed they are with Julia.

Simulations, ODEs, tight for-loops seem to be high on the lists.

“Generic” data science doesn’t, or indeed general, unscientific programming.

Perhaps this is not right, or outdated, but my impression is that Julia is perhaps very well suited to _some_ scientific programming, not necessarily all of it, despite the broad statements.


I think ODEs and numerical computing gets mentioned because it was the first area where Julia packages outshined all alternatives. It originated among physicists and mathematicians, so it has a "head start" in these areas.

Julia is a general-purpose programming language, despite having roots in scientific programming. Its performance characteristics with its high latency and runtime memory overhead makes it unsuitable for a number of non-scientific applications. But not all. I think Julia would work excellent as a webserver backend, for example.

I mostly use it for DNA sequence processing, which is not really numerical programming, but more string processing. There it shines. I wouldn't want to make a video game in Julia or fly a plane using Julia software.


Really? My only experience with Julia was porting a simulation from Python and the speedup was incredible.

The bang-for-your-buck of adding Numba to existing Python code is why we passed on Julia. Give it a try.


The problem with Numba is that it only really works in small, limited usecases. Good for e.g. Numpy arrays, but can it provide these massive speedups with custom classes? Arrays of strings? Sets? Probably not, since both Julia and Numba relies on inferring the data types of all objects, and on them being stored efficiently in memory. The Julia type system allows easy type inference, and all custom structs are stored efficiently in memory. In Python, neither of those are true.


hijacking this discussion to raise the banner of seamless interoperability between numba (quick numerical code) and pypy (quick algorithmic code + business logic).

My dream python is pypy+mypy+numba. Alas, pypy doesn't play nice with the others "(


It used to have slower dicts and string handling like around version 0.5.

It's come a very very long way since then. Now at 1.5.

It's much faster for arbitrary code


If you are not being specific it is impossible to know what you are talking about. It could be a library thing or it could be that your are really just describing the initial latency of the JIT compilation.

If you look at the well known language shootout you will see Julia today beats almost everything even on multicore. The clear exception is anything related to garbage collection. But you can circumvent this with turning off the garbage collector temporarily.


First time I've heard anything like that. Most people report a massive speedup without doing anything special. Maybe try the current version?


Let me second GP’s sentiment. I find Julia really slow for my purposes. I don’t know his reasoning, but I will explain mine. None of this is surprising and is oft discussed.

Julia (at least by default) is constantly recompiling everything. This is a huge pain in a REPL style setup where you want to tweak one thing and see the changes, again and again. I know the Julia ecosystem is working on better caching etc to fix this problem but it’s a problem.

Also, despite the marketing claims around the language, expertly crafted C usually beats Julia in performance. So if your “python” program is spending most of its time in Numpy/PyTorch/etc, it will beat Julia, unless you’re writing a fancy “put a differential equation in a neural network in a Bayesian Monte Carlo” program that benefits from cross compiling across specialized libraries.

Finally, the Julia libraries are just not as mature as python’s. Armies of developers and larger armies of users have battle tested and perfected python’s crown jewel libraries over many years. Often when someone posts a bad benchmark to the Julia forums they can “fix” it in the library implementation, proving the correctness of the theoretical case for Julia. But in reality many such problems remain to be fixed.

Julia is really cool and does have many inherent advantages over python. But it’s not the silver bullet many of its proponents suggest it to be. At least not yet. Every few years I check out Julia and I hope one day it does become that perfect language. I think it will. I just fear it will take longer than many others hope.


I appreciate your well-balanced critique, thanks.

> Julia (at least by default) is constantly recompiling everything. This is a huge pain in a REPL style setup where you want to tweak one thing and see the changes, again and again. I know the Julia ecosystem is working on better caching etc to fix this problem but it’s a problem.

Maybe try Revise.jl? There are a few changes it can't handle, but you can do a lot of development without ever restarting. (Disclaimer: I'm its main author.)

> expertly crafted C usually beats Julia in performance

This isn't generically true, and there are now quite a few examples of the converse. I linked to it above as well, but check out the benchmarks in LoopVectorization's documentation (https://chriselrod.github.io/LoopVectorization.jl/latest/exa...) for examples of beating MKL, one of the most carefully-engineered libraries in existence.

I think an exciting area of growth for Julia will be exploiting the fact that Julia's compiler, written mostly in Julia, is more "morphable" than most and may develop its own plug-in architecture. This seems likely to provide performance opportunities that many fields seem hungry for.

> the Julia libraries are just not as mature as python’s

On balance I agree. While there are already many examples where Julia makes things easier than Python, as of today there are many more examples to the contrary. Julia's libraries are advancing rapidly, but I expect it will take a few more years of development until it's no longer so one-sided.


My thoughts exactly.

I would just add, I feel Python is stagnating as a scientific programming _language_. The libraries, ecosystem etc are still great, and Python is still a great language, but these days the development focus seems to be on type hints and unicode support.

I wouldn’t be surprised if Julia takes over, simply because it actually focuses on scientific programming. To me, personally, that would be a shame; good for Julia, but I still find Python a better language overall.


Preach, brother.

I’m cautiously optimistic that JAX (or something like JAX) can save the python programming language from stagnation by essentially building a feature-complete reimplementation of the language with JIT and autograd baked into the core. I’m praying that Google diverts like 10% of TF’s budget to JAX.

That way I don’t have to learn to love a bunch of unnecessary semi colons and “end”s littering up my beautiful zero-indexed code ;-)


Julia code almost never has semicolons. Semicolons can be used at the REPL to suppress printing, but actual Julia code does not normally use semicolons.

I personally like the "end"s because I like the symmetry and they're prettier than curly braces. Also, there are some syntax color themes that color the "end"s in a darker color in order to de-emphasize the "end"s, which can be nice depending on your taste.


That was mostly meant as a joke, thus the “;-)”

I don’t really care much about syntax choices, but my small complaint about “end” is that it takes up a line which reduces the amount of business-logic code I can fit on one screen, especially if you ever get into lots of nested loops and conditionals.


To each their own, but I usually try to refactor when I hit too many nested loops or branched statements - it's usually a sign of missing some abstraction or trying to be too clever.


@Sukera

Fair, but, if I break up all the loops and if statements into functions, those functions still have “end”s


For the record, I'm a fan of both Python and Julia (though, I believe that the latter is not yet ready for general programming mainstream industrial use). As for Python "stagnating as a scientific programming _language_", it is totally understandable. It tries to be the language of choice for everyone. However, as we know, "you can please some of the people all of the time, you can please all of the people some of the time, but you can’t please all of the people all of the time".


My point is, Python is no longer _trying_ to please scientific programming circles. It is developing, just not in these directions.


Understood, fair enough.


> but these days the development focus seems to be on type hints and unicode support.

It's a result of the language maturing. The tradeoff is between groundbreaking innovation and being a stable language with a large user base. You can't have it both ways.


There are features Python is missing as a scientific language, that are not a focus. Proper multicore support, GIL, any static safety (which type hints sort of address but not really).

It was fine in Python 2, it was cheap and cheerful. It feels likes Python 3 is running out of ideas for improvement, and yet these are not at all in the scope of work.

Cf Julia that treats all these features as first class problems.


I am wondering if this is really a problem? I just parallelized some numerical code and instead of threads (Gil problem), I use processes. As far as I understand the only drawback would be that the parallel items cannot share memory, well do you do that? I find it hard to reason about correctness in these cases.


This works a bit. But sometimes sharing memory is useful. Sometimes you may want to parallelise a local function, or a callable class instance - except you can't.

And when I say, "it works", that's clearly on Linux and Mac. On Windows multiprocessing is very severely stunted by lack of forking.

Meanwhile, probably even my watch supports threads.


This might interest you: you can now turn off compiler optimizations at the module level, using a macro. For some people this speeds up the delopment cycle, as it skips most to the time-consuming compilation activity.


oh that is interesting. I'll give it a spin again this weekend. Thanks.


Caching has improves a ton in the last 3 minor releases.


Thanks. I watched the JuliaCon state of Julia presentation. As I wrote in my original post, I appreciate the investments the Julia core developers are making, that have improved but not eliminated this problem. I wish them luck.


I had an experience slightly like that at first where I was surprised to only be going 1-2x faster than my old MATLAB, but then I realized my code was full of trivially avoidable type instabilities and got another 100x speedup in Julia.


My experience with Julia is that for many (most?) things, you do get a free speedup basically. With other things though, it's about the same. What this means in practice is that if you expand your use cases enough, and integrate enough libraries, you'll incorporate some bottleneck that slows it down.

This is true of a lot of languages, python included, but I think when part of the language's selling point is something like "c-like speed with python-like syntax" it can be a little (although not entirely) misleading.

Having said that, I still prefer Julia over python (at least for numerical computing, not so sure about other things). I just like the language more. I also think, with some exceptions, that I don't have the same dependency hell problems that I've run into with python. Even now, I'm in the process of switching over to Julia from python for a project because the python library I'm using depends on about 6 different other libraries, but only specific versions, that you have to run in a specific standalone conda environment to avoid using the wrong combination of packages, all of which are pre-python3, and so forth and so on. Even then, when you manage to thread the needle, it still falls apart later for unknown reasons. This is surely this particular package, but my experience with Julia is that things are much cleaner (R is similarly problem-free usually but it's a lot slower and Julia as a language is more coherent to me).

I'd really prefer something like Nim to be seeing the attention that Julia is getting, something more general-purpose, but there's no consensus of momentum around something like that at the moment. Maybe in the near future ocaml will pick up steam, or maybe the next version of C++ will essentially make it look like python, or maybe there will be something not quite on people's radar at the moment, but at the moment it is what it is.


Probably because it is mostly compared to Python - which is arguably a low bar to beat.


Python is a high bar to beat though, since no one understands beating Python as beating some hypothetical Python-only numerical library, it's beating mature precompiled C/C++/Fortran libraries with a small overhead in Python. And there are very few dynamic languages that can even compete on this level (without needing the FFI), and even static languages will have trouble dealing with the level of optimization that went on those libraries.

But everything has compromises, and in Julia case is the JIT lag from aggressively optimizing compilation, making it a slow language for simple do once tasks (which is the part that is usually written in Python) and a fast language for when runtime performance is the most important (which is the part done by the libraries in C/C++/Fortran), making it slower in some situations. Though new releases are attacking this issue, including the ability to not optimize this do once tasks (like plotting) and improving pre-compiling.

But as for now, the ability to not have black boxes and not being required to twist my code to a particular style in order to not be prohibitive slow makes up for the slow startup (plus using Revise.jl to automatically compile code in a REPL, and keep the session alive for all my programming routine makes it not that much of a deal during development).


FWIW that was my experience too. And the JIT was too slow for my pace of code changes. Also really wanted to love it.

Maybe I need to try the new version.


You need to start using Revise.jl that is the key to using Julia effectively. If the JIT is too slow then, then I really wonder what you are doing. I have no performance issues but on a 5 year old computer, apart from the famous first time to plot or bringing in a whole new library. I don’t have issues with modifying my own code however when using Revise.jl


Common Lisp has had multiple dispatch for decades. Yet another Lisp feature "taken" by another language - why not just start with a Lisp and improve it (e.g. Typed Racket, for performance) rather then repeatedly creating new languages and just adding a tiny piece of Lisp to them each time?


That is how Julia started. In the early days it was a scheme reader macro. You can still run `julia --lisp` to get a scheme prompt since the frontend is still written in it. Obviously lots of improvements have happened since then. Julia itself also feels a lot like a lisp, since Jeff is a huge fan, but of course people get hung up on the syntax.


It's nice that you can write Julia in a more Lispy way, but if you did that you'd probably be one of the only Julia users did so, and your code would not be readily understandable or accpetable to the rest of Julia's users and if you wanted to integrate other Julia code in to your own you'd be stuck with having to use the more Python-like syntax that the vast majority of Julia code is written in.

In short, the Julia ecosystem is not a Lisp ecosystem.

If you wanted Lisp, you'd be far better off using a real Lisp to begin with, so you can unreservedly participate in an entire Lisp ecosystem, instead of using a language that hides its Lispyness behind a Python-like syntax.


Having a language that hides it's lispness behind a python-like syntax can be good for the lisp community though, even if they don't use them. People do have prejudice against those parenthesis, so having an entry point that is "Python-like" (the most popular language for beginners and non programmers) that can still teach the core features of Lisp (everything is an expression, sort of easy AST manipulation, macros, parts of CLOS like multiple dispatch) will only help people appreciate the languages it was inspired on. And possibly they'll also feel the limitations of non sexp macros and decide to actually move to racket or common lisp to free themselves from those restrictions once they become aware of them.

Possibly those people will even do something like Clojure for the Julia compiler (a mature version of [1]), with an entire community around it so you don't have to worry about doing "unacceptably lispy" code (and it will certainly interop much better than Clojure and Java).

[1] https://github.com/swadey/LispSyntax.jl


Creating your own little universe that nobody else using the language can understand sounds exactly like lisp ;).


"why not just start with a Lisp and improve it"

The reason is pretty clear. Mainstream programmers are just allergic to Lisp syntax.

They don't like all the parenthesis, and want a more Algol-like syntax (which, these days, and especially for the users who Julia is trying to attract, means a more Python-like syntax).

Lisp for a lot of developers also has an old, stale feel to it. Programming is dominated by fads, and most programmers (especially younger ones) want instead to chase the new shiny.

Decades ago, I read that there were already something like 4,000 languages. I expect in another couple of decades there'll be 4,000 more, with every new generation of developers eager to jump on the bandwagon of yet another new language.

Python, once the new kid on the block dethroning Perl, is now itself considered to be kind of old and stale, so people are looking for the new thing. Maybe Julia is it!


This is an issue for me. Although I find the lisp idea beautiful, my programming usually has some relation to mathematics, to equations. In lisp, math just doesn’t look like math. Julia goes the opposite way: it's use of Unicode for identifiers and such things as allowing juxtaposition for multiplication (when not ambiguous) make math look more like math than in any other language.


I do like the the option of using unicode symbols in Julia, but this need not be thought of as only a language feature.

Emacs allows you to substitute any text for any other when viewing a document, so you can pretty much do the same in Emacs for any language.

I did this when editing LaTeX papers full of logic symbols, which Emacs would display as the symbols themselves rather than as the underlying LaTeX markup.

It could do the same for Lisp or any other language.


Vim can do this to (with a plugin). And you can enter Unicode in the Julia REPL by entering the TeX command and hitting TAB.


> Emacs allows you to substitute any text for any other when viewing a document

Do you have a reference or search term for that? My Google-fu is failing me.


It's called prettify-symbols-mode: http://www.modernemacs.com/post/prettify-mode/


I believe the bigger problem is that macros are too powerful. This leads to that every project becomes its own DSL, which is not compatible with other projects. So code sharing/ availability of tested libraries is low. This leads to low adoption rate.

Of course if you want to build your own (mega-) fortress all by yourself, Lisp is the language to go to. Most people and stakeholder prefer not to.


Yes, common lisp has certainly had multiple dispatch for decades. I think what makes Julia interesting is that the Dispatch Ratio and reuse is significantly higher in Julia than in other languages with multiple dispatch. Some of that is certainly attributable to the ways in which Julia differs from a lisp.

See Sec 6.2 "Julia: dynamism and performance reconciled by design" in https://dl.acm.org/doi/pdf/10.1145/3276490.


As I mention in the article. Note that Julia is in a sense lisp-based, and you can easily see the AST of your functions. You can say +(a, b, c) in Julia, for example.


But unlike Lisp, the Julia language is not homoiconic, so its ASTs don't look like the parent language.


Just swap out the parser ;) -https://github.com/swadey/LispSyntax.jl


Don't fight the universe, common lisp is too old for the mainstream to care, you can argue the same for es6, python, php .. it was all ridiculously bad and evolved by adding old ideas.

If you want to make CL exist in a commercial driven world.. make a successful CL business, that's the only thing this soil understands.


Fun fact: You're commenting on Hackernews, which was made with extra cash belonging to a man whose fortunes came from a successful CL business.


As if I didn't know that.

ps: viaweb was ages ago, it was cool and famous and world didn't care. Social structures are odd and people want to belong to their own cult, php, js, python were a 'better' fit for the uneducated people of that time, now they are the world and still only care about their own little world. Until they get a new idea, which they now see the value of and will integrate them shamelessly. CL nowhere to be found. It's a sad fact of life, CL heads don't have the desire to dominate, so their language features will be assimilated silently.


>Yet another Lisp feature "taken" by another language

Julia is a Lisp.


This might sound harsh, but Common Lisp fans should stop whining about what everybody else is doing and start making some killer apps to bring back some life into their language. Hearing all about Lisp's past glories is far less convincing than working code. It's just off-putting to hear complaints about what someone else did. As always in free software, the one who puts in the hours gets to call the shots.


Please don't take HN threads further into programming language flamewar. I realize that you're trying to defend something good, but good intentions don't remove the duty to take out name-calling and swipes.

https://news.ycombinator.com/newsguidelines.html


There's a psychological paradox at play. Many idealistic lispers want a lot of things but they don't play the current game, they (I can include myself partially) care more about the tool than the product and it just dies a quick death.

For this to happen they have to turn their view of lisp from beauty to an effective tool to win on pragmatic points: time to develop, number of bugs, ease of adaptability. CL has a hammer for all this points but you have to see the field that way and produce things.. well basically 'less is more' vs 'perfectionism'


As a huge Lisp fan (though more of Scheme than Common Lisp), I kind of agree.

If a Lisper had come up with Jupyter notebooks before Python did, they could have captured the interest of academia.

If they'd come out with a Ruby On Rails framework before Ruby did, they could have made inroads in to the web development community.

That is, of course, if you could somehow get developers to embrace all the parenthesis, which seem to be an albatross around Lisp's neck that it can never shake.

Lisp does have one killer app, though: Emacs.


They did. It was called org-babel. It is still superior to notebooks because you can weave the source code and mix a dozen languages.


Don't get me wrong, I love org, but this is really not nearly the same as Jupyter notebooks.

Jupyter notebooks are web based, and only require a web browser to use. That's not the case for org.

Using org effectively also means using Emacs, which is a high barrier of entry for most people.

Jupyter notebooks have no such requirements, and anyone can get rolling by just installing some packages and opening a web browser.

Jupyter notebooks are also really polished, while org is more bare bones visually.

Jupyter notebooks also have all these tools integrated in to it (all the graphing and inline image display tools are particularly appealing)

Yes, you could potentially do all of these with org, but it'd take way more knowledge and setup than is required for Jupyter notebooks.

If something as visually appealing, powerful, and easy to set up and use for people who don't know Emacs then it'd have a chance of getting widespread adoption. But that's not org.


It's better. Already serialized. Can be kept under version control. Can be shared between people without needing to signup for anything. And as already stated, code is automatically extractable.

Notebooks are only a thing because you don't need to install anything on work laptops and get around terrible IT policies.


> This might sound harsh, but Common Lisp fans should stop whining about what everybody else is doing and start making some killer apps to bring back some life into their language.

They can't. They are all busy implementing another Lisp in Lisp.



I wish Julia was much more Lisp/Scheme-like in terms of syntax. If they'd just focused on making a more performant, typed-Scheme (or improving one of the existing ones), that would be much more interesting for me.

I took a dip in to Julia recently, and found it to be a mishmash of Lisp, Haskell, Matlab and Fortran ideas, wrapped in Python-like syntax.

I guess that can be appealing for people coming from those worlds, and for those for whom speed trumps everything else.

For me the loss of a Lispy syntax and the price of working with a relatively immature language wasn't worth it for me. I'd just rather use something with reasonable performance that's more Lispy.


At a conference presentation the other day a speaker made the claim that well-written Julia code was as fast as well-written C code while as expressive as Python, so it solved the two language problem.

My Python package uses a C extension for performance. That C extension uses AVX2 intrinsics. How well does Julia support intrinsics?

Even supporting something like __builtin_popcountll() would be nice, but http://rosettacode.org/wiki/Population_count#Julia suggests that it's not supported in Julia.

I've no experience with Julia and my attempt at finding this out on my own failed - does anyone here know about Julia's support for intrinsics?


Check out LoopVectorization: https://github.com/chriselrod/LoopVectorization.jl

From its benchmarks (https://chriselrod.github.io/LoopVectorization.jl/latest/exa...), a 9-line naive matrix multiplication routine in Julia + LV slightly edges out Intel's MKL.


It looks like that @avx macro just tells Julia that it's okay to use the AVX instructions?

My specific question is, how do I tell Julia that I want to compute the popcount of the intersection of two byte strings of length 256 bytes?

A reference C code is at http://www.dalkescientific.com/writings/diary/archive/2020/1... in byte_intersect_256() and threshold_bin_tanimoto_search(); my blog posts shows the important parts - I link to the full definition for the C code.


@avx does a ton more than just use AVX instructions. It'll reorder and unroll loops when advantageous, swap out some functions for more vectorizable version of those functions and a few other tricks.

Julia uses avx instructions by default if your code is amenable to it.


Sorry, I see how what I wrote could be interpreted that way, and part of what I wrote was out of ignorance. While I didn't write it, I assumed the macro was doing some sort of equivalent to the template metaprogramming I've heard about in C++ to do similar things.

What I did't see was how I could use the AVX2 instructions myself.

Checking now, since Julia's count_ones() maps to the LLVM popcount instruction, and recent clang versions know how to optimize that fixed-length sequence in C even for AVX-512, the Julia equivalent to the code I wrote should have good performance.

There are a few optimizations (keeping one AVX register loaded with a constant byte string, and using prefetch instructions) which might be missing. I'll be talking with the conference participant who brought up Julia to work this out in more detail.

Thanks for the comment!


Ah I see, yeah I misunderstood you.

> What I did't see was how I could use the AVX2 instructions myself.

If you ever find yourself in a situation where you want manual control over vectorization, the package SIMD.jl [1] is pretty good for manual, handwritten vectorization. There's also VectorizationBase.jl [2] which LoopVectorization.jl uses. Which one of these two packages are most appropriate just kinda depends on what sort of interface you prefer.

[1] https://github.com/eschnett/SIMD.jl

[2] https://github.com/chriselrod/VectorizationBase.jl


Thanks for the pointers! The first looks like it gives a language for SIMD operations, but not all of the intrinsics. The second is for vectorization, which also doesn't include all of the intrinsics.

However! The second shows an example with Base.llvmcall, which lets me write LLVM assembly. That should let me do anything I want.

However however - the stuff I actually do doesn't need all that, and could likely be implemented with LoopVectorization.


My understanding is that the Julia community is quite interested in having SIMD via e.g. AVX “just work”. I recall reading this post on it a while back: https://juliacomputing.com/blog/2017/09/27/auto-vectorizatio...


Sure, but you can only do that if either that's a way to express what you want directly or, in the example you gave, there's a common idiomatic style that the compiler can recognize and handle.

What is the idiomatic way to write the popcount of the intersection of two 256-byte byte strings? My C code is:

  static int
  byte_intersect_256(const unsigned char *fp1, const unsigned char *fp2) {
      int num_words = 2048 / 64;
      int intersect_popcount = 0;

      /* Interpret as 64-bit integers and assume possible mis-alignment is okay. */
      uint64_t *fp1_64 = (uint64_t *) fp1, *fp2_64 = (uint64_t *) fp2;

      for (int i=0; i<num_words; i++) {
          intersect_popcount += __builtin_popcountll(fp1_64[i] & fp2_64[i]);
      }
      return intersect_popcount;
  }
I haven't figured out the Julia way to write it so it would use the POPCNT instruction (if available), the AVX2 popcount technique (if available), or the VPOPCNTDQ AVX-512 instruction (if available) - falling back, I suppose, to the SSSE3 and Lauradoux implementations - the last being the fastest generic C implementation I found. (See https://jcheminf.biomedcentral.com/articles/10.1186/s13321-0... ).


Isn’t __builtin_pocountll() exactly the same as Julia’s count_ones() function?


Thanks! Yes!

I kept looking for "popcount" and "population count", not "Number of ones in the binary representation of x." A Julia person should update the Rosetta Code implementation to use that function.

That function maps to ctpop_int which is implemented as uu_iintrinsic_slow(LLVMCountPopulation, ctpop_int, u). I ... don't know what the "slow" means.

There are follow-on problems, like my AVX2 implementation for 128 bytes stores one of the byte strings in AVX registers, and I get extra performance by prefetching. Once I get to the AVX-512 implementation, the same will happen. I don't know how well the optimizer can handle those details. ("My" = implemented by HN user nkurz.)

But this is enough of a clue for a first decently-high-performance implementation - thanks again! I talked with the person from the conference who promoted Julia. He's going to work on this/we'll work on it. I'm looking forward to seeing how the timings work out.


You know what? In the past, I was toying with the idea of building Julia IDE. In the end, I decided to build something else.

So the pain point is still there. Looking at this thread, it could be a good idea to build Julia IDE. Then you can apply for funding in the next YC batch. 6 months should be enough to build an MVP.

Bonus point if you could build Julia IDE with Rust. In Show HN, surely you would get a lot of karma points. But we need to be realistic. Rust does not have a solid GUI library. If I chose to build Julia IDE, I would use either C++ with Qt or C++ with wxWidgets.


It doesnt sound good that you do not trust julia as a source language for the IDE, actually didnt Julia wanted to solve the two language problem?


I got the impression that Julia is catered to numerical computing domain (data science and alike), not general purpose programming language like Python. Does Julia have a good desktop programming library?

I guess I am wrong. However when I looked at the website (https://julialang.org), in the Ecosystem section, they listed these things:

Visualization, General Purpose, Data Science, Machine Learning, Scientific Domains, Parallel Computing.

I think they should market it in a different way. "General Purpose" should be expanded to "Web Programming", "Desktop Programming", etc.


I tried to get started with Julia a couple months ago, downloaded several libraries, and they wouldn't compile due to missing dependencies. Is there a problem with the package manager?


That's weird using libraries almost always just works? Did you install them via the package manager? If you just git clones them then you will need to tell the package manager about them e.g. via `add`/`dev`ing the local path, + `Pkg.resolve`.

I recommend asking for help on the Julia Discourse. http://Discourse.julialang.org/


The package manager is one of my favorite things about Julia. No more dependency hell. Just type "add <package>" in the REPL.


I have only seen this happen with packages that depend on non-julia libraries, such as ARPACK, but the move to providing binaries with BinaryBuilder should fix this.


That might well be the issue, these were game/gui libraries. What's the timeframe on BinaryBuilder?


BinaryBuilder/BinaryProvider are 100% functional and have been deployed and working for a long time now. The timeframes for individual packages to move to using them instead of ad-hoc build scripts, however, varies. It's generally very quick to do this, but complicated chains of binary dependencies can be a pain.


https://dl.acm.org/doi/pdf/10.1145/327649 ("Julia: Dynamism and Performance Reconciled by Design") section 4.2.1 shows how multiple dispatch enables a library to compute the derivative of another program without modifying that other program to be aware of the derivative-computing library.


How easy would it be to use Julia on a problem that would take advantage of the 16 hyperthreaded cores on my computer? Does it use threads, async/await mechanism, something else? Is it intuitive to use?


Julia's main parallel primitive (@spawnat) [1] is heavily inspired in Go, in which you just run anything in a managed lightweight thread, using a channel to pass data and sync with a fetch. That API is quite recent though (Julia 1.3), so there is still a lot of work going on both on the language side and on the library side to give higher level mechanisms, such as [2].

[1] https://julialang.org/blog/2019/07/multithreading/

[2] https://juliafolds.github.io/data-parallelism/tutorials/quic...


As easy as starting up the interpreter with -t auto and then using, e.g., pmap instead of map.


I wanted to like Julia, but loading the graphing library takes longer than starting the entirety of MATLAB. Not a good design.


Why should I use julia over c++11 ?


It's another R, something that seems like progress, but will end up stopping further progress.


I didn't like R at first. Some scientists I work with would say, "its amazing, it just works the way I think". One Biostatistics later class and I get it, Still don't love it but I get it. Plus the RSudio work environment and ggplot2 are simply amazing.


What would true progress look like to you? Specifically regarding Julia's main use case of interactive yet performant programming?


Mind elaborating on why you think so?


Here is my main gripe with Julia: They blatantly copy Matlab. The same is also true to a lesser degree of Python's numpy and matplotlib, but Julia goes the whole nine yards, by basically replicating the syntax and copying most of the numerical APIs including the indexing convention. I understand why they do it and I get that Matlab is hugely popular, I just wish that there was more creative energy in the Open Source world and not just rehashing of 20-40 year old designs.


This is not true. Indexing works very differently in julia than it does in Matlab. There is some overlap in function names, but less than you're implying, mostly in the names that everyone uses anyway (sin, cos, etc) and even where the names are the same, the API details often aren't. Ironically the number of overlapping method names has increased because Matlab has been copying method names from Julia in recent years. That said, Julia and Matlab are completely different languages, so I'd encourage you to look in one step further.


The thing is, there are parts of Matlab's syntax which are fantastic, and I would be disappointed if they didn't copy it.


If this is blatent, what is octave?


Wow, Julia is a "baby language" and there is not a single project where its dominating the usage landscape.

What's next, the unreasonable effectiveness of an idea I just had last night?

Come back in 10 years.


SciML and JuMP are both best in class ecosystems, in some cases by enormous margins. The PPL landscape is also very significant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: