Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I don't think Julia belongs into the same list as C++, C and Fortran. It is true that for some algorithms it is almost the same speed as C++ out of the box, but for many others it is still factors of 10s or 100s of. Also it often requires significant tweaking to get to it's best performance (e.g. don't use abstract types), so it is almost like saying Python is the a fast language, because you can use Cython or Pythran. I really wish Julia fans would stop overstating the language capabilities, it really does a disservice to an otherwise great language.


I was talking about reality: https://www.hpcwire.com/off-the-wire/julia-joins-petaflop-cl...

Nothing overstated; just the circumstance in the real world that these four languages are the only ones used in the most demanding HPC.

C and Fortran are also not as fast as they could be if you use them incorrectly. Using concrete types is not “significant tweaking”.


this is just false. you can not find reasonably well written Julia code that runs 10x slower than equivalent fortran.


https://jochenschroeder.com/blog/articles/DSP_with_Python2/

There the "naive" Julia code, simply implementing the code like I would in Fortran is a factor of 10 or 15 slower than the optimised cython version (which would be the same as a regular C version), the optimised Julia version is still a factor of 5 slower than the cython and pythran version. Can you show me how to optimise it so that Julia performs on par with pythran or cython?


The naive Julia code made a few pretty fundamental mistakes (Complex vs Complex{Float64}, and row vs column major). The following is non-optimized Julia code that is roughly 6x faster (and much simpler) than the "optimized" code in the blogpost. Some further optimizations would give another 2-4x over this (like using StaticArrays), but I'll leave that as an exercise for the reader.

    apply_filter(x, y) = vec(y)' \* vec(x)

    function cma!(wxy, E, mu, R, os, ntaps)
        L, pols = size(E)
        N = (L ÷ os ÷ ntaps - 1) \* ntaps  # ÷ or div are integer division
        err = similar(E)  # allocate array without initializing its values
        @inbounds for k in axes(E, 2)  # avoid assuming 1-based arrays. Just need a single inbounds macro call
            @views for i in 1:N   # everything in this block is a view
                X = E[i*os-1:i*os+ntaps-2, :]
                Xest = apply_filter(X, wxy[:,:, k])
                err[i,k] = (R - abs2(Xest)) \* Xest  # abs2 avoids needless extra work
                wxy[:,:,k] .+= (mu \* conj(err[i,k])) .\* X  # remember the dots!
            end  
        end
        return wxy, err  # note order of returns, seems more idiomatic
    end




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: