I am one of the maintainers. From information first-hand, Torch is used by:
- Facebook
- Google DeepMind and slowly Google Brain is moving as well.
- Certain people at IBM
- NYU
- IDIAP
- LISA LAB (not exclusively, but some students started using it)
- Purdue e-lab
- Several smaller companies (10-100 companies)
There will definitely be several commonly asked questions, and this is my personal perspective on them.
Why torch/lua, why not python+?*
No reason. Just because. Mostly because LuaJIT is awesome (with it's quirks) and LuaJIT is extremely portable. (we embed torch routinely in tiny devices, afaik not practically possible with python).
Is Torch better than Theano/etc.etc.?
Better and worse. Every framework has it’s oddities.
I like the super-simple design and the compactness of traversing from high-level easy-to-use API to bare-metal C/assembly.
Also, torch’s ecosystem was grown not with exclusively lab experiments in mind, with Yann’s strong robotics research, packages were developed with practicality in mind all the time. Custom chips are being developed for convnets (TeraDeep) and they use Torch.
I like Julia a lot, and it’s definitely cool, but the packages for NN and GPUs aren’t very strong, so advantages over Julia are simply code that’s already written.
If there are any more questions, feel free to ask them here or just open an issue on the github package.
Thanks for reading.
Edit: Apologize for the formatting, not very good at hackernews
I'd love to try out Torch7, but I've grown accustomed to Theano's automatic differentation, which is the killer feature for me. That's the one thing keeping me from exploring the alternatives. Not having to calculate any gradients makes trying different architectures and activation functions a lot less daunting.
Are there any plans for Torch7 to support something similar? I suppose it would be nontrivial since to my knowledge Torch7 does not use symbolic representations of the computations internally, which is what enables Theano to do automatic differentation in the first place.
I'm writing a calculus package that's inspired by Julia's, which would have an automatic differentiation engine, but it is nowhere close to being finished, nor is it a priority at this point. So sadly, we wont see Sander using torch anytime soon :)
Torch is an amazing project and has been a great inspiration for my leap in to deep learning. I wish the maintainers the best of luck. It's ambitious to write a deep learning lib. That being said, you guys are doing a great job.
The comparison helps quite a bit. Mind if I lift this for our site? I'd link back to this thread as well.
my biggest annoyance with LuaJIT is the lack of 64-bit addressing, and there's no easy fix for this. Mike Pall said he'd fix it in LuaJIT 3.0 (with an overhaul of the GC), whenever that'll be.
Edit: to clarify, this does not mean torch cant use more than 2GB, but native lua allocations cant be more than 2GB, so ffi-based or C based allocations (which is how torch tensors are allocated) have no such limits.
This is a big annoyance for me as well. I use jemalloc to make sure that my FFI allocations don't 'steal' this lower memory from my pure Lua allocations. It works great on Linux, but only so-so on OSX.
Sure, the best way to get attention is to open an issue on github.com/torch/torch7 (that way you get the attention of all the core developers). If you'd like to ask something more discrete, I'd suggest you email one of the four here: https://github.com/orgs/torch/members
It may seem obvious, but I feel like the page you linked could use an occurrence of the word 'C++'. The code snippet was obvious to me, but just a thought.
Thanks for the feedback! While the code snippet is from C++, we have language wrappers for Java, R and Fortran. We'll try to be more clear about this in our documentation.
Thanks for the response, currently I'm doing R&D with easier to use high level representations and if it proves fruitful will be looking at getting the most cost effective speedups in about 6 months.
Whats the license for the free edition? I would be sticking to open source code until we secure capital, but would be very interested in your commercial services down the line.
That comment from LeCun is very interesting, because this means he moved from his own language, Lush, which he maintained for many years, to Torch7. And LeCun has very high standards for expressiveness and efficiency.
Oh, dont get me started about Lush :), this is my personal opinion, but I tried to use lush for a bit, and I ran away as fast as I can. However, in retrospect, that's probably because it was the first functional programming language that I was introduced to.
Also, torch has a great set of people building the ecosystem, lots of communication and collaboration online (github, forums, across labs) which lush lacked.
I found https://github.com/BVLC/caffe to be 20 to 40x faster for image classification when comparing with Overfeat, which uses Torch - YMMV (The type of BLAS you use makes a gigantic difference. MKL was 2x faster than ATLAS and 5x faster than OpenBLAS). Caffe also has a more active community and cleaner code IMO.
caffe definitely grew in popularity very quickly, and it does a narrow set of tasks very well, but I dont agree that the code is cleaner (in fact I think the opposite), and I dont see the design to be particularly broad-minded (i.e. generic neural networks, or a general scientific computing framework).
I haven't used Torch7 yet, but from heavy usage I can recommend http://www.nongnu.org/gsl-shell/.
GSL-Shell combines Lua + Luajit + GSL-Library (GNU Science Library) + additional syntactic sugar that helps with matrix/vector calculations. I use it for all kinds of linear algebra projects and gsl-shell does an excellent job.
I'd love to try this out, but it seems that it uses some luarocks that are unavailable (sundown, cwrap) .. or .. at least, my attempt to do a standard install with the instructions as given has resulted in luarocks not finding any of the required rocks .. anyone know what might be causing this?
(EDIT: Never mind I figured it out - if you have a luarocks installed previously, you must do:
I don't understand: how is this better than/different from python (with numpy, scipy, theano and FFI)? Is this a language preference thing where people just want to avoid python?
Thanks to Luajit, the code you are writing will be compiled to efficient byte code which runs very fast. It's because Lua as a language is simple enough to produce light bytecode that can run in the CPUs cache.
Numpy will always do the job, but when speed is really critical, you might want to look into it.
I'm not an expert in LuaJIT, but it sounds unlikely that the performance characteristics of torch7 are due to the efficiency of Lua bytecode. The speed with which you can train NNs will be dominated by the performance of the linear algebra libraries which are utilised by the numerical optimisation algorithms (SGD, L-BFGS and the like). Almost everyone ends up using some variant of the BLAS libraries for this.
Someone I met recently said this (in rough words): "When I'm writing rough code in lua, it's completely acceptable to do a couple of for loops here and there without a disaster in speed. With python, this was a complete meltdown"
I'm not saying it all boils down to just this quote, but I thought it was interesting and wanted to replay it here.
Specifically, LuaJIT has a very well-designed bytecode optimized for decoding and with specialization for types. The bytecode dispatch is hand-written in assembly which exploits this. And this is all even if your code never sees a JIT -- LuaJIT JIT'd numerical code is competitive in microbenchmarks.
Wow, this looks awesome (I usually use C). I had added Lua to a previous project which had been entirely C and it simplified the "outer bits" drastically, as well as being relatively painless to set up.
It seems to be written mostly in C, so I wonder how difficult it would be to port this library into an interpreted Lua 5.2 environment. Has anyone tried that yet?
I thought LuaJIT and other Lua runtimes are great for scripting and embedded environments, while Julia focused primarily on being a MATLAB replacement (i.e. desktop environments).
EDIT: Based on comments in the thread, seems both could do the same job but Torch7 has libraries that make running on GPUs easier vs. Julia.
I came across about Torch in this FB comment [1] by Yann LeCun:
Apparently it is used in many deep learning labs. I think that in Toronto they mostly used Matlab and Python, does anybody know if this is still true?[1] https://www.facebook.com/yann.lecun/posts/10152077631217143?...