Machine Learning from scratch: Bare bones implementations in Python

schmit · on Feb 25, 2017

One quick comment: in general it is a bad idea to compute the inverse of a matrix (to solve a linear system). It's much better to compute the QR factorization or SVD instead (or simply call least square solver).

See for example: https://www.johndcook.com/blog/2010/01/19/dont-invert-that-m...

JustFinishedBSG · on Feb 25, 2017

It's usually even better to use iterative methods

thearn4 · on Feb 25, 2017

GMRES is definitely my go-to these days. Though it is worth noting that direct methods do have a benefit of letting you quickly solve many successive linear problems involving the same matrix, but different right-hand sides. But iterative methods scale very well for large sparse problems that they are very often the only tool to consider.

Block Krylov methods are a thing , but I haven't experimented with them yet.

JustFinishedBSG · on Feb 25, 2017

For solving linear systems there are recent mind-blowing methods by R. M. Gower and P. Richtarik:

https://arxiv.org/abs/1506.03296

http://www.maths.ed.ac.uk/~richtarik/papers/SDA.pdf

Plus R. M. Gower is fantastically nice and enthusiastic so there's that

And thanks for the link !

stabbles · on Feb 26, 2017

Only scrolled through it, but AFAIK if its results are comparable to Kaczmarz, then note that it is extremely slow.

If your goal is to solve say a LS problem, why not go for CGLS? http://web.stanford.edu/group/SOL/software/cgls/

zzleeper · on Feb 26, 2017

Has anyone implemented them in production? I read their paper a while ago, but thought that the dual methods they showed might have practical limitations (like insane cache misses, compared to more standard methods)

stabbles · on Feb 26, 2017

These problems you sketch can already be overcome by using the approximate eigenspectrum info of the matrix obtained as a by-product of Krylov methods. This has been invented many times, and is used in 'thick restart', 'augmented Krylov subspace methods', 'deflation'. In particular this was developed for scattering problems with many electromagnetic incident waves hitting an object, changing only the rhs.

Also, you can also use an approximate LU-decomp as a preconditioner for Krylov methods.

Similarly, Tykhonov regularization (solving for a range of slightly perturbed matrices "A + labda I" where labda is a parameter) are easily tackled using Krylov subspace methods by noting that the Krylov subspace is invariant under shifts like these. So only once an orthonormal basis must be found for the Krylov subspace, which can then be used for every lamda of interest.

stabbles · on Feb 26, 2017

Depends on the number of unknowns, sparsity of the matrix, whether you want 3 correct digits or 16, etc.

Direct methods as Gaussian elimination with pivoting are proven to be stable. Iterative methods are not but can be a lot cheaper in computational costs. Also they can stop when a certain relative residual is reached, unlike direct methods.

If iterative methods like Krylov subspace methods where stable, then they could actually be seen as direct methods themselves, as the Krylov subspace has at most a dimension of N, where N is the number of unknowns; so after at most N iterations the solution could be extracted exactly from the search space. In practice this is not the case due to rounding errors.

eriklindernoren · on Feb 25, 2017

Thank you for the feedback. :) I plan on fixing this soon!

JustFinishedBSG · on Feb 25, 2017

Not very important and for a learning project I find using cvxpy a better idea as it's more readable ( like you did ) but:

Solving the full quadratic optimization problem for SVMs in basically impossible to do. You are forming an n^2 matrix, so I'm going to let you imagine what happens when n = 100 000.

Using people use either approximation methods ( Incomplete Cholesky, Nystrom ) or do it exactly but iteratively ( SMO, Pegasos... )

I'm implementing them for class right now so it's still fresh in my head haha

imdsm · on Feb 26, 2017

Great resource, but it could be a phenomenal resource if you documented each method and explained how and why it does what it does.

Don't get me wrong, having working code to play with is key, but when you don't fully grasp the concepts behind it, an explanation can become so valuable.

That being said, you've included names, so research can be done. Great work and I hope you're enjoying it!

eriklindernoren · on Feb 26, 2017

Given the amount of publicity this repository has gained I will make sure do put more work into documenting the code. And also add references. Thank you for your feedback!

compactmani · on Feb 25, 2017

This is a nice project. I think it would be great to add references used for the implementations and some tests that demonstrate they return what is expected (or perhaps the same result of sklearn maybe).

eriklindernoren · on Feb 25, 2017

Those are great suggestions. I will look into adding that.

onvalleysilic · on Feb 25, 2017

Just tried it with an equities dataset http://54.174.116.134/recommend/datasets and it seems to have performed nicely. Great work!

eriklindernoren · on Feb 25, 2017

Awesome. Thank you!

f311a · on Feb 25, 2017

Similar project: https://github.com/rushter/MLAlgorithms

Jasamba · on Feb 25, 2017

This is impressive, and kindof exactly what I am in the process of doing. It's certainly the best way to get familiar with the internal workings of these methods than just tune parameters like an oblivious albeit theoretically informed monkey. How long did it take you to do them?

eriklindernoren · on Feb 25, 2017

It certainly is! Thanks. :) I have been working on it for about three weeks now.

JustFinishedBSG · on Feb 25, 2017

> I have been working on it for about three weeks now.

That's not a whole lot, you're quick

victor106 · on Feb 25, 2017

Would you suggest any books/resources to learn the theory behind these implementations so a newbie can follow along?

saboot · on Feb 25, 2017

Pattern recognition and machine learning by Bishop is one of the canonical text books. It helps to have a linear algebra background, it includes a refresher though

tnecniv · on Feb 25, 2017

Bishop is good but reads a little too much like a literature review sometimes. That may or may not be a problem depending on what you are looking for.

victor106 · on Feb 26, 2017

Thanks everyone for the suggestions...will check these books out

capkutay · on Feb 26, 2017

How does that compare to Andrew Ng's course?

asteinbr · on Feb 25, 2017

I can recommend Applied Predictive Modeling by Max Kuhn: http://appliedpredictivemodeling.com/

jumpCastle · on Feb 25, 2017

Introduction to statistical learning http://www-bcf.usc.edu/~gareth/ISL/

eggie5 · on Feb 26, 2017

This one is interesting to see the "statistical" other side of the industry vs machine-learning people. For example I don't think gradient descent is used once in that book.

IshV · on Feb 26, 2017

This book as a prerequisite to anyone who wants to get into machine learning imo.

make3 · on Feb 25, 2017

probably this https://www.amazon.ca/Python-Machine-Learning-Sebastian-Rasc...

nkozyra · on Feb 26, 2017

While a great book, most of this is just implementations of sklearn.

fnl · on Feb 26, 2017

This could become a fantastic resource for anybody who is teaching machine learning.

One vital improvement suggestion to make that path attractive would be if the Jupyter notebook format were used. It would be easier to add more documentation and references.

But in any case, thanks for sharing!

bayonetz · on Feb 26, 2017

Better yet, write your own simple versions. That's what we did in undergrad and grad classes in AI, ML, Neural Nets, etc. There is nothing like building one yourself, even a simple model like kNN!

onlyrealcuzzo · on Feb 25, 2017

This is awesome! I'm working on something similar for JavaScript. Definitely will be using yours for reference. Thanks, dude!

eriklindernoren · on Feb 25, 2017

Thank you!

metaobject · on Feb 26, 2017

In your RandomForest implementation, on the line in fit() where you're building the training subsets to give to each tree, it appears that your bagging approach doesn't use 'sampling with replacement' strategy.

    idx = np.random.choice(range(n_features), size=self.max_features, replace=False)

It would appear that the replace=False prevents the 'sampling with replacement' behavior usually implemented by bagging algorithms. Should the replace=False be changed to replace=True?

eriklindernoren · on Feb 26, 2017

Thank you for your feedback! I have read up on the feature bagging part of the algorithm and I believe that you are correct. This is fixed in the latest commit.

mrcactu5 · on Feb 25, 2017

sci-kit learn is excellent, but their implementations are a bit to complicated to learn from.

this is for people who don't just want to tune parameters but build the whole thing from scratch

I can buy buy a pie all the fix-ins from a bakery, or I can buy the ingredients myself, and make it to exactly my liking. it may not be a professional.

eriklindernoren · on Feb 26, 2017

Glad that you liked it!

screwston · on Feb 28, 2017

A friend sent me a link to this - nice work, and I happen to be intermittently working on a very similar (and unfortunately similarly named) project - https://github.com/jarfa/ML_from_scratch/. Check my commit history if you suspect me of copying you ;)

I don't think I'll be implementing as many algorithms as you though, I should force myself to work on more projects outside my comfort zone.

eriklindernoren · on Feb 28, 2017

Oh, cool! Haha, that's unfortunate. Maybe I should have done a better job finding an original name. Good luck to you anyways!

ussser · on Feb 25, 2017

Cool! How long did it take to learn and implement these models?

eriklindernoren · on Feb 25, 2017

I have taken some ML courses during my university studies and have also done some model implementations in other programming languages. So I didn't have to start from scratch. But I have been working on this project for about three weeks now.

edshiro · on Feb 25, 2017

Nice! I have started brushing up my maths and reading about machine learning in general. Next step is to get my feet wet in the implementation. I think looking at your project can give me a good idea as to how to implement some of the most basic algorithms. Good luck!

eriklindernoren · on Feb 25, 2017

Awesome! :) Doing the nitty-gritty has been a great way of learning the limitations and benefits of using certain models for a given task. Good luck to you as well!

grzm · on Feb 26, 2017

For the future, if it meets the guidelines, this likely should have been a Show HN:

https://news.ycombinator.com/showhn.html

jbrambleDC · on Feb 26, 2017

This is awesome. I am currently building a decision tree from scratch in Java and will use yours as a reference.

One comment I have. in kNN, it is best to ensure that the neighbors list occupies O(k) space.

dnautics · on Feb 25, 2017

Nice project! I'm doing something similar in julia, with the added advantage that as I build it the numerical types are variadic so I can play around with numbers that aren't IEEE FPs.

searchfaster · on Feb 26, 2017

Very nice project! Very very useful for a ML beginner like myself. Thank you very much !

jogundas · on Feb 25, 2017

Very cool! I have actually been planning to do exactly what you did, sir :)

natch · on Feb 25, 2017

Not sure what you had in mind, but a Python 3 version of this would be great!

jogundas · on Feb 26, 2017

Thanks for the comment, will have this in mind!

eriklindernoren · on Feb 25, 2017

Thanks! :)

peter_retief · on Feb 25, 2017

I feel happy to see your wonderful work you share so freely

eriklindernoren · on Feb 25, 2017

Glad you liked it!

joelberman · on Feb 25, 2017

Very nice project! Learning stuff makes me happy.

Winterflow3r · on Feb 25, 2017

This is really cool and inspiring!

eriklindernoren · on Feb 25, 2017

Thank you! The response here has been amazing. :)

sp4ke · on Feb 26, 2017

Amazing, thanks for sharing :)

thinkr42 · on Feb 25, 2017

This is awesome!

eriklindernoren · on Feb 25, 2017

Glad you like it!

mmrr88 · on Feb 25, 2017

http://vschool.io/en/apply/

SvenDowideit · on Feb 25, 2017

Deliver and release stuf that people actually use. Or work on projects that do.

Delivering value trumps painting every day