Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Microsoft Research wins image recognition competition (venturebeat.com)
166 points by espeed on Dec 13, 2015 | hide | past | favorite | 41 comments


"Baidu does not show up in this year’s rankings. The company made more submissions than were permitted and ultimately apologized and fired the team leader who directed juniors to make the unacceptable submissions."




Pretty good read, provides all the context needed and raises some important points about benchmarks. Thanks for sharing.


Might be worth noting that it is very likely that Google, Facebook and Baidu did not even participate in this ImageNet. It is widely believed in the vision community that this dataset is at the end of its run, i.e. any subsequent improvements in performance are not due to scientific breakthroughs but through hyperparameter optimization and ensembling.


Google participated with ReCeption / Inception v3 model, see the results: http://image-net.org/challenges/LSVRC/2015/results

In classification task, MSRA and ReCeption show very similar performance: 0.03567 vs 0.03581 top(5) error. The gap is much more drastic on the localization task: 0.090178 vs 0.195792

The residual learning presented by Microsoft Research seems to be a breakthrough, based on the early evidence. But yes, ImageNet needs to be updated to stay relevant.


The article and the title specifically mentions Google.


Google entered with ReCeption and finished just after the Microsoft team in the "Ordered by classification error" section (which is the usual "headline" number).

Baidu was banned. I don't think Facebook has ever done ImageNet?

But I agree that most of the interesting work will happen outside ImageNet now that human performance has been comprehensibly surpassed.

The only exception is non-Deep Learning based systems, where some people remain convinced that alternative approaches can match DL systems and have other advantages.


I bet there are breakthroughs to be had if they would compete on training time and computing resources, rather than just accuracy.


Surely that's not true until someone achieves human-level performance with their entry.


Human level performance was surpassed earlier this year:

http://arxiv.org/abs/1502.01852


I won't be surprised when human level performance is surpassed, but isn't that paper based on work that has the advantage of knowing the full test dataset in advance? You can say "yes but they only trained on the training data" but that doesn't rule out tweaking across several experiments and measuring them against the known test data, then cherry picking the experiment that was best overfit to that data, right? I'm not saying there are shenanigans here, just wondering how you know there are not.


Did they train also the human on the training data ?


The test labels are not known to them. The test labels are held by a different team that tells them their test error rate. Furthermore, multiple, separate teams have surpassed human performance.

FWIW, "human performance" is not as meaningful as most people make it out to be, because this is such a narrow task.


The paper mentions it's unfair to humans to have to tell say a coucal from an indigo bunting (both blue coloured birds). I've never heard of either so I guess I would have got that wrong but it's not the fault of my visual system.


But you only have to guess right once in 5 tries!


I replied to abrichr's link. Did you look at it at all?

They used a dataset from 2012. Are you saying that the test labels from a 2012 dataset have been kept secret until now? Really?


I am familiar with the ImageNet competition. Let me know if you find the ImageNet test labels. (Hint: they are not released, since they are re-used for competitions).


Interesting, did not know that. Do you think that proves this team was not able to get and exploit some inside knowledge? Judging from recent competitions, ethics does not seem to be a strong point in some cultures.


It's mentioned in the article, but here's a direct link to the paper on arXiv: http://arxiv.org/abs/1512.03385

And direct link to the PDF: http://arxiv.org/pdf/1512.03385v1.pdf


I'm disappointed to hear this after watching the video of Andrew Ng's talk. He was very proud of Baidu's accomplishment in this competition.


What products does Microsoft use this technology in?

I've seen what Google's doing in Google Photos but I've noticed it's not accurate all the time.


Can't speak for this particular team, but a submission for a competition is usually done on the last minute, so it's highly unlikely that _this_ algorithm is part of a product like Oxford API already. But a predecessor might, and it might get integrated over time. A research prototype usually requires a lot of duct tape to work. Productization usually requires a different mindset (i.e. an engineering rather than a research team) who care more about things like test coverage, distribution, (time) performance etc. So by definition, MSR is ahead of the curve, product teams are on the curve.


> What products does Microsoft use this technology in?

http://venturebeat.com/2015/11/11/microsoft-launches-project...


Developer API's. It would be nice to see a consumer use case in Windows.


It could be used in Project Adam: https://youtu.be/zOPIvC0MlA4


Sometimes even with political repercussions.


Is there a private one shot hold out in this particular competition?


yes. more like 5-shot hold-out, and the held-out set is pretty big as well, so more legit.


so you mean there are 5 entries you can select to be scored in the end, and there is one private leaderboard no one can see but the admins? do you know the size of the private leaderboard test set?


Has anyone implemented this yet for Tensorflow or Caffe? Also, any more details on the MSCOCO segmentation part?


I'd like to have seen how IBM would have fared with Watson.


Watson doesn't have much unique computer vision capabilities unfortunately. They did some good work in NLP a decade ago but not added much since


Have they not really added much to it since the Jeopardy stunt? We get some periodic rumblings about integrating it with a customer-service tool, but it is starting to sound a little Duke Nukem Forever


They have it doing medical analysis (I'm hesitant to say "diagnosis" but I think it's being used to find symptom patterns doctors might otherwise miss).

http://www.ibm.com/smarterplanet/us/en/ibmwatson/health/

    Watson for Oncology analyzes a patient’s medical 
    information against a vast array of data, including 
    ongoing expert training from MSK physicians, cancer 
    case histories, and more, to provide evidence-based 
    treatment options.


Does Chef Watson count? It's actually pretty neat as a consumer tool, it suggests the weirdest things but somehow they taste fine. No idea about the inner workings though

https://www.ibmchefwatson.com/


These are "invented" by the computer right? I just had a suggestion for

shrimp, peanut, olive oil, ginger, garlic, grapefruit juice, blood orange, dill, coriander seed, turmeric

That's a really odd mix but it honestly seems ok ... although I'd sub the shrimp for salmon ... apparently I can in the interface. Nice.

Also the ingredient input seems to lack exotic things like frogs, rattlesnakes, camels, durian, or ostrich. But quinoa and quail eggs are there.


AlchemyVision[1] from IBM Watson's Developer Cloud/BlueMix[2] is part of the Watson stack, and is pretty good. One of the few public APIs that lets you train your own categories.

[1] http://www.ibm.com/smarterplanet/us/en/ibmwatson/developercl...

[2] Pretty sure I need some (c) and TMs in there somewhere


Isn't Watson a question answering system? I don't think it's specialized for image recognition.


https://www.ibm.com/smarterplanet/us/en/ibmwatson/developerc...

Watson is a marketing term.

IBM provides a variety of machine learning services and calls them all Watson.


Well, that's just confusing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: