Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
The Top 50 Gawker Media Passwords (wsj.com)
28 points by m3mb3r on Dec 15, 2010 | hide | past | favorite | 29 comments


I personally use 'password' for my password on sites like Gawker, where I'm being forced to create an account I don't care about. Using 'password' for my password is my note to myself that this is a junk account that I have no interest in. I just don't care if somebody accesses it, period.

I suspect that others do the same thing, and little weight should be given to the strength of passwords recovered from a site such as this.



Success.


The list is basically identical to every "most common passwords" leak that's come out since the beginning of the web. Even "monkey", which the author seems to think is quirk of the Gawker community, is known to frequently be a top 20 password.


Yep. Here's an article from earlier this year that includes it as the 14th most common password: http://www.tomshardware.com/news/imperva-rockyou-most-common...

And here's one from 2008: http://www.whatsmypass.com/the-top-500-worst-passwords-of-al...


This lead me to a question about DES. If no salt is provided it uses a static or default two character salt. In the gawker leak, the first two characters of the stored hash were the default salt. How is that two character default derived?


I'm not sure what you mean by 'default salt'.

In the gawker leak, the first two characters of the stored hash were not a default salt, they were random salts. As for how they're generate? Well, randomly.


The thing is that they don't appear to be random. Everyone with the same password had the same salt. I guess default isn't the right word. Whatever algorithm they used to generate the salt always generated the same two characters given the same input. I'm curious as to whether anyone knows the details of that function.


What? No. Not at all. The most number times a given salt||hash occurs in the database is seven.

Why do you think that what you said is true?


I misinterpreted what I was looking at and I made a bad assumption that since the same hash for a known password appeared several times in my narrow view that it was the same for all occurrences of the known password.

Your comment that the maximum number of a given salt||hash was seven also threw me for a second. Am I correct that that is purely coincidental? Given the limit of only two characters for the salt (in what set? all printable characters?) and the sheer volume of accounts there is simply some unintended overlap? It just happens that the most it occured was seven times?


Ah, alright.

Yeah, seven is just coincidental. But, it appears that you are correct in one respect: seven seems a bit high to me. I don't have the time to do the probability distributions out, if someone cares would they do the calculation and check?

EDIT: The salt 'sV' occurred 215 times. sV39Fw5at18zo occurs seven times. Assuming that there were only 300 possible passwords each of which occurred with probability ~.3% (the probability of '123456'), then the probability of seven passwords hashing to the same value is incredibly low. Less than a thousandth of one percent. Does anyone know why this is? Or was it just the case that Scorpion's assumption that the distribution is very non-random is correct?


The takeaway here? If you want to "hack" into sites like these, you're virtually guaranteed to succeed by picking a few random usernames, and trying some combination of "123456", "password", "12345678", site name, and "qwerty" for password.

I think it's time for someone to come up with a radically better authentication mechanism.


There were one million passwords released, and only about 3,000 use '123456'. That's only a 0.3% chance.

Yes, you could have a bot that checked the top 50 passwords against a few thousand or so accounts- but even then, you'd only get one or two matches at most.

What if sites blocked passwords that have been used more than twice already? So, at most, there would be two "123456" passwords- any secure password is more than likely something that no more than one other person on any given site would be using.

[Edited: Whoops, did the math too quickly]


There were one million passwords released, and only about 3,000 use '123456'. That's only a 0.003% chance.

Actually, 3000/1000000 = .003, so a .3% chance.


> What if sites blocked passwords that have been used more than twice already?

"I'm sorry, another user on this system is already using the password you have submitted. Please choose a different one."

That's not the sort of information you want to be leaking.


"Your password is too weak"


Yes, this would suffice. But I think it points out that a duplicate password is just a proxy for a weak password: strong passwords tend to be locally unique.

If you are going to have a strength requirement, then run your strength validation routine and deny weak passwords. That a password is duplicated seems like a special case of weakness that is not worth checking for.

i.e. You probably don't want special checks for 'password' or '123456' either, since your strength validation routine should catch these.


Mot to mention storing the passwords in plaintext (or unsalted) to be able to query them.


I think good practices around encryption/storage of passwords would make it infeasible to check how many times a password is already in your database.

How about instead just implementing a blacklist of unacceptable passwords?


Wouldn't you just need to hash the password and do a lookup on the table to see how many results were returned?


If you are generating a per-password salt, that won't work. In order to find prior occurrences of a given password, you would have to hash the password for every salt value that you've ever used. And since you're using BCrypt[1], that will be very slow.

[1] http://codahale.com/how-to-safely-store-a-password/


How long would you spend in the "another user has this password, please try a different one" loop before getting fed up and leaving?


...which would allow you to comment on a blog article as someone else. Yikes!


I am personally surprised by the number of proper names on there. Jennifer, Jordan, Michelle, Micheal. I know these are pretty common names (Jordan?) but when you figure the percentage of the population that would have these names, then the percentage of those that would use their name as a password (assuming they are using their name, and not for some other reason) then it's surprising that so many would make a top 50 list.


The names are probably picked for other reasons. Each of those names is the name of a popular figure in society (actor, entertainer, sports star).


This is actually a pretty good analysis by the mainstream press. While the information is well-known to the point of being common sense for us, for readers of the WSJ it will likely be a learning experience.


Duplicate:

http://news.ycombinator.com/item?id=2002805

Many comments there.


Interesting list. "qwerty" is up there as well. Wondered whats that, and thats just first row on your keyborad.


Also the name of the standard us/uk english keyboard layout: http://en.wikipedia.org/wiki/QWERTY




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: