I was rather surprised at this assertion, at first. Then I thought about it… there are only about 95 printable characters in the basic seven-bit ASCII character set. A very conservative estimate puts the number of distinct English words at well over 65,000, most of which are many letters long.
If you knew that someone’s password was several properly-spelled English words separated by single spaces, you’d still have 65,000-to-the-power-of-X combinations to go through for X words. That’s over four billion combinations to try for a two-word phrase, and about 275 trillion combinations for a three-word phrase — four billion is easy enough for a computer to handle, but 275 trillion is more secure than a seven-random-character password. And if you didn’t know that the password followed that pattern, or there was even one deliberately misspelled word or different bit of punctuation, the numbers skyrocket.
That said, I’m not about to abandon my current password scheme, but for some passwords, this might be useful.
The problem is, over 90% of users use a word out of only about a hundred words, and a quite large percentage use words or number sequences out of less than 20 different commonly used passwords. (Like “password” or “12345678”). So advising users to use gibberish is a Good Thing.
A lot of those 65,000 words in English, incidentally, are technical or medical terminology. Most of our vocabulary is less than 2,000 words, especially of less literate folks.
Many “cracking dictionaries” rely on this phenomenon, and I was surprised to see a password I’d liked to use, “fubar”, as one of the words in a cracking dictionary. 🙂 (Since then I no longer use that, but that’s just one example.)
According to this page, if you add technical terminology (and don’t include formal names for organisms, just technical vocabulary), the word-count may be as high as two million. The same page also concludes:
So I imagine that the original article is more correct than you think.
I was talking about the words people use most frequently for passwords, which is a tiny fraction of the overall vocabulary of a person. People usually pick something “easy to remember” or “cute”, and due to those patterns, cracking dictionaries exist and are highly effective.
Though apparently I did quote total vocabulary figures that were erroneous. That wasn’t my only point tho. 🙂
“Most frequently” simply means that more than a handful of people used them. Considering that half the population is less than average intelligence (by definition), that’s not surprising. It doesn’t affect the pool of words that people of at least average intelligence can draw from to create good passphrases.
(The most common password is 123456, but you never see how common it is — I’ll bet that it’s in the low single digits, percentage-wise, even on low-value accounts that are open to everyone, where you’d logically expect to see lots of easy-to-guess passwords.)
http://www.nytimes.com/2010/01/21/technology/21password.html
Right, the most commmon password on the web.
Overall, 20% of the people picked a password common enough to be put into a 5,000 word cracking dictionary, which is smaller than I thought, but still…
Ah… thanks for that link, it confirms that “123456” was used by less than 1% of the people in that list. I’m also shocked that only 20% of people used an easy-to-guess password. Presumably “RockYou” was a low-value target where people wouldn’t be too concerned about being hacked.
xkcd just posted a strip on this very topic here: http://xkcd.com/936/
Probably inspired by the same report that the LifeHacker article was. xkcd puts it so much better though. 🙂