“Why Multi-word Phrases Make for More Secure Passwords Than Incomprehensible Gibberish”

I was rather surprised at this assertion, at first. Then I thought about it… there are only about 95 printable characters in the basic seven-bit ASCII character set. A very conservative estimate puts the number of distinct English words at well over 65,000, most of which are many letters long.

If you knew that someone’s password was several properly-spelled English words separated by single spaces, you’d still have 65,000-to-the-power-of-X combinations to go through for X words. That’s over four billion combinations to try for a two-word phrase, and about 275 trillion combinations for a three-word phrase — four billion is easy enough for a computer to handle, but 275 trillion is more secure than a seven-random-character password. And if you didn’t know that the password followed that pattern, or there was even one deliberately misspelled word or different bit of punctuation, the numbers skyrocket.

That said, I’m not about to abandon my current password scheme, but for some passwords, this might be useful.

10 Comments

  1. The problem is, over 90% of users use a word out of only about a hundred words, and a quite large percentage use words or number sequences out of less than 20 different commonly used passwords. (Like “password” or “12345678”). So advising users to use gibberish is a Good Thing.

    A lot of those 65,000 words in English, incidentally, are technical or medical terminology. Most of our vocabulary is less than 2,000 words, especially of less literate folks.

    Many “cracking dictionaries” rely on this phenomenon, and I was surprised to see a password I’d liked to use, “fubar”, as one of the words in a cracking dictionary. 🙂 (Since then I no longer use that, but that’s just one example.)

  2. According to this page, if you add technical terminology (and don’t include formal names for organisms, just technical vocabulary), the word-count may be as high as two million. The same page also concludes:

    It’s common to see figures for vocabulary quoted such as 10,000-12,000 words for a 16-year-old, and 20,000-25,000 for a college graduate. These seem not to have much research to back them up. Usually they don’t make clear whether active or passive vocabulary is being quoted, and they don’t account for differences in lifestyle, profession and hobby interests between individuals.

    David Crystal described a simple research project — using random pages from a dictionary — that suggests these figures are severe underestimates. He concludes that a better average for a college graduate might be 60,000 active words and 75,000 passive ones. But this method of assessing vocabulary counts dictionary headwords only; it would be possible to multiply it several-fold to include different senses, inflected forms, and compounds. Another assessment — of a million-word collection of American texts — identified about 38,000 headwords. Bearing in mind this was all general writing, this doesn’t sound so different from David Crystal’s estimates for graduate vocabularies.

    So I imagine that the original article is more correct than you think.

    • I was talking about the words people use most frequently for passwords, which is a tiny fraction of the overall vocabulary of a person. People usually pick something “easy to remember” or “cute”, and due to those patterns, cracking dictionaries exist and are highly effective.

  3. “Most frequently” simply means that more than a handful of people used them. Considering that half the population is less than average intelligence (by definition), that’s not surprising. It doesn’t affect the pool of words that people of at least average intelligence can draw from to create good passphrases.

  4. (The most common password is 123456, but you never see how common it is — I’ll bet that it’s in the low single digits, percentage-wise, even on low-value accounts that are open to everyone, where you’d logically expect to see lots of easy-to-guess passwords.)

  5. Ah… thanks for that link, it confirms that “123456” was used by less than 1% of the people in that list. I’m also shocked that only 20% of people used an easy-to-guess password. Presumably “RockYou” was a low-value target where people wouldn’t be too concerned about being hacked.

  6. Probably inspired by the same report that the LifeHacker article was. xkcd puts it so much better though. 🙂

Comments are closed.