|
Message-ID: <CAJocqxPpB8nO1w-ZJeQY0QNF7WiTS4KZD9xc7vf4wmjGCrH==w@mail.gmail.com> Date: Sat, 31 Dec 2011 10:24:12 -0600 From: Wesley Tansey <tansey@...utexas.edu> To: john-users@...ts.openwall.com Subject: Re: Rules for realistic words Hi Alex, Have you seen Markov mode? http://openwall.info/wiki/john/markov That seems to be more or less what you are describing in the first half of your email. Wesley 2011/12/31 Alex Sicamiotis <alekshs@...mail.com> > > From an analysis I've conducted in a file containing greeklish (greek > words written in english) and english passwords, of ~1500 DES (max 8 > length) passwords, the following came up: > > Very high frequency letters: > a=850 i=597 o=584 e=525 > > Medium to low frequency letters > s=498 r=472 n=418 t=405 l=366 m=277 p=247 c=211 d=201 k=193 g=159 u=148 > h=144 b=113 y=97 f=87 > > Very low frequency letters > v=66 w=53 x=47 j=31 z=31 q=18 > > > Number frequency: > 1=448 occurences > 2=293 occurences > 3=249 occurences > 9=219 occurences > 0=203 occurences > 4=185 occurences > 6=175 occurences > 5=174 occurences > 7=156 occurences > 8=132 occurences > > ...what this means, is that a new method of brute forcing could be used. > > Currently it's something like > > 1) single > 2) dictionary > 3) dictionary with rules > 4) incremental with digits, Alpha, Lanman, All from lower characters to > more characters. > > Now for the 26 letters of Alpha, it goes like 26x26x26x26x26x26x26x26 = > 208.8 billion combos > For the Alpha+Digits it goes 36x36x36x36x36x36x36x36 = 2.82 trillion combos > > What if there were intermediate character sets of frequently used letters > as an intermediate step between dictionaries with rules and incremental > with full character sets? For example the top 16 letters and 4 numbers = 20 > characters in total. In such a case it's only 25.6 billion combos for 8 > char length - and with multiple hashes, it's always worth to check these > first in order to crack them and speed up the rest. I think incremental > mode already applies some sort of "more frequent" type of cracking, but I > don't know how optimized it is in relation to this. If it already covers > this sector, ignore this comment. > > Another aspect that can take improvement, (not in cracking speed, but in > cracking the easier ones out) is to emulate how language is constructed. > For example greek & italian languages, use a lot of alternation between > consonant and vowels. This means that you can have a rule which goes like > this: > > (V)owel > (C)onsonant > (B)oth+numbers+symbols > > 1-4 lengths are cracked in incremental > From 4 char length onwards: > > VCVCV => italy > CVCVC => begar > VCVCB => nike@ > CVCVB => epic6 > VCVCVC > CVCVCV > VCVCVB > CVCVCB > VCVCVCV > CVCVCVC > VCVCVCB > CVCVCVB > VCVCVCVC > CVCVCVCV > VCVCVCVB > CVCVCVCB > > By splicing words in human-like syllables, I achieved a hefty increase in > effective cracking speed. Because instead of 26x26x26... it goes like > 18x8x18x8x18 - which means enormously less combinations than non-words like > zzxaeseq. > > (the following is a greeklish example - you may see some words as vowels > which are consonants in english, but in greeklish for example w is used > phonetically as o.. it's the omega letter) > > > [bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy]" > > [aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz]" > > [bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz]" > > [aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy][bcdfgjklmnpqrstvxz][aehiouwy]" > > In some cases it needs tweeking to account for two consonants or two > vowels in some part of the word (for example peNTagon, aCRopolis, bicyCLe, > AErodynamic), so a few variations of the above are necessary to cover a > large percentage of words. > > An analysis of the english language and linguistic patterns might give > significant increase in human-like words or composite words (that the > dictionaries do not contain - like name&surname). Ideally, we could have a > statistics program or an AI program to extract rules for the 95%+ of the > words contained in a certain language, so that combinations could be based > on this structure (with possible twists like adding stuff in the end). > English are a bit more difficult to do in a letter-by-letter format > compared to greek/italian, but, ultimately, it's just more variations. A > syllable approach (ie combos of one, two and three letter sequences) might > also be appropriate for english or other languages. For example instead of > combining words, we could combine ready syllables... The syllable MO + > syllable RE = word MORE. The combinations compared to 26^8 will drop > dramatically. > > Have a great 2012... >
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.