john-users - Re: Contributing significant changes to the jumbo patch (mostly performance improvements)

Follow @Openwall on Twitter for new release announcements and other news

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <D86290232B8A496F88F9FA449EE61A86@ath64dual>
Date: Wed, 15 Jul 2009 08:55:06 -0500
From: "JimF" <jfoug@....net>
To: <john-users@...ts.openwall.com>
Subject: Re: Contributing significant changes to the jumbo patch 	(mostly performance improvements)

> from bartavelle:
>
> Small suggestion : adding an ETA would be a nice addition.

That sounds like a good idea.  The only 'problem' with percentages done (or 
any percentage computation), is if there are lots of rejection rules within 
the rule set, it is not possible to know which words will be skipped due to 
rejection of form.  So, if you had 10 rules each having 100 preprocessing 
rules (say, appending num-num), and followed that by 10 rules which each 
having 1000 preprocessing rules, but the 2nd set of 10 has rejection rules 
that filter out 90% of the words, both halves would take the same 
theoretical time, but when the first half is complete, john would claim only 
10% of the work was done, vs 50%.   I do not see a way around that, as until 
you run the words, john has no way of knowing that it will skip a bunch.

One other idea I have, is to change markov to allow a range  [min .. max]. 
Thus, you could run up to 220, then later pick up the search, and run from 
221 to 240, and only end up with the 'new' material, assuming you use the 
same statsfile.  Right now, about the only way to 'resume', is to drop both 
levels out to flat files (using --stdout), and then write code to strip the 
lines out of the larger file from the smaller one.  But changing the markov 
processing to not output or use values that are under a certain threshold 
should not be hard at all.

Jim. 

-- 
To unsubscribe, e-mail john-users-unsubscribe@...ts.openwall.com and reply
to the automated confirmation request that will be sent to you.

Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.