|
Message-ID: <20150826054905.GA2804@openwall.com> Date: Wed, 26 Aug 2015 08:49:05 +0300 From: Solar Designer <solar@...nwall.com> To: john-users@...ts.openwall.com Subject: Re: Anyone looked at the Ashley Madison data yet? On Wed, Aug 26, 2015 at 06:15:55AM +0300, Solar Designer wrote: > For the "top N" work, you need to "shuf" the dump and choose specific > e.g. 100k lines from it (e.g. for intending to produce a top 100 list). > To make this even safer, "shuf" the 100k sub-list of hashes for each > potential contributor separately, and give each contributor only their > shuffled list. This extra measure is in case of interrupted attacks, so > that with a large number of contributors the original 100k list is > attacked uniformly anyway. (It wouldn't be fatal even if it's not, > though, since it's already shuffled. However, if a particularly common > password is found closer to the start of the 100k list, it might appear > as even more common than it actually is if some attacks are interrupted.) Actually, for a likely top 100 list from a 100k sub-list, you don't need a community effort. This can be done by one person using one machine in a few days. Just take a few hundred top passwords from existing such lists, add four lines: ashley madison Ashley Madison and run it until completion against the 100k sample (it's crucial to "shuf" the original list before you extract this sample). Out of the four lines I suggested adding, I guess the all-lowercase ones are somewhat likely to appear in top 100. The capitalized ones probably aren't popular enough, but are worth testing as well (can't rule out them being in top 100 without testing). To test 300 candidate passwords against a 100k sample at 50 c/s (one modern quad-core CPU), you need: 300 * 100000 / 50 / 86400 = ~7 days 300 is probably enough to have good confidence that ~90% of the eventual top 100 were included in testing. Someone might want to confirm or disprove this by comparing similar portions of existing top lists from various leaks, assuming that AM is similar in this respect. Adding a few hundred of top already cracked AM passwords (cracked without following this methodology, so without being limited to this sample) to the list of candidate passwords to test against the 100k sample is also a good idea. (If you already have those other cracks.) They will compete against the usual top 300 (derived from other top lists), in case there are enough specifics to AM that some (many?) otherwise not top 300 passwords are on AM's top 100. This may take a second week, or a second CPU. Or those passwords may be probed in a day against a 10k sample first, and only those that are common enough in that sample to potentially be in top 100 then tested against the 100k sample. Then it's just a day more. So it's unclear if a community effort is justified. For a top 100 list, if desired, someone just needs to do it right. And doing it right is more important than testing a larger candidate password list against a larger sample. Alexander
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.