|
Message-ID: <CAC6_mQN+8L5MXub-4akTuCZ63KtVc7dW4P992p=3sv3T+X2Xrw@mail.gmail.com> Date: Sat, 19 Nov 2011 22:38:30 -0500 From: Stephen Reese <rsreese@...il.com> To: john-users@...ts.openwall.com Subject: Re: OpenMP not using all threads On Sat, Nov 19, 2011 at 8:35 PM, Solar Designer <solar@...nwall.com> wrote: > On Sat, Nov 19, 2011 at 07:55:50PM -0500, Stephen Reese wrote: >> I had a feeling that the 32-bit architecture might be an issue as I >> noticed that "OpenMP example" was only twice as fast (32-bit OpenMP) >> instead of four times (64-bit OpenMP). >> http://openwall.info/wiki/internal/gcc-local-build#OpenMP-example. >> Though OpenMP example is four times as fast neither the CVS nor >> stable/patch versions of John would provide the 4x speed-up I was >> hoping for even on the 64-bit. Maybe XEN and the other respective >> hosts across the multiple Linodes I am testing are causing roughly a >> 45 - 60% slowdown from a bare-metal instance but not affecting the >> "OpenMP Example". > > It appears that you simply have unstable system performance (changing > over time as load from other VMs changes). > >> root@:~# time ./loop2 >> 615e5600 >> real 0m2.229s >> user 0m2.226s >> sys 0m0.002s >> root@:~# time ./loop >> 615e5600 >> real 0m0.333s >> user 0m1.313s >> sys 0m0.003s > > This would be a 7x speedup if it were for real, but notice how the user > time decreased as well - indicating that load from other VMs probably > halved between these two invocations. You'll need many more invocations > of your benchmarks to see the overall difference between the different > builds despite of the changing load. > >> What I am trying to achieve: I have 42 DES passwords and three >> Linodes. Password list is currently split-up so each host has 12 >> entries and are running in incremental mode. Is there a better way, >> such as specifying a thread per instance on a single host? >> >> Is there a performance/time benefit in splitting up the password list >> amongst multiple hosts or is one host going to achieve the same >> results as the three? > > This depends on the hashes per salt ratio. You didn't mention how many > different salts you have. Is it 42 hashes with 42 different salts? > > Anyhow, you may achieve a very slight increase in c/s rate (due to lower > key setup overhead) by not splitting your 42 hashes (have all nodes load > all 42), but instead splitting the candidate password space. However, > this improvement would probably be negated by slightly less optimal > order in which candidate passwords would be tested then (e.g., you'd > split by length: 0-6, 7, 8). So continuing like you have started is > fine. 3*12 is 36, not 42, though. > > Also note that OpenMP generally performs poorly when the system is > under other load. In your case, this other load comes from other VMs. > Even a 10% load from other processes/VMs may result in a 50% slowdown of > your task with OpenMP, unfortunately. And it can be even worse than > that: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43706 (yes, you may > try the GOMP_SPINCOUNT workarounds from there). > > As an alternative, you may try an MPI build of -jumbo, even across all > three of your Linodes. > > Or as a simpler alternative, yes, you may choose to use many instances > of non-OpenMP builds. Then other load will have less of an effect, but > the key setup overhead will increase. The CVS version and the > -fast-des-key-setup-3 patch (your choice) reduce the key setup overhead, > though, making it almost negligible. In 1.7.8 release, it's about 10% > when cracking just one DES-based crypt(3) hash. With the newer code or > the patch, it reduces to about 3%. You probably lose a lot more than > that to OpenMP's unfriendliness to system load, so you'll improve things > overall by going for separate processes. > > Alexander > Alexander, Thanks for the quick response. There are 42 hashes and 42 unique salts (13/node). I am going to change this so there are 42 hashes per node and specify the length, (1-6, 7, 8 for All.chr). OpenMP has consistently been around 5000K but I tested another recommendation of yours for running non-OpenMP due to the previously discussed system load woes (GOMP_SPINCOUNT did not help). Four non-OpenMP run at 2000K and a fifth at ~1000K using the same john.pots and password file via multiple sessions--they seem to even out after a bit but a combined 9000K is great! This is what I was looking for. Thanks for your help!
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.