john-users - Re: DES-based crypt(3) cracking on ZTEX 1.15y FPGA boards (descrypt-ztex)

Follow @Openwall on Twitter for new release announcements and other news
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+E3k9220jEnL8YrKiD15fjmY_veVR0Pb_QEouuT_mKs1Qy=Ug@mail.gmail.com>
Date: Thu, 29 Jun 2017 21:49:13 -0800
From: Royce Williams <royce@...ho.org>
To: john-users@...ts.openwall.com
Cc: apingis@...nwall.net
Subject: Re: DES-based crypt(3) cracking on ZTEX 1.15y FPGA
 boards (descrypt-ztex)

On Thu, Jun 29, 2017 at 10:39 AM, Solar Designer <solar@...nwall.com> wrote:

> Besides his recently committed work on bcrypt-ztex Denis has also been
> trying to redesign descrypt-ztex.  While his attempts were promising
> (with ~50% greater expected speeds), they mostly failed so far with
> difficult to debug issues.  Given the low demand for any of this (with
> it being mostly an experiment), I asked Denis that rather than keep
> trying to get much better speeds he gathers whatever minor optimizations
> he could get working quickly and commits those - and he did.  The result
> is a design that should run approx. 19/17 times = ~12% faster, and can
> be overclocked slightly (5% or so) on top of that.  This went into
> bleeding-jumbo earlier today, and we welcome testing by others (Royce?)

Fantastic! :)  I'm still being excessively cautious with my
substandard cooling setup (and I do understand that these boards
should run much cooler with descrypt-ztex than when they are used for
cryptocurrency work), so I haven't experimented with overclocking just
yet. But I have at least now done some basic testing.

> I am now getting ~806M c/s at standard clocks, ~840M at 5% overclocking
> (which appears stable on this board, but YMMV).  This is with the same
> Qubes USB pass-through as I described for my bcrypt-ztex testing here:
>
> http://www.openwall.com/lists/john-users/2017/06/25/1
>
> Performance should be higher without the virtualization (or with USB
> controller pass-through rather than individual device proxying).

Here are standard-clocks results on my setup - controlled directly
from a Raspberry Pi 2, through a couple of powered USB 2.0 hubs, to my
(now down to 14) functional boards:

$ time ./john -form=descrypt-ztex -inc=alpha -min-len=8 -max-len=8
-mask='?w?l?l?l?l' pw-fake-unix
ZTEX XXXXXXXXXX bus:1 dev:91 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:88 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:84 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:80 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:90 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:86 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:82 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:79 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:89 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:85 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:81 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:87 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:83 Frequency:220,160 220,160 220,160 220,160
ZTEX XXXXXXXXXX bus:1 dev:78 Frequency:220,160 220,160 220,160 220,160
Using default input encoding: UTF-8
Loaded 3269 password hashes with 2243 different salts (descrypt-ztex,
traditional crypt(3) [DES ZTEX])
Press 'q' or Ctrl-C to abort, almost any other key for status
promethe         (u1286-des)
campbell         (u1822-des)
[...]
cocacola         (u371-des)
transfer         (u2807-des)
christop         (u1861-des)
139g 0:00:06:37  0.3499g/s 0p/s 9926Mc/s 14382MC/s loveaaaa..lovealks
salasana         (u1324-des)
liverpoo         (u1172-bigcrypt:1)
[...]
kimberly         (u1151-des)
kathleen         (u671-des)
198g 0:00:13:34 0.06% (ETA: 2017-07-15 13:43) 0.2431g/s 2514Kp/s
9906Mc/s 14295MC/s tuteaaaa..tutealks
robotics         (u2592-des)
katherin         (u427-des)
[...]
napoleon         (u2420-des)
223g 0:00:19:59 0.12% (ETA: 2017-07-11 12:28) 0.1859g/s 3414Kp/s
9896Mc/s 14209MC/s saswrkzc..dupoalks
nightsha         (u2434-des)
saturday         (u2630-des)
[...]
acropoli         (u894-bigcrypt:1)
december         (u1934-des)
mountain         (u305-des)
244g 0:00:29:42 0.18% (ETA: 2017-07-11 09:59) 0.1369g/s 3446Kp/s
9888Mc/s 14191MC/s shzbaaaa..shzbalks
chocolat         (u368-bigcrypt:1)
[...]
highland         (u2152-des)
swimming         (u1381-des)
294g 0:00:39:30 0.31% (ETA: 2017-07-09 03:35) 0.1240g/s 4319Kp/s
9885Mc/s 14117MC/s hestaaaa..hestalks
electric         (u1049-des)
[...]
majordom         (u693-bigcrypt:1)
lawrence         (u1163-des)
303g 0:00:46:41 0.37% (ETA: 2017-07-09 00:20) 0.1081g/s 4385Kp/s
9883Mc/s 14099MC/s emctaaaa..emctalks
charlott         (u573-des)
charlott         (u573-bigcrypt:1)
[...]
gilgames         (u2083-bigcrypt:1)
goodluck         (u195-des)
316g 0:00:53:08 0.43% (ETA: 2017-07-08 19:10) 0.09911g/s 4495Kp/s
9881Mc/s 14089MC/s tmriaaaa..tmrialks
gretchen         (u1096-des)
rastafar         (u1297-des)
[...]
gilgames         (u2083-des)
gammaphi         (u2064-des)
325g 0:00:59:59 0.49% (ETA: 2017-07-08 16:38) 0.09029g/s 4550Kp/s
9881Mc/s 14078MC/s nujuaaaa..nujualks
spitfire         (u794-des)


So after an hour, performance was ~706Mc/s per board (if I'm reading it right).

I also used a Kill-A-Watt to roughly measure power consumption (just
of the supply to the boards, not any of the supporting gear):

~110W idle = ~7.8W/board
~470W under load = ~33.6W/board

I also noted during the testing that CPU usage was around 40-44% during the run.

The only troubles that I've encountered so far are from either
apparent problems of scale, or problems with individual boards. (Most
users won't encounter the scale problems, but given the age of these
boards, some users may run into the per-board issues, so they may
warrant closer examination; I will file issues for these as you
suggest).

With the individual board problems, during earlier (very kind!)
troubleshooting sessions, Denis already significantly improved how his
communication methods (ztex_inouttraffic?) handled my various failure
modes, but a few remain.

Specifically, I have a board or two that seem to fade in and out.
(I've swapped USB and power connectors among known good boards, so I
think that this is a problem associated with the boards themselves.)
Example behavior:

[...]
promethe         (u1286-des)
campbell         (u1822-des)
SN 04A36E0FE0 FPGA #2 error: app_status=0x02
SN 04A36E0FE0 error -1 doing r/w of FPGAs (LIBUSB_ERROR_IO)
Deassigned: 4
Found 1 device(s) ZTEX 1.15y
SN: 04A36E0FE0 productId: 10.15.0.0 "inouttraffic UFM 1.15y" busnum:1 devnum:101
stephani         (u234-bigcrypt:1)
pinkfloy         (u1273-des)


This is handled gracefully for a while, allowing the overall job to
continue, with the problematic board pass in out of service for a few
minutes and successfully being redetected. And as far as I can tell,
that board is working during that time, because the c/s are consistent
with the number of boards. But eventually (usually within the first
five minutes of work), a segfault occurs and the entire job stops:

[...]
monopoly         (u716-des)
blowfish         (u946-des)
XXXXXXXXXX CMP_EQUAL id=445: no task
SN XXXXXXXXXX: bad input packet
Deassigned: 4
Segmentation fault


This "no task" error is always associated with the same serial number
as that of the problematic board(s). Once the problematic boards are
removed from the cluster, these errors stop. It would of course be
preferable if a long-running job could continue after dropping an iffy
board, but I don't know if this would be feasible.

The apparent scale problem -- that has been happening for me all
along, but included for completeness -- is that during the firmware
upload that happens after initial power-up, if more than about 12
boards (it seems to vary) are present, a segmentation fault occurs:

SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
SN XXXXXXXXXX: firmware uploaded
Segmentation fault


For this problem, there is a simple and consistently effective
workaround. Simply re-running john with the same parameters again
picks up where it left off, uploading firmware to any remaining
devices, and then proceeding on with uploading bitstreams and starting
the job. This is speculation, but it's almost as though there's a
problem with uploading firmware to more than a certain number of
devices in a row. If I had a much larger cluster, my guess is that the
second run would segfault after another 12 or so devices, and the
workaround would need to be reapplied until the last group of devices
falls under the segfault threshold, at which point work would begin.

I can do further testing as needed, though I do not have remotely
controllable power management of the cluster and so usually have to be
physically present, but should almost always be able to test within a
day or so of any requests. I can do tests with a single board, or with
smaller groups of boards, if that would help.

I'm quite thrilled with both the descrypt and bcrypt work, and very
grateful to Denis and everyone else who has helped to work on it.
Kudos!

Royce
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.