|
Message-ID: <CA+E3k9220jEnL8YrKiD15fjmY_veVR0Pb_QEouuT_mKs1Qy=Ug@mail.gmail.com> Date: Thu, 29 Jun 2017 21:49:13 -0800 From: Royce Williams <royce@...ho.org> To: john-users@...ts.openwall.com Cc: apingis@...nwall.net Subject: Re: DES-based crypt(3) cracking on ZTEX 1.15y FPGA boards (descrypt-ztex) On Thu, Jun 29, 2017 at 10:39 AM, Solar Designer <solar@...nwall.com> wrote: > Besides his recently committed work on bcrypt-ztex Denis has also been > trying to redesign descrypt-ztex. While his attempts were promising > (with ~50% greater expected speeds), they mostly failed so far with > difficult to debug issues. Given the low demand for any of this (with > it being mostly an experiment), I asked Denis that rather than keep > trying to get much better speeds he gathers whatever minor optimizations > he could get working quickly and commits those - and he did. The result > is a design that should run approx. 19/17 times = ~12% faster, and can > be overclocked slightly (5% or so) on top of that. This went into > bleeding-jumbo earlier today, and we welcome testing by others (Royce?) Fantastic! :) I'm still being excessively cautious with my substandard cooling setup (and I do understand that these boards should run much cooler with descrypt-ztex than when they are used for cryptocurrency work), so I haven't experimented with overclocking just yet. But I have at least now done some basic testing. > I am now getting ~806M c/s at standard clocks, ~840M at 5% overclocking > (which appears stable on this board, but YMMV). This is with the same > Qubes USB pass-through as I described for my bcrypt-ztex testing here: > > http://www.openwall.com/lists/john-users/2017/06/25/1 > > Performance should be higher without the virtualization (or with USB > controller pass-through rather than individual device proxying). Here are standard-clocks results on my setup - controlled directly from a Raspberry Pi 2, through a couple of powered USB 2.0 hubs, to my (now down to 14) functional boards: $ time ./john -form=descrypt-ztex -inc=alpha -min-len=8 -max-len=8 -mask='?w?l?l?l?l' pw-fake-unix ZTEX XXXXXXXXXX bus:1 dev:91 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:88 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:84 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:80 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:90 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:86 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:82 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:79 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:89 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:85 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:81 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:87 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:83 Frequency:220,160 220,160 220,160 220,160 ZTEX XXXXXXXXXX bus:1 dev:78 Frequency:220,160 220,160 220,160 220,160 Using default input encoding: UTF-8 Loaded 3269 password hashes with 2243 different salts (descrypt-ztex, traditional crypt(3) [DES ZTEX]) Press 'q' or Ctrl-C to abort, almost any other key for status promethe (u1286-des) campbell (u1822-des) [...] cocacola (u371-des) transfer (u2807-des) christop (u1861-des) 139g 0:00:06:37 0.3499g/s 0p/s 9926Mc/s 14382MC/s loveaaaa..lovealks salasana (u1324-des) liverpoo (u1172-bigcrypt:1) [...] kimberly (u1151-des) kathleen (u671-des) 198g 0:00:13:34 0.06% (ETA: 2017-07-15 13:43) 0.2431g/s 2514Kp/s 9906Mc/s 14295MC/s tuteaaaa..tutealks robotics (u2592-des) katherin (u427-des) [...] napoleon (u2420-des) 223g 0:00:19:59 0.12% (ETA: 2017-07-11 12:28) 0.1859g/s 3414Kp/s 9896Mc/s 14209MC/s saswrkzc..dupoalks nightsha (u2434-des) saturday (u2630-des) [...] acropoli (u894-bigcrypt:1) december (u1934-des) mountain (u305-des) 244g 0:00:29:42 0.18% (ETA: 2017-07-11 09:59) 0.1369g/s 3446Kp/s 9888Mc/s 14191MC/s shzbaaaa..shzbalks chocolat (u368-bigcrypt:1) [...] highland (u2152-des) swimming (u1381-des) 294g 0:00:39:30 0.31% (ETA: 2017-07-09 03:35) 0.1240g/s 4319Kp/s 9885Mc/s 14117MC/s hestaaaa..hestalks electric (u1049-des) [...] majordom (u693-bigcrypt:1) lawrence (u1163-des) 303g 0:00:46:41 0.37% (ETA: 2017-07-09 00:20) 0.1081g/s 4385Kp/s 9883Mc/s 14099MC/s emctaaaa..emctalks charlott (u573-des) charlott (u573-bigcrypt:1) [...] gilgames (u2083-bigcrypt:1) goodluck (u195-des) 316g 0:00:53:08 0.43% (ETA: 2017-07-08 19:10) 0.09911g/s 4495Kp/s 9881Mc/s 14089MC/s tmriaaaa..tmrialks gretchen (u1096-des) rastafar (u1297-des) [...] gilgames (u2083-des) gammaphi (u2064-des) 325g 0:00:59:59 0.49% (ETA: 2017-07-08 16:38) 0.09029g/s 4550Kp/s 9881Mc/s 14078MC/s nujuaaaa..nujualks spitfire (u794-des) So after an hour, performance was ~706Mc/s per board (if I'm reading it right). I also used a Kill-A-Watt to roughly measure power consumption (just of the supply to the boards, not any of the supporting gear): ~110W idle = ~7.8W/board ~470W under load = ~33.6W/board I also noted during the testing that CPU usage was around 40-44% during the run. The only troubles that I've encountered so far are from either apparent problems of scale, or problems with individual boards. (Most users won't encounter the scale problems, but given the age of these boards, some users may run into the per-board issues, so they may warrant closer examination; I will file issues for these as you suggest). With the individual board problems, during earlier (very kind!) troubleshooting sessions, Denis already significantly improved how his communication methods (ztex_inouttraffic?) handled my various failure modes, but a few remain. Specifically, I have a board or two that seem to fade in and out. (I've swapped USB and power connectors among known good boards, so I think that this is a problem associated with the boards themselves.) Example behavior: [...] promethe (u1286-des) campbell (u1822-des) SN 04A36E0FE0 FPGA #2 error: app_status=0x02 SN 04A36E0FE0 error -1 doing r/w of FPGAs (LIBUSB_ERROR_IO) Deassigned: 4 Found 1 device(s) ZTEX 1.15y SN: 04A36E0FE0 productId: 10.15.0.0 "inouttraffic UFM 1.15y" busnum:1 devnum:101 stephani (u234-bigcrypt:1) pinkfloy (u1273-des) This is handled gracefully for a while, allowing the overall job to continue, with the problematic board pass in out of service for a few minutes and successfully being redetected. And as far as I can tell, that board is working during that time, because the c/s are consistent with the number of boards. But eventually (usually within the first five minutes of work), a segfault occurs and the entire job stops: [...] monopoly (u716-des) blowfish (u946-des) XXXXXXXXXX CMP_EQUAL id=445: no task SN XXXXXXXXXX: bad input packet Deassigned: 4 Segmentation fault This "no task" error is always associated with the same serial number as that of the problematic board(s). Once the problematic boards are removed from the cluster, these errors stop. It would of course be preferable if a long-running job could continue after dropping an iffy board, but I don't know if this would be feasible. The apparent scale problem -- that has been happening for me all along, but included for completeness -- is that during the firmware upload that happens after initial power-up, if more than about 12 boards (it seems to vary) are present, a segmentation fault occurs: SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded SN XXXXXXXXXX: firmware uploaded Segmentation fault For this problem, there is a simple and consistently effective workaround. Simply re-running john with the same parameters again picks up where it left off, uploading firmware to any remaining devices, and then proceeding on with uploading bitstreams and starting the job. This is speculation, but it's almost as though there's a problem with uploading firmware to more than a certain number of devices in a row. If I had a much larger cluster, my guess is that the second run would segfault after another 12 or so devices, and the workaround would need to be reapplied until the last group of devices falls under the segfault threshold, at which point work would begin. I can do further testing as needed, though I do not have remotely controllable power management of the cluster and so usually have to be physically present, but should almost always be able to test within a day or so of any requests. I can do tests with a single board, or with smaller groups of boards, if that would help. I'm quite thrilled with both the descrypt and bcrypt work, and very grateful to Denis and everyone else who has helped to work on it. Kudos! Royce
Powered by blists - more mailing lists
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.