|
Message-ID: <20240206170142.GA2656@openwall.com> Date: Tue, 6 Feb 2024 18:01:42 +0100 From: Solar Designer <solar@...nwall.com> To: oss-security@...ts.openwall.com Subject: CVE-2024-1048: grub2-set-bootflag may be abused to fill up /boot, bypass RLIMIT_NPROC Hi, Summary: This message is about issues in grub-set-bootflag.c commonly installed as grub2-set-bootflag, which is Red Hat's addition (not part of upstream GRUB project) used at least in Fedora and RHEL and its downstreams. It is a SUID root program. I think its latest development source code is currently located in this branch: https://github.com/rhboot/grub2/tree/fedora-40 On non-OSTree distros, this program's purpose appears to be purely cosmetic - hide the boot menu if the system had already successfully booted up with its current kernel and a user had successfully logged in. Impact of the issues I identified (through my work at CIQ on Rocky Linux) is rather limited - denial of service and resource limit bypass. I pre-notified Red Hat grub2 package maintainers about upcoming issues in this program in late December, and reported them in detail via Red Hat Bugzilla on January 3: https://bugzilla.redhat.com/show_bug.cgi?id=2256678 (This is currently a private "bug", hopefully it will be opened soon.) I also reported this to linux-distros on January 24, and today February 6 is the coordinated public disclosure. Attached are my currently proposed patches (two revisions, see below), tested by me on Rocky Linux 9.3, and (for the later revision) also by people at Red Hat. Red Hat assigned this issue CVE-2024-1048 and rated it as CVSSv3.1 Base Score 3.3 and Moderate severity, which I agree with: CVSS:3.1/AV:L/AC:L/PR:L/UI:N/S:U/C:N/I:N/A:L - 3.3 Technically, the RLIMIT_NPROC bypass could mean S:C A:H, resulting in a score of 6.5, however in practice for this to matter the resource limits would need to be set up, which by default and on most systems they are not anyway. That is, by default almost the same kind and extent of DoS is possible by a simple "fork bomb" from the user's account, so there's no additional vulnerability. I'd like to thank Red Hat, and especially Marta Lewandowska for her help in coordinating this disclosure and testing the patches. Overall, I think that at least on Enterprise Linux distros unprivileged setting of boot flags should be disabled by default. It is of questionable value and isn't worth the risk. That said, I understand that for now it may be easier for distros to patch than to re-think it. Detail: In 2019, Tavis Ormandy reported that the original implementation of grub2-set-bootflag could be abused to truncate the grubenv file. This is CVE-2019-14865 and was fixed back then: https://bugzilla.redhat.com/show_bug.cgi?id=CVE-2019-14865 https://access.redhat.com/errata/RHSA-2020:0335 Taking a fresh look at grub2-set-bootflag, I saw some other ways in which users could still abuse this little program: 1. After CVE-2019-14865 fix, grub2-set-bootflag no longer rewrites the grubenv file in-place, but writes into a temporary file and renames it over the original, checking for error returns from each call first. This prevents the original file truncation vulnerability, but it can leave the temporary file around if the program is killed before it can rename or remove the file. There are still many ways to get the program killed, such as through RLIMIT_FSIZE triggering SIGXFSZ (tested, reliable) or by careful timing (tricky) of signals sent by process group leader, pty, pre-scheduled timers, SIGXCPU (probably not an exhaustive list). Invoking the program multiple times fills up /boot (or if /boot is not separate, then it can fill up the root filesystem). Since the files are tiny, the filesystem is likely to run out of free inodes before it'd run out of blocks, but the effect is similar - can't create new files after this point (but still can add data to existing files, such as logs). 2. After CVE-2019-14865 fix, grub2-set-bootflag naively tries to protect itself from signals by becoming full root. (This does protect it from signals sent by the user directly to the PID, but e.g. "kill -9 -1" by the user still works.) A side effect of such "protection" is that it's possible to invoke more concurrent instances of grub2-set-bootflag than the user's RLIMIT_NPROC would normally permit (as specified e.g. in /etc/security/limits.conf, or say in Apache httpd's RLimitNPROC if grub2-set-bootflag would be abused by a website script), thereby exhausting system resources (e.g., bypassing RAM usage limit if RLIMIT_AS was also set). 3. umask is inherited. Again, due to how the CVE-2019-14865 fix creates a new file, and due to how mkstemp() works, this affects grubenv's new file permissions. Luckily, mkstemp() forces them to be no more relaxed than 0600, but the user ends up being able to set them e.g. to 0. Luckily, at least in my testing GRUB still works fine even when the file has such (lack of) permissions. The attached -1 patch deals with my example abuses above as follows: 1. RLIMIT_FSIZE is pre-checked, so this specific way to get the process killed should no longer work. However, this isn't a complete fix because there are other ways to get the process killed after it has created the temporary file. The patch also fixes bug 1975892 ("RFE: grub2-set-bootflag should not write the grubenv when the flag being written is already set") and similar for "menu_show_once", which further reduces the abuse potential. 2. RLIMIT_NPROC bypass should be avoided by not becoming full root (aka dropping the partial "kill protection"). 3. A safe umask is set. The -1 patch is a partial fix (temporary files can still accumulate, but this is harder to trigger). It should be safe to use. The attached -7 patch additionally switches to usage of per-user fixed temporary filenames along with a weird locking mechanism, which is explained in source code comments. This is a more complete fix (temporary files can't accumulate). Unfortunately, it introduces new risks (by working on a temporary file shared between the user's invocations), which are _hopefully_ avoided by the patch's elaborate logic. I actually got it wrong at first, which suggests that this logic is hard to reason about, and more errors or omissions are possible. It also relies on the kernel's primitives' exact semantics to a greater extent (nothing out of the ordinary, though). Both patches also fix potential 1- or 2-byte over-read of env[] if its content is malformed - this was not a security issue since the grubenv file is trusted input, and the fix is just for robustness. Also attached is a program I wrote and used to test the unusual approach to locking implemented in the -7 patch here. Remaining issues that I think cannot reasonably be fixed without a redesign (e.g., having per-flag files with nothing else in them) and without introducing new issues: A. A user can still revert a concurrent user's attempt of setting the other flag - or of making other changes to grubenv by means other than this program. B. One leftover temporary file per user is still possible. Needs comments by people more familiar with GRUB and its configurations in use: C. One hopefully non-issue (but I am not sure): can "menu_show_once" possibly make the system stuck at next boot? Apparently, not with defaults, but maybe along with other GRUB settings in place? If so, it could be unsafe to expose setting this flag to users. A misfeature? Security hardening not yet implemented (would require changes or at least decisions outside of this program's code): D. If this program's functionality is really desirable anywhere at all, perhaps its availability should vary by distro - e.g., have it on (some builds of) Fedora, but not on Enterprise Linux distros - and then don't make this program SUID root where that is not needed. E. The program could refuse to work (exit early) if invoked by an unexpected system pseudo-user. Apparently, it's expected to be invoked by all normal users, but we can nevertheless disallow uid < 1000, so the program couldn't be abused by a compromised system pseudo-user account in a multi-vulnerability multi-step attack. F. grubenv could be made a symlink into a subdirectory writable by a group, then SGID to that group could be used, mostly to reduce impact of some other (yet unidentified) vulnerabilities/attacks on the program. Regarding remaining issue/idea D above, even RHEL installs /usr/lib/systemd/user/grub-boot-success.service, which then fails to run upon user login when the program is not user-accessible. The impact from this failure, however, appears to be very limited - just some noise in the logs. The -7 patch includes a piece to reduce such noise if the program is installed e.g. mode 755. Overall, my understanding is that the program (and other related parts using the boot success flag) is most useful on systems with OSTree, which means some builds of Fedora, right? Per Wikipedia it's "Fedora's atomic spins (Silverblue, Kinoite, and Sericea)". https://en.wikipedia.org/wiki/OSTree Should we get rid of it on other distros? Or on the contrary, should we make real, non-cosmetic use of the boot flag? If not setting the flag would trigger automatic fallback to the previous kernel, that could be a valuable enough feature to justify some risks, but on the other hand such fallback would also be unexpected by many and it'd be a security concern on its own. A server could successfully boot into the new kernel and be in use without any Unix user logins to it occurring until next reboot. It shouldn't then revert to the old kernel just because no one had logged in. So the feature would need to be opt-in by the sysadmin or/and the criteria for fallback would need to be different. Alexander View attachment "grub-set-bootflag-rocky-1.patch" of type "text/plain" (2112 bytes) View attachment "grub-set-bootflag-rocky-7.patch" of type "text/plain" (6387 bytes) View attachment "locktest.c" of type "text/x-c" (2348 bytes)
Powered by blists - more mailing lists
Please check out the Open Source Software Security Wiki, which is counterpart to this mailing list.
Confused about mailing lists and their use? Read about mailing lists on Wikipedia and check out these guidelines on proper formatting of your messages.