qemu/include
Jason A. Donenfeld eac7a7791b x86: don't let decompressed kernel image clobber setup_data
The setup_data links are appended to the compressed kernel image. Since
the kernel image is typically loaded at 0x100000, setup_data lives at
`0x100000 + compressed_size`, which does not get relocated during the
kernel's boot process.

The kernel typically decompresses the image starting at address
0x1000000 (note: there's one more zero there than the compressed image
above). This usually is fine for most kernels.

However, if the compressed image is actually quite large, then
setup_data will live at a `0x100000 + compressed_size` that extends into
the decompressed zone at 0x1000000. In other words, if compressed_size
is larger than `0x1000000 - 0x100000`, then the decompression step will
clobber setup_data, resulting in crashes.

Visually, what happens now is that QEMU appends setup_data to the kernel
image:

          kernel image            setup_data
   |--------------------------||----------------|
0x100000                  0x100000+l1     0x100000+l1+l2

The problem is that this decompresses to 0x1000000 (one more zero). So
if l1 is > (0x1000000-0x100000), then this winds up looking like:

          kernel image            setup_data
   |--------------------------||----------------|
0x100000                  0x100000+l1     0x100000+l1+l2

                                 d e c o m p r e s s e d   k e r n e l
                     |-------------------------------------------------------------|
                0x1000000                                                     0x1000000+l3

The decompressed kernel seemingly overwriting the compressed kernel
image isn't a problem, because that gets relocated to a higher address
early on in the boot process, at the end of startup_64. setup_data,
however, stays in the same place, since those links are self referential
and nothing fixes them up.  So the decompressed kernel clobbers it.

Fix this by appending setup_data to the cmdline blob rather than the
kernel image blob, which remains at a lower address that won't get
clobbered.

This could have been done by overwriting the initrd blob instead, but
that poses big difficulties, such as no longer being able to use memory
mapped files for initrd, hurting performance, and, more importantly, the
initrd address calculation is hard coded in qboot, and it always grows
down rather than up, which means lots of brittle semantics would have to
be changed around, incurring more complexity. In contrast, using cmdline
is simple and doesn't interfere with anything.

The microvm machine has a gross hack where it fiddles with fw_cfg data
after the fact. So this hack is updated to account for this appending,
by reserving some bytes.

Fixup-by: Michael S. Tsirkin <mst@redhat.com>
Cc: x86@kernel.org
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Message-Id: <20221230220725.618763-1-Jason@zx2c4.com>
Message-ID: <20230128061015-mutt-send-email-mst@kernel.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eric Biggers <ebiggers@google.com>
Tested-by: Mathias Krause <minipli@grsecurity.net>
2023-01-28 06:21:29 -05:00
..
authz Prefer 'on' | 'off' over 'yes' | 'no' for bool options 2021-01-29 17:07:53 +00:00
block include/block: Untangle inclusion loops 2023-01-20 07:24:28 +01:00
chardev chardev: src buffer const for write functions 2022-09-29 14:38:05 +04:00
crypto crypto: Support export akcipher to pkcs8 2022-11-02 06:56:32 -04:00
disas target/loongarch: Add disassembler 2022-06-06 18:09:03 +00:00
exec intel-iommu: Document iova_tree 2023-01-27 11:47:02 -05:00
fpu fpu: Add rebias bool, value and operation 2022-08-31 14:08:05 -03:00
hw x86: don't let decompressed kernel image clobber setup_data 2023-01-28 06:21:29 -05:00
io coroutine: Split qemu/coroutine-core.h off qemu/coroutine.h 2023-01-20 07:21:46 +01:00
libdecnumber Replace config-time define HOST_WORDS_BIGENDIAN 2022-04-06 10:50:37 +02:00
migration migration: Remove load_state_old and minimum_version_id_old 2022-03-02 18:20:45 +00:00
monitor Header cleanup patches for 2023-01-20 2023-01-20 13:17:55 +00:00
net virtio-net: add support for configure interrupt 2023-01-08 01:54:22 -05:00
qapi qerror: QERR_PERMISSION_DENIED is no longer used, drop 2022-10-27 07:57:18 +02:00
qemu coroutine: Split qemu/coroutine-core.h off qemu/coroutine.h 2023-01-20 07:21:46 +01:00
qom qom/object: Remove circular include dependency 2022-06-28 10:53:32 +02:00
scsi coroutine: Clean up superfluous inclusion of qemu/coroutine.h 2023-01-19 10:18:28 +01:00
semihosting semihosting: Allow optional use of semihosting from userspace 2022-09-13 17:18:21 +01:00
standard-headers m68k: rework BI_VIRT_RNG_SEED as BI_RNG_SEED 2022-10-21 20:46:10 +02:00
sysemu include/block: Untangle inclusion loops 2023-01-20 07:24:28 +01:00
tcg tcg: Move tb_target_set_jmp_target declaration to tcg.h 2023-01-17 10:22:35 -10:00
ui ui: Split hmp_mouse_set() and move the HMP part to ui/ 2023-01-19 13:30:01 +01:00
user include: Include headers where needed 2023-01-08 01:54:22 -05:00
elf.h include/elf.h: add s390x note types 2022-10-26 12:54:59 +04:00
glib-compat.h compiler.h: replace QEMU_NORETURN with G_NORETURN 2022-04-21 17:03:51 +04:00
qemu-io.h Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
qemu-main.h ui/cocoa: Run qemu_init in the main thread 2022-09-23 14:36:33 +02:00