mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-12-11 16:00:50 -07:00
scsi: build qemu-pr-helper
Introduce a privileged helper to run persistent reservation commands. This lets virtual machines send persistent reservations without using CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions and SCM_RIGHTS to restrict access to processes that can access its socket and prove that they have an open file descriptor for a raw SCSI device. The next patch will also correct the usage of persistent reservations with multipath devices. It would also be possible to support for Linux's IOC_PR_* ioctls in the future, to support NVMe devices. For now, however, only SCSI is supported. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This commit is contained in:
parent
7c9e527659
commit
b855f8d175
6 changed files with 905 additions and 5 deletions
83
docs/interop/pr-helper.rst
Normal file
83
docs/interop/pr-helper.rst
Normal file
|
|
@ -0,0 +1,83 @@
|
|||
..
|
||||
|
||||
======================================
|
||||
Persistent reservation helper protocol
|
||||
======================================
|
||||
|
||||
QEMU's SCSI passthrough devices, ``scsi-block`` and ``scsi-generic``,
|
||||
can delegate implementation of persistent reservations to an external
|
||||
(and typically privileged) program. Persistent Reservations allow
|
||||
restricting access to block devices to specific initiators in a shared
|
||||
storage setup.
|
||||
|
||||
For a more detailed reference please refer the the SCSI Primary
|
||||
Commands standard, specifically the section on Reservations and the
|
||||
"PERSISTENT RESERVE IN" and "PERSISTENT RESERVE OUT" commands.
|
||||
|
||||
This document describes the socket protocol used between QEMU's
|
||||
``pr-manager-helper`` object and the external program.
|
||||
|
||||
.. contents::
|
||||
|
||||
Connection and initialization
|
||||
-----------------------------
|
||||
|
||||
All data transmitted on the socket is big-endian.
|
||||
|
||||
After connecting to the helper program's socket, the helper starts a simple
|
||||
feature negotiation process by writing four bytes corresponding to
|
||||
the features it exposes (``supported_features``). QEMU reads it,
|
||||
then writes four bytes corresponding to the desired features of the
|
||||
helper program (``requested_features``).
|
||||
|
||||
If a bit is 1 in ``requested_features`` and 0 in ``supported_features``,
|
||||
the corresponding feature is not supported by the helper and the connection
|
||||
is closed. On the other hand, it is acceptable for a bit to be 0 in
|
||||
``requested_features`` and 1 in ``supported_features``; in this case,
|
||||
the helper will not enable the feature.
|
||||
|
||||
Right now no feature is defined, so the two parties always write four
|
||||
zero bytes.
|
||||
|
||||
Command format
|
||||
--------------
|
||||
|
||||
It is invalid to send multiple commands concurrently on the same
|
||||
socket. It is however possible to connect multiple sockets to the
|
||||
helper and send multiple commands to the helper for one or more
|
||||
file descriptors.
|
||||
|
||||
A command consists of a request and a response. A request consists
|
||||
of a 16-byte SCSI CDB. A file descriptor must be passed to the helper
|
||||
together with the SCSI CDB using ancillary data.
|
||||
|
||||
The CDB has the following limitations:
|
||||
|
||||
- the command (stored in the first byte) must be one of 0x5E
|
||||
(PERSISTENT RESERVE IN) or 0x5F (PERSISTENT RESERVE OUT).
|
||||
|
||||
- the allocation length (stored in bytes 7-8 of the CDB for PERSISTENT
|
||||
RESERVE IN) or parameter list length (stored in bytes 5-8 of the CDB
|
||||
for PERSISTENT RESERVE OUT) is limited to 8 KiB.
|
||||
|
||||
For PERSISTENT RESERVE OUT, the parameter list is sent right after the
|
||||
CDB. The length of the parameter list is taken from the CDB itself.
|
||||
|
||||
The helper's reply has the following structure:
|
||||
|
||||
- 4 bytes for the SCSI status
|
||||
|
||||
- 4 bytes for the payload size (nonzero only for PERSISTENT RESERVE IN
|
||||
and only if the SCSI status is 0x00, i.e. GOOD)
|
||||
|
||||
- 96 bytes for the SCSI sense data
|
||||
|
||||
- if the size is nonzero, the payload follows
|
||||
|
||||
The sense data is always sent to keep the protocol simple, even though
|
||||
it is only valid if the SCSI status is CHECK CONDITION (0x02).
|
||||
|
||||
The payload size is always less than or equal to the allocation length
|
||||
specified in the CDB for the PERSISTENT RESERVE IN command.
|
||||
|
||||
If the protocol is violated, the helper closes the socket.
|
||||
|
|
@ -49,3 +49,36 @@ Alternatively, using ``-blockdev``::
|
|||
-object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock
|
||||
-blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0
|
||||
-device scsi-block,drive=hd
|
||||
|
||||
----------------------------------
|
||||
Invoking :program:`qemu-pr-helper`
|
||||
----------------------------------
|
||||
|
||||
QEMU provides an implementation of the persistent reservation helper,
|
||||
called :program:`qemu-pr-helper`. The helper should be started as a
|
||||
system service and supports the following option:
|
||||
|
||||
-d, --daemon run in the background
|
||||
-q, --quiet decrease verbosity
|
||||
-f, --pidfile=path PID file when running as a daemon
|
||||
-k, --socket=path path to the socket
|
||||
-T, --trace=trace-opts tracing options
|
||||
|
||||
By default, the socket and PID file are placed in the runtime state
|
||||
directory, for example :file:`/var/run/qemu-pr-helper.sock` and
|
||||
:file:`/var/run/qemu-pr-helper.pid`. The PID file is not created
|
||||
unless :option:`-d` is passed too.
|
||||
|
||||
:program:`qemu-pr-helper` can also use the systemd socket activation
|
||||
protocol. In this case, the systemd socket unit should specify a
|
||||
Unix stream socket, like this::
|
||||
|
||||
[Socket]
|
||||
ListenStream=/var/run/qemu-pr-helper.sock
|
||||
|
||||
After connecting to the socket, :program:`qemu-pr-helper`` can optionally drop
|
||||
root privileges, except for those capabilities that are needed for
|
||||
its operation. To do this, add the following options:
|
||||
|
||||
-u, --user=user user to drop privileges to
|
||||
-g, --group=group group to drop privileges to
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue