mirror of
https://github.com/Motorhead1991/qemu.git
synced 2025-08-03 15:53:54 -06:00
virtiofsd: Remove source
Now remove all the source. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
parent
8ab5e8a503
commit
e0dc2631ec
19 changed files with 0 additions and 13693 deletions
|
@ -1,403 +0,0 @@
|
|||
QEMU virtio-fs shared file system daemon
|
||||
========================================
|
||||
|
||||
Synopsis
|
||||
--------
|
||||
|
||||
**virtiofsd** [*OPTIONS*]
|
||||
|
||||
Description
|
||||
-----------
|
||||
|
||||
Share a host directory tree with a guest through a virtio-fs device. This
|
||||
program is a vhost-user backend that implements the virtio-fs device. Each
|
||||
virtio-fs device instance requires its own virtiofsd process.
|
||||
|
||||
This program is designed to work with QEMU's ``--device vhost-user-fs-pci``
|
||||
but should work with any virtual machine monitor (VMM) that supports
|
||||
vhost-user. See the Examples section below.
|
||||
|
||||
This program must be run as the root user. The program drops privileges where
|
||||
possible during startup although it must be able to create and access files
|
||||
with any uid/gid:
|
||||
|
||||
* The ability to invoke syscalls is limited using seccomp(2).
|
||||
* Linux capabilities(7) are dropped.
|
||||
|
||||
In "namespace" sandbox mode the program switches into a new file system
|
||||
namespace and invokes pivot_root(2) to make the shared directory tree its root.
|
||||
A new pid and net namespace is also created to isolate the process.
|
||||
|
||||
In "chroot" sandbox mode the program invokes chroot(2) to make the shared
|
||||
directory tree its root. This mode is intended for container environments where
|
||||
the container runtime has already set up the namespaces and the program does
|
||||
not have permission to create namespaces itself.
|
||||
|
||||
Both sandbox modes prevent "file system escapes" due to symlinks and other file
|
||||
system objects that might lead to files outside the shared directory.
|
||||
|
||||
Options
|
||||
-------
|
||||
|
||||
.. program:: virtiofsd
|
||||
|
||||
.. option:: -h, --help
|
||||
|
||||
Print help.
|
||||
|
||||
.. option:: -V, --version
|
||||
|
||||
Print version.
|
||||
|
||||
.. option:: -d
|
||||
|
||||
Enable debug output.
|
||||
|
||||
.. option:: --syslog
|
||||
|
||||
Print log messages to syslog instead of stderr.
|
||||
|
||||
.. option:: -o OPTION
|
||||
|
||||
* debug -
|
||||
Enable debug output.
|
||||
|
||||
* flock|no_flock -
|
||||
Enable/disable flock. The default is ``no_flock``.
|
||||
|
||||
* modcaps=CAPLIST
|
||||
Modify the list of capabilities allowed; CAPLIST is a colon separated
|
||||
list of capabilities, each preceded by either + or -, e.g.
|
||||
''+sys_admin:-chown''.
|
||||
|
||||
* log_level=LEVEL -
|
||||
Print only log messages matching LEVEL or more severe. LEVEL is one of
|
||||
``err``, ``warn``, ``info``, or ``debug``. The default is ``info``.
|
||||
|
||||
* posix_lock|no_posix_lock -
|
||||
Enable/disable remote POSIX locks. The default is ``no_posix_lock``.
|
||||
|
||||
* readdirplus|no_readdirplus -
|
||||
Enable/disable readdirplus. The default is ``readdirplus``.
|
||||
|
||||
* sandbox=namespace|chroot -
|
||||
Sandbox mode:
|
||||
- namespace: Create mount, pid, and net namespaces and pivot_root(2) into
|
||||
the shared directory.
|
||||
- chroot: chroot(2) into shared directory (use in containers).
|
||||
The default is "namespace".
|
||||
|
||||
* source=PATH -
|
||||
Share host directory tree located at PATH. This option is required.
|
||||
|
||||
* timeout=TIMEOUT -
|
||||
I/O timeout in seconds. The default depends on cache= option.
|
||||
|
||||
* writeback|no_writeback -
|
||||
Enable/disable writeback cache. The cache allows the FUSE client to buffer
|
||||
and merge write requests. The default is ``no_writeback``.
|
||||
|
||||
* xattr|no_xattr -
|
||||
Enable/disable extended attributes (xattr) on files and directories. The
|
||||
default is ``no_xattr``.
|
||||
|
||||
* posix_acl|no_posix_acl -
|
||||
Enable/disable posix acl support. Posix ACLs are disabled by default.
|
||||
|
||||
* security_label|no_security_label -
|
||||
Enable/disable security label support. Security labels are disabled by
|
||||
default. This will allow client to send a MAC label of file during
|
||||
file creation. Typically this is expected to be SELinux security
|
||||
label. Server will try to set that label on newly created file
|
||||
atomically wherever possible.
|
||||
|
||||
* killpriv_v2|no_killpriv_v2 -
|
||||
Enable/disable ``FUSE_HANDLE_KILLPRIV_V2`` support. KILLPRIV_V2 is enabled
|
||||
by default as long as the client supports it. Enabling this option helps
|
||||
with performance in write path.
|
||||
|
||||
.. option:: --socket-path=PATH
|
||||
|
||||
Listen on vhost-user UNIX domain socket at PATH.
|
||||
|
||||
.. option:: --socket-group=GROUP
|
||||
|
||||
Set the vhost-user UNIX domain socket gid to GROUP.
|
||||
|
||||
.. option:: --fd=FDNUM
|
||||
|
||||
Accept connections from vhost-user UNIX domain socket file descriptor FDNUM.
|
||||
The file descriptor must already be listening for connections.
|
||||
|
||||
.. option:: --thread-pool-size=NUM
|
||||
|
||||
Restrict the number of worker threads per request queue to NUM. The default
|
||||
is 0.
|
||||
|
||||
.. option:: --cache=none|auto|always
|
||||
|
||||
Select the desired trade-off between coherency and performance. ``none``
|
||||
forbids the FUSE client from caching to achieve best coherency at the cost of
|
||||
performance. ``auto`` acts similar to NFS with a 1 second metadata cache
|
||||
timeout. ``always`` sets a long cache lifetime at the expense of coherency.
|
||||
The default is ``auto``.
|
||||
|
||||
Extended attribute (xattr) mapping
|
||||
----------------------------------
|
||||
|
||||
By default the name of xattr's used by the client are passed through to the server
|
||||
file system. This can be a problem where either those xattr names are used
|
||||
by something on the server (e.g. selinux client/server confusion) or if the
|
||||
``virtiofsd`` is running in a container with restricted privileges where it
|
||||
cannot access some attributes.
|
||||
|
||||
Mapping syntax
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
A mapping of xattr names can be made using -o xattrmap=mapping where the ``mapping``
|
||||
string consists of a series of rules.
|
||||
|
||||
The first matching rule terminates the mapping.
|
||||
The set of rules must include a terminating rule to match any remaining attributes
|
||||
at the end.
|
||||
|
||||
Each rule consists of a number of fields separated with a separator that is the
|
||||
first non-white space character in the rule. This separator must then be used
|
||||
for the whole rule.
|
||||
White space may be added before and after each rule.
|
||||
|
||||
Using ':' as the separator a rule is of the form:
|
||||
|
||||
``:type:scope:key:prepend:``
|
||||
|
||||
**scope** is:
|
||||
|
||||
- 'client' - match 'key' against a xattr name from the client for
|
||||
setxattr/getxattr/removexattr
|
||||
- 'server' - match 'prepend' against a xattr name from the server
|
||||
for listxattr
|
||||
- 'all' - can be used to make a single rule where both the server
|
||||
and client matches are triggered.
|
||||
|
||||
**type** is one of:
|
||||
|
||||
- 'prefix' - is designed to prepend and strip a prefix; the modified
|
||||
attributes then being passed on to the client/server.
|
||||
|
||||
- 'ok' - Causes the rule set to be terminated when a match is found
|
||||
while allowing matching xattr's through unchanged.
|
||||
It is intended both as a way of explicitly terminating
|
||||
the list of rules, and to allow some xattr's to skip following rules.
|
||||
|
||||
- 'bad' - If a client tries to use a name matching 'key' it's
|
||||
denied using EPERM; when the server passes an attribute
|
||||
name matching 'prepend' it's hidden. In many ways it's use is very like
|
||||
'ok' as either an explicit terminator or for special handling of certain
|
||||
patterns.
|
||||
|
||||
- 'unsupported' - If a client tries to use a name matching 'key' it's
|
||||
denied using ENOTSUP; when the server passes an attribute
|
||||
name matching 'prepend' it's hidden. In many ways it's use is very like
|
||||
'ok' as either an explicit terminator or for special handling of certain
|
||||
patterns.
|
||||
|
||||
**key** is a string tested as a prefix on an attribute name originating
|
||||
on the client. It maybe empty in which case a 'client' rule
|
||||
will always match on client names.
|
||||
|
||||
**prepend** is a string tested as a prefix on an attribute name originating
|
||||
on the server, and used as a new prefix. It may be empty
|
||||
in which case a 'server' rule will always match on all names from
|
||||
the server.
|
||||
|
||||
e.g.:
|
||||
|
||||
``:prefix:client:trusted.:user.virtiofs.:``
|
||||
|
||||
will match 'trusted.' attributes in client calls and prefix them before
|
||||
passing them to the server.
|
||||
|
||||
``:prefix:server::user.virtiofs.:``
|
||||
|
||||
will strip 'user.virtiofs.' from all server replies.
|
||||
|
||||
``:prefix:all:trusted.:user.virtiofs.:``
|
||||
|
||||
combines the previous two cases into a single rule.
|
||||
|
||||
``:ok:client:user.::``
|
||||
|
||||
will allow get/set xattr for 'user.' xattr's and ignore
|
||||
following rules.
|
||||
|
||||
``:ok:server::security.:``
|
||||
|
||||
will pass 'security.' xattr's in listxattr from the server
|
||||
and ignore following rules.
|
||||
|
||||
``:ok:all:::``
|
||||
|
||||
will terminate the rule search passing any remaining attributes
|
||||
in both directions.
|
||||
|
||||
``:bad:server::security.:``
|
||||
|
||||
would hide 'security.' xattr's in listxattr from the server.
|
||||
|
||||
A simpler 'map' type provides a shorter syntax for the common case:
|
||||
|
||||
``:map:key:prepend:``
|
||||
|
||||
The 'map' type adds a number of separate rules to add **prepend** as a prefix
|
||||
to the matched **key** (or all attributes if **key** is empty).
|
||||
There may be at most one 'map' rule and it must be the last rule in the set.
|
||||
|
||||
Note: When the 'security.capability' xattr is remapped, the daemon has to do
|
||||
extra work to remove it during many operations, which the host kernel normally
|
||||
does itself.
|
||||
|
||||
Security considerations
|
||||
~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Operating systems typically partition the xattr namespace using
|
||||
well defined name prefixes. Each partition may have different
|
||||
access controls applied. For example, on Linux there are multiple
|
||||
partitions
|
||||
|
||||
* ``system.*`` - access varies depending on attribute & filesystem
|
||||
* ``security.*`` - only processes with CAP_SYS_ADMIN
|
||||
* ``trusted.*`` - only processes with CAP_SYS_ADMIN
|
||||
* ``user.*`` - any process granted by file permissions / ownership
|
||||
|
||||
While other OS such as FreeBSD have different name prefixes
|
||||
and access control rules.
|
||||
|
||||
When remapping attributes on the host, it is important to
|
||||
ensure that the remapping does not allow a guest user to
|
||||
evade the guest access control rules.
|
||||
|
||||
Consider if ``trusted.*`` from the guest was remapped to
|
||||
``user.virtiofs.trusted*`` in the host. An unprivileged
|
||||
user in a Linux guest has the ability to write to xattrs
|
||||
under ``user.*``. Thus the user can evade the access
|
||||
control restriction on ``trusted.*`` by instead writing
|
||||
to ``user.virtiofs.trusted.*``.
|
||||
|
||||
As noted above, the partitions used and access controls
|
||||
applied, will vary across guest OS, so it is not wise to
|
||||
try to predict what the guest OS will use.
|
||||
|
||||
The simplest way to avoid an insecure configuration is
|
||||
to remap all xattrs at once, to a given fixed prefix.
|
||||
This is shown in example (1) below.
|
||||
|
||||
If selectively mapping only a subset of xattr prefixes,
|
||||
then rules must be added to explicitly block direct
|
||||
access to the target of the remapping. This is shown
|
||||
in example (2) below.
|
||||
|
||||
Mapping examples
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
1) Prefix all attributes with 'user.virtiofs.'
|
||||
|
||||
::
|
||||
|
||||
-o xattrmap=":prefix:all::user.virtiofs.::bad:all:::"
|
||||
|
||||
|
||||
This uses two rules, using : as the field separator;
|
||||
the first rule prefixes and strips 'user.virtiofs.',
|
||||
the second rule hides any non-prefixed attributes that
|
||||
the host set.
|
||||
|
||||
This is equivalent to the 'map' rule:
|
||||
|
||||
::
|
||||
|
||||
-o xattrmap=":map::user.virtiofs.:"
|
||||
|
||||
2) Prefix 'trusted.' attributes, allow others through
|
||||
|
||||
::
|
||||
|
||||
"/prefix/all/trusted./user.virtiofs./
|
||||
/bad/server//trusted./
|
||||
/bad/client/user.virtiofs.//
|
||||
/ok/all///"
|
||||
|
||||
|
||||
Here there are four rules, using / as the field
|
||||
separator, and also demonstrating that new lines can
|
||||
be included between rules.
|
||||
The first rule is the prefixing of 'trusted.' and
|
||||
stripping of 'user.virtiofs.'.
|
||||
The second rule hides unprefixed 'trusted.' attributes
|
||||
on the host.
|
||||
The third rule stops a guest from explicitly setting
|
||||
the 'user.virtiofs.' path directly to prevent access
|
||||
control bypass on the target of the earlier prefix
|
||||
remapping.
|
||||
Finally, the fourth rule lets all remaining attributes
|
||||
through.
|
||||
|
||||
This is equivalent to the 'map' rule:
|
||||
|
||||
::
|
||||
|
||||
-o xattrmap="/map/trusted./user.virtiofs./"
|
||||
|
||||
3) Hide 'security.' attributes, and allow everything else
|
||||
|
||||
::
|
||||
|
||||
"/bad/all/security./security./
|
||||
/ok/all///'
|
||||
|
||||
The first rule combines what could be separate client and server
|
||||
rules into a single 'all' rule, matching 'security.' in either
|
||||
client arguments or lists returned from the host. This stops
|
||||
the client seeing any 'security.' attributes on the server and
|
||||
stops it setting any.
|
||||
|
||||
SELinux support
|
||||
---------------
|
||||
One can enable support for SELinux by running virtiofsd with option
|
||||
"-o security_label". But this will try to save guest's security context
|
||||
in xattr security.selinux on host and it might fail if host's SELinux
|
||||
policy does not permit virtiofsd to do this operation.
|
||||
|
||||
Hence, it is preferred to remap guest's "security.selinux" xattr to say
|
||||
"trusted.virtiofs.security.selinux" on host.
|
||||
|
||||
"-o xattrmap=:map:security.selinux:trusted.virtiofs.:"
|
||||
|
||||
This will make sure that guest and host's SELinux xattrs on same file
|
||||
remain separate and not interfere with each other. And will allow both
|
||||
host and guest to implement their own separate SELinux policies.
|
||||
|
||||
Setting trusted xattr on host requires CAP_SYS_ADMIN. So one will need
|
||||
add this capability to daemon.
|
||||
|
||||
"-o modcaps=+sys_admin"
|
||||
|
||||
Giving CAP_SYS_ADMIN increases the risk on system. Now virtiofsd is more
|
||||
powerful and if gets compromised, it can do lot of damage to host system.
|
||||
So keep this trade-off in my mind while making a decision.
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Export ``/var/lib/fs/vm001/`` on vhost-user UNIX domain socket
|
||||
``/var/run/vm001-vhost-fs.sock``:
|
||||
|
||||
.. parsed-literal::
|
||||
|
||||
host# virtiofsd --socket-path=/var/run/vm001-vhost-fs.sock -o source=/var/lib/fs/vm001
|
||||
host# |qemu_system| \\
|
||||
-chardev socket,id=char0,path=/var/run/vm001-vhost-fs.sock \\
|
||||
-device vhost-user-fs-pci,chardev=char0,tag=myfs \\
|
||||
-object memory-backend-memfd,id=mem,size=4G,share=on \\
|
||||
-numa node,memdev=mem \\
|
||||
...
|
||||
guest# mount -t virtiofs myfs /mnt
|
Loading…
Add table
Add a link
Reference in a new issue