vfio queue:

* Preparatory changes for the introduction of CPR support
 * Automatic enablement of OpRegion for IGD device passthrough
 * Linux headers update
 * Preparatory changes for the introduction of vfio-user
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmgd/0kACgkQUaNDx8/7
 7KHRmRAArw1PXMCmoVBBeLcZ8BZPGjBZHtsvRzwS1yhVnNQadlpDlq4wd9HrfDFK
 BTr7//Ag2Q1dKgibesh0A8hSjorXHUGQCmdkcCuGGTFnEwC86q5jCH1lUxgI0cs5
 3bVwc43zhXGoKqmo07g4f2UFbjDYHe89LgWz2c7TFFGz7Tda/LCOdhnmXlXcIwz+
 v1ocutXd7VbDWvUzN7uZbf0SIH3Zj3p96dwmpLDtdzdliDA0JidNvS27+Z5gtvWe
 O+1NW9MDzNfd6zLXCxL3GLeT61WZCe1dRCHEPX4cBo+DhnrifsC25DtJwYlDFvi2
 NMFfGzdKcEVSpeDp7WeM6MJgCZsGHC7ytmAKOKgN2M2kFSj3SI3sTFNlE1rzUhe6
 yjjCa59HzNLIi7L7xYCrVtCLGC/VXOp9kh67Sjs7FY7v778QUEdiudFBdBki7Bwh
 bpRhdFJgCLHuKc6XrM7hsMnsRyM28MywyfHDo3M/pRSFNKfeImW6zSMXnyncZztK
 W8e8OIz2DBMfH8pIu8hPw9Gsm5VAAs4aVmVFNa0CLl0oBko0Ew2YXcA5pTK5gGqv
 x24uc/BhbLcfFUtK0OnP4N/B4rcoADebPV2u4eWoUK3aF5u4+7BY235bFuoTj+sb
 55DPDyWm5cmkX58Tdq46tD39dbD1hlUYkcydPbANH51wYx/lPpc=
 =OqYP
 -----END PGP SIGNATURE-----

Merge tag 'pull-vfio-20250509' of https://github.com/legoater/qemu into staging

vfio queue:

* Preparatory changes for the introduction of CPR support
* Automatic enablement of OpRegion for IGD device passthrough
* Linux headers update
* Preparatory changes for the introduction of vfio-user

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEEoPZlSPBIlev+awtgUaNDx8/77KEFAmgd/0kACgkQUaNDx8/7
# 7KHRmRAArw1PXMCmoVBBeLcZ8BZPGjBZHtsvRzwS1yhVnNQadlpDlq4wd9HrfDFK
# BTr7//Ag2Q1dKgibesh0A8hSjorXHUGQCmdkcCuGGTFnEwC86q5jCH1lUxgI0cs5
# 3bVwc43zhXGoKqmo07g4f2UFbjDYHe89LgWz2c7TFFGz7Tda/LCOdhnmXlXcIwz+
# v1ocutXd7VbDWvUzN7uZbf0SIH3Zj3p96dwmpLDtdzdliDA0JidNvS27+Z5gtvWe
# O+1NW9MDzNfd6zLXCxL3GLeT61WZCe1dRCHEPX4cBo+DhnrifsC25DtJwYlDFvi2
# NMFfGzdKcEVSpeDp7WeM6MJgCZsGHC7ytmAKOKgN2M2kFSj3SI3sTFNlE1rzUhe6
# yjjCa59HzNLIi7L7xYCrVtCLGC/VXOp9kh67Sjs7FY7v778QUEdiudFBdBki7Bwh
# bpRhdFJgCLHuKc6XrM7hsMnsRyM28MywyfHDo3M/pRSFNKfeImW6zSMXnyncZztK
# W8e8OIz2DBMfH8pIu8hPw9Gsm5VAAs4aVmVFNa0CLl0oBko0Ew2YXcA5pTK5gGqv
# x24uc/BhbLcfFUtK0OnP4N/B4rcoADebPV2u4eWoUK3aF5u4+7BY235bFuoTj+sb
# 55DPDyWm5cmkX58Tdq46tD39dbD1hlUYkcydPbANH51wYx/lPpc=
# =OqYP
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 09 May 2025 09:12:41 EDT
# gpg:                using RSA key A0F66548F04895EBFE6B0B6051A343C7CFFBECA1
# gpg: Good signature from "Cédric Le Goater <clg@redhat.com>" [full]
# gpg:                 aka "Cédric Le Goater <clg@kaod.org>" [full]
# Primary key fingerprint: A0F6 6548 F048 95EB FE6B  0B60 51A3 43C7 CFFB ECA1

* tag 'pull-vfio-20250509' of https://github.com/legoater/qemu: (28 commits)
  vfio/container: pass listener_begin/commit callbacks
  vfio: add vfio-pci-base class
  vfio: add read/write to device IO ops vector
  vfio: add region info cache
  vfio: add device IO ops vector
  vfio: implement unmap all for DMA unmap callbacks
  vfio: add unmap_all flag to DMA unmap callback
  vfio: add vfio_pci_config_space_read/write()
  vfio: add strread/writeerror()
  vfio: consistently handle return value for helpers
  vfio: add vfio_device_get_irq_info() helper
  vfio: add vfio_attach_device_by_iommu_type()
  vfio: add vfio_device_unprepare()
  vfio: add vfio_device_prepare()
  linux-headers: Update to Linux v6.15-rc3
  linux-header: update-linux-header script changes
  vfio/igd: Remove generation limitation for IGD passthrough
  vfio/igd: Only emulate GGC register when x-igd-gms is set
  vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+
  vfio/igd: Enable OpRegion by default
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This commit is contained in:
Stefan Hajnoczi 2025-05-09 12:04:34 -04:00
commit 7be29f2f1a
50 changed files with 1083 additions and 405 deletions

View file

@ -47,6 +47,7 @@ Intel document [1] shows how to dump VBIOS to file. For UEFI Option ROM, see
QEMU also provides a "Legacy" mode that implicitly enables full functionality QEMU also provides a "Legacy" mode that implicitly enables full functionality
on IGD, it is automatically enabled when on IGD, it is automatically enabled when
* IGD generation is 6 to 9 (Sandy Bridge to Comet Lake)
* Machine type is i440fx * Machine type is i440fx
* IGD is assigned to guest BDF 00:02.0 * IGD is assigned to guest BDF 00:02.0
* ROM BAR or romfile is present * ROM BAR or romfile is present
@ -101,7 +102,7 @@ digital formats work well.
Options Options
======= =======
* x-igd-opregion=[on|*off*] * x-igd-opregion=[*on*|off]
Copy host IGD OpRegion and expose it to guest with fw_cfg Copy host IGD OpRegion and expose it to guest with fw_cfg
* x-igd-lpc=[on|*off*] * x-igd-lpc=[on|*off*]
@ -123,7 +124,7 @@ Examples
* Adding IGD with OpRegion and LPC ID hack, but without VGA ranges * Adding IGD with OpRegion and LPC ID hack, but without VGA ranges
(For UEFI guests) (For UEFI guests)
-device vfio-pci,host=00:02.0,id=hostdev0,addr=2.0,x-igd-legacy-mode=off,x-igd-opregion=on,x-igd-lpc=on,romfile=efi_oprom.rom -device vfio-pci,host=00:02.0,id=hostdev0,addr=2.0,x-igd-legacy-mode=off,x-igd-lpc=on,romfile=efi_oprom.rom
Guest firmware Guest firmware
@ -156,6 +157,12 @@ fw_cfg requirements on the VM firmware:
it's expected that this fw_cfg file is only relevant to a single PCI it's expected that this fw_cfg file is only relevant to a single PCI
class VGA device with Intel vendor ID, appearing at PCI bus address 00:02.0. class VGA device with Intel vendor ID, appearing at PCI bus address 00:02.0.
Starting from Meteor Lake, IGD devices access stolen memory via its MMIO
BAR2 (LMEMBAR) and removed the BDSM register in config space. There is
no need for guest firmware to allocate data stolen memory in guest address
space and write it to BDSM register. Value of this fw_cfg file is 0 in
such case.
Upstream Seabios has OpRegion and BDSM (pre-Gen11 device only) support. Upstream Seabios has OpRegion and BDSM (pre-Gen11 device only) support.
However, the support is not accepted by upstream EDK2/OVMF. A recommended However, the support is not accepted by upstream EDK2/OVMF. A recommended
solution is to create a virtual OpRom with following DXE drivers: solution is to create a virtual OpRom with following DXE drivers:

View file

@ -74,10 +74,10 @@ static bool vfio_ap_register_irq_notifier(VFIOAPDevice *vapdev,
unsigned int irq, Error **errp) unsigned int irq, Error **errp)
{ {
int fd; int fd;
size_t argsz; int ret;
IOHandler *fd_read; IOHandler *fd_read;
EventNotifier *notifier; EventNotifier *notifier;
g_autofree struct vfio_irq_info *irq_info = NULL; struct vfio_irq_info irq_info;
VFIODevice *vdev = &vapdev->vdev; VFIODevice *vdev = &vapdev->vdev;
switch (irq) { switch (irq) {
@ -96,14 +96,15 @@ static bool vfio_ap_register_irq_notifier(VFIOAPDevice *vapdev,
return false; return false;
} }
argsz = sizeof(*irq_info); ret = vfio_device_get_irq_info(vdev, irq, &irq_info);
irq_info = g_malloc0(argsz);
irq_info->index = irq;
irq_info->argsz = argsz;
if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO, if (ret < 0) {
irq_info) < 0 || irq_info->count < 1) { error_setg_errno(errp, -ret, "vfio: Error getting irq info");
error_setg_errno(errp, errno, "vfio: Error getting irq info"); return false;
}
if (irq_info.count < 1) {
error_setg(errp, "vfio: Error getting irq info, count=0");
return false; return false;
} }

View file

@ -376,8 +376,8 @@ static bool vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
Error **errp) Error **errp)
{ {
VFIODevice *vdev = &vcdev->vdev; VFIODevice *vdev = &vcdev->vdev;
g_autofree struct vfio_irq_info *irq_info = NULL; struct vfio_irq_info irq_info;
size_t argsz; int ret;
int fd; int fd;
EventNotifier *notifier; EventNotifier *notifier;
IOHandler *fd_read; IOHandler *fd_read;
@ -406,13 +406,15 @@ static bool vfio_ccw_register_irq_notifier(VFIOCCWDevice *vcdev,
return false; return false;
} }
argsz = sizeof(*irq_info); ret = vfio_device_get_irq_info(vdev, irq, &irq_info);
irq_info = g_malloc0(argsz);
irq_info->index = irq; if (ret < 0) {
irq_info->argsz = argsz; error_setg_errno(errp, -ret, "vfio: Error getting irq info");
if (ioctl(vdev->fd, VFIO_DEVICE_GET_IRQ_INFO, return false;
irq_info) < 0 || irq_info->count < 1) { }
error_setg_errno(errp, errno, "vfio: Error getting irq info");
if (irq_info.count < 1) {
error_setg(errp, "vfio: Error getting irq info, count=0");
return false; return false;
} }
@ -502,7 +504,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
vcdev->io_region_offset = info->offset; vcdev->io_region_offset = info->offset;
vcdev->io_region = g_malloc0(info->size); vcdev->io_region = g_malloc0(info->size);
g_free(info);
/* check for the optional async command region */ /* check for the optional async command region */
ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW, ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW,
@ -515,7 +516,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
} }
vcdev->async_cmd_region_offset = info->offset; vcdev->async_cmd_region_offset = info->offset;
vcdev->async_cmd_region = g_malloc0(info->size); vcdev->async_cmd_region = g_malloc0(info->size);
g_free(info);
} }
ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW, ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW,
@ -528,7 +528,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
} }
vcdev->schib_region_offset = info->offset; vcdev->schib_region_offset = info->offset;
vcdev->schib_region = g_malloc(info->size); vcdev->schib_region = g_malloc(info->size);
g_free(info);
} }
ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW, ret = vfio_device_get_region_info_type(vdev, VFIO_REGION_TYPE_CCW,
@ -542,7 +541,6 @@ static bool vfio_ccw_get_region(VFIOCCWDevice *vcdev, Error **errp)
} }
vcdev->crw_region_offset = info->offset; vcdev->crw_region_offset = info->offset;
vcdev->crw_region = g_malloc(info->size); vcdev->crw_region = g_malloc(info->size);
g_free(info);
} }
return true; return true;
@ -552,7 +550,6 @@ out_err:
g_free(vcdev->schib_region); g_free(vcdev->schib_region);
g_free(vcdev->async_cmd_region); g_free(vcdev->async_cmd_region);
g_free(vcdev->io_region); g_free(vcdev->io_region);
g_free(info);
return false; return false;
} }

View file

@ -85,12 +85,12 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer,
int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, int vfio_container_dma_unmap(VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size, hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb) IOMMUTLBEntry *iotlb, bool unmap_all)
{ {
VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer); VFIOIOMMUClass *vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
g_assert(vioc->dma_unmap); g_assert(vioc->dma_unmap);
return vioc->dma_unmap(bcontainer, iova, size, iotlb); return vioc->dma_unmap(bcontainer, iova, size, iotlb, unmap_all);
} }
bool vfio_container_add_section_window(VFIOContainerBase *bcontainer, bool vfio_container_add_section_window(VFIOContainerBase *bcontainer,
@ -198,11 +198,7 @@ static int vfio_device_dma_logging_report(VFIODevice *vbasedev, hwaddr iova,
feature->flags = VFIO_DEVICE_FEATURE_GET | feature->flags = VFIO_DEVICE_FEATURE_GET |
VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT; VFIO_DEVICE_FEATURE_DMA_LOGGING_REPORT;
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) { return vbasedev->io_ops->device_feature(vbasedev, feature);
return -errno;
}
return 0;
} }
static int vfio_container_iommu_query_dirty_bitmap(const VFIOContainerBase *bcontainer, static int vfio_container_iommu_query_dirty_bitmap(const VFIOContainerBase *bcontainer,

View file

@ -119,12 +119,9 @@ unmap_exit:
return ret; return ret;
} }
/* static int vfio_legacy_dma_unmap_one(const VFIOContainerBase *bcontainer,
* DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86 hwaddr iova, ram_addr_t size,
*/ IOMMUTLBEntry *iotlb)
static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb)
{ {
const VFIOContainer *container = container_of(bcontainer, VFIOContainer, const VFIOContainer *container = container_of(bcontainer, VFIOContainer,
bcontainer); bcontainer);
@ -181,6 +178,34 @@ static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
return 0; return 0;
} }
/*
* DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
*/
static int vfio_legacy_dma_unmap(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb, bool unmap_all)
{
int ret;
if (unmap_all) {
/* The unmap ioctl doesn't accept a full 64-bit span. */
Int128 llsize = int128_rshift(int128_2_64(), 1);
ret = vfio_legacy_dma_unmap_one(bcontainer, 0, int128_get64(llsize),
iotlb);
if (ret == 0) {
ret = vfio_legacy_dma_unmap_one(bcontainer, int128_get64(llsize),
int128_get64(llsize), iotlb);
}
} else {
ret = vfio_legacy_dma_unmap_one(bcontainer, iova, size, iotlb);
}
return ret;
}
static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova, static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova,
ram_addr_t size, void *vaddr, bool readonly) ram_addr_t size, void *vaddr, bool readonly)
{ {
@ -205,7 +230,7 @@ static int vfio_legacy_dma_map(const VFIOContainerBase *bcontainer, hwaddr iova,
*/ */
if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 || if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0 ||
(errno == EBUSY && (errno == EBUSY &&
vfio_legacy_dma_unmap(bcontainer, iova, size, NULL) == 0 && vfio_legacy_dma_unmap(bcontainer, iova, size, NULL, false) == 0 &&
ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) { ioctl(container->fd, VFIO_IOMMU_MAP_DMA, &map) == 0)) {
return 0; return 0;
} }
@ -511,16 +536,10 @@ static bool vfio_legacy_setup(VFIOContainerBase *bcontainer, Error **errp)
return true; return true;
} }
static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as, static bool vfio_container_attach_discard_disable(VFIOContainer *container,
Error **errp) VFIOGroup *group, Error **errp)
{ {
VFIOContainer *container; int ret;
VFIOContainerBase *bcontainer;
int ret, fd;
VFIOAddressSpace *space;
VFIOIOMMUClass *vioc;
space = vfio_address_space_get(as);
/* /*
* VFIO is currently incompatible with discarding of RAM insofar as the * VFIO is currently incompatible with discarding of RAM insofar as the
@ -553,97 +572,118 @@ static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as,
* details once we know which type of IOMMU we are using. * details once we know which type of IOMMU we are using.
*/ */
ret = vfio_ram_block_discard_disable(container, true);
if (ret) {
error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken");
if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER, &container->fd)) {
error_report("vfio: error disconnecting group %d from"
" container", group->groupid);
}
}
return !ret;
}
static bool vfio_container_group_add(VFIOContainer *container, VFIOGroup *group,
Error **errp)
{
if (!vfio_container_attach_discard_disable(container, group, errp)) {
return false;
}
group->container = container;
QLIST_INSERT_HEAD(&container->group_list, group, container_next);
vfio_group_add_kvm_device(group);
return true;
}
static void vfio_container_group_del(VFIOContainer *container, VFIOGroup *group)
{
QLIST_REMOVE(group, container_next);
group->container = NULL;
vfio_group_del_kvm_device(group);
vfio_ram_block_discard_disable(container, false);
}
static bool vfio_container_connect(VFIOGroup *group, AddressSpace *as,
Error **errp)
{
VFIOContainer *container;
VFIOContainerBase *bcontainer;
int ret, fd = -1;
VFIOAddressSpace *space;
VFIOIOMMUClass *vioc = NULL;
bool new_container = false;
bool group_was_added = false;
space = vfio_address_space_get(as);
QLIST_FOREACH(bcontainer, &space->containers, next) { QLIST_FOREACH(bcontainer, &space->containers, next) {
container = container_of(bcontainer, VFIOContainer, bcontainer); container = container_of(bcontainer, VFIOContainer, bcontainer);
if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) { if (!ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &container->fd)) {
ret = vfio_ram_block_discard_disable(container, true); return vfio_container_group_add(container, group, errp);
if (ret) {
error_setg_errno(errp, -ret,
"Cannot set discarding of RAM broken");
if (ioctl(group->fd, VFIO_GROUP_UNSET_CONTAINER,
&container->fd)) {
error_report("vfio: error disconnecting group %d from"
" container", group->groupid);
}
return false;
}
group->container = container;
QLIST_INSERT_HEAD(&container->group_list, group, container_next);
vfio_group_add_kvm_device(group);
return true;
} }
} }
fd = qemu_open("/dev/vfio/vfio", O_RDWR, errp); fd = qemu_open("/dev/vfio/vfio", O_RDWR, errp);
if (fd < 0) { if (fd < 0) {
goto put_space_exit; goto fail;
} }
ret = ioctl(fd, VFIO_GET_API_VERSION); ret = ioctl(fd, VFIO_GET_API_VERSION);
if (ret != VFIO_API_VERSION) { if (ret != VFIO_API_VERSION) {
error_setg(errp, "supported vfio version: %d, " error_setg(errp, "supported vfio version: %d, "
"reported version: %d", VFIO_API_VERSION, ret); "reported version: %d", VFIO_API_VERSION, ret);
goto close_fd_exit; goto fail;
} }
container = vfio_create_container(fd, group, errp); container = vfio_create_container(fd, group, errp);
if (!container) { if (!container) {
goto close_fd_exit; goto fail;
} }
new_container = true;
bcontainer = &container->bcontainer; bcontainer = &container->bcontainer;
if (!vfio_cpr_register_container(bcontainer, errp)) { if (!vfio_cpr_register_container(bcontainer, errp)) {
goto free_container_exit; goto fail;
}
ret = vfio_ram_block_discard_disable(container, true);
if (ret) {
error_setg_errno(errp, -ret, "Cannot set discarding of RAM broken");
goto unregister_container_exit;
} }
vioc = VFIO_IOMMU_GET_CLASS(bcontainer); vioc = VFIO_IOMMU_GET_CLASS(bcontainer);
assert(vioc->setup); assert(vioc->setup);
if (!vioc->setup(bcontainer, errp)) { if (!vioc->setup(bcontainer, errp)) {
goto enable_discards_exit; goto fail;
} }
vfio_group_add_kvm_device(group);
vfio_address_space_insert(space, bcontainer); vfio_address_space_insert(space, bcontainer);
group->container = container; if (!vfio_container_group_add(container, group, errp)) {
QLIST_INSERT_HEAD(&container->group_list, group, container_next); goto fail;
}
group_was_added = true;
if (!vfio_listener_register(bcontainer, errp)) { if (!vfio_listener_register(bcontainer, errp)) {
goto listener_release_exit; goto fail;
} }
bcontainer->initialized = true; bcontainer->initialized = true;
return true; return true;
listener_release_exit:
QLIST_REMOVE(group, container_next); fail:
vfio_group_del_kvm_device(group);
vfio_listener_unregister(bcontainer); vfio_listener_unregister(bcontainer);
if (vioc->release) {
if (group_was_added) {
vfio_container_group_del(container, group);
}
if (vioc && vioc->release) {
vioc->release(bcontainer); vioc->release(bcontainer);
} }
if (new_container) {
enable_discards_exit: vfio_cpr_unregister_container(bcontainer);
vfio_ram_block_discard_disable(container, false); object_unref(container);
}
unregister_container_exit: if (fd >= 0) {
vfio_cpr_unregister_container(bcontainer); close(fd);
}
free_container_exit:
object_unref(container);
close_fd_exit:
close(fd);
put_space_exit:
vfio_address_space_put(space); vfio_address_space_put(space);
return false; return false;
@ -811,18 +851,14 @@ static bool vfio_device_get(VFIOGroup *group, const char *name,
} }
} }
vfio_device_prepare(vbasedev, &group->container->bcontainer, info);
vbasedev->fd = fd; vbasedev->fd = fd;
vbasedev->group = group; vbasedev->group = group;
QLIST_INSERT_HEAD(&group->device_list, vbasedev, next); QLIST_INSERT_HEAD(&group->device_list, vbasedev, next);
vbasedev->num_irqs = info->num_irqs;
vbasedev->num_regions = info->num_regions;
vbasedev->flags = info->flags;
trace_vfio_device_get(name, info->flags, info->num_regions, info->num_irqs); trace_vfio_device_get(name, info->flags, info->num_regions, info->num_irqs);
vbasedev->reset_works = !!(info->flags & VFIO_DEVICE_FLAGS_RESET);
return true; return true;
} }
@ -875,7 +911,6 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
int groupid = vfio_device_get_groupid(vbasedev, errp); int groupid = vfio_device_get_groupid(vbasedev, errp);
VFIODevice *vbasedev_iter; VFIODevice *vbasedev_iter;
VFIOGroup *group; VFIOGroup *group;
VFIOContainerBase *bcontainer;
if (groupid < 0) { if (groupid < 0) {
return false; return false;
@ -904,11 +939,6 @@ static bool vfio_legacy_attach_device(const char *name, VFIODevice *vbasedev,
goto device_put_exit; goto device_put_exit;
} }
bcontainer = &group->container->bcontainer;
vbasedev->bcontainer = bcontainer;
QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next);
QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next);
return true; return true;
device_put_exit: device_put_exit:
@ -922,10 +952,10 @@ static void vfio_legacy_detach_device(VFIODevice *vbasedev)
{ {
VFIOGroup *group = vbasedev->group; VFIOGroup *group = vbasedev->group;
QLIST_REMOVE(vbasedev, global_next);
QLIST_REMOVE(vbasedev, container_next);
vbasedev->bcontainer = NULL;
trace_vfio_device_detach(vbasedev->name, group->groupid); trace_vfio_device_detach(vbasedev->name, group->groupid);
vfio_device_unprepare(vbasedev);
object_unref(vbasedev->hiod); object_unref(vbasedev->hiod);
vfio_device_put(vbasedev); vfio_device_put(vbasedev);
vfio_group_put(group); vfio_group_put(group);

View file

@ -82,7 +82,7 @@ void vfio_device_irq_disable(VFIODevice *vbasedev, int index)
.count = 0, .count = 0,
}; };
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); vbasedev->io_ops->set_irqs(vbasedev, &irq_set);
} }
void vfio_device_irq_unmask(VFIODevice *vbasedev, int index) void vfio_device_irq_unmask(VFIODevice *vbasedev, int index)
@ -95,7 +95,7 @@ void vfio_device_irq_unmask(VFIODevice *vbasedev, int index)
.count = 1, .count = 1,
}; };
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); vbasedev->io_ops->set_irqs(vbasedev, &irq_set);
} }
void vfio_device_irq_mask(VFIODevice *vbasedev, int index) void vfio_device_irq_mask(VFIODevice *vbasedev, int index)
@ -108,7 +108,7 @@ void vfio_device_irq_mask(VFIODevice *vbasedev, int index)
.count = 1, .count = 1,
}; };
ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, &irq_set); vbasedev->io_ops->set_irqs(vbasedev, &irq_set);
} }
static inline const char *action_to_str(int action) static inline const char *action_to_str(int action)
@ -167,7 +167,7 @@ bool vfio_device_irq_set_signaling(VFIODevice *vbasedev, int index, int subindex
pfd = (int32_t *)&irq_set->data; pfd = (int32_t *)&irq_set->data;
*pfd = fd; *pfd = fd;
if (!ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irq_set)) { if (!vbasedev->io_ops->set_irqs(vbasedev, irq_set)) {
return true; return true;
} }
@ -185,10 +185,28 @@ bool vfio_device_irq_set_signaling(VFIODevice *vbasedev, int index, int subindex
return false; return false;
} }
int vfio_device_get_irq_info(VFIODevice *vbasedev, int index,
struct vfio_irq_info *info)
{
memset(info, 0, sizeof(*info));
info->argsz = sizeof(*info);
info->index = index;
return vbasedev->io_ops->get_irq_info(vbasedev, info);
}
int vfio_device_get_region_info(VFIODevice *vbasedev, int index, int vfio_device_get_region_info(VFIODevice *vbasedev, int index,
struct vfio_region_info **info) struct vfio_region_info **info)
{ {
size_t argsz = sizeof(struct vfio_region_info); size_t argsz = sizeof(struct vfio_region_info);
int ret;
/* check cache */
if (vbasedev->reginfo[index] != NULL) {
*info = vbasedev->reginfo[index];
return 0;
}
*info = g_malloc0(argsz); *info = g_malloc0(argsz);
@ -196,10 +214,11 @@ int vfio_device_get_region_info(VFIODevice *vbasedev, int index,
retry: retry:
(*info)->argsz = argsz; (*info)->argsz = argsz;
if (ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, *info)) { ret = vbasedev->io_ops->get_region_info(vbasedev, *info);
if (ret != 0) {
g_free(*info); g_free(*info);
*info = NULL; *info = NULL;
return -errno; return ret;
} }
if ((*info)->argsz > argsz) { if ((*info)->argsz > argsz) {
@ -209,6 +228,9 @@ retry:
goto retry; goto retry;
} }
/* fill cache */
vbasedev->reginfo[index] = *info;
return 0; return 0;
} }
@ -227,7 +249,6 @@ int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type,
hdr = vfio_get_region_info_cap(*info, VFIO_REGION_INFO_CAP_TYPE); hdr = vfio_get_region_info_cap(*info, VFIO_REGION_INFO_CAP_TYPE);
if (!hdr) { if (!hdr) {
g_free(*info);
continue; continue;
} }
@ -239,8 +260,6 @@ int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type,
if (cap_type->type == type && cap_type->subtype == subtype) { if (cap_type->type == type && cap_type->subtype == subtype) {
return 0; return 0;
} }
g_free(*info);
} }
*info = NULL; *info = NULL;
@ -249,7 +268,7 @@ int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type,
bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type) bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type)
{ {
g_autofree struct vfio_region_info *info = NULL; struct vfio_region_info *info = NULL;
bool ret = false; bool ret = false;
if (!vfio_device_get_region_info(vbasedev, region, &info)) { if (!vfio_device_get_region_info(vbasedev, region, &info)) {
@ -305,11 +324,14 @@ void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **errp)
vbasedev->fd = fd; vbasedev->fd = fd;
} }
static VFIODeviceIOOps vfio_device_io_ops_ioctl;
void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops, void vfio_device_init(VFIODevice *vbasedev, int type, VFIODeviceOps *ops,
DeviceState *dev, bool ram_discard) DeviceState *dev, bool ram_discard)
{ {
vbasedev->type = type; vbasedev->type = type;
vbasedev->ops = ops; vbasedev->ops = ops;
vbasedev->io_ops = &vfio_device_io_ops_ioctl;
vbasedev->dev = dev; vbasedev->dev = dev;
vbasedev->fd = -1; vbasedev->fd = -1;
@ -370,27 +392,35 @@ bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
VFIODevice *vfio_get_vfio_device(Object *obj) VFIODevice *vfio_get_vfio_device(Object *obj)
{ {
if (object_dynamic_cast(obj, TYPE_VFIO_PCI)) { if (object_dynamic_cast(obj, TYPE_VFIO_PCI)) {
return &VFIO_PCI(obj)->vbasedev; return &VFIO_PCI_BASE(obj)->vbasedev;
} else { } else {
return NULL; return NULL;
} }
} }
bool vfio_device_attach(char *name, VFIODevice *vbasedev, bool vfio_device_attach_by_iommu_type(const char *iommu_type, char *name,
AddressSpace *as, Error **errp) VFIODevice *vbasedev, AddressSpace *as,
Error **errp)
{ {
const VFIOIOMMUClass *ops = const VFIOIOMMUClass *ops =
VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY)); VFIO_IOMMU_CLASS(object_class_by_name(iommu_type));
if (vbasedev->iommufd) {
ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
}
assert(ops); assert(ops);
return ops->attach_device(name, vbasedev, as, errp); return ops->attach_device(name, vbasedev, as, errp);
} }
bool vfio_device_attach(char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp)
{
const char *iommu_type = vbasedev->iommufd ?
TYPE_VFIO_IOMMU_IOMMUFD :
TYPE_VFIO_IOMMU_LEGACY;
return vfio_device_attach_by_iommu_type(iommu_type, name, vbasedev,
as, errp);
}
void vfio_device_detach(VFIODevice *vbasedev) void vfio_device_detach(VFIODevice *vbasedev)
{ {
if (!vbasedev->bcontainer) { if (!vbasedev->bcontainer) {
@ -398,3 +428,120 @@ void vfio_device_detach(VFIODevice *vbasedev)
} }
VFIO_IOMMU_GET_CLASS(vbasedev->bcontainer)->detach_device(vbasedev); VFIO_IOMMU_GET_CLASS(vbasedev->bcontainer)->detach_device(vbasedev);
} }
void vfio_device_prepare(VFIODevice *vbasedev, VFIOContainerBase *bcontainer,
struct vfio_device_info *info)
{
vbasedev->num_irqs = info->num_irqs;
vbasedev->num_regions = info->num_regions;
vbasedev->flags = info->flags;
vbasedev->reset_works = !!(info->flags & VFIO_DEVICE_FLAGS_RESET);
vbasedev->bcontainer = bcontainer;
QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next);
QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next);
vbasedev->reginfo = g_new0(struct vfio_region_info *,
vbasedev->num_regions);
}
void vfio_device_unprepare(VFIODevice *vbasedev)
{
int i;
for (i = 0; i < vbasedev->num_regions; i++) {
g_free(vbasedev->reginfo[i]);
}
g_free(vbasedev->reginfo);
vbasedev->reginfo = NULL;
QLIST_REMOVE(vbasedev, container_next);
QLIST_REMOVE(vbasedev, global_next);
vbasedev->bcontainer = NULL;
}
/*
* Traditional ioctl() based io
*/
static int vfio_device_io_device_feature(VFIODevice *vbasedev,
struct vfio_device_feature *feature)
{
int ret;
ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature);
return ret < 0 ? -errno : ret;
}
static int vfio_device_io_get_region_info(VFIODevice *vbasedev,
struct vfio_region_info *info)
{
int ret;
ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_REGION_INFO, info);
return ret < 0 ? -errno : ret;
}
static int vfio_device_io_get_irq_info(VFIODevice *vbasedev,
struct vfio_irq_info *info)
{
int ret;
ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, info);
return ret < 0 ? -errno : ret;
}
static int vfio_device_io_set_irqs(VFIODevice *vbasedev,
struct vfio_irq_set *irqs)
{
int ret;
ret = ioctl(vbasedev->fd, VFIO_DEVICE_SET_IRQS, irqs);
return ret < 0 ? -errno : ret;
}
static int vfio_device_io_region_read(VFIODevice *vbasedev, uint8_t index,
off_t off, uint32_t size, void *data)
{
struct vfio_region_info *info;
int ret;
ret = vfio_device_get_region_info(vbasedev, index, &info);
if (ret != 0) {
return ret;
}
ret = pread(vbasedev->fd, data, size, info->offset + off);
return ret < 0 ? -errno : ret;
}
static int vfio_device_io_region_write(VFIODevice *vbasedev, uint8_t index,
off_t off, uint32_t size, void *data)
{
struct vfio_region_info *info;
int ret;
ret = vfio_device_get_region_info(vbasedev, index, &info);
if (ret != 0) {
return ret;
}
ret = pwrite(vbasedev->fd, data, size, info->offset + off);
return ret < 0 ? -errno : ret;
}
static VFIODeviceIOOps vfio_device_io_ops_ioctl = {
.device_feature = vfio_device_io_device_feature,
.get_region_info = vfio_device_io_get_region_info,
.get_irq_info = vfio_device_io_get_irq_info,
.set_irqs = vfio_device_io_set_irqs,
.region_read = vfio_device_io_region_read,
.region_write = vfio_device_io_region_write,
};

View file

@ -103,6 +103,7 @@ static int igd_gen(VFIOPCIDevice *vdev)
/* /*
* Unfortunately, Intel changes it's specification quite often. This makes * Unfortunately, Intel changes it's specification quite often. This makes
* it impossible to use a suitable default value for unknown devices. * it impossible to use a suitable default value for unknown devices.
* Return -1 for not applying any generation-specific quirks.
*/ */
return -1; return -1;
} }
@ -182,16 +183,13 @@ static bool vfio_pci_igd_opregion_init(VFIOPCIDevice *vdev,
trace_vfio_pci_igd_opregion_enabled(vdev->vbasedev.name); trace_vfio_pci_igd_opregion_enabled(vdev->vbasedev.name);
pci_set_long(vdev->pdev.config + IGD_ASLS, 0);
pci_set_long(vdev->pdev.wmask + IGD_ASLS, ~0);
pci_set_long(vdev->emulated_config_bits + IGD_ASLS, ~0);
return true; return true;
} }
static bool vfio_pci_igd_setup_opregion(VFIOPCIDevice *vdev, Error **errp) static bool vfio_pci_igd_opregion_detect(VFIOPCIDevice *vdev,
struct vfio_region_info **opregion,
Error **errp)
{ {
g_autofree struct vfio_region_info *opregion = NULL;
int ret; int ret;
/* Hotplugging is not supported for opregion access */ /* Hotplugging is not supported for opregion access */
@ -202,17 +200,13 @@ static bool vfio_pci_igd_setup_opregion(VFIOPCIDevice *vdev, Error **errp)
ret = vfio_device_get_region_info_type(&vdev->vbasedev, ret = vfio_device_get_region_info_type(&vdev->vbasedev,
VFIO_REGION_TYPE_PCI_VENDOR_TYPE | PCI_VENDOR_ID_INTEL, VFIO_REGION_TYPE_PCI_VENDOR_TYPE | PCI_VENDOR_ID_INTEL,
VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION, &opregion); VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION, opregion);
if (ret) { if (ret) {
error_setg_errno(errp, -ret, error_setg_errno(errp, -ret,
"Device does not supports IGD OpRegion feature"); "Device does not supports IGD OpRegion feature");
return false; return false;
} }
if (!vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
return false;
}
return true; return true;
} }
@ -355,8 +349,8 @@ static int vfio_pci_igd_lpc_init(VFIOPCIDevice *vdev,
static bool vfio_pci_igd_setup_lpc_bridge(VFIOPCIDevice *vdev, Error **errp) static bool vfio_pci_igd_setup_lpc_bridge(VFIOPCIDevice *vdev, Error **errp)
{ {
g_autofree struct vfio_region_info *host = NULL; struct vfio_region_info *host = NULL;
g_autofree struct vfio_region_info *lpc = NULL; struct vfio_region_info *lpc = NULL;
PCIDevice *lpc_bridge; PCIDevice *lpc_bridge;
int ret; int ret;
@ -419,6 +413,44 @@ static bool vfio_pci_igd_setup_lpc_bridge(VFIOPCIDevice *vdev, Error **errp)
return true; return true;
} }
static bool vfio_pci_igd_override_gms(int gen, uint32_t gms, uint32_t *gmch)
{
bool ret = false;
if (gen == -1) {
error_report("x-igd-gms is not supported on this device");
} else if (gen < 8) {
if (gms <= 0x10) {
*gmch &= ~(IGD_GMCH_GEN6_GMS_MASK << IGD_GMCH_GEN6_GMS_SHIFT);
*gmch |= gms << IGD_GMCH_GEN6_GMS_SHIFT;
ret = true;
} else {
error_report(QERR_INVALID_PARAMETER_VALUE, "x-igd-gms", "0~0x10");
}
} else if (gen == 8) {
if (gms <= 0x40) {
*gmch &= ~(IGD_GMCH_GEN8_GMS_MASK << IGD_GMCH_GEN8_GMS_SHIFT);
*gmch |= gms << IGD_GMCH_GEN8_GMS_SHIFT;
ret = true;
} else {
error_report(QERR_INVALID_PARAMETER_VALUE, "x-igd-gms", "0~0x40");
}
} else {
/* 0x0 to 0x40: 32MB increments starting at 0MB */
/* 0xf0 to 0xfe: 4MB increments starting at 4MB */
if ((gms <= 0x40) || (gms >= 0xf0 && gms <= 0xfe)) {
*gmch &= ~(IGD_GMCH_GEN8_GMS_MASK << IGD_GMCH_GEN8_GMS_SHIFT);
*gmch |= gms << IGD_GMCH_GEN8_GMS_SHIFT;
ret = true;
} else {
error_report(QERR_INVALID_PARAMETER_VALUE,
"x-igd-gms", "0~0x40 or 0xf0~0xfe");
}
}
return ret;
}
#define IGD_GGC_MMIO_OFFSET 0x108040 #define IGD_GGC_MMIO_OFFSET 0x108040
#define IGD_BDSM_MMIO_OFFSET 0x1080C0 #define IGD_BDSM_MMIO_OFFSET 0x1080C0
@ -428,41 +460,35 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror; VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
int gen; int gen;
/*
* This must be an Intel VGA device at address 00:02.0 for us to even
* consider enabling legacy mode. Some driver have dependencies on the PCI
* bus address.
*/
if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) || if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
!vfio_is_vga(vdev) || nr != 0) { !vfio_is_vga(vdev) || nr != 0) {
return; return;
} }
/* /* Only on IGD Gen6-12 device needs quirks in BAR 0 */
* Only on IGD devices of gen 11 and above, the BDSM register is mirrored
* into MMIO space and read from MMIO space by the Windows driver.
*/
gen = igd_gen(vdev); gen = igd_gen(vdev);
if (gen < 6) { if (gen < 6) {
return; return;
} }
ggc_quirk = vfio_quirk_alloc(1); if (vdev->igd_gms) {
ggc_mirror = ggc_quirk->data = g_malloc0(sizeof(*ggc_mirror)); ggc_quirk = vfio_quirk_alloc(1);
ggc_mirror->mem = ggc_quirk->mem; ggc_mirror = ggc_quirk->data = g_malloc0(sizeof(*ggc_mirror));
ggc_mirror->vdev = vdev; ggc_mirror->mem = ggc_quirk->mem;
ggc_mirror->bar = nr; ggc_mirror->vdev = vdev;
ggc_mirror->offset = IGD_GGC_MMIO_OFFSET; ggc_mirror->bar = nr;
ggc_mirror->config_offset = IGD_GMCH; ggc_mirror->offset = IGD_GGC_MMIO_OFFSET;
ggc_mirror->config_offset = IGD_GMCH;
memory_region_init_io(ggc_mirror->mem, OBJECT(vdev), memory_region_init_io(ggc_mirror->mem, OBJECT(vdev),
&vfio_generic_mirror_quirk, ggc_mirror, &vfio_generic_mirror_quirk, ggc_mirror,
"vfio-igd-ggc-quirk", 2); "vfio-igd-ggc-quirk", 2);
memory_region_add_subregion_overlap(vdev->bars[nr].region.mem, memory_region_add_subregion_overlap(vdev->bars[nr].region.mem,
ggc_mirror->offset, ggc_mirror->mem, ggc_mirror->offset, ggc_mirror->mem,
1); 1);
QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, ggc_quirk, next); QLIST_INSERT_HEAD(&vdev->bars[nr].quirks, ggc_quirk, next);
}
bdsm_quirk = vfio_quirk_alloc(1); bdsm_quirk = vfio_quirk_alloc(1);
bdsm_mirror = bdsm_quirk->data = g_malloc0(sizeof(*bdsm_mirror)); bdsm_mirror = bdsm_quirk->data = g_malloc0(sizeof(*bdsm_mirror));
@ -484,44 +510,37 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp) static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
{ {
struct vfio_region_info *opregion = NULL;
int ret, gen; int ret, gen;
uint64_t gms_size; uint64_t gms_size = 0;
uint64_t *bdsm_size; uint64_t *bdsm_size;
uint32_t gmch; uint32_t gmch;
bool legacy_mode_enabled = false; bool legacy_mode_enabled = false;
Error *err = NULL; Error *err = NULL;
/*
* This must be an Intel VGA device at address 00:02.0 for us to even
* consider enabling legacy mode. The vBIOS has dependencies on the
* PCI bus address.
*/
if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) || if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
!vfio_is_vga(vdev)) { !vfio_is_vga(vdev)) {
return true; return true;
} }
/* /* IGD device always comes with OpRegion */
* IGD is not a standard, they like to change their specs often. We if (!vfio_pci_igd_opregion_detect(vdev, &opregion, errp)) {
* only attempt to support back to SandBridge and we hope that newer
* devices maintain compatibility with generation 8.
*/
gen = igd_gen(vdev);
if (gen == -1) {
error_report("IGD device %s is unsupported in legacy mode, "
"try SandyBridge or newer", vdev->vbasedev.name);
return true; return true;
} }
info_report("OpRegion detected on Intel display %x.", vdev->device_id);
gen = igd_gen(vdev);
gmch = vfio_pci_read_config(&vdev->pdev, IGD_GMCH, 4); gmch = vfio_pci_read_config(&vdev->pdev, IGD_GMCH, 4);
/* /*
* For backward compatibility, enable legacy mode when * For backward compatibility, enable legacy mode when
* - Device geneation is 6 to 9 (including both)
* - Machine type is i440fx (pc_piix) * - Machine type is i440fx (pc_piix)
* - IGD device is at guest BDF 00:02.0 * - IGD device is at guest BDF 00:02.0
* - Not manually disabled by x-igd-legacy-mode=off * - Not manually disabled by x-igd-legacy-mode=off
*/ */
if ((vdev->igd_legacy_mode != ON_OFF_AUTO_OFF) && if ((vdev->igd_legacy_mode != ON_OFF_AUTO_OFF) &&
(gen >= 6 && gen <= 9) &&
!strcmp(MACHINE_GET_CLASS(qdev_get_machine())->family, "pc_piix") && !strcmp(MACHINE_GET_CLASS(qdev_get_machine())->family, "pc_piix") &&
(&vdev->pdev == pci_find_device(pci_device_root_bus(&vdev->pdev), (&vdev->pdev == pci_find_device(pci_device_root_bus(&vdev->pdev),
0, PCI_DEVFN(0x2, 0)))) { 0, PCI_DEVFN(0x2, 0)))) {
@ -532,7 +551,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
* - OpRegion * - OpRegion
* - Same LPC bridge and Host bridge VID/DID/SVID/SSID as host * - Same LPC bridge and Host bridge VID/DID/SVID/SSID as host
*/ */
g_autofree struct vfio_region_info *rom = NULL; struct vfio_region_info *rom = NULL;
legacy_mode_enabled = true; legacy_mode_enabled = true;
info_report("IGD legacy mode enabled, " info_report("IGD legacy mode enabled, "
@ -566,13 +585,15 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
vdev->features |= VFIO_FEATURE_ENABLE_IGD_LPC; vdev->features |= VFIO_FEATURE_ENABLE_IGD_LPC;
} else if (vdev->igd_legacy_mode == ON_OFF_AUTO_ON) { } else if (vdev->igd_legacy_mode == ON_OFF_AUTO_ON) {
error_setg(&err, error_setg(&err,
"Machine is not i440fx or assigned BDF is not 00:02.0"); "Machine is not i440fx, assigned BDF is not 00:02.0, "
"or device %04x (gen %d) doesn't support legacy mode",
vdev->device_id, gen);
goto error; goto error;
} }
/* Setup OpRegion access */ /* Setup OpRegion access */
if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_OPREGION) && if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_OPREGION) &&
!vfio_pci_igd_setup_opregion(vdev, errp)) { !vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
goto error; goto error;
} }
@ -580,7 +601,15 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_LPC) && if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_LPC) &&
!vfio_pci_igd_setup_lpc_bridge(vdev, errp)) { !vfio_pci_igd_setup_lpc_bridge(vdev, errp)) {
goto error; goto error;
} }
/*
* ASLS (OpRegion address) is read-only, emulated
* It contains HPA, guest firmware need to reprogram it with GPA.
*/
pci_set_long(vdev->pdev.config + IGD_ASLS, 0);
pci_set_long(vdev->pdev.wmask + IGD_ASLS, ~0);
pci_set_long(vdev->emulated_config_bits + IGD_ASLS, ~0);
/* /*
* Allow user to override dsm size using x-igd-gms option, in multiples of * Allow user to override dsm size using x-igd-gms option, in multiples of
@ -588,56 +617,44 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
* set from DVMT Pre-Allocated option in host BIOS. * set from DVMT Pre-Allocated option in host BIOS.
*/ */
if (vdev->igd_gms) { if (vdev->igd_gms) {
if (gen < 8) { if (!vfio_pci_igd_override_gms(gen, vdev->igd_gms, &gmch)) {
if (vdev->igd_gms <= 0x10) { return false;
gmch &= ~(IGD_GMCH_GEN6_GMS_MASK << IGD_GMCH_GEN6_GMS_SHIFT);
gmch |= vdev->igd_gms << IGD_GMCH_GEN6_GMS_SHIFT;
} else {
error_report(QERR_INVALID_PARAMETER_VALUE,
"x-igd-gms", "0~0x10");
}
} else {
if (vdev->igd_gms <= 0x40) {
gmch &= ~(IGD_GMCH_GEN8_GMS_MASK << IGD_GMCH_GEN8_GMS_SHIFT);
gmch |= vdev->igd_gms << IGD_GMCH_GEN8_GMS_SHIFT;
} else {
error_report(QERR_INVALID_PARAMETER_VALUE,
"x-igd-gms", "0~0x40");
}
} }
/* GMCH is read-only, emulated */
pci_set_long(vdev->pdev.config + IGD_GMCH, gmch);
pci_set_long(vdev->pdev.wmask + IGD_GMCH, 0);
pci_set_long(vdev->emulated_config_bits + IGD_GMCH, ~0);
} }
gms_size = igd_stolen_memory_size(gen, gmch); if (gen > 0) {
gms_size = igd_stolen_memory_size(gen, gmch);
/* BDSM is read-write, emulated. BIOS needs to be able to write it */
if (gen < 11) {
pci_set_long(vdev->pdev.config + IGD_BDSM, 0);
pci_set_long(vdev->pdev.wmask + IGD_BDSM, ~0);
pci_set_long(vdev->emulated_config_bits + IGD_BDSM, ~0);
} else {
pci_set_quad(vdev->pdev.config + IGD_BDSM_GEN11, 0);
pci_set_quad(vdev->pdev.wmask + IGD_BDSM_GEN11, ~0);
pci_set_quad(vdev->emulated_config_bits + IGD_BDSM_GEN11, ~0);
}
}
/* /*
* Request reserved memory for stolen memory via fw_cfg. VM firmware * Request reserved memory for stolen memory via fw_cfg. VM firmware
* must allocate a 1MB aligned reserved memory region below 4GB with * must allocate a 1MB aligned reserved memory region below 4GB with
* the requested size (in bytes) for use by the Intel PCI class VGA * the requested size (in bytes) for use by the IGD device. The base
* device at VM address 00:02.0. The base address of this reserved * address of this reserved memory region must be written to the
* memory region must be written to the device BDSM register at PCI * device BDSM register.
* config offset 0x5C. * For newer device without BDSM register, this fw_cfg item is 0.
*/ */
bdsm_size = g_malloc(sizeof(*bdsm_size)); bdsm_size = g_malloc(sizeof(*bdsm_size));
*bdsm_size = cpu_to_le64(gms_size); *bdsm_size = cpu_to_le64(gms_size);
fw_cfg_add_file(fw_cfg_find(), "etc/igd-bdsm-size", fw_cfg_add_file(fw_cfg_find(), "etc/igd-bdsm-size",
bdsm_size, sizeof(*bdsm_size)); bdsm_size, sizeof(*bdsm_size));
/* GMCH is read-only, emulated */
pci_set_long(vdev->pdev.config + IGD_GMCH, gmch);
pci_set_long(vdev->pdev.wmask + IGD_GMCH, 0);
pci_set_long(vdev->emulated_config_bits + IGD_GMCH, ~0);
/* BDSM is read-write, emulated. The BIOS needs to be able to write it */
if (gen < 11) {
pci_set_long(vdev->pdev.config + IGD_BDSM, 0);
pci_set_long(vdev->pdev.wmask + IGD_BDSM, ~0);
pci_set_long(vdev->emulated_config_bits + IGD_BDSM, ~0);
} else {
pci_set_quad(vdev->pdev.config + IGD_BDSM_GEN11, 0);
pci_set_quad(vdev->pdev.wmask + IGD_BDSM_GEN11, ~0);
pci_set_quad(vdev->emulated_config_bits + IGD_BDSM_GEN11, ~0);
}
trace_vfio_pci_igd_bdsm_enabled(vdev->vbasedev.name, (gms_size / MiB)); trace_vfio_pci_igd_bdsm_enabled(vdev->vbasedev.name, (gms_size / MiB));
return true; return true;
@ -664,8 +681,27 @@ error:
*/ */
static bool vfio_pci_kvmgt_config_quirk(VFIOPCIDevice *vdev, Error **errp) static bool vfio_pci_kvmgt_config_quirk(VFIOPCIDevice *vdev, Error **errp)
{ {
struct vfio_region_info *opregion = NULL;
int gen;
if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
!vfio_is_vga(vdev)) {
return true;
}
/* FIXME: Cherryview is Gen8, but don't support GVT-g */
gen = igd_gen(vdev);
if (gen != 8 && gen != 9) {
return true;
}
if (!vfio_pci_igd_opregion_detect(vdev, &opregion, errp)) {
/* Should never reach here, KVMGT always emulates OpRegion */
return false;
}
if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_OPREGION) && if ((vdev->features & VFIO_FEATURE_ENABLE_IGD_OPREGION) &&
!vfio_pci_igd_setup_opregion(vdev, errp)) { !vfio_pci_igd_opregion_init(vdev, opregion, errp)) {
return false; return false;
} }

View file

@ -46,11 +46,28 @@ static int iommufd_cdev_map(const VFIOContainerBase *bcontainer, hwaddr iova,
static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer, static int iommufd_cdev_unmap(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size, hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb) IOMMUTLBEntry *iotlb, bool unmap_all)
{ {
const VFIOIOMMUFDContainer *container = const VFIOIOMMUFDContainer *container =
container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer); container_of(bcontainer, VFIOIOMMUFDContainer, bcontainer);
/* unmap in halves */
if (unmap_all) {
Int128 llsize = int128_rshift(int128_2_64(), 1);
int ret;
ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
0, int128_get64(llsize));
if (ret == 0) {
ret = iommufd_backend_unmap_dma(container->be, container->ioas_id,
int128_get64(llsize),
int128_get64(llsize));
}
return ret;
}
/* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */ /* TODO: Handle dma_unmap_bitmap with iotlb args (migration) */
return iommufd_backend_unmap_dma(container->be, return iommufd_backend_unmap_dma(container->be,
container->ioas_id, iova, size); container->ioas_id, iova, size);
@ -588,14 +605,7 @@ found_container:
iommufd_cdev_ram_block_discard_disable(false); iommufd_cdev_ram_block_discard_disable(false);
} }
vbasedev->group = 0; vfio_device_prepare(vbasedev, bcontainer, &dev_info);
vbasedev->num_irqs = dev_info.num_irqs;
vbasedev->num_regions = dev_info.num_regions;
vbasedev->flags = dev_info.flags;
vbasedev->reset_works = !!(dev_info.flags & VFIO_DEVICE_FLAGS_RESET);
vbasedev->bcontainer = bcontainer;
QLIST_INSERT_HEAD(&bcontainer->device_list, vbasedev, container_next);
QLIST_INSERT_HEAD(&vfio_device_list, vbasedev, global_next);
trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs, trace_iommufd_cdev_device_info(vbasedev->name, devfd, vbasedev->num_irqs,
vbasedev->num_regions, vbasedev->flags); vbasedev->num_regions, vbasedev->flags);
@ -622,9 +632,7 @@ static void iommufd_cdev_detach(VFIODevice *vbasedev)
VFIOIOMMUFDContainer *container = container_of(bcontainer, VFIOIOMMUFDContainer *container = container_of(bcontainer,
VFIOIOMMUFDContainer, VFIOIOMMUFDContainer,
bcontainer); bcontainer);
QLIST_REMOVE(vbasedev, global_next); vfio_device_unprepare(vbasedev);
QLIST_REMOVE(vbasedev, container_next);
vbasedev->bcontainer = NULL;
if (!vbasedev->ram_block_discard_allowed) { if (!vbasedev->ram_block_discard_allowed) {
iommufd_cdev_ram_block_discard_disable(false); iommufd_cdev_ram_block_discard_disable(false);

View file

@ -172,7 +172,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, IOMMUTLBEntry *iotlb)
} }
} else { } else {
ret = vfio_container_dma_unmap(bcontainer, iova, ret = vfio_container_dma_unmap(bcontainer, iova,
iotlb->addr_mask + 1, iotlb); iotlb->addr_mask + 1, iotlb, false);
if (ret) { if (ret) {
error_setg(&local_err, error_setg(&local_err,
"vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " "vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
@ -201,7 +201,7 @@ static void vfio_ram_discard_notify_discard(RamDiscardListener *rdl,
int ret; int ret;
/* Unmap with a single call. */ /* Unmap with a single call. */
ret = vfio_container_dma_unmap(bcontainer, iova, size , NULL); ret = vfio_container_dma_unmap(bcontainer, iova, size , NULL, false);
if (ret) { if (ret) {
error_report("%s: vfio_container_dma_unmap() failed: %s", __func__, error_report("%s: vfio_container_dma_unmap() failed: %s", __func__,
strerror(-ret)); strerror(-ret));
@ -411,6 +411,32 @@ static bool vfio_get_section_iova_range(VFIOContainerBase *bcontainer,
return true; return true;
} }
static void vfio_listener_begin(MemoryListener *listener)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
void (*listener_begin)(VFIOContainerBase *bcontainer);
listener_begin = VFIO_IOMMU_GET_CLASS(bcontainer)->listener_begin;
if (listener_begin) {
listener_begin(bcontainer);
}
}
static void vfio_listener_commit(MemoryListener *listener)
{
VFIOContainerBase *bcontainer = container_of(listener, VFIOContainerBase,
listener);
void (*listener_commit)(VFIOContainerBase *bcontainer);
listener_commit = VFIO_IOMMU_GET_CLASS(bcontainer)->listener_begin;
if (listener_commit) {
listener_commit(bcontainer);
}
}
static void vfio_device_error_append(VFIODevice *vbasedev, Error **errp) static void vfio_device_error_append(VFIODevice *vbasedev, Error **errp)
{ {
/* /*
@ -634,21 +660,14 @@ static void vfio_listener_region_del(MemoryListener *listener,
} }
if (try_unmap) { if (try_unmap) {
bool unmap_all = false;
if (int128_eq(llsize, int128_2_64())) { if (int128_eq(llsize, int128_2_64())) {
/* The unmap ioctl doesn't accept a full 64-bit span. */ unmap_all = true;
llsize = int128_rshift(llsize, 1); llsize = int128_zero();
ret = vfio_container_dma_unmap(bcontainer, iova,
int128_get64(llsize), NULL);
if (ret) {
error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
"0x%"HWADDR_PRIx") = %d (%s)",
bcontainer, iova, int128_get64(llsize), ret,
strerror(-ret));
}
iova += int128_get64(llsize);
} }
ret = vfio_container_dma_unmap(bcontainer, iova, ret = vfio_container_dma_unmap(bcontainer, iova, int128_get64(llsize),
int128_get64(llsize), NULL); NULL, unmap_all);
if (ret) { if (ret) {
error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", " error_report("vfio_container_dma_unmap(%p, 0x%"HWADDR_PRIx", "
"0x%"HWADDR_PRIx") = %d (%s)", "0x%"HWADDR_PRIx") = %d (%s)",
@ -801,13 +820,17 @@ static void vfio_devices_dma_logging_stop(VFIOContainerBase *bcontainer)
VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP; VFIO_DEVICE_FEATURE_DMA_LOGGING_STOP;
QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) { QLIST_FOREACH(vbasedev, &bcontainer->device_list, container_next) {
int ret;
if (!vbasedev->dirty_tracking) { if (!vbasedev->dirty_tracking) {
continue; continue;
} }
if (ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature)) { ret = vbasedev->io_ops->device_feature(vbasedev, feature);
if (ret != 0) {
warn_report("%s: Failed to stop DMA logging, err %d (%s)", warn_report("%s: Failed to stop DMA logging, err %d (%s)",
vbasedev->name, -errno, strerror(errno)); vbasedev->name, -ret, strerror(-ret));
} }
vbasedev->dirty_tracking = false; vbasedev->dirty_tracking = false;
} }
@ -908,10 +931,9 @@ static bool vfio_devices_dma_logging_start(VFIOContainerBase *bcontainer,
continue; continue;
} }
ret = ioctl(vbasedev->fd, VFIO_DEVICE_FEATURE, feature); ret = vbasedev->io_ops->device_feature(vbasedev, feature);
if (ret) { if (ret) {
ret = -errno; error_setg_errno(errp, -ret, "%s: Failed to start DMA logging",
error_setg_errno(errp, errno, "%s: Failed to start DMA logging",
vbasedev->name); vbasedev->name);
goto out; goto out;
} }
@ -1165,6 +1187,8 @@ static void vfio_listener_log_sync(MemoryListener *listener,
static const MemoryListener vfio_memory_listener = { static const MemoryListener vfio_memory_listener = {
.name = "vfio", .name = "vfio",
.begin = vfio_listener_begin,
.commit = vfio_listener_commit,
.region_add = vfio_listener_region_add, .region_add = vfio_listener_region_add,
.region_del = vfio_listener_region_del, .region_del = vfio_listener_region_del,
.log_global_start = vfio_listener_log_global_start, .log_global_start = vfio_listener_log_global_start,

View file

@ -241,7 +241,7 @@ static void vfio_intx_update(VFIOPCIDevice *vdev, PCIINTxRoute *route)
static void vfio_intx_routing_notifier(PCIDevice *pdev) static void vfio_intx_routing_notifier(PCIDevice *pdev)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
PCIINTxRoute route; PCIINTxRoute route;
if (vdev->interrupt != VFIO_INT_INTx) { if (vdev->interrupt != VFIO_INT_INTx) {
@ -381,7 +381,7 @@ static void vfio_msi_interrupt(void *opaque)
static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev) static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev)
{ {
g_autofree struct vfio_irq_set *irq_set = NULL; g_autofree struct vfio_irq_set *irq_set = NULL;
int ret = 0, argsz; int argsz;
int32_t *fd; int32_t *fd;
argsz = sizeof(*irq_set) + sizeof(*fd); argsz = sizeof(*irq_set) + sizeof(*fd);
@ -396,9 +396,7 @@ static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev)
fd = (int32_t *)&irq_set->data; fd = (int32_t *)&irq_set->data;
*fd = -1; *fd = -1;
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); return vdev->vbasedev.io_ops->set_irqs(&vdev->vbasedev, irq_set);
return ret;
} }
static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
@ -455,7 +453,7 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix)
fds[i] = fd; fds[i] = fd;
} }
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); ret = vdev->vbasedev.io_ops->set_irqs(&vdev->vbasedev, irq_set);
g_free(irq_set); g_free(irq_set);
@ -516,7 +514,7 @@ static void vfio_update_kvm_msi_virq(VFIOMSIVector *vector, MSIMessage msg,
static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
MSIMessage *msg, IOHandler *handler) MSIMessage *msg, IOHandler *handler)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIOMSIVector *vector; VFIOMSIVector *vector;
int ret; int ret;
bool resizing = !!(vdev->nr_vectors < nr + 1); bool resizing = !!(vdev->nr_vectors < nr + 1);
@ -581,7 +579,8 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr,
vfio_device_irq_disable(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX); vfio_device_irq_disable(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX);
ret = vfio_enable_vectors(vdev, true); ret = vfio_enable_vectors(vdev, true);
if (ret) { if (ret) {
error_report("vfio: failed to enable vectors, %d", ret); error_report("vfio: failed to enable vectors, %s",
strerror(-ret));
} }
} else { } else {
Error *err = NULL; Error *err = NULL;
@ -621,7 +620,7 @@ static int vfio_msix_vector_use(PCIDevice *pdev,
static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr) static void vfio_msix_vector_release(PCIDevice *pdev, unsigned int nr)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIOMSIVector *vector = &vdev->msi_vectors[nr]; VFIOMSIVector *vector = &vdev->msi_vectors[nr];
trace_vfio_msix_vector_release(vdev->vbasedev.name, nr); trace_vfio_msix_vector_release(vdev->vbasedev.name, nr);
@ -695,7 +694,8 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
if (vdev->nr_vectors) { if (vdev->nr_vectors) {
ret = vfio_enable_vectors(vdev, true); ret = vfio_enable_vectors(vdev, true);
if (ret) { if (ret) {
error_report("vfio: failed to enable vectors, %d", ret); error_report("vfio: failed to enable vectors, %s",
strerror(-ret));
} }
} else { } else {
/* /*
@ -712,7 +712,8 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev)
*/ */
ret = vfio_enable_msix_no_vec(vdev); ret = vfio_enable_msix_no_vec(vdev);
if (ret) { if (ret) {
error_report("vfio: failed to enable MSI-X, %d", ret); error_report("vfio: failed to enable MSI-X, %s",
strerror(-ret));
} }
} }
@ -765,7 +766,8 @@ retry:
ret = vfio_enable_vectors(vdev, false); ret = vfio_enable_vectors(vdev, false);
if (ret) { if (ret) {
if (ret < 0) { if (ret < 0) {
error_report("vfio: Error: Failed to setup MSI fds: %m"); error_report("vfio: Error: Failed to setup MSI fds: %s",
strerror(-ret));
} else { } else {
error_report("vfio: Error: Failed to enable %d " error_report("vfio: Error: Failed to enable %d "
"MSI vectors, retry with %d", vdev->nr_vectors, ret); "MSI vectors, retry with %d", vdev->nr_vectors, ret);
@ -881,18 +883,22 @@ static void vfio_update_msi(VFIOPCIDevice *vdev)
static void vfio_pci_load_rom(VFIOPCIDevice *vdev) static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
{ {
g_autofree struct vfio_region_info *reg_info = NULL; VFIODevice *vbasedev = &vdev->vbasedev;
struct vfio_region_info *reg_info = NULL;
uint64_t size; uint64_t size;
off_t off = 0; off_t off = 0;
ssize_t bytes; ssize_t bytes;
int ret;
if (vfio_device_get_region_info(&vdev->vbasedev, ret = vfio_device_get_region_info(vbasedev, VFIO_PCI_ROM_REGION_INDEX,
VFIO_PCI_ROM_REGION_INDEX, &reg_info)) { &reg_info);
error_report("vfio: Error getting ROM info: %m");
if (ret != 0) {
error_report("vfio: Error getting ROM info: %s", strerror(-ret));
return; return;
} }
trace_vfio_pci_load_rom(vdev->vbasedev.name, (unsigned long)reg_info->size, trace_vfio_pci_load_rom(vbasedev->name, (unsigned long)reg_info->size,
(unsigned long)reg_info->offset, (unsigned long)reg_info->offset,
(unsigned long)reg_info->flags); (unsigned long)reg_info->flags);
@ -901,8 +907,7 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
if (!vdev->rom_size) { if (!vdev->rom_size) {
vdev->rom_read_failed = true; vdev->rom_read_failed = true;
error_report("vfio-pci: Cannot read device rom at " error_report("vfio-pci: Cannot read device rom at %s", vbasedev->name);
"%s", vdev->vbasedev.name);
error_printf("Device option ROM contents are probably invalid " error_printf("Device option ROM contents are probably invalid "
"(check dmesg).\nSkip option ROM probe with rombar=0, " "(check dmesg).\nSkip option ROM probe with rombar=0, "
"or load from file with romfile=\n"); "or load from file with romfile=\n");
@ -913,18 +918,22 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
memset(vdev->rom, 0xff, size); memset(vdev->rom, 0xff, size);
while (size) { while (size) {
bytes = pread(vdev->vbasedev.fd, vdev->rom + off, bytes = vbasedev->io_ops->region_read(vbasedev,
size, vdev->rom_offset + off); VFIO_PCI_ROM_REGION_INDEX,
off, size, vdev->rom + off);
if (bytes == 0) { if (bytes == 0) {
break; break;
} else if (bytes > 0) { } else if (bytes > 0) {
off += bytes; off += bytes;
size -= bytes; size -= bytes;
} else { } else {
if (errno == EINTR || errno == EAGAIN) { if (bytes == -EINTR || bytes == -EAGAIN) {
continue; continue;
} }
error_report("vfio: Error reading device ROM: %m"); error_report("vfio: Error reading device ROM: %s",
strreaderror(bytes));
break; break;
} }
} }
@ -960,6 +969,24 @@ static void vfio_pci_load_rom(VFIOPCIDevice *vdev)
} }
} }
/* "Raw" read of underlying config space. */
static int vfio_pci_config_space_read(VFIOPCIDevice *vdev, off_t offset,
uint32_t size, void *data)
{
return vdev->vbasedev.io_ops->region_read(&vdev->vbasedev,
VFIO_PCI_CONFIG_REGION_INDEX,
offset, size, data);
}
/* "Raw" write of underlying config space. */
static int vfio_pci_config_space_write(VFIOPCIDevice *vdev, off_t offset,
uint32_t size, void *data)
{
return vdev->vbasedev.io_ops->region_write(&vdev->vbasedev,
VFIO_PCI_CONFIG_REGION_INDEX,
offset, size, data);
}
static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size) static uint64_t vfio_rom_read(void *opaque, hwaddr addr, unsigned size)
{ {
VFIOPCIDevice *vdev = opaque; VFIOPCIDevice *vdev = opaque;
@ -1012,10 +1039,9 @@ static const MemoryRegionOps vfio_rom_ops = {
static void vfio_pci_size_rom(VFIOPCIDevice *vdev) static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
{ {
VFIODevice *vbasedev = &vdev->vbasedev;
uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK); uint32_t orig, size = cpu_to_le32((uint32_t)PCI_ROM_ADDRESS_MASK);
off_t offset = vdev->config_offset + PCI_ROM_ADDRESS;
char *name; char *name;
int fd = vdev->vbasedev.fd;
if (vdev->pdev.romfile || !vdev->pdev.rom_bar) { if (vdev->pdev.romfile || !vdev->pdev.rom_bar) {
/* Since pci handles romfile, just print a message and return */ /* Since pci handles romfile, just print a message and return */
@ -1032,11 +1058,12 @@ static void vfio_pci_size_rom(VFIOPCIDevice *vdev)
* Use the same size ROM BAR as the physical device. The contents * Use the same size ROM BAR as the physical device. The contents
* will get filled in later when the guest tries to read it. * will get filled in later when the guest tries to read it.
*/ */
if (pread(fd, &orig, 4, offset) != 4 || if (vfio_pci_config_space_read(vdev, PCI_ROM_ADDRESS, 4, &orig) != 4 ||
pwrite(fd, &size, 4, offset) != 4 || vfio_pci_config_space_write(vdev, PCI_ROM_ADDRESS, 4, &size) != 4 ||
pread(fd, &size, 4, offset) != 4 || vfio_pci_config_space_read(vdev, PCI_ROM_ADDRESS, 4, &size) != 4 ||
pwrite(fd, &orig, 4, offset) != 4) { vfio_pci_config_space_write(vdev, PCI_ROM_ADDRESS, 4, &orig) != 4) {
error_report("%s(%s) failed: %m", __func__, vdev->vbasedev.name);
error_report("%s(%s) ROM access failed", __func__, vbasedev->name);
return; return;
} }
@ -1169,7 +1196,7 @@ static const MemoryRegionOps vfio_vga_ops = {
*/ */
static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar) static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIORegion *region = &vdev->bars[bar].region; VFIORegion *region = &vdev->bars[bar].region;
MemoryRegion *mmap_mr, *region_mr, *base_mr; MemoryRegion *mmap_mr, *region_mr, *base_mr;
PCIIORegion *r; PCIIORegion *r;
@ -1215,7 +1242,8 @@ static void vfio_sub_page_bar_update_mapping(PCIDevice *pdev, int bar)
*/ */
uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len) uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIODevice *vbasedev = &vdev->vbasedev;
uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val; uint32_t emu_bits = 0, emu_val = 0, phys_val = 0, val;
memcpy(&emu_bits, vdev->emulated_config_bits + addr, len); memcpy(&emu_bits, vdev->emulated_config_bits + addr, len);
@ -1228,12 +1256,12 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
if (~emu_bits & (0xffffffffU >> (32 - len * 8))) { if (~emu_bits & (0xffffffffU >> (32 - len * 8))) {
ssize_t ret; ssize_t ret;
ret = pread(vdev->vbasedev.fd, &phys_val, len, ret = vfio_pci_config_space_read(vdev, addr, len, &phys_val);
vdev->config_offset + addr);
if (ret != len) { if (ret != len) {
error_report("%s(%s, 0x%x, 0x%x) failed: %m", error_report("%s(%s, 0x%x, 0x%x) failed: %s",
__func__, vdev->vbasedev.name, addr, len); __func__, vbasedev->name, addr, len,
return -errno; strreaderror(ret));
return -1;
} }
phys_val = le32_to_cpu(phys_val); phys_val = le32_to_cpu(phys_val);
} }
@ -1248,16 +1276,19 @@ uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len)
void vfio_pci_write_config(PCIDevice *pdev, void vfio_pci_write_config(PCIDevice *pdev,
uint32_t addr, uint32_t val, int len) uint32_t addr, uint32_t val, int len)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIODevice *vbasedev = &vdev->vbasedev;
uint32_t val_le = cpu_to_le32(val); uint32_t val_le = cpu_to_le32(val);
int ret;
trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len); trace_vfio_pci_write_config(vdev->vbasedev.name, addr, val, len);
/* Write everything to VFIO, let it filter out what we can't write */ /* Write everything to VFIO, let it filter out what we can't write */
if (pwrite(vdev->vbasedev.fd, &val_le, len, vdev->config_offset + addr) ret = vfio_pci_config_space_write(vdev, addr, len, &val_le);
!= len) { if (ret != len) {
error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %m", error_report("%s(%s, 0x%x, 0x%x, 0x%x) failed: %s",
__func__, vdev->vbasedev.name, addr, val, len); __func__, vbasedev->name, addr, val, len,
strwriteerror(ret));
} }
/* MSI/MSI-X Enabling/Disabling */ /* MSI/MSI-X Enabling/Disabling */
@ -1345,9 +1376,11 @@ static bool vfio_msi_setup(VFIOPCIDevice *vdev, int pos, Error **errp)
int ret, entries; int ret, entries;
Error *err = NULL; Error *err = NULL;
if (pread(vdev->vbasedev.fd, &ctrl, sizeof(ctrl), ret = vfio_pci_config_space_read(vdev, pos + PCI_CAP_FLAGS,
vdev->config_offset + pos + PCI_CAP_FLAGS) != sizeof(ctrl)) { sizeof(ctrl), &ctrl);
error_setg_errno(errp, errno, "failed reading MSI PCI_CAP_FLAGS"); if (ret != sizeof(ctrl)) {
error_setg(errp, "failed reading MSI PCI_CAP_FLAGS: %s",
strreaderror(ret));
return false; return false;
} }
ctrl = le16_to_cpu(ctrl); ctrl = le16_to_cpu(ctrl);
@ -1554,31 +1587,35 @@ static bool vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp)
uint8_t pos; uint8_t pos;
uint16_t ctrl; uint16_t ctrl;
uint32_t table, pba; uint32_t table, pba;
int ret, fd = vdev->vbasedev.fd; struct vfio_irq_info irq_info;
struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info),
.index = VFIO_PCI_MSIX_IRQ_INDEX };
VFIOMSIXInfo *msix; VFIOMSIXInfo *msix;
int ret;
pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX); pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX);
if (!pos) { if (!pos) {
return true; return true;
} }
if (pread(fd, &ctrl, sizeof(ctrl), ret = vfio_pci_config_space_read(vdev, pos + PCI_MSIX_FLAGS,
vdev->config_offset + pos + PCI_MSIX_FLAGS) != sizeof(ctrl)) { sizeof(ctrl), &ctrl);
error_setg_errno(errp, errno, "failed to read PCI MSIX FLAGS"); if (ret != sizeof(ctrl)) {
error_setg(errp, "failed to read PCI MSIX FLAGS: %s",
strreaderror(ret));
return false; return false;
} }
if (pread(fd, &table, sizeof(table), ret = vfio_pci_config_space_read(vdev, pos + PCI_MSIX_TABLE,
vdev->config_offset + pos + PCI_MSIX_TABLE) != sizeof(table)) { sizeof(table), &table);
error_setg_errno(errp, errno, "failed to read PCI MSIX TABLE"); if (ret != sizeof(table)) {
error_setg(errp, "failed to read PCI MSIX TABLE: %s",
strreaderror(ret));
return false; return false;
} }
if (pread(fd, &pba, sizeof(pba), ret = vfio_pci_config_space_read(vdev, pos + PCI_MSIX_PBA,
vdev->config_offset + pos + PCI_MSIX_PBA) != sizeof(pba)) { sizeof(pba), &pba);
error_setg_errno(errp, errno, "failed to read PCI MSIX PBA"); if (ret != sizeof(pba)) {
error_setg(errp, "failed to read PCI MSIX PBA: %s", strreaderror(ret));
return false; return false;
} }
@ -1593,7 +1630,8 @@ static bool vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp)
msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK; msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK;
msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1;
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); ret = vfio_device_get_irq_info(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX,
&irq_info);
if (ret < 0) { if (ret < 0) {
error_setg_errno(errp, -ret, "failed to get MSI-X irq info"); error_setg_errno(errp, -ret, "failed to get MSI-X irq info");
g_free(msix); g_free(msix);
@ -1737,10 +1775,10 @@ static void vfio_bar_prepare(VFIOPCIDevice *vdev, int nr)
} }
/* Determine what type of BAR this is for registration */ /* Determine what type of BAR this is for registration */
ret = pread(vdev->vbasedev.fd, &pci_bar, sizeof(pci_bar), ret = vfio_pci_config_space_read(vdev, PCI_BASE_ADDRESS_0 + (4 * nr),
vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr)); sizeof(pci_bar), &pci_bar);
if (ret != sizeof(pci_bar)) { if (ret != sizeof(pci_bar)) {
error_report("vfio: Failed to read BAR %d (%m)", nr); error_report("vfio: Failed to read BAR %d: %s", nr, strreaderror(ret));
return; return;
} }
@ -2443,21 +2481,23 @@ void vfio_pci_pre_reset(VFIOPCIDevice *vdev)
void vfio_pci_post_reset(VFIOPCIDevice *vdev) void vfio_pci_post_reset(VFIOPCIDevice *vdev)
{ {
VFIODevice *vbasedev = &vdev->vbasedev;
Error *err = NULL; Error *err = NULL;
int nr; int ret, nr;
if (!vfio_intx_enable(vdev, &err)) { if (!vfio_intx_enable(vdev, &err)) {
error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name);
} }
for (nr = 0; nr < PCI_NUM_REGIONS - 1; ++nr) { for (nr = 0; nr < PCI_NUM_REGIONS - 1; ++nr) {
off_t addr = vdev->config_offset + PCI_BASE_ADDRESS_0 + (4 * nr); off_t addr = PCI_BASE_ADDRESS_0 + (4 * nr);
uint32_t val = 0; uint32_t val = 0;
uint32_t len = sizeof(val); uint32_t len = sizeof(val);
if (pwrite(vdev->vbasedev.fd, &val, len, addr) != len) { ret = vfio_pci_config_space_write(vdev, addr, len, &val);
error_report("%s(%s) reset bar %d failed: %m", __func__, if (ret != len) {
vdev->vbasedev.name, nr); error_report("%s(%s) reset bar %d failed: %s", __func__,
vbasedev->name, nr, strwriteerror(ret));
} }
} }
@ -2670,7 +2710,7 @@ static VFIODeviceOps vfio_pci_ops = {
bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp) bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp)
{ {
VFIODevice *vbasedev = &vdev->vbasedev; VFIODevice *vbasedev = &vdev->vbasedev;
g_autofree struct vfio_region_info *reg_info = NULL; struct vfio_region_info *reg_info = NULL;
int ret; int ret;
ret = vfio_device_get_region_info(vbasedev, VFIO_PCI_VGA_REGION_INDEX, &reg_info); ret = vfio_device_get_region_info(vbasedev, VFIO_PCI_VGA_REGION_INDEX, &reg_info);
@ -2735,8 +2775,8 @@ bool vfio_populate_vga(VFIOPCIDevice *vdev, Error **errp)
static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp) static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp)
{ {
VFIODevice *vbasedev = &vdev->vbasedev; VFIODevice *vbasedev = &vdev->vbasedev;
g_autofree struct vfio_region_info *reg_info = NULL; struct vfio_region_info *reg_info = NULL;
struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info) }; struct vfio_irq_info irq_info;
int i, ret = -1; int i, ret = -1;
/* Sanity check device */ /* Sanity check device */
@ -2797,12 +2837,10 @@ static bool vfio_populate_device(VFIOPCIDevice *vdev, Error **errp)
} }
} }
irq_info.index = VFIO_PCI_ERR_IRQ_INDEX; ret = vfio_device_get_irq_info(vbasedev, VFIO_PCI_ERR_IRQ_INDEX, &irq_info);
ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info);
if (ret) { if (ret) {
/* This can fail for an old kernel or legacy PCI dev */ /* This can fail for an old kernel or legacy PCI dev */
trace_vfio_populate_device_get_irq_info_failure(strerror(errno)); trace_vfio_populate_device_get_irq_info_failure(strerror(-ret));
} else if (irq_info.count == 1) { } else if (irq_info.count == 1) {
vdev->pci_aer = true; vdev->pci_aer = true;
} else { } else {
@ -2911,17 +2949,18 @@ static void vfio_req_notifier_handler(void *opaque)
static void vfio_register_req_notifier(VFIOPCIDevice *vdev) static void vfio_register_req_notifier(VFIOPCIDevice *vdev)
{ {
struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info), struct vfio_irq_info irq_info;
.index = VFIO_PCI_REQ_IRQ_INDEX };
Error *err = NULL; Error *err = NULL;
int32_t fd; int32_t fd;
int ret;
if (!(vdev->features & VFIO_FEATURE_ENABLE_REQ)) { if (!(vdev->features & VFIO_FEATURE_ENABLE_REQ)) {
return; return;
} }
if (ioctl(vdev->vbasedev.fd, ret = vfio_device_get_irq_info(&vdev->vbasedev, VFIO_PCI_REQ_IRQ_INDEX,
VFIO_DEVICE_GET_IRQ_INFO, &irq_info) < 0 || irq_info.count < 1) { &irq_info);
if (ret < 0 || irq_info.count < 1) {
return; return;
} }
@ -3090,11 +3129,12 @@ static bool vfio_interrupt_setup(VFIOPCIDevice *vdev, Error **errp)
static void vfio_realize(PCIDevice *pdev, Error **errp) static void vfio_realize(PCIDevice *pdev, Error **errp)
{ {
ERRP_GUARD(); ERRP_GUARD();
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIODevice *vbasedev = &vdev->vbasedev; VFIODevice *vbasedev = &vdev->vbasedev;
int i, ret; int i, ret;
char uuid[UUID_STR_LEN]; char uuid[UUID_STR_LEN];
g_autofree char *name = NULL; g_autofree char *name = NULL;
uint32_t config_space_size;
if (vbasedev->fd < 0 && !vbasedev->sysfsdev) { if (vbasedev->fd < 0 && !vbasedev->sysfsdev) {
if (!(~vdev->host.domain || ~vdev->host.bus || if (!(~vdev->host.domain || ~vdev->host.bus ||
@ -3149,13 +3189,14 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
goto error; goto error;
} }
config_space_size = MIN(pci_config_size(&vdev->pdev), vdev->config_size);
/* Get a copy of config space */ /* Get a copy of config space */
ret = pread(vbasedev->fd, vdev->pdev.config, ret = vfio_pci_config_space_read(vdev, 0, config_space_size,
MIN(pci_config_size(&vdev->pdev), vdev->config_size), vdev->pdev.config);
vdev->config_offset); if (ret < (int)config_space_size) {
if (ret < (int)MIN(pci_config_size(&vdev->pdev), vdev->config_size)) { ret = ret < 0 ? -ret : EFAULT;
ret = ret < 0 ? -errno : -EFAULT; error_setg_errno(errp, ret, "failed to read device config space");
error_setg_errno(errp, -ret, "failed to read device config space");
goto error; goto error;
} }
@ -3259,7 +3300,7 @@ error:
static void vfio_instance_finalize(Object *obj) static void vfio_instance_finalize(Object *obj)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(obj); VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
vfio_display_finalize(vdev); vfio_display_finalize(vdev);
vfio_bars_finalize(vdev); vfio_bars_finalize(vdev);
@ -3277,7 +3318,7 @@ static void vfio_instance_finalize(Object *obj)
static void vfio_exitfn(PCIDevice *pdev) static void vfio_exitfn(PCIDevice *pdev)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(pdev);
VFIODevice *vbasedev = &vdev->vbasedev; VFIODevice *vbasedev = &vdev->vbasedev;
vfio_unregister_req_notifier(vdev); vfio_unregister_req_notifier(vdev);
@ -3301,7 +3342,7 @@ static void vfio_exitfn(PCIDevice *pdev)
static void vfio_pci_reset(DeviceState *dev) static void vfio_pci_reset(DeviceState *dev)
{ {
VFIOPCIDevice *vdev = VFIO_PCI(dev); VFIOPCIDevice *vdev = VFIO_PCI_BASE(dev);
trace_vfio_pci_reset(vdev->vbasedev.name); trace_vfio_pci_reset(vdev->vbasedev.name);
@ -3341,7 +3382,7 @@ post_reset:
static void vfio_instance_init(Object *obj) static void vfio_instance_init(Object *obj)
{ {
PCIDevice *pci_dev = PCI_DEVICE(obj); PCIDevice *pci_dev = PCI_DEVICE(obj);
VFIOPCIDevice *vdev = VFIO_PCI(obj); VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
VFIODevice *vbasedev = &vdev->vbasedev; VFIODevice *vbasedev = &vdev->vbasedev;
device_add_bootindex_property(obj, &vdev->bootindex, device_add_bootindex_property(obj, &vdev->bootindex,
@ -3362,6 +3403,31 @@ static void vfio_instance_init(Object *obj)
pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS;
} }
static void vfio_pci_base_dev_class_init(ObjectClass *klass, const void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *pdc = PCI_DEVICE_CLASS(klass);
dc->desc = "VFIO PCI base device";
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
pdc->exit = vfio_exitfn;
pdc->config_read = vfio_pci_read_config;
pdc->config_write = vfio_pci_write_config;
}
static const TypeInfo vfio_pci_base_dev_info = {
.name = TYPE_VFIO_PCI_BASE,
.parent = TYPE_PCI_DEVICE,
.instance_size = 0,
.abstract = true,
.class_init = vfio_pci_base_dev_class_init,
.interfaces = (const InterfaceInfo[]) {
{ INTERFACE_PCIE_DEVICE },
{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
{ }
},
};
static PropertyInfo vfio_pci_migration_multifd_transfer_prop; static PropertyInfo vfio_pci_migration_multifd_transfer_prop;
static const Property vfio_pci_dev_properties[] = { static const Property vfio_pci_dev_properties[] = {
@ -3385,7 +3451,7 @@ static const Property vfio_pci_dev_properties[] = {
DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features, DEFINE_PROP_BIT("x-req", VFIOPCIDevice, features,
VFIO_FEATURE_ENABLE_REQ_BIT, true), VFIO_FEATURE_ENABLE_REQ_BIT, true),
DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features, DEFINE_PROP_BIT("x-igd-opregion", VFIOPCIDevice, features,
VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, true),
DEFINE_PROP_BIT("x-igd-lpc", VFIOPCIDevice, features, DEFINE_PROP_BIT("x-igd-lpc", VFIOPCIDevice, features,
VFIO_FEATURE_ENABLE_IGD_LPC_BIT, false), VFIO_FEATURE_ENABLE_IGD_LPC_BIT, false),
DEFINE_PROP_ON_OFF_AUTO("x-igd-legacy-mode", VFIOPCIDevice, DEFINE_PROP_ON_OFF_AUTO("x-igd-legacy-mode", VFIOPCIDevice,
@ -3432,7 +3498,8 @@ static const Property vfio_pci_dev_properties[] = {
#ifdef CONFIG_IOMMUFD #ifdef CONFIG_IOMMUFD
static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp) static void vfio_pci_set_fd(Object *obj, const char *str, Error **errp)
{ {
vfio_device_set_fd(&VFIO_PCI(obj)->vbasedev, str, errp); VFIOPCIDevice *vdev = VFIO_PCI_BASE(obj);
vfio_device_set_fd(&vdev->vbasedev, str, errp);
} }
#endif #endif
@ -3447,11 +3514,7 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, const void *data)
object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd); object_class_property_add_str(klass, "fd", NULL, vfio_pci_set_fd);
#endif #endif
dc->desc = "VFIO-based PCI device assignment"; dc->desc = "VFIO-based PCI device assignment";
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
pdc->realize = vfio_realize; pdc->realize = vfio_realize;
pdc->exit = vfio_exitfn;
pdc->config_read = vfio_pci_read_config;
pdc->config_write = vfio_pci_write_config;
object_class_property_set_description(klass, /* 1.3 */ object_class_property_set_description(klass, /* 1.3 */
"host", "host",
@ -3576,16 +3639,11 @@ static void vfio_pci_dev_class_init(ObjectClass *klass, const void *data)
static const TypeInfo vfio_pci_dev_info = { static const TypeInfo vfio_pci_dev_info = {
.name = TYPE_VFIO_PCI, .name = TYPE_VFIO_PCI,
.parent = TYPE_PCI_DEVICE, .parent = TYPE_VFIO_PCI_BASE,
.instance_size = sizeof(VFIOPCIDevice), .instance_size = sizeof(VFIOPCIDevice),
.class_init = vfio_pci_dev_class_init, .class_init = vfio_pci_dev_class_init,
.instance_init = vfio_instance_init, .instance_init = vfio_instance_init,
.instance_finalize = vfio_instance_finalize, .instance_finalize = vfio_instance_finalize,
.interfaces = (const InterfaceInfo[]) {
{ INTERFACE_PCIE_DEVICE },
{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
{ }
},
}; };
static const Property vfio_pci_dev_nohotplug_properties[] = { static const Property vfio_pci_dev_nohotplug_properties[] = {
@ -3632,6 +3690,7 @@ static void register_vfio_pci_dev_type(void)
vfio_pci_migration_multifd_transfer_prop = qdev_prop_on_off_auto; vfio_pci_migration_multifd_transfer_prop = qdev_prop_on_off_auto;
vfio_pci_migration_multifd_transfer_prop.realized_set_allowed = true; vfio_pci_migration_multifd_transfer_prop.realized_set_allowed = true;
type_register_static(&vfio_pci_base_dev_info);
type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_dev_info);
type_register_static(&vfio_pci_nohotplug_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info);
} }

View file

@ -118,8 +118,16 @@ typedef struct VFIOMSIXInfo {
bool noresize; bool noresize;
} VFIOMSIXInfo; } VFIOMSIXInfo;
/*
* TYPE_VFIO_PCI_BASE is an abstract type used to share code
* between VFIO implementations that use a kernel driver
* with those that use user sockets.
*/
#define TYPE_VFIO_PCI_BASE "vfio-pci-base"
OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI_BASE)
#define TYPE_VFIO_PCI "vfio-pci" #define TYPE_VFIO_PCI "vfio-pci"
OBJECT_DECLARE_SIMPLE_TYPE(VFIOPCIDevice, VFIO_PCI) /* TYPE_VFIO_PCI shares struct VFIOPCIDevice. */
struct VFIOPCIDevice { struct VFIOPCIDevice {
PCIDevice pdev; PCIDevice pdev;

View file

@ -474,10 +474,10 @@ static bool vfio_populate_device(VFIODevice *vbasedev, Error **errp)
QSIMPLEQ_INIT(&vdev->pending_intp_queue); QSIMPLEQ_INIT(&vdev->pending_intp_queue);
for (i = 0; i < vbasedev->num_irqs; i++) { for (i = 0; i < vbasedev->num_irqs; i++) {
struct vfio_irq_info irq = { .argsz = sizeof(irq) }; struct vfio_irq_info irq;
ret = vfio_device_get_irq_info(vbasedev, i, &irq);
irq.index = i;
ret = ioctl(vbasedev->fd, VFIO_DEVICE_GET_IRQ_INFO, &irq);
if (ret) { if (ret) {
error_setg_errno(errp, -ret, "failed to get device irq info"); error_setg_errno(errp, -ret, "failed to get device irq info");
goto irq_err; goto irq_err;

View file

@ -45,6 +45,7 @@ void vfio_region_write(void *opaque, hwaddr addr,
uint32_t dword; uint32_t dword;
uint64_t qword; uint64_t qword;
} buf; } buf;
int ret;
switch (size) { switch (size) {
case 1: case 1:
@ -64,11 +65,13 @@ void vfio_region_write(void *opaque, hwaddr addr,
break; break;
} }
if (pwrite(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { ret = vbasedev->io_ops->region_write(vbasedev, region->nr,
addr, size, &buf);
if (ret != size) {
error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64 error_report("%s(%s:region%d+0x%"HWADDR_PRIx", 0x%"PRIx64
",%d) failed: %m", ",%d) failed: %s",
__func__, vbasedev->name, region->nr, __func__, vbasedev->name, region->nr,
addr, data, size); addr, data, size, strwriteerror(ret));
} }
trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size); trace_vfio_region_write(vbasedev->name, region->nr, addr, data, size);
@ -96,11 +99,13 @@ uint64_t vfio_region_read(void *opaque,
uint64_t qword; uint64_t qword;
} buf; } buf;
uint64_t data = 0; uint64_t data = 0;
int ret;
if (pread(vbasedev->fd, &buf, size, region->fd_offset + addr) != size) { ret = vbasedev->io_ops->region_read(vbasedev, region->nr, addr, size, &buf);
error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %m", if (ret != size) {
error_report("%s(%s:region%d+0x%"HWADDR_PRIx", %d) failed: %s",
__func__, vbasedev->name, region->nr, __func__, vbasedev->name, region->nr,
addr, size); addr, size, strreaderror(ret));
return (uint64_t)-1; return (uint64_t)-1;
} }
switch (size) { switch (size) {
@ -182,7 +187,7 @@ static int vfio_setup_region_sparse_mmaps(VFIORegion *region,
int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region, int vfio_region_setup(Object *obj, VFIODevice *vbasedev, VFIORegion *region,
int index, const char *name) int index, const char *name)
{ {
g_autofree struct vfio_region_info *info = NULL; struct vfio_region_info *info = NULL;
int ret; int ret;
ret = vfio_device_get_region_info(vbasedev, index, &info); ret = vfio_device_get_region_info(vbasedev, index, &info);

View file

@ -81,7 +81,7 @@ int vfio_container_dma_map(VFIOContainerBase *bcontainer,
void *vaddr, bool readonly); void *vaddr, bool readonly);
int vfio_container_dma_unmap(VFIOContainerBase *bcontainer, int vfio_container_dma_unmap(VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size, hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb); IOMMUTLBEntry *iotlb, bool unmap_all);
bool vfio_container_add_section_window(VFIOContainerBase *bcontainer, bool vfio_container_add_section_window(VFIOContainerBase *bcontainer,
MemoryRegionSection *section, MemoryRegionSection *section,
Error **errp); Error **errp);
@ -117,12 +117,25 @@ struct VFIOIOMMUClass {
/* basic feature */ /* basic feature */
bool (*setup)(VFIOContainerBase *bcontainer, Error **errp); bool (*setup)(VFIOContainerBase *bcontainer, Error **errp);
void (*listener_begin)(VFIOContainerBase *bcontainer);
void (*listener_commit)(VFIOContainerBase *bcontainer);
int (*dma_map)(const VFIOContainerBase *bcontainer, int (*dma_map)(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size, hwaddr iova, ram_addr_t size,
void *vaddr, bool readonly); void *vaddr, bool readonly);
/**
* @dma_unmap
*
* Unmap an address range from the container.
*
* @bcontainer: #VFIOContainerBase to use for unmap
* @iova: start address to unmap
* @size: size of the range to unmap
* @iotlb: The IOMMU TLB mapping entry (or NULL)
* @unmap_all: if set, unmap the entire address space
*/
int (*dma_unmap)(const VFIOContainerBase *bcontainer, int (*dma_unmap)(const VFIOContainerBase *bcontainer,
hwaddr iova, ram_addr_t size, hwaddr iova, ram_addr_t size,
IOMMUTLBEntry *iotlb); IOMMUTLBEntry *iotlb, bool unmap_all);
bool (*attach_device)(const char *name, VFIODevice *vbasedev, bool (*attach_device)(const char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp); AddressSpace *as, Error **errp);
void (*detach_device)(VFIODevice *vbasedev); void (*detach_device)(VFIODevice *vbasedev);

View file

@ -41,6 +41,7 @@ enum {
}; };
typedef struct VFIODeviceOps VFIODeviceOps; typedef struct VFIODeviceOps VFIODeviceOps;
typedef struct VFIODeviceIOOps VFIODeviceIOOps;
typedef struct VFIOMigration VFIOMigration; typedef struct VFIOMigration VFIOMigration;
typedef struct IOMMUFDBackend IOMMUFDBackend; typedef struct IOMMUFDBackend IOMMUFDBackend;
@ -66,6 +67,7 @@ typedef struct VFIODevice {
OnOffAuto migration_multifd_transfer; OnOffAuto migration_multifd_transfer;
bool migration_events; bool migration_events;
VFIODeviceOps *ops; VFIODeviceOps *ops;
VFIODeviceIOOps *io_ops;
unsigned int num_irqs; unsigned int num_irqs;
unsigned int num_regions; unsigned int num_regions;
unsigned int flags; unsigned int flags;
@ -81,6 +83,7 @@ typedef struct VFIODevice {
IOMMUFDBackend *iommufd; IOMMUFDBackend *iommufd;
VFIOIOASHwpt *hwpt; VFIOIOASHwpt *hwpt;
QLIST_ENTRY(VFIODevice) hwpt_next; QLIST_ENTRY(VFIODevice) hwpt_next;
struct vfio_region_info **reginfo;
} VFIODevice; } VFIODevice;
struct VFIODeviceOps { struct VFIODeviceOps {
@ -115,6 +118,20 @@ struct VFIODeviceOps {
int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f); int (*vfio_load_config)(VFIODevice *vdev, QEMUFile *f);
}; };
/*
* Given a return value of either a short number of bytes read or -errno,
* construct a meaningful error message.
*/
#define strreaderror(ret) \
(ret < 0 ? strerror(-ret) : "short read")
/*
* Given a return value of either a short number of bytes written or -errno,
* construct a meaningful error message.
*/
#define strwriteerror(ret) \
(ret < 0 ? strerror(-ret) : "short write")
void vfio_device_irq_disable(VFIODevice *vbasedev, int index); void vfio_device_irq_disable(VFIODevice *vbasedev, int index);
void vfio_device_irq_unmask(VFIODevice *vbasedev, int index); void vfio_device_irq_unmask(VFIODevice *vbasedev, int index);
void vfio_device_irq_mask(VFIODevice *vbasedev, int index); void vfio_device_irq_mask(VFIODevice *vbasedev, int index);
@ -127,6 +144,9 @@ bool vfio_device_hiod_create_and_realize(VFIODevice *vbasedev,
const char *typename, Error **errp); const char *typename, Error **errp);
bool vfio_device_attach(char *name, VFIODevice *vbasedev, bool vfio_device_attach(char *name, VFIODevice *vbasedev,
AddressSpace *as, Error **errp); AddressSpace *as, Error **errp);
bool vfio_device_attach_by_iommu_type(const char *iommu_type, char *name,
VFIODevice *vbasedev, AddressSpace *as,
Error **errp);
void vfio_device_detach(VFIODevice *vbasedev); void vfio_device_detach(VFIODevice *vbasedev);
VFIODevice *vfio_get_vfio_device(Object *obj); VFIODevice *vfio_get_vfio_device(Object *obj);
@ -134,11 +154,73 @@ typedef QLIST_HEAD(VFIODeviceList, VFIODevice) VFIODeviceList;
extern VFIODeviceList vfio_device_list; extern VFIODeviceList vfio_device_list;
#ifdef CONFIG_LINUX #ifdef CONFIG_LINUX
/*
* How devices communicate with the server. The default option is through
* ioctl() to the kernel VFIO driver, but vfio-user can use a socket to a remote
* process.
*/
struct VFIODeviceIOOps {
/**
* @device_feature
*
* Fill in feature info for the given device.
*/
int (*device_feature)(VFIODevice *vdev, struct vfio_device_feature *);
/**
* @get_region_info
*
* Fill in @info with information on the region given by @info->index.
*/
int (*get_region_info)(VFIODevice *vdev,
struct vfio_region_info *info);
/**
* @get_irq_info
*
* Fill in @irq with information on the IRQ given by @info->index.
*/
int (*get_irq_info)(VFIODevice *vdev, struct vfio_irq_info *irq);
/**
* @set_irqs
*
* Configure IRQs as defined by @irqs.
*/
int (*set_irqs)(VFIODevice *vdev, struct vfio_irq_set *irqs);
/**
* @region_read
*
* Read @size bytes from the region @nr at offset @off into the buffer
* @data.
*/
int (*region_read)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
void *data);
/**
* @region_write
*
* Write @size bytes to the region @nr at offset @off from the buffer
* @data.
*/
int (*region_write)(VFIODevice *vdev, uint8_t nr, off_t off, uint32_t size,
void *data);
};
void vfio_device_prepare(VFIODevice *vbasedev, VFIOContainerBase *bcontainer,
struct vfio_device_info *info);
void vfio_device_unprepare(VFIODevice *vbasedev);
int vfio_device_get_region_info(VFIODevice *vbasedev, int index, int vfio_device_get_region_info(VFIODevice *vbasedev, int index,
struct vfio_region_info **info); struct vfio_region_info **info);
int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type, int vfio_device_get_region_info_type(VFIODevice *vbasedev, uint32_t type,
uint32_t subtype, struct vfio_region_info **info); uint32_t subtype, struct vfio_region_info **info);
bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type); bool vfio_device_has_region_cap(VFIODevice *vbasedev, int region, uint16_t cap_type);
int vfio_device_get_irq_info(VFIODevice *vbasedev, int index,
struct vfio_irq_info *info);
#endif #endif
/* Returns 0 on success, or a negative errno. */ /* Returns 0 on success, or a negative errno. */

View file

@ -18,7 +18,7 @@
#define SETUP_INDIRECT (1<<31) #define SETUP_INDIRECT (1<<31)
#define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT) #define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT)
#ifndef __ASSEMBLY__ #ifndef __ASSEMBLER__
#include "standard-headers/linux/types.h" #include "standard-headers/linux/types.h"
@ -78,6 +78,6 @@ struct ima_setup_data {
uint64_t size; uint64_t size;
} QEMU_PACKED; } QEMU_PACKED;
#endif /* __ASSEMBLY__ */ #endif /* __ASSEMBLER__ */
#endif /* _ASM_X86_SETUP_DATA_H */ #endif /* _ASM_X86_SETUP_DATA_H */

View file

@ -420,6 +420,7 @@ extern "C" {
#define DRM_FORMAT_MOD_VENDOR_ARM 0x08 #define DRM_FORMAT_MOD_VENDOR_ARM 0x08
#define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09 #define DRM_FORMAT_MOD_VENDOR_ALLWINNER 0x09
#define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a #define DRM_FORMAT_MOD_VENDOR_AMLOGIC 0x0a
#define DRM_FORMAT_MOD_VENDOR_MTK 0x0b
/* add more to the end as needed */ /* add more to the end as needed */
@ -1452,6 +1453,46 @@ drm_fourcc_canonicalize_nvidia_format_mod(uint64_t modifier)
*/ */
#define AMLOGIC_FBC_OPTION_MEM_SAVING (1ULL << 0) #define AMLOGIC_FBC_OPTION_MEM_SAVING (1ULL << 0)
/* MediaTek modifiers
* Bits Parameter Notes
* ----- ------------------------ ---------------------------------------------
* 7: 0 TILE LAYOUT Values are MTK_FMT_MOD_TILE_*
* 15: 8 COMPRESSION Values are MTK_FMT_MOD_COMPRESS_*
* 23:16 10 BIT LAYOUT Values are MTK_FMT_MOD_10BIT_LAYOUT_*
*
*/
#define DRM_FORMAT_MOD_MTK(__flags) fourcc_mod_code(MTK, __flags)
/*
* MediaTek Tiled Modifier
* The lowest 8 bits of the modifier is used to specify the tiling
* layout. Only the 16L_32S tiling is used for now, but we define an
* "untiled" version and leave room for future expansion.
*/
#define MTK_FMT_MOD_TILE_MASK 0xf
#define MTK_FMT_MOD_TILE_NONE 0x0
#define MTK_FMT_MOD_TILE_16L32S 0x1
/*
* Bits 8-15 specify compression options
*/
#define MTK_FMT_MOD_COMPRESS_MASK (0xf << 8)
#define MTK_FMT_MOD_COMPRESS_NONE (0x0 << 8)
#define MTK_FMT_MOD_COMPRESS_V1 (0x1 << 8)
/*
* Bits 16-23 specify how the bits of 10 bit formats are
* stored out in memory
*/
#define MTK_FMT_MOD_10BIT_LAYOUT_MASK (0xf << 16)
#define MTK_FMT_MOD_10BIT_LAYOUT_PACKED (0x0 << 16)
#define MTK_FMT_MOD_10BIT_LAYOUT_LSBTILED (0x1 << 16)
#define MTK_FMT_MOD_10BIT_LAYOUT_LSBRASTER (0x2 << 16)
/* alias for the most common tiling format */
#define DRM_FORMAT_MOD_MTK_16L_32S_TILE DRM_FORMAT_MOD_MTK(MTK_FMT_MOD_TILE_16L32S)
/* /*
* AMD modifiers * AMD modifiers
* *

View file

@ -33,7 +33,7 @@
* Missing __asm__ support * Missing __asm__ support
* *
* __BIT128() would not work in the __asm__ code, as it shifts an * __BIT128() would not work in the __asm__ code, as it shifts an
* 'unsigned __init128' data type as direct representation of * 'unsigned __int128' data type as direct representation of
* 128 bit constants is not supported in the gcc compiler, as * 128 bit constants is not supported in the gcc compiler, as
* they get silently truncated. * they get silently truncated.
* *

View file

@ -2059,6 +2059,24 @@ enum ethtool_link_mode_bit_indices {
ETHTOOL_LINK_MODE_10baseT1S_Half_BIT = 100, ETHTOOL_LINK_MODE_10baseT1S_Half_BIT = 100,
ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT = 101, ETHTOOL_LINK_MODE_10baseT1S_P2MP_Half_BIT = 101,
ETHTOOL_LINK_MODE_10baseT1BRR_Full_BIT = 102, ETHTOOL_LINK_MODE_10baseT1BRR_Full_BIT = 102,
ETHTOOL_LINK_MODE_200000baseCR_Full_BIT = 103,
ETHTOOL_LINK_MODE_200000baseKR_Full_BIT = 104,
ETHTOOL_LINK_MODE_200000baseDR_Full_BIT = 105,
ETHTOOL_LINK_MODE_200000baseDR_2_Full_BIT = 106,
ETHTOOL_LINK_MODE_200000baseSR_Full_BIT = 107,
ETHTOOL_LINK_MODE_200000baseVR_Full_BIT = 108,
ETHTOOL_LINK_MODE_400000baseCR2_Full_BIT = 109,
ETHTOOL_LINK_MODE_400000baseKR2_Full_BIT = 110,
ETHTOOL_LINK_MODE_400000baseDR2_Full_BIT = 111,
ETHTOOL_LINK_MODE_400000baseDR2_2_Full_BIT = 112,
ETHTOOL_LINK_MODE_400000baseSR2_Full_BIT = 113,
ETHTOOL_LINK_MODE_400000baseVR2_Full_BIT = 114,
ETHTOOL_LINK_MODE_800000baseCR4_Full_BIT = 115,
ETHTOOL_LINK_MODE_800000baseKR4_Full_BIT = 116,
ETHTOOL_LINK_MODE_800000baseDR4_Full_BIT = 117,
ETHTOOL_LINK_MODE_800000baseDR4_2_Full_BIT = 118,
ETHTOOL_LINK_MODE_800000baseSR4_Full_BIT = 119,
ETHTOOL_LINK_MODE_800000baseVR4_Full_BIT = 120,
/* must be last entry */ /* must be last entry */
__ETHTOOL_LINK_MODE_MASK_NBITS __ETHTOOL_LINK_MODE_MASK_NBITS
@ -2271,6 +2289,10 @@ static inline int ethtool_validate_duplex(uint8_t duplex)
* be exploited to reduce the RSS queue spread. * be exploited to reduce the RSS queue spread.
*/ */
#define RXH_XFRM_SYM_XOR (1 << 0) #define RXH_XFRM_SYM_XOR (1 << 0)
/* Similar to SYM_XOR, except that one copy of the XOR'ed fields is replaced by
* an OR of the same fields
*/
#define RXH_XFRM_SYM_OR_XOR (1 << 1)
#define RXH_XFRM_NO_CHANGE 0xff #define RXH_XFRM_NO_CHANGE 0xff
/* L2-L4 network traffic flow types */ /* L2-L4 network traffic flow types */

View file

@ -229,6 +229,9 @@
* - FUSE_URING_IN_OUT_HEADER_SZ * - FUSE_URING_IN_OUT_HEADER_SZ
* - FUSE_URING_OP_IN_OUT_SZ * - FUSE_URING_OP_IN_OUT_SZ
* - enum fuse_uring_cmd * - enum fuse_uring_cmd
*
* 7.43
* - add FUSE_REQUEST_TIMEOUT
*/ */
#ifndef _LINUX_FUSE_H #ifndef _LINUX_FUSE_H
@ -260,7 +263,7 @@
#define FUSE_KERNEL_VERSION 7 #define FUSE_KERNEL_VERSION 7
/** Minor version number of this interface */ /** Minor version number of this interface */
#define FUSE_KERNEL_MINOR_VERSION 42 #define FUSE_KERNEL_MINOR_VERSION 43
/** The node ID of the root inode */ /** The node ID of the root inode */
#define FUSE_ROOT_ID 1 #define FUSE_ROOT_ID 1
@ -431,6 +434,8 @@ struct fuse_file_lock {
* of the request ID indicates resend requests * of the request ID indicates resend requests
* FUSE_ALLOW_IDMAP: allow creation of idmapped mounts * FUSE_ALLOW_IDMAP: allow creation of idmapped mounts
* FUSE_OVER_IO_URING: Indicate that client supports io-uring * FUSE_OVER_IO_URING: Indicate that client supports io-uring
* FUSE_REQUEST_TIMEOUT: kernel supports timing out requests.
* init_out.request_timeout contains the timeout (in secs)
*/ */
#define FUSE_ASYNC_READ (1 << 0) #define FUSE_ASYNC_READ (1 << 0)
#define FUSE_POSIX_LOCKS (1 << 1) #define FUSE_POSIX_LOCKS (1 << 1)
@ -473,11 +478,11 @@ struct fuse_file_lock {
#define FUSE_PASSTHROUGH (1ULL << 37) #define FUSE_PASSTHROUGH (1ULL << 37)
#define FUSE_NO_EXPORT_SUPPORT (1ULL << 38) #define FUSE_NO_EXPORT_SUPPORT (1ULL << 38)
#define FUSE_HAS_RESEND (1ULL << 39) #define FUSE_HAS_RESEND (1ULL << 39)
/* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */ /* Obsolete alias for FUSE_DIRECT_IO_ALLOW_MMAP */
#define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP #define FUSE_DIRECT_IO_RELAX FUSE_DIRECT_IO_ALLOW_MMAP
#define FUSE_ALLOW_IDMAP (1ULL << 40) #define FUSE_ALLOW_IDMAP (1ULL << 40)
#define FUSE_OVER_IO_URING (1ULL << 41) #define FUSE_OVER_IO_URING (1ULL << 41)
#define FUSE_REQUEST_TIMEOUT (1ULL << 42)
/** /**
* CUSE INIT request/reply flags * CUSE INIT request/reply flags
@ -905,7 +910,8 @@ struct fuse_init_out {
uint16_t map_alignment; uint16_t map_alignment;
uint32_t flags2; uint32_t flags2;
uint32_t max_stack_depth; uint32_t max_stack_depth;
uint32_t unused[6]; uint16_t request_timeout;
uint16_t unused[11];
}; };
#define CUSE_INIT_INFO_MAX 4096 #define CUSE_INIT_INFO_MAX 4096

View file

@ -486,6 +486,7 @@
#define PCI_EXP_TYPE_RC_EC 0xa /* Root Complex Event Collector */ #define PCI_EXP_TYPE_RC_EC 0xa /* Root Complex Event Collector */
#define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */ #define PCI_EXP_FLAGS_SLOT 0x0100 /* Slot implemented */
#define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */ #define PCI_EXP_FLAGS_IRQ 0x3e00 /* Interrupt message number */
#define PCI_EXP_FLAGS_FLIT 0x8000 /* Flit Mode Supported */
#define PCI_EXP_DEVCAP 0x04 /* Device capabilities */ #define PCI_EXP_DEVCAP 0x04 /* Device capabilities */
#define PCI_EXP_DEVCAP_PAYLOAD 0x00000007 /* Max_Payload_Size */ #define PCI_EXP_DEVCAP_PAYLOAD 0x00000007 /* Max_Payload_Size */
#define PCI_EXP_DEVCAP_PHANTOM 0x00000018 /* Phantom functions */ #define PCI_EXP_DEVCAP_PHANTOM 0x00000018 /* Phantom functions */
@ -795,6 +796,8 @@
#define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */ #define PCI_ERR_CAP_ECRC_CHKC 0x00000080 /* ECRC Check Capable */
#define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */ #define PCI_ERR_CAP_ECRC_CHKE 0x00000100 /* ECRC Check Enable */
#define PCI_ERR_CAP_PREFIX_LOG_PRESENT 0x00000800 /* TLP Prefix Log Present */ #define PCI_ERR_CAP_PREFIX_LOG_PRESENT 0x00000800 /* TLP Prefix Log Present */
#define PCI_ERR_CAP_TLP_LOG_FLIT 0x00040000 /* TLP was logged in Flit Mode */
#define PCI_ERR_CAP_TLP_LOG_SIZE 0x00f80000 /* Logged TLP Size (only in Flit mode) */
#define PCI_ERR_HEADER_LOG 0x1c /* Header Log Register (16 bytes) */ #define PCI_ERR_HEADER_LOG 0x1c /* Header Log Register (16 bytes) */
#define PCI_ERR_ROOT_COMMAND 0x2c /* Root Error Command */ #define PCI_ERR_ROOT_COMMAND 0x2c /* Root Error Command */
#define PCI_ERR_ROOT_CMD_COR_EN 0x00000001 /* Correctable Err Reporting Enable */ #define PCI_ERR_ROOT_CMD_COR_EN 0x00000001 /* Correctable Err Reporting Enable */
@ -1013,7 +1016,7 @@
/* Resizable BARs */ /* Resizable BARs */
#define PCI_REBAR_CAP 4 /* capability register */ #define PCI_REBAR_CAP 4 /* capability register */
#define PCI_REBAR_CAP_SIZES 0x00FFFFF0 /* supported BAR sizes */ #define PCI_REBAR_CAP_SIZES 0xFFFFFFF0 /* supported BAR sizes */
#define PCI_REBAR_CTRL 8 /* control register */ #define PCI_REBAR_CTRL 8 /* control register */
#define PCI_REBAR_CTRL_BAR_IDX 0x00000007 /* BAR index */ #define PCI_REBAR_CTRL_BAR_IDX 0x00000007 /* BAR index */
#define PCI_REBAR_CTRL_NBAR_MASK 0x000000E0 /* # of resizable BARs */ #define PCI_REBAR_CTRL_NBAR_MASK 0x000000E0 /* # of resizable BARs */
@ -1061,8 +1064,9 @@
#define PCI_EXP_DPC_CAP_RP_EXT 0x0020 /* Root Port Extensions */ #define PCI_EXP_DPC_CAP_RP_EXT 0x0020 /* Root Port Extensions */
#define PCI_EXP_DPC_CAP_POISONED_TLP 0x0040 /* Poisoned TLP Egress Blocking Supported */ #define PCI_EXP_DPC_CAP_POISONED_TLP 0x0040 /* Poisoned TLP Egress Blocking Supported */
#define PCI_EXP_DPC_CAP_SW_TRIGGER 0x0080 /* Software Triggering Supported */ #define PCI_EXP_DPC_CAP_SW_TRIGGER 0x0080 /* Software Triggering Supported */
#define PCI_EXP_DPC_RP_PIO_LOG_SIZE 0x0F00 /* RP PIO Log Size */ #define PCI_EXP_DPC_RP_PIO_LOG_SIZE 0x0F00 /* RP PIO Log Size [3:0] */
#define PCI_EXP_DPC_CAP_DL_ACTIVE 0x1000 /* ERR_COR signal on DL_Active supported */ #define PCI_EXP_DPC_CAP_DL_ACTIVE 0x1000 /* ERR_COR signal on DL_Active supported */
#define PCI_EXP_DPC_RP_PIO_LOG_SIZE4 0x2000 /* RP PIO Log Size [4] */
#define PCI_EXP_DPC_CTL 0x06 /* DPC control */ #define PCI_EXP_DPC_CTL 0x06 /* DPC control */
#define PCI_EXP_DPC_CTL_EN_FATAL 0x0001 /* Enable trigger on ERR_FATAL message */ #define PCI_EXP_DPC_CTL_EN_FATAL 0x0001 /* Enable trigger on ERR_FATAL message */
@ -1205,9 +1209,12 @@
#define PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX 0x000000ff #define PCI_DOE_DATA_OBJECT_DISC_REQ_3_INDEX 0x000000ff
#define PCI_DOE_DATA_OBJECT_DISC_REQ_3_VER 0x0000ff00 #define PCI_DOE_DATA_OBJECT_DISC_REQ_3_VER 0x0000ff00
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_VID 0x0000ffff #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_VID 0x0000ffff
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL 0x00ff0000 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_TYPE 0x00ff0000
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX 0xff000000 #define PCI_DOE_DATA_OBJECT_DISC_RSP_3_NEXT_INDEX 0xff000000
/* Deprecated old name, replaced with PCI_DOE_DATA_OBJECT_DISC_RSP_3_TYPE */
#define PCI_DOE_DATA_OBJECT_DISC_RSP_3_PROTOCOL PCI_DOE_DATA_OBJECT_DISC_RSP_3_TYPE
/* Compute Express Link (CXL r3.1, sec 8.1.5) */ /* Compute Express Link (CXL r3.1, sec 8.1.5) */
#define PCI_DVSEC_CXL_PORT 3 #define PCI_DVSEC_CXL_PORT 3
#define PCI_DVSEC_CXL_PORT_CTL 0x0c #define PCI_DVSEC_CXL_PORT_CTL 0x0c

View file

@ -327,6 +327,19 @@ struct virtio_net_rss_config {
uint8_t hash_key_data[/* hash_key_length */]; uint8_t hash_key_data[/* hash_key_length */];
}; };
struct virtio_net_rss_config_hdr {
uint32_t hash_types;
uint16_t indirection_table_mask;
uint16_t unclassified_queue;
uint16_t indirection_table[/* 1 + indirection_table_mask */];
};
struct virtio_net_rss_config_trailer {
uint16_t max_tx_vq;
uint8_t hash_key_length;
uint8_t hash_key_data[/* hash_key_length */];
};
#define VIRTIO_NET_CTRL_MQ_RSS_CONFIG 1 #define VIRTIO_NET_CTRL_MQ_RSS_CONFIG 1
/* /*

View file

@ -25,7 +25,7 @@ struct virtio_snd_config {
uint32_t streams; uint32_t streams;
/* # of available channel maps */ /* # of available channel maps */
uint32_t chmaps; uint32_t chmaps;
/* # of available control elements */ /* # of available control elements (if VIRTIO_SND_F_CTLS) */
uint32_t controls; uint32_t controls;
}; };

View file

@ -105,6 +105,7 @@ struct kvm_regs {
#define KVM_ARM_VCPU_PTRAUTH_ADDRESS 5 /* VCPU uses address authentication */ #define KVM_ARM_VCPU_PTRAUTH_ADDRESS 5 /* VCPU uses address authentication */
#define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */ #define KVM_ARM_VCPU_PTRAUTH_GENERIC 6 /* VCPU uses generic authentication */
#define KVM_ARM_VCPU_HAS_EL2 7 /* Support nested virtualization */ #define KVM_ARM_VCPU_HAS_EL2 7 /* Support nested virtualization */
#define KVM_ARM_VCPU_HAS_EL2_E2H0 8 /* Limit NV support to E2H RES0 */
struct kvm_vcpu_init { struct kvm_vcpu_init {
__u32 target; __u32 target;
@ -365,6 +366,7 @@ enum {
KVM_REG_ARM_STD_HYP_BIT_PV_TIME = 0, KVM_REG_ARM_STD_HYP_BIT_PV_TIME = 0,
}; };
/* Vendor hyper call function numbers 0-63 */
#define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2) #define KVM_REG_ARM_VENDOR_HYP_BMAP KVM_REG_ARM_FW_FEAT_BMAP_REG(2)
enum { enum {
@ -372,6 +374,14 @@ enum {
KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1, KVM_REG_ARM_VENDOR_HYP_BIT_PTP = 1,
}; };
/* Vendor hyper call function numbers 64-127 */
#define KVM_REG_ARM_VENDOR_HYP_BMAP_2 KVM_REG_ARM_FW_FEAT_BMAP_REG(3)
enum {
KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_VER = 0,
KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_CPUS = 1,
};
/* Device Control API on vm fd */ /* Device Control API on vm fd */
#define KVM_ARM_VM_SMCCC_CTRL 0 #define KVM_ARM_VM_SMCCC_CTRL 0
#define KVM_ARM_VM_SMCCC_FILTER 0 #define KVM_ARM_VM_SMCCC_FILTER 0
@ -394,6 +404,7 @@ enum {
#define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6 #define KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS 6
#define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7 #define KVM_DEV_ARM_VGIC_GRP_LEVEL_INFO 7
#define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8 #define KVM_DEV_ARM_VGIC_GRP_ITS_REGS 8
#define KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ 9
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10 #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT 10
#define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \ #define KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_MASK \
(0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT) (0x3fffffULL << KVM_DEV_ARM_VGIC_LINE_LEVEL_INFO_SHIFT)

View file

@ -323,6 +323,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_64_H */ #endif /* _ASM_UNISTD_64_H */

View file

@ -85,6 +85,7 @@
/* compatibility flags */ /* compatibility flags */
#define MAP_FILE 0 #define MAP_FILE 0
#define PKEY_UNRESTRICTED 0x0
#define PKEY_DISABLE_ACCESS 0x1 #define PKEY_DISABLE_ACCESS 0x1
#define PKEY_DISABLE_WRITE 0x2 #define PKEY_DISABLE_WRITE 0x2
#define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\ #define PKEY_ACCESS_MASK (PKEY_DISABLE_ACCESS |\

View file

@ -849,9 +849,11 @@ __SYSCALL(__NR_getxattrat, sys_getxattrat)
__SYSCALL(__NR_listxattrat, sys_listxattrat) __SYSCALL(__NR_listxattrat, sys_listxattrat)
#define __NR_removexattrat 466 #define __NR_removexattrat 466
__SYSCALL(__NR_removexattrat, sys_removexattrat) __SYSCALL(__NR_removexattrat, sys_removexattrat)
#define __NR_open_tree_attr 467
__SYSCALL(__NR_open_tree_attr, sys_open_tree_attr)
#undef __NR_syscalls #undef __NR_syscalls
#define __NR_syscalls 467 #define __NR_syscalls 468
/* /*
* 32 bit systems traditionally used different * 32 bit systems traditionally used different

View file

@ -319,6 +319,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_64_H */ #endif /* _ASM_UNISTD_64_H */

View file

@ -395,5 +395,6 @@
#define __NR_getxattrat (__NR_Linux + 464) #define __NR_getxattrat (__NR_Linux + 464)
#define __NR_listxattrat (__NR_Linux + 465) #define __NR_listxattrat (__NR_Linux + 465)
#define __NR_removexattrat (__NR_Linux + 466) #define __NR_removexattrat (__NR_Linux + 466)
#define __NR_open_tree_attr (__NR_Linux + 467)
#endif /* _ASM_UNISTD_N32_H */ #endif /* _ASM_UNISTD_N32_H */

View file

@ -371,5 +371,6 @@
#define __NR_getxattrat (__NR_Linux + 464) #define __NR_getxattrat (__NR_Linux + 464)
#define __NR_listxattrat (__NR_Linux + 465) #define __NR_listxattrat (__NR_Linux + 465)
#define __NR_removexattrat (__NR_Linux + 466) #define __NR_removexattrat (__NR_Linux + 466)
#define __NR_open_tree_attr (__NR_Linux + 467)
#endif /* _ASM_UNISTD_N64_H */ #endif /* _ASM_UNISTD_N64_H */

View file

@ -441,5 +441,6 @@
#define __NR_getxattrat (__NR_Linux + 464) #define __NR_getxattrat (__NR_Linux + 464)
#define __NR_listxattrat (__NR_Linux + 465) #define __NR_listxattrat (__NR_Linux + 465)
#define __NR_removexattrat (__NR_Linux + 466) #define __NR_removexattrat (__NR_Linux + 466)
#define __NR_open_tree_attr (__NR_Linux + 467)
#endif /* _ASM_UNISTD_O32_H */ #endif /* _ASM_UNISTD_O32_H */

View file

@ -448,6 +448,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_32_H */ #endif /* _ASM_UNISTD_32_H */

View file

@ -420,6 +420,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_64_H */ #endif /* _ASM_UNISTD_64_H */

View file

@ -182,6 +182,8 @@ enum KVM_RISCV_ISA_EXT_ID {
KVM_RISCV_ISA_EXT_SVVPTC, KVM_RISCV_ISA_EXT_SVVPTC,
KVM_RISCV_ISA_EXT_ZABHA, KVM_RISCV_ISA_EXT_ZABHA,
KVM_RISCV_ISA_EXT_ZICCRSE, KVM_RISCV_ISA_EXT_ZICCRSE,
KVM_RISCV_ISA_EXT_ZAAMO,
KVM_RISCV_ISA_EXT_ZALRSC,
KVM_RISCV_ISA_EXT_MAX, KVM_RISCV_ISA_EXT_MAX,
}; };

View file

@ -314,6 +314,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_32_H */ #endif /* _ASM_UNISTD_32_H */

View file

@ -324,6 +324,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_64_H */ #endif /* _ASM_UNISTD_64_H */

View file

@ -439,5 +439,6 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_S390_UNISTD_32_H */ #endif /* _ASM_S390_UNISTD_32_H */

View file

@ -387,5 +387,6 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_S390_UNISTD_64_H */ #endif /* _ASM_S390_UNISTD_64_H */

View file

@ -557,6 +557,9 @@ struct kvm_x86_mce {
#define KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE (1 << 7) #define KVM_XEN_HVM_CONFIG_PVCLOCK_TSC_UNSTABLE (1 << 7)
#define KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA (1 << 8) #define KVM_XEN_HVM_CONFIG_SHARED_INFO_HVA (1 << 8)
#define KVM_XEN_MSR_MIN_INDEX 0x40000000u
#define KVM_XEN_MSR_MAX_INDEX 0x4fffffffu
struct kvm_xen_hvm_config { struct kvm_xen_hvm_config {
__u32 flags; __u32 flags;
__u32 msr; __u32 msr;

View file

@ -457,6 +457,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_32_H */ #endif /* _ASM_UNISTD_32_H */

View file

@ -380,6 +380,7 @@
#define __NR_getxattrat 464 #define __NR_getxattrat 464
#define __NR_listxattrat 465 #define __NR_listxattrat 465
#define __NR_removexattrat 466 #define __NR_removexattrat 466
#define __NR_open_tree_attr 467
#endif /* _ASM_UNISTD_64_H */ #endif /* _ASM_UNISTD_64_H */

View file

@ -333,6 +333,7 @@
#define __NR_getxattrat (__X32_SYSCALL_BIT + 464) #define __NR_getxattrat (__X32_SYSCALL_BIT + 464)
#define __NR_listxattrat (__X32_SYSCALL_BIT + 465) #define __NR_listxattrat (__X32_SYSCALL_BIT + 465)
#define __NR_removexattrat (__X32_SYSCALL_BIT + 466) #define __NR_removexattrat (__X32_SYSCALL_BIT + 466)
#define __NR_open_tree_attr (__X32_SYSCALL_BIT + 467)
#define __NR_rt_sigaction (__X32_SYSCALL_BIT + 512) #define __NR_rt_sigaction (__X32_SYSCALL_BIT + 512)
#define __NR_rt_sigreturn (__X32_SYSCALL_BIT + 513) #define __NR_rt_sigreturn (__X32_SYSCALL_BIT + 513)
#define __NR_ioctl (__X32_SYSCALL_BIT + 514) #define __NR_ioctl (__X32_SYSCALL_BIT + 514)

View file

@ -4,13 +4,9 @@
#ifndef _LINUX_BITS_H #ifndef _LINUX_BITS_H
#define _LINUX_BITS_H #define _LINUX_BITS_H
#define __GENMASK(h, l) \ #define __GENMASK(h, l) (((~_UL(0)) << (l)) & (~_UL(0) >> (BITS_PER_LONG - 1 - (h))))
(((~_UL(0)) - (_UL(1) << (l)) + 1) & \
(~_UL(0) >> (__BITS_PER_LONG - 1 - (h))))
#define __GENMASK_ULL(h, l) \ #define __GENMASK_ULL(h, l) (((~_ULL(0)) << (l)) & (~_ULL(0) >> (BITS_PER_LONG_LONG - 1 - (h))))
(((~_ULL(0)) - (_ULL(1) << (l)) + 1) & \
(~_ULL(0) >> (__BITS_PER_LONG_LONG - 1 - (h))))
#define __GENMASK_U128(h, l) \ #define __GENMASK_U128(h, l) \
((_BIT128((h)) << 1) - (_BIT128(l))) ((_BIT128((h)) << 1) - (_BIT128(l)))

View file

@ -33,7 +33,7 @@
* Missing __asm__ support * Missing __asm__ support
* *
* __BIT128() would not work in the __asm__ code, as it shifts an * __BIT128() would not work in the __asm__ code, as it shifts an
* 'unsigned __init128' data type as direct representation of * 'unsigned __int128' data type as direct representation of
* 128 bit constants is not supported in the gcc compiler, as * 128 bit constants is not supported in the gcc compiler, as
* they get silently truncated. * they get silently truncated.
* *

View file

@ -55,6 +55,7 @@ enum {
IOMMUFD_CMD_VIOMMU_ALLOC = 0x90, IOMMUFD_CMD_VIOMMU_ALLOC = 0x90,
IOMMUFD_CMD_VDEVICE_ALLOC = 0x91, IOMMUFD_CMD_VDEVICE_ALLOC = 0x91,
IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92, IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92,
IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93,
}; };
/** /**
@ -392,6 +393,9 @@ struct iommu_vfio_ioas {
* Any domain attached to the non-PASID part of the * Any domain attached to the non-PASID part of the
* device must also be flagged, otherwise attaching a * device must also be flagged, otherwise attaching a
* PASID will blocked. * PASID will blocked.
* For the user that wants to attach PASID, ioas is
* not recommended for both the non-PASID part
* and PASID part of the device.
* If IOMMU does not support PASID it will return * If IOMMU does not support PASID it will return
* error (-EOPNOTSUPP). * error (-EOPNOTSUPP).
*/ */
@ -608,9 +612,17 @@ enum iommu_hw_info_type {
* IOMMU_HWPT_GET_DIRTY_BITMAP * IOMMU_HWPT_GET_DIRTY_BITMAP
* IOMMU_HWPT_SET_DIRTY_TRACKING * IOMMU_HWPT_SET_DIRTY_TRACKING
* *
* @IOMMU_HW_CAP_PCI_PASID_EXEC: Execute Permission Supported, user ignores it
* when the struct
* iommu_hw_info::out_max_pasid_log2 is zero.
* @IOMMU_HW_CAP_PCI_PASID_PRIV: Privileged Mode Supported, user ignores it
* when the struct
* iommu_hw_info::out_max_pasid_log2 is zero.
*/ */
enum iommufd_hw_capabilities { enum iommufd_hw_capabilities {
IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0, IOMMU_HW_CAP_DIRTY_TRACKING = 1 << 0,
IOMMU_HW_CAP_PCI_PASID_EXEC = 1 << 1,
IOMMU_HW_CAP_PCI_PASID_PRIV = 1 << 2,
}; };
/** /**
@ -626,6 +638,9 @@ enum iommufd_hw_capabilities {
* iommu_hw_info_type. * iommu_hw_info_type.
* @out_capabilities: Output the generic iommu capability info type as defined * @out_capabilities: Output the generic iommu capability info type as defined
* in the enum iommu_hw_capabilities. * in the enum iommu_hw_capabilities.
* @out_max_pasid_log2: Output the width of PASIDs. 0 means no PASID support.
* PCI devices turn to out_capabilities to check if the
* specific capabilities is supported or not.
* @__reserved: Must be 0 * @__reserved: Must be 0
* *
* Query an iommu type specific hardware information data from an iommu behind * Query an iommu type specific hardware information data from an iommu behind
@ -649,7 +664,8 @@ struct iommu_hw_info {
__u32 data_len; __u32 data_len;
__aligned_u64 data_uptr; __aligned_u64 data_uptr;
__u32 out_data_type; __u32 out_data_type;
__u32 __reserved; __u8 out_max_pasid_log2;
__u8 __reserved[3];
__aligned_u64 out_capabilities; __aligned_u64 out_capabilities;
}; };
#define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO) #define IOMMU_GET_HW_INFO _IO(IOMMUFD_TYPE, IOMMUFD_CMD_GET_HW_INFO)
@ -1014,4 +1030,115 @@ struct iommu_ioas_change_process {
#define IOMMU_IOAS_CHANGE_PROCESS \ #define IOMMU_IOAS_CHANGE_PROCESS \
_IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_CHANGE_PROCESS) _IO(IOMMUFD_TYPE, IOMMUFD_CMD_IOAS_CHANGE_PROCESS)
/**
* enum iommu_veventq_flag - flag for struct iommufd_vevent_header
* @IOMMU_VEVENTQ_FLAG_LOST_EVENTS: vEVENTQ has lost vEVENTs
*/
enum iommu_veventq_flag {
IOMMU_VEVENTQ_FLAG_LOST_EVENTS = (1U << 0),
};
/**
* struct iommufd_vevent_header - Virtual Event Header for a vEVENTQ Status
* @flags: Combination of enum iommu_veventq_flag
* @sequence: The sequence index of a vEVENT in the vEVENTQ, with a range of
* [0, INT_MAX] where the following index of INT_MAX is 0
*
* Each iommufd_vevent_header reports a sequence index of the following vEVENT:
*
* +----------------------+-------+----------------------+-------+---+-------+
* | header0 {sequence=0} | data0 | header1 {sequence=1} | data1 |...| dataN |
* +----------------------+-------+----------------------+-------+---+-------+
*
* And this sequence index is expected to be monotonic to the sequence index of
* the previous vEVENT. If two adjacent sequence indexes has a delta larger than
* 1, it means that delta - 1 number of vEVENTs has lost, e.g. two lost vEVENTs:
*
* +-----+----------------------+-------+----------------------+-------+-----+
* | ... | header3 {sequence=3} | data3 | header6 {sequence=6} | data6 | ... |
* +-----+----------------------+-------+----------------------+-------+-----+
*
* If a vEVENT lost at the tail of the vEVENTQ and there is no following vEVENT
* providing the next sequence index, an IOMMU_VEVENTQ_FLAG_LOST_EVENTS header
* would be added to the tail, and no data would follow this header:
*
* +--+----------------------+-------+-----------------------------------------+
* |..| header3 {sequence=3} | data3 | header4 {flags=LOST_EVENTS, sequence=4} |
* +--+----------------------+-------+-----------------------------------------+
*/
struct iommufd_vevent_header {
__u32 flags;
__u32 sequence;
};
/**
* enum iommu_veventq_type - Virtual Event Queue Type
* @IOMMU_VEVENTQ_TYPE_DEFAULT: Reserved for future use
* @IOMMU_VEVENTQ_TYPE_ARM_SMMUV3: ARM SMMUv3 Virtual Event Queue
*/
enum iommu_veventq_type {
IOMMU_VEVENTQ_TYPE_DEFAULT = 0,
IOMMU_VEVENTQ_TYPE_ARM_SMMUV3 = 1,
};
/**
* struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event
* (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3)
* @evt: 256-bit ARM SMMUv3 Event record, little-endian.
* Reported event records: (Refer to "7.3 Event records" in SMMUv3 HW Spec)
* - 0x04 C_BAD_STE
* - 0x06 F_STREAM_DISABLED
* - 0x08 C_BAD_SUBSTREAMID
* - 0x0a C_BAD_CD
* - 0x10 F_TRANSLATION
* - 0x11 F_ADDR_SIZE
* - 0x12 F_ACCESS
* - 0x13 F_PERMISSION
*
* StreamID field reports a virtual device ID. To receive a virtual event for a
* device, a vDEVICE must be allocated via IOMMU_VDEVICE_ALLOC.
*/
struct iommu_vevent_arm_smmuv3 {
__aligned_le64 evt[4];
};
/**
* struct iommu_veventq_alloc - ioctl(IOMMU_VEVENTQ_ALLOC)
* @size: sizeof(struct iommu_veventq_alloc)
* @flags: Must be 0
* @viommu_id: virtual IOMMU ID to associate the vEVENTQ with
* @type: Type of the vEVENTQ. Must be defined in enum iommu_veventq_type
* @veventq_depth: Maximum number of events in the vEVENTQ
* @out_veventq_id: The ID of the new vEVENTQ
* @out_veventq_fd: The fd of the new vEVENTQ. User space must close the
* successfully returned fd after using it
* @__reserved: Must be 0
*
* Explicitly allocate a virtual event queue interface for a vIOMMU. A vIOMMU
* can have multiple FDs for different types, but is confined to one per @type.
* User space should open the @out_veventq_fd to read vEVENTs out of a vEVENTQ,
* if there are vEVENTs available. A vEVENTQ will lose events due to overflow,
* if the number of the vEVENTs hits @veventq_depth.
*
* Each vEVENT in a vEVENTQ encloses a struct iommufd_vevent_header followed by
* a type-specific data structure, in a normal case:
*
* +-+---------+-------+---------+-------+-----+---------+-------+-+
* | | header0 | data0 | header1 | data1 | ... | headerN | dataN | |
* +-+---------+-------+---------+-------+-----+---------+-------+-+
*
* unless a tailing IOMMU_VEVENTQ_FLAG_LOST_EVENTS header is logged (refer to
* struct iommufd_vevent_header).
*/
struct iommu_veventq_alloc {
__u32 size;
__u32 flags;
__u32 viommu_id;
__u32 type;
__u32 veventq_depth;
__u32 out_veventq_id;
__u32 out_veventq_fd;
__u32 __reserved;
};
#define IOMMU_VEVENTQ_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VEVENTQ_ALLOC)
#endif #endif

View file

@ -921,6 +921,7 @@ struct kvm_enable_cap {
#define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_PRE_FAULT_MEMORY 236
#define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237
#define KVM_CAP_X86_GUEST_MODE 238 #define KVM_CAP_X86_GUEST_MODE 238
#define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239
struct kvm_irq_routing_irqchip { struct kvm_irq_routing_irqchip {
__u32 irqchip; __u32 irqchip;

View file

@ -73,13 +73,20 @@ typedef enum {
SEV_RET_INVALID_PARAM, SEV_RET_INVALID_PARAM,
SEV_RET_RESOURCE_LIMIT, SEV_RET_RESOURCE_LIMIT,
SEV_RET_SECURE_DATA_INVALID, SEV_RET_SECURE_DATA_INVALID,
SEV_RET_INVALID_KEY = 0x27, SEV_RET_INVALID_PAGE_SIZE = 0x0019,
SEV_RET_INVALID_PAGE_SIZE, SEV_RET_INVALID_PAGE_STATE = 0x001A,
SEV_RET_INVALID_PAGE_STATE, SEV_RET_INVALID_MDATA_ENTRY = 0x001B,
SEV_RET_INVALID_MDATA_ENTRY, SEV_RET_INVALID_PAGE_OWNER = 0x001C,
SEV_RET_INVALID_PAGE_OWNER, SEV_RET_AEAD_OFLOW = 0x001D,
SEV_RET_INVALID_PAGE_AEAD_OFLOW, SEV_RET_EXIT_RING_BUFFER = 0x001F,
SEV_RET_RMP_INIT_REQUIRED, SEV_RET_RMP_INIT_REQUIRED = 0x0020,
SEV_RET_BAD_SVN = 0x0021,
SEV_RET_BAD_VERSION = 0x0022,
SEV_RET_SHUTDOWN_REQUIRED = 0x0023,
SEV_RET_UPDATE_FAILED = 0x0024,
SEV_RET_RESTORE_REQUIRED = 0x0025,
SEV_RET_RMP_INITIALIZATION_FAILED = 0x0026,
SEV_RET_INVALID_KEY = 0x0027,
SEV_RET_MAX, SEV_RET_MAX,
} sev_ret_code; } sev_ret_code;

View file

@ -70,4 +70,6 @@
#define __counted_by_be(m) #define __counted_by_be(m)
#endif #endif
#define __kernel_nonstring
#endif /* _LINUX_STDDEF_H */ #endif /* _LINUX_STDDEF_H */

View file

@ -671,6 +671,7 @@ enum {
*/ */
enum { enum {
VFIO_AP_REQ_IRQ_INDEX, VFIO_AP_REQ_IRQ_INDEX,
VFIO_AP_CFG_CHG_IRQ_INDEX,
VFIO_AP_NUM_IRQS VFIO_AP_NUM_IRQS
}; };
@ -931,29 +932,34 @@ struct vfio_device_bind_iommufd {
* VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19, * VFIO_DEVICE_ATTACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 19,
* struct vfio_device_attach_iommufd_pt) * struct vfio_device_attach_iommufd_pt)
* @argsz: User filled size of this data. * @argsz: User filled size of this data.
* @flags: Must be 0. * @flags: Flags for attach.
* @pt_id: Input the target id which can represent an ioas or a hwpt * @pt_id: Input the target id which can represent an ioas or a hwpt
* allocated via iommufd subsystem. * allocated via iommufd subsystem.
* Output the input ioas id or the attached hwpt id which could * Output the input ioas id or the attached hwpt id which could
* be the specified hwpt itself or a hwpt automatically created * be the specified hwpt itself or a hwpt automatically created
* for the specified ioas by kernel during the attachment. * for the specified ioas by kernel during the attachment.
* @pasid: The pasid to be attached, only meaningful when
* VFIO_DEVICE_ATTACH_PASID is set in @flags
* *
* Associate the device with an address space within the bound iommufd. * Associate the device with an address space within the bound iommufd.
* Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only * Undo by VFIO_DEVICE_DETACH_IOMMUFD_PT or device fd close. This is only
* allowed on cdev fds. * allowed on cdev fds.
* *
* If a vfio device is currently attached to a valid hw_pagetable, without doing * If a vfio device or a pasid of this device is currently attached to a valid
* a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl * hw_pagetable (hwpt), without doing a VFIO_DEVICE_DETACH_IOMMUFD_PT, a second
* passing in another hw_pagetable (hwpt) id is allowed. This action, also known * VFIO_DEVICE_ATTACH_IOMMUFD_PT ioctl passing in another hwpt id is allowed.
* as a hw_pagetable replacement, will replace the device's currently attached * This action, also known as a hw_pagetable replacement, will replace the
* hw_pagetable with a new hw_pagetable corresponding to the given pt_id. * currently attached hwpt of the device or the pasid of this device with a new
* hwpt corresponding to the given pt_id.
* *
* Return: 0 on success, -errno on failure. * Return: 0 on success, -errno on failure.
*/ */
struct vfio_device_attach_iommufd_pt { struct vfio_device_attach_iommufd_pt {
__u32 argsz; __u32 argsz;
__u32 flags; __u32 flags;
#define VFIO_DEVICE_ATTACH_PASID (1 << 0)
__u32 pt_id; __u32 pt_id;
__u32 pasid;
}; };
#define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19) #define VFIO_DEVICE_ATTACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 19)
@ -962,17 +968,21 @@ struct vfio_device_attach_iommufd_pt {
* VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20, * VFIO_DEVICE_DETACH_IOMMUFD_PT - _IOW(VFIO_TYPE, VFIO_BASE + 20,
* struct vfio_device_detach_iommufd_pt) * struct vfio_device_detach_iommufd_pt)
* @argsz: User filled size of this data. * @argsz: User filled size of this data.
* @flags: Must be 0. * @flags: Flags for detach.
* @pasid: The pasid to be detached, only meaningful when
* VFIO_DEVICE_DETACH_PASID is set in @flags
* *
* Remove the association of the device and its current associated address * Remove the association of the device or a pasid of the device and its current
* space. After it, the device should be in a blocking DMA state. This is only * associated address space. After it, the device or the pasid should be in a
* allowed on cdev fds. * blocking DMA state. This is only allowed on cdev fds.
* *
* Return: 0 on success, -errno on failure. * Return: 0 on success, -errno on failure.
*/ */
struct vfio_device_detach_iommufd_pt { struct vfio_device_detach_iommufd_pt {
__u32 argsz; __u32 argsz;
__u32 flags; __u32 flags;
#define VFIO_DEVICE_DETACH_PASID (1 << 0)
__u32 pasid;
}; };
#define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20) #define VFIO_DEVICE_DETACH_IOMMUFD_PT _IO(VFIO_TYPE, VFIO_BASE + 20)

View file

@ -177,7 +177,7 @@ EOF
# Remove everything except the macros from bootparam.h avoiding the # Remove everything except the macros from bootparam.h avoiding the
# unnecessary import of several video/ist/etc headers # unnecessary import of several video/ist/etc headers
sed -e '/__ASSEMBLY__/,/__ASSEMBLY__/d' \ sed -e '/__ASSEMBLER__/,/__ASSEMBLER__/d' \
"$hdrdir/include/asm/bootparam.h" > "$hdrdir/bootparam.h" "$hdrdir/include/asm/bootparam.h" > "$hdrdir/bootparam.h"
cp_portable "$hdrdir/bootparam.h" \ cp_portable "$hdrdir/bootparam.h" \
"$output/include/standard-headers/asm-$arch" "$output/include/standard-headers/asm-$arch"