AioContext: speed up aio_notify

In many cases, the call to event_notifier_set in aio_notify is unnecessary. In particular, if we are executing aio_dispatch, or if aio_poll is not blocking, we know that we will soon get to the next loop iteration (if necessary); the thread that hosts the AioContext's event loop does not need any nudging. The patch includes a Promela formal model that shows that this really works and does not need any further complication such as generation counts. It needs a memory barrier though. The generation counts are not needed because any change to ctx->dispatching after the memory barrier is okay for aio_notify. If it changes from zero to one, it is the right thing to skip event_notifier_set. If it changes from one to zero, the event_notifier_set is unnecessary but harmless. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2025-12-11 16:00:50 -07:00 · 2014-07-07 15:18:04 +02:00 · 2014-07-07 15:18:04 +02:00 · 0ceb849bd3
commit 0ceb849bd3
parent ef508f427b
4 changed files with 164 additions and 2 deletions
--- a/aio-posix.c
+++ b/aio-posix.c
@ -175,11 +175,38 @@ static bool aio_dispatch(AioContext *ctx)
 bool aio_poll(AioContext *ctx, bool blocking)
 {
    AioHandler *node;
+    bool was_dispatching;
    int ret;
    bool progress;

+    was_dispatching = ctx->dispatching;
    progress = false;

+    /* aio_notify can avoid the expensive event_notifier_set if
+     * everything (file descriptors, bottom halves, timers) will
+     * be re-evaluated before the next blocking poll().  This happens
+     * in two cases:
+     *
+     * 1) when aio_poll is called with blocking == false
+     *
+     * 2) when we are called after poll().  If we are called before
+     *    poll(), bottom halves will not be re-evaluated and we need
+     *    aio_notify() if blocking == true.
+     *
+     * The first aio_dispatch() only does something when AioContext is
+     * running as a GSource, and in that case aio_poll is used only
+     * with blocking == false, so this optimization is already quite
+     * effective.  However, the code is ugly and should be restructured
+     * to have a single aio_dispatch() call.  To do this, we need to
+     * reorganize aio_poll into a prepare/poll/dispatch model like
+     * glib's.
+     *
+     * If we're in a nested event loop, ctx->dispatching might be true.
+     * In that case we can restore it just before returning, but we
+     * have to clear it now.
+     */
+    aio_set_dispatching(ctx, !blocking);
+
    /*
     * If there are callbacks left that have been queued, we need to call them.
     * Do not call select in this case, because it is possible that the caller
@ -190,12 +217,14 @@ bool aio_poll(AioContext *ctx, bool blocking)
        progress = true;
    }

+    /* Re-evaluate condition (1) above.  */
+    aio_set_dispatching(ctx, !blocking);
    if (aio_dispatch(ctx)) {
        progress = true;
    }

    if (progress && !blocking) {
-        return true;
+        goto out;
    }

    ctx->walking_handlers++;
@ -234,9 +263,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
    }

    /* Run dispatch even if there were no readable fds to run timers */
+    aio_set_dispatching(ctx, true);
    if (aio_dispatch(ctx)) {
        progress = true;
    }

+out:
+    aio_set_dispatching(ctx, was_dispatching);
    return progress;
 }