add non-arbitrary migration stop condition

Currently, we're entering migration's stage 3 when a treshold of 10 pages remain to be transferred in the system. This has hurt some users. However, any proposed threshold is arbitrary by nature, and would only shift the annoyance. The proposal of this patch is to define a max_downtime variable, which represents the maximum downtime a migration user is willing to suffer. Then, based on the bandwidth of last iteration, we calculate how much data we can transfer in such a window of time. Whenever we reach that value (or lower), we know is safe to enter stage3. This has largely improved the situation for me. On localhost migrations, where one would expect things to go as quickly as me running away from the duty of writting software for windows, a kernel compile was enough to get the migration stuck. It takes 20 ~ 30 iterations now. Signed-off-by: Glauber Costa <glommer@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2025-07-27 20:33:54 -06:00 · 2009-05-28 15:22:57 -04:00 · 2009-05-28 15:22:57 -04:00 · a0a3fd60f6
commit a0a3fd60f6
parent 8c14c17395
3 changed files with 30 additions and 2 deletions
--- a/migration.c
+++ b/migration.c
@ -107,6 +107,17 @@ void do_migrate_set_speed(Monitor *mon, const char *value)
    
 }

+/* amount of nanoseconds we are willing to wait for migration to be down.
+ * the choice of nanoseconds is because it is the maximum resolution that
+ * get_clock() can achieve. It is an internal measure. All user-visible
+ * units must be in seconds */
+static uint64_t max_downtime = 30000000;
+
+uint64_t migrate_max_downtime(void)
+{
+    return max_downtime;
+}
+
 void do_info_migrate(Monitor *mon)
 {
    MigrationState *s = current_migration;