Skip to content

frame: Handle reparent failure avoiding duplicate stack windows instances

In some cases, reparenting a window with its frame may fail; this seems
to happen especially during initialization of a window that may be
unmapped and re-mapped quickly and multiple times.

If this happens, we're never going to receive a remove event on the
stack tracker and so we may end up adding it twice to the list of the
windows to synchronize with the compositor, breaking its assumption that
the stack list is unique, and eventually leading to a crash because we
do not end up removing all the instances of a window on its destruction.

In particular we may end up in this situation:

  Syncing Window 10485927: 0x555558863540 (actual xid is 10485927),
    user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
  Syncing Window 14680081: 0x5555588664b0 (actual xid is 14680081),
    user time is 14680082 frame is 0x555558871d80, frame xid 6291595
  Syncing Window 6291460: 0x55555796dc80 (actual xid is 10485763),
    user time is 10485764 frame is 0x555557a6f630, frame xid 6291460
  Syncing Window 6291465: 0x555557a68af0 (actual xid is 14680067),
    user time is 14680068 frame is 0x555557a73e80, frame xid 6291465
  Syncing Window 6291509: 0x555557f9d830 (actual xid is 8388623),
    user time is 0 frame is 0x555557fac780, frame xid 6291509
  Syncing Window 6291586: 0x5555586e1690 (actual xid is 4194363),
    user time is 0 frame is 0x55555886e550, frame xid 6291586
  Syncing Window 6291591: 0x555558863540 (actual xid is 10485927),
    user time is 10485928 frame is 0x5555588715c0, frame xid 6291591

Where the same meta window 0x555558863540 is added twice because that's
both mapped by the window itself (10485927) and by its frame (6291591).

This happens because for historical reasons the xids hash table managed
by the x11-display maps both the X11 windows, their frames and their
user time windows as the meta-window, and so if we don't filter out them
properly we end up duplicating the entries in the compositor list.

Such duplicates finally end up making mutter to crash in
meta_compositor_sync_stack() because we could end up trying to access
to an invalid window, given its actor has been destroyed but not all the
instances have been removed from the compositor windows list:

  0x00007ffff71059 in meta_compositor_sync_stack (compositor=0x555555b8,
  stack=0x555558701b80) at ../../mutter/src/compositor/compositor.c:773
  773	          if ((old_window->hidden || old_window->unmanaging) &&
  (gdb) print old_window
  $1 = (MetaWindow *) 0x0

So, in order to prevent this, check that XReparentWindow does not fail,
and in case of failure, reset the window state to the one it had before
we failed and more importantly, remove the association between the frame
X11 window and the MetaWindow, since this is not true anymore and so
that at the next stack synchronization there won't be any meta window
associated to that frame XID (unless there aren't further stack changes
impacting on that).

In particular these are some logs that may be useful to see what was happening.

The original crash was lead to:

Thread 1 "gnome-shell" received signal SIGSEGV, Segmentation fault.
0x00007ffff71059de in meta_compositor_sync_stack (compositor=0x555555b85670, 
    stack=0x555558701b80) at ../../mutter/src/compositor/compositor.c:773
773	          if ((old_window->hidden || old_window->unmanaging) &&
(gdb) print old_window
$1 = (MetaWindow *) 0x0
(gdb) list
768	      while (old_stack)
769	        {
770	          old_actor = old_stack->data;
771	          old_window = meta_window_actor_get_meta_window (old_actor);
772	
773	          if ((old_window->hidden || old_window->unmanaging) &&
774	              !meta_window_actor_effect_in_progress (old_actor))
775	            {
776	              old_stack = g_list_delete_link (old_stack, old_stack);
777	              old_actor = NULL;
(gdb) print old_actor
$2 = (MetaWindowActor *) 0x5555588b2700
(gdb) print *old_actor 
$3 = {parent_instance = {parent_instance = {g_type_instance = {g_class = 0x555557967830}, 
      ref_count = 1, qdata = 0x5555588c0b40}, flags = 0, private_flags = 0, 
    priv = 0x5555588b23b0}}
(gdb) call g_type_class_get_instance_private_offset (g_type_class_peek_static (meta_window_actor_get_type()))
$4 = -928
(gdb) print *(MetaWindowActorPrivate*) ($2+$4)
$6 = {window = 0x0, compositor = 0x555555b85670, stage_views_changed_id = 0, surface = 0x0, 
  surface_actors = 0x0, geometry_scale = 1, minimize_in_progress = 0, 
  unminimize_in_progress = 0, size_change_in_progress = 0, map_in_progress = 0, 
  destroy_in_progress = 0, freeze_count = 0, screen_cast_usage_count = 0, visible = 0, 
  disposed = 1, needs_destroy = 1, updates_frozen = 0, first_frame_state = 2}

The full stacktrace is like the one in this RH bug (or this or that).

The window s infos when we were adding a duplicate window to the windows list:

  Window 10485927: 0x555558863540 (actual xid is 10485927),
    user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
  Window 14680081: 0x5555588664b0 (actual xid is 14680081),
    user time is 14680082 frame is 0x555558871d80, frame xid 6291595
  Window 6291460: 0x55555796dc80 (actual xid is 10485763),
    user time is 10485764 frame is 0x555557a6f630, frame xid 6291460
  Window 6291465: 0x555557a68af0 (actual xid is 14680067),
    user time is 14680068 frame is 0x555557a73e80, frame xid 6291465
  Window 6291509: 0x555557f9d830 (actual xid is 8388623),
    user time is 0 frame is 0x555557fac780, frame xid 6291509
  Window 6291586: 0x5555586e1690 (actual xid is 4194363),
    user time is 0 frame is 0x55555886e550, frame xid 6291586
  Window 6291591: 0x555558863540 (actual xid is 10485927),
    user time is 10485928 frame is 0x5555588715c0, frame xid 6291591

6291591 "hexchat": ("mutter-x11-frames" "mutter-x11-frames")  76x114+2+15  +2+15
        1 child:
        6291592 (has no name): ()  1x1+-1+-1  +1+14

10485927 "hexchat": ("hexchat" "Hexchat")  48x48+14+49  +14+49
        1 child:
        10485928 (has no name): ()  1x1+-1+-1  +13+48

xwininfo: Window id: 6291591 "hexchat"

  Absolute upper-left X:  2
  Absolute upper-left Y:  15
  Relative upper-left X:  2
  Relative upper-left Y:  15
  Width: 76
  Height: 114
  Depth: 32
  Visual: 0x3e9
  Visual Class: TrueColor
  Border width: 0
  Class: InputOutput
  Colormap: 0x600001 (not installed)
  Bit Gravity State: NorthWestGravity
  Window Gravity State: NorthWestGravity
  Backing Store State: NotUseful
  Save Under State: no
  Map State: IsViewable
  Override Redirect State: no
  Corners:  +2+15  -1522+15  -1522-1071  +2-1071
  -geometry 76x114+2+15

xwininfo: Window id: 10485927 "hexchat"

  Absolute upper-left X:  14
  Absolute upper-left Y:  49
  Relative upper-left X:  14
  Relative upper-left Y:  49
  Width: 48
  Height: 48
  Depth: 24
  Visual: 0x21
  Visual Class: TrueColor
  Border width: 0
  Class: InputOutput
  Colormap: 0x20 (installed)
  Bit Gravity State: NorthWestGravity
  Window Gravity State: NorthWestGravity
  Backing Store State: NotUseful
  Save Under State: no
  Map State: IsViewable
  Override Redirect State: no
  Corners:  +14+49  -1538+49  -1538-1103  +14-1103
  -geometry 48x48+14+49


(gdb) print meta_window
$1 = (MetaWindow *) 0x555558863540
(gdb) print *meta_window
$2 = {parent_instance = {g_type_instance = {g_class = 0x55555796d860}, ref_count = 4, 
    qdata = 0x555558871801}, display = 0x555555b73430, id = 315847097, stamp = 4294967313, 
  monitor = 0x555555595ad0, highest_scale_monitor = 0x555555595ad0, 
  workspace = 0x555555b86200, client_type = META_WINDOW_CLIENT_TYPE_X11, 
  frame = 0x5555588715c0, depth = 24, desc = 0x555558867610 "0xa000a7", 
  title = 0x5555586beb00 "hexchat", type = META_WINDOW_NORMAL, 
  res_class = 0x55555885ca90 "Hexchat", res_name = 0x555558867960 "hexchat", role = 0x0, 
  startup_id = 0x0, mutter_hints = 0x0, sandboxed_app_id = 0x0, gtk_theme_variant = 0x0, 
  gtk_application_id = 0x0, gtk_unique_bus_name = 0x0, gtk_application_object_path = 0x0, 
  gtk_window_object_path = 0x0, gtk_app_menu_object_path = 0x0, gtk_menubar_object_path = 0x0, 
  transient_for = 0x0, initial_workspace = 0, initial_timestamp = 0, 
  tile_mode = META_TILE_NONE, tile_monitor_number = -1, edge_constraints = {
    top = META_EDGE_CONSTRAINT_NONE, right = META_EDGE_CONSTRAINT_NONE, 
    bottom = META_EDGE_CONSTRAINT_NONE, left = META_EDGE_CONSTRAINT_NONE}, 
  tile_hfraction = -1, preferred_output_winsys_id = 1031, fullscreen_monitors = {top = 0x0, 
    bottom = 0x0, left = 0x0, right = 0x0}, frame_bounds = 0x0, opacity = 255 '\377', 
  struts = 0x0, unmaps_pending = 1, reparents_pending = 1, stable_sequence = 18, 
  net_wm_user_time = 83208168, has_custom_frame_extents = 0, custom_frame_extents = {left = 0, 
    right = 0, top = 0, bottom = 0}, rect = {x = 16, y = 27, width = 48, height = 85}, 
  saved_rect = {x = 16, y = 27, width = 48, height = 48}, saved_rect_fullscreen = {x = 16, 
    y = 27, width = 48, height = 48}, unconstrained_rect = {x = 16, y = 27, width = 48, 
    height = 85}, buffer_rect = {x = 2, y = 15, width = 76, height = 114}, icon_geometry = {
    x = 0, y = 0, width = 0, height = 0}, size_hints = {flags = 1008, x = 16, y = 27, 
    width = 48, height = 48, min_width = 48, min_height = 48, max_width = 2147483647, 
    max_height = 2147483647, width_inc = 1, height_inc = 1, min_aspect = {x = 1, 
      y = 2147483647}, max_aspect = {x = 2147483647, y = 1}, base_width = 48, 
    base_height = 48, win_gravity = 1}, layer = META_LAYER_NORMAL, stack_position = 4, 
  close_dialog = 0x0, compositor_private = 0x55555886c2b0, attached_focus_window = 0x0, 
  tile_match = 0x0, placement = {rule = 0x0, state = META_PLACEMENT_STATE_UNCONSTRAINED, 
    pending = {x = 0, y = 0, rel_x = 0, rel_y = 0}, current = {rel_x = 0, rel_y = 0}}, 
  close_dialog_timeout_id = 0, client_pid = 644877, has_valid_cgroup = 1, cgroup_path = 0x0, 
  events_during_ping = 0, override_redirect = 0, maximized_horizontally = 0, 
  maximized_vertically = 0, maximize_horizontally_after_placement = 0, 
  maximize_vertically_after_placement = 0, minimize_after_placement = 0, saved_maximize = 0, 
  fullscreen = 0, urgent = 0, require_fully_onscreen = 1, require_on_single_monitor = 1, 
  require_titlebar_visible = 1, on_all_workspaces = 0, on_all_workspaces_requested = 0, 
  minimized = 0, mapped = 1, hidden = 0, visible_to_compositor = 1, known_to_compositor = 1, 
  pending_compositor_effect = 4, iconic = 0, initially_iconic = 0, initial_workspace_set = 0, 
  initial_timestamp_set = 0, net_wm_user_time_set = 0, icon_geometry_set = 0, input = 1, 
  mwm_decorated = 1, mwm_border_only = 0, mwm_has_close_func = 1, mwm_has_minimize_func = 1, 
  mwm_has_maximize_func = 1, mwm_has_move_func = 1, mwm_has_resize_func = 1, decorated = 1, 
  border_only = 0, always_sticky = 0, has_close_func = 1, has_minimize_func = 1, 
  has_maximize_func = 1, has_move_func = 1, has_resize_func = 1, has_fullscreen_func = 1, 
  skip_taskbar = 0, skip_pager = 0, skip_from_window_list = 0, wm_state_above = 0, 
  wm_state_below = 0, wm_state_demands_attention = 0, has_focus = 0, appears_focused = 0, 
  placed = 1, denied_focus_and_not_transient = 0, showing_for_first_time = 0, unmanaging = 0, 
--Type <RET> for more, q to quit, c to continue without paging--
  constructing = 0, withdrawn = 0, calc_placement = 0, have_focus_click_grab = 1, 
  attached = 0, is_remote = 0, restore_focus_on_map = 0, is_alive = 1, in_workspace_change = 0}
(gdb) print *meta_window->frame
$6 = {window = 0x555558863540, xwindow = 6291591, rect = {x = 2, y = 15, width = 76, 
    height = 114}, cached_borders = {visible = {left = 0, right = 0, top = 37, bottom = 0}, 
    invisible = {left = 14, right = 14, top = 12, bottom = 17}, total = {left = 14, 
      right = 14, top = 49, bottom = 17}}, opaque_region = 0x0, sync_counter = {
    window = 0x555558863540, xwindow = 6291591, sync_request_counter = 6291594, 
    sync_request_serial = 2, sync_request_wait_serial = 0, sync_request_timeout_id = 0, 
    sync_request_alarm = 4194381, frame_drawn_time = 0, frames = 0x5555586e0d50, 
    extended_sync_request_counter = 1, disabled = 0, needs_frame_drawn = 1}, child_x = 14, 
  child_y = 49, right_width = 14, bottom_height = 17, borders_cached = 1}
(gdb) call g_type_class_get_instance_private_offset (g_type_class_peek_static (meta_window_x11_get_type()))
$3 = -400
(gdb) print (*(MetaWindowX11Private*) ($1+$3))
$4 = {wm_state_skip_taskbar = 0, wm_state_skip_pager = 0, wm_take_focus = 0, wm_ping = 0, 
  wm_delete_window = 0, wm_state_modal = 0, using_net_wm_name = 0, 
  using_net_wm_visible_name = 0, type_atom = 0, attributes = {x = 0, y = 0, width = 0, 
    height = 0, border_width = 0, depth = 0, visual = 0x0, root = 0, class = 0, 
    bit_gravity = 0, win_gravity = 0, backing_store = 0, backing_planes = 0, 
    backing_pixel = 0, save_under = 0, colormap = 0, map_installed = 0, map_state = 0, 
    all_event_masks = 0, your_event_mask = 0, do_not_propagate_mask = 0, 
    override_redirect = 0, screen = 0x0}, border_width = 0, showing_resize_popup = 0, 
  client_rect = {x = 0, y = 0, width = 0, height = 0}, opaque_region = 0x0, 
  input_region = 0x0, shape_region = 0x0, wm_hints_pixmap = 0, wm_hints_mask = 0, 
  thaw_after_paint = 0, xvisual = 0x0, xwindow = 0, xclient_leader = 0, xgroup_leader = 0, 
  user_time_window = 0, bypass_compositor = META_BYPASS_COMPOSITOR_HINT_AUTO, group = 0x0, 
  sync_counter = {window = 0x0, xwindow = 0, sync_request_counter = 0, 
    sync_request_serial = 0, sync_request_wait_serial = 0, sync_request_timeout_id = 0, 
    sync_request_alarm = 0, frame_drawn_time = 0, frames = 0x0, 
    extended_sync_request_counter = 0, disabled = 0, needs_frame_drawn = 0}, keys_grabbed = 0, 
  grab_on_frame = 0, wm_client_machine = 0x0, sm_client_id = 0x0}

A full log of when the duplicate window ended up being added to the compositor list: duplicate-windows-to-stack-compositor.log

Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2213872 https://bugzilla.redhat.com/show_bug.cgi?id=2021919 https://bugzilla.redhat.com/show_bug.cgi?id=1544267

Edited by Marco Trevisan

Merge request reports