frame: Handle reparent failure avoiding duplicate stack windows instances
In some cases, reparenting a window with its frame may fail; this seems
to happen especially during initialization of a window that may be
unmapped and re-mapped quickly and multiple times.
If this happens, we're never going to receive a remove event on the
stack tracker and so we may end up adding it twice to the list of the
windows to synchronize with the compositor, breaking its assumption that
the stack list is unique, and eventually leading to a crash because we
do not end up removing all the instances of a window on its destruction.
In particular we may end up in this situation:
Syncing Window 10485927: 0x555558863540 (actual xid is 10485927),
user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
Syncing Window 14680081: 0x5555588664b0 (actual xid is 14680081),
user time is 14680082 frame is 0x555558871d80, frame xid 6291595
Syncing Window 6291460: 0x55555796dc80 (actual xid is 10485763),
user time is 10485764 frame is 0x555557a6f630, frame xid 6291460
Syncing Window 6291465: 0x555557a68af0 (actual xid is 14680067),
user time is 14680068 frame is 0x555557a73e80, frame xid 6291465
Syncing Window 6291509: 0x555557f9d830 (actual xid is 8388623),
user time is 0 frame is 0x555557fac780, frame xid 6291509
Syncing Window 6291586: 0x5555586e1690 (actual xid is 4194363),
user time is 0 frame is 0x55555886e550, frame xid 6291586
Syncing Window 6291591: 0x555558863540 (actual xid is 10485927),
user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
Where the same meta window 0x555558863540 is added twice because that's
both mapped by the window itself (10485927) and by its frame (6291591).
This happens because for historical reasons the xids hash table managed
by the x11-display maps both the X11 windows, their frames and their
user time windows as the meta-window, and so if we don't filter out them
properly we end up duplicating the entries in the compositor list.
Such duplicates finally end up making mutter to crash in
meta_compositor_sync_stack() because we could end up trying to access
to an invalid window, given its actor has been destroyed but not all the
instances have been removed from the compositor windows list:
0x00007ffff71059 in meta_compositor_sync_stack (compositor=0x555555b8,
stack=0x555558701b80) at ../../mutter/src/compositor/compositor.c:773
773 if ((old_window->hidden || old_window->unmanaging) &&
(gdb) print old_window
$1 = (MetaWindow *) 0x0
So, in order to prevent this, check that XReparentWindow does not fail,
and in case of failure, reset the window state to the one it had before
we failed and more importantly, remove the association between the frame
X11 window and the MetaWindow, since this is not true anymore and so
that at the next stack synchronization there won't be any meta window
associated to that frame XID (unless there aren't further stack changes
impacting on that).
In particular these are some logs that may be useful to see what was happening.
The original crash was lead to:
Thread 1 "gnome-shell" received signal SIGSEGV, Segmentation fault.
0x00007ffff71059de in meta_compositor_sync_stack (compositor=0x555555b85670,
stack=0x555558701b80) at ../../mutter/src/compositor/compositor.c:773
773 if ((old_window->hidden || old_window->unmanaging) &&
(gdb) print old_window
$1 = (MetaWindow *) 0x0
(gdb) list
768 while (old_stack)
769 {
770 old_actor = old_stack->data;
771 old_window = meta_window_actor_get_meta_window (old_actor);
772
773 if ((old_window->hidden || old_window->unmanaging) &&
774 !meta_window_actor_effect_in_progress (old_actor))
775 {
776 old_stack = g_list_delete_link (old_stack, old_stack);
777 old_actor = NULL;
(gdb) print old_actor
$2 = (MetaWindowActor *) 0x5555588b2700
(gdb) print *old_actor
$3 = {parent_instance = {parent_instance = {g_type_instance = {g_class = 0x555557967830},
ref_count = 1, qdata = 0x5555588c0b40}, flags = 0, private_flags = 0,
priv = 0x5555588b23b0}}
(gdb) call g_type_class_get_instance_private_offset (g_type_class_peek_static (meta_window_actor_get_type()))
$4 = -928
(gdb) print *(MetaWindowActorPrivate*) ($2+$4)
$6 = {window = 0x0, compositor = 0x555555b85670, stage_views_changed_id = 0, surface = 0x0,
surface_actors = 0x0, geometry_scale = 1, minimize_in_progress = 0,
unminimize_in_progress = 0, size_change_in_progress = 0, map_in_progress = 0,
destroy_in_progress = 0, freeze_count = 0, screen_cast_usage_count = 0, visible = 0,
disposed = 1, needs_destroy = 1, updates_frozen = 0, first_frame_state = 2}
The full stacktrace is like the one in this RH bug (or this or that).
The window s infos when we were adding a duplicate window to the windows list:
Window 10485927: 0x555558863540 (actual xid is 10485927),
user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
Window 14680081: 0x5555588664b0 (actual xid is 14680081),
user time is 14680082 frame is 0x555558871d80, frame xid 6291595
Window 6291460: 0x55555796dc80 (actual xid is 10485763),
user time is 10485764 frame is 0x555557a6f630, frame xid 6291460
Window 6291465: 0x555557a68af0 (actual xid is 14680067),
user time is 14680068 frame is 0x555557a73e80, frame xid 6291465
Window 6291509: 0x555557f9d830 (actual xid is 8388623),
user time is 0 frame is 0x555557fac780, frame xid 6291509
Window 6291586: 0x5555586e1690 (actual xid is 4194363),
user time is 0 frame is 0x55555886e550, frame xid 6291586
Window 6291591: 0x555558863540 (actual xid is 10485927),
user time is 10485928 frame is 0x5555588715c0, frame xid 6291591
6291591 "hexchat": ("mutter-x11-frames" "mutter-x11-frames") 76x114+2+15 +2+15
1 child:
6291592 (has no name): () 1x1+-1+-1 +1+14
10485927 "hexchat": ("hexchat" "Hexchat") 48x48+14+49 +14+49
1 child:
10485928 (has no name): () 1x1+-1+-1 +13+48
xwininfo: Window id: 6291591 "hexchat"
Absolute upper-left X: 2
Absolute upper-left Y: 15
Relative upper-left X: 2
Relative upper-left Y: 15
Width: 76
Height: 114
Depth: 32
Visual: 0x3e9
Visual Class: TrueColor
Border width: 0
Class: InputOutput
Colormap: 0x600001 (not installed)
Bit Gravity State: NorthWestGravity
Window Gravity State: NorthWestGravity
Backing Store State: NotUseful
Save Under State: no
Map State: IsViewable
Override Redirect State: no
Corners: +2+15 -1522+15 -1522-1071 +2-1071
-geometry 76x114+2+15
xwininfo: Window id: 10485927 "hexchat"
Absolute upper-left X: 14
Absolute upper-left Y: 49
Relative upper-left X: 14
Relative upper-left Y: 49
Width: 48
Height: 48
Depth: 24
Visual: 0x21
Visual Class: TrueColor
Border width: 0
Class: InputOutput
Colormap: 0x20 (installed)
Bit Gravity State: NorthWestGravity
Window Gravity State: NorthWestGravity
Backing Store State: NotUseful
Save Under State: no
Map State: IsViewable
Override Redirect State: no
Corners: +14+49 -1538+49 -1538-1103 +14-1103
-geometry 48x48+14+49
(gdb) print meta_window
$1 = (MetaWindow *) 0x555558863540
(gdb) print *meta_window
$2 = {parent_instance = {g_type_instance = {g_class = 0x55555796d860}, ref_count = 4,
qdata = 0x555558871801}, display = 0x555555b73430, id = 315847097, stamp = 4294967313,
monitor = 0x555555595ad0, highest_scale_monitor = 0x555555595ad0,
workspace = 0x555555b86200, client_type = META_WINDOW_CLIENT_TYPE_X11,
frame = 0x5555588715c0, depth = 24, desc = 0x555558867610 "0xa000a7",
title = 0x5555586beb00 "hexchat", type = META_WINDOW_NORMAL,
res_class = 0x55555885ca90 "Hexchat", res_name = 0x555558867960 "hexchat", role = 0x0,
startup_id = 0x0, mutter_hints = 0x0, sandboxed_app_id = 0x0, gtk_theme_variant = 0x0,
gtk_application_id = 0x0, gtk_unique_bus_name = 0x0, gtk_application_object_path = 0x0,
gtk_window_object_path = 0x0, gtk_app_menu_object_path = 0x0, gtk_menubar_object_path = 0x0,
transient_for = 0x0, initial_workspace = 0, initial_timestamp = 0,
tile_mode = META_TILE_NONE, tile_monitor_number = -1, edge_constraints = {
top = META_EDGE_CONSTRAINT_NONE, right = META_EDGE_CONSTRAINT_NONE,
bottom = META_EDGE_CONSTRAINT_NONE, left = META_EDGE_CONSTRAINT_NONE},
tile_hfraction = -1, preferred_output_winsys_id = 1031, fullscreen_monitors = {top = 0x0,
bottom = 0x0, left = 0x0, right = 0x0}, frame_bounds = 0x0, opacity = 255 '\377',
struts = 0x0, unmaps_pending = 1, reparents_pending = 1, stable_sequence = 18,
net_wm_user_time = 83208168, has_custom_frame_extents = 0, custom_frame_extents = {left = 0,
right = 0, top = 0, bottom = 0}, rect = {x = 16, y = 27, width = 48, height = 85},
saved_rect = {x = 16, y = 27, width = 48, height = 48}, saved_rect_fullscreen = {x = 16,
y = 27, width = 48, height = 48}, unconstrained_rect = {x = 16, y = 27, width = 48,
height = 85}, buffer_rect = {x = 2, y = 15, width = 76, height = 114}, icon_geometry = {
x = 0, y = 0, width = 0, height = 0}, size_hints = {flags = 1008, x = 16, y = 27,
width = 48, height = 48, min_width = 48, min_height = 48, max_width = 2147483647,
max_height = 2147483647, width_inc = 1, height_inc = 1, min_aspect = {x = 1,
y = 2147483647}, max_aspect = {x = 2147483647, y = 1}, base_width = 48,
base_height = 48, win_gravity = 1}, layer = META_LAYER_NORMAL, stack_position = 4,
close_dialog = 0x0, compositor_private = 0x55555886c2b0, attached_focus_window = 0x0,
tile_match = 0x0, placement = {rule = 0x0, state = META_PLACEMENT_STATE_UNCONSTRAINED,
pending = {x = 0, y = 0, rel_x = 0, rel_y = 0}, current = {rel_x = 0, rel_y = 0}},
close_dialog_timeout_id = 0, client_pid = 644877, has_valid_cgroup = 1, cgroup_path = 0x0,
events_during_ping = 0, override_redirect = 0, maximized_horizontally = 0,
maximized_vertically = 0, maximize_horizontally_after_placement = 0,
maximize_vertically_after_placement = 0, minimize_after_placement = 0, saved_maximize = 0,
fullscreen = 0, urgent = 0, require_fully_onscreen = 1, require_on_single_monitor = 1,
require_titlebar_visible = 1, on_all_workspaces = 0, on_all_workspaces_requested = 0,
minimized = 0, mapped = 1, hidden = 0, visible_to_compositor = 1, known_to_compositor = 1,
pending_compositor_effect = 4, iconic = 0, initially_iconic = 0, initial_workspace_set = 0,
initial_timestamp_set = 0, net_wm_user_time_set = 0, icon_geometry_set = 0, input = 1,
mwm_decorated = 1, mwm_border_only = 0, mwm_has_close_func = 1, mwm_has_minimize_func = 1,
mwm_has_maximize_func = 1, mwm_has_move_func = 1, mwm_has_resize_func = 1, decorated = 1,
border_only = 0, always_sticky = 0, has_close_func = 1, has_minimize_func = 1,
has_maximize_func = 1, has_move_func = 1, has_resize_func = 1, has_fullscreen_func = 1,
skip_taskbar = 0, skip_pager = 0, skip_from_window_list = 0, wm_state_above = 0,
wm_state_below = 0, wm_state_demands_attention = 0, has_focus = 0, appears_focused = 0,
placed = 1, denied_focus_and_not_transient = 0, showing_for_first_time = 0, unmanaging = 0,
--Type <RET> for more, q to quit, c to continue without paging--
constructing = 0, withdrawn = 0, calc_placement = 0, have_focus_click_grab = 1,
attached = 0, is_remote = 0, restore_focus_on_map = 0, is_alive = 1, in_workspace_change = 0}
(gdb) print *meta_window->frame
$6 = {window = 0x555558863540, xwindow = 6291591, rect = {x = 2, y = 15, width = 76,
height = 114}, cached_borders = {visible = {left = 0, right = 0, top = 37, bottom = 0},
invisible = {left = 14, right = 14, top = 12, bottom = 17}, total = {left = 14,
right = 14, top = 49, bottom = 17}}, opaque_region = 0x0, sync_counter = {
window = 0x555558863540, xwindow = 6291591, sync_request_counter = 6291594,
sync_request_serial = 2, sync_request_wait_serial = 0, sync_request_timeout_id = 0,
sync_request_alarm = 4194381, frame_drawn_time = 0, frames = 0x5555586e0d50,
extended_sync_request_counter = 1, disabled = 0, needs_frame_drawn = 1}, child_x = 14,
child_y = 49, right_width = 14, bottom_height = 17, borders_cached = 1}
(gdb) call g_type_class_get_instance_private_offset (g_type_class_peek_static (meta_window_x11_get_type()))
$3 = -400
(gdb) print (*(MetaWindowX11Private*) ($1+$3))
$4 = {wm_state_skip_taskbar = 0, wm_state_skip_pager = 0, wm_take_focus = 0, wm_ping = 0,
wm_delete_window = 0, wm_state_modal = 0, using_net_wm_name = 0,
using_net_wm_visible_name = 0, type_atom = 0, attributes = {x = 0, y = 0, width = 0,
height = 0, border_width = 0, depth = 0, visual = 0x0, root = 0, class = 0,
bit_gravity = 0, win_gravity = 0, backing_store = 0, backing_planes = 0,
backing_pixel = 0, save_under = 0, colormap = 0, map_installed = 0, map_state = 0,
all_event_masks = 0, your_event_mask = 0, do_not_propagate_mask = 0,
override_redirect = 0, screen = 0x0}, border_width = 0, showing_resize_popup = 0,
client_rect = {x = 0, y = 0, width = 0, height = 0}, opaque_region = 0x0,
input_region = 0x0, shape_region = 0x0, wm_hints_pixmap = 0, wm_hints_mask = 0,
thaw_after_paint = 0, xvisual = 0x0, xwindow = 0, xclient_leader = 0, xgroup_leader = 0,
user_time_window = 0, bypass_compositor = META_BYPASS_COMPOSITOR_HINT_AUTO, group = 0x0,
sync_counter = {window = 0x0, xwindow = 0, sync_request_counter = 0,
sync_request_serial = 0, sync_request_wait_serial = 0, sync_request_timeout_id = 0,
sync_request_alarm = 0, frame_drawn_time = 0, frames = 0x0,
extended_sync_request_counter = 0, disabled = 0, needs_frame_drawn = 0}, keys_grabbed = 0,
grab_on_frame = 0, wm_client_machine = 0x0, sm_client_id = 0x0}
A full log of when the duplicate window ended up being added to the compositor list: duplicate-windows-to-stack-compositor.log
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2213872 https://bugzilla.redhat.com/show_bug.cgi?id=2021919 https://bugzilla.redhat.com/show_bug.cgi?id=1544267
Edited by Marco Trevisan