X Window System core protocol

The X Window System is based on a client-server model. The X protocol forms the base of the interaction between the X server and the other programs. Other protocols related to the X Window System exist, both built at the top of the X protocol or as separate protocols.

Overview

Communication between server and clients is done by exchanging packets over a channel. The connection is established by the client, which sends the first packet, containing the byte order to be used and information about the version of the protocol and the kind of authentication the client expects the server to use. The server answers by sending back a packet stating the acceptance or refusal of the connection, or with a request for a further authentication. If the connection is accepted, the acceptance packet contains data for the client to use in the subsequent interaction with the server.

After connection is established, four types of packets are exchanged by the client and the server over the channel:

Request: The client requests information from the server or requests it to perform an action.
Reply: The server responds to a request. Not all requests generate replies.
Event: The server sends an event to the client, e.g., keyboard or mouse input, or a window being moved, resized or exposed.
Error: The server sends an error packet if a request is invalid. Since requests are queued, error packets generated by a request may not be sent immediately.

Request and reply packets have varying length, while event and error packets have a fixed length of 32 bytes.

The request packets are numbered sequentially by the server as soon as it receives them: the first request from a client is numbered 1, the second 2, etc. The least significant 16 bits of the sequential number of a request is included in the reply and error packets generated by the request, if any. They are also included in event packets to indicate the sequential number of the request that the server is currently processing or has just finished processing.

X services

The X server provides the following services. Anything else is provided by client programs.

Window services: Clients ask the server to create or destroy windows, to change their attributes, to request information about them, etc.
Input handling: Keyboard and mouse input are detected by the server and sent to clients.
Graphic operations: Clients ask the server to draw pixels, lines, strings, etc. The client can ask information about fonts (size, etc.) and can ask transfer of graphic content.
Resource management: The X resource manager provides a content addressable database for clients. Clients can be implemented so they are customizable on a system and user basis.

Windows

What is usually called a window is called a top-level window in the X Window System terminology. The term window is also use to denote windows that lay within other windows, that is, the subwindows of a parent window. Graphical elements such as buttons, menus, icons, etc. are all realized using windows.

A window can only be created as a subwindow of a parent window. This makes the windows to be arranged in a tree, that is, a hierarchy. The root of this hierarchy is called the root window, which is automatically created by the server. The top-level windows are exactly the direct subwindows of the root window. Visibly, the root window is as large as the screen, and lays behind all other windows. It contains the background of the screen.

The content of a window is not always guaranteed to be preserved. In particular, the window content may be destroyed when the window is moved, resized, covered by other windows, and in general made totally or partly non-visible. Whether content is lost in these cases depends on whether the server is maintaining a backing store of the window content. The client can request backing store for a window to be maintained, but there is not obligation for the server to do so. Even worse, there is no way for the client to know whether the server has accepted to maintain backing store.

Therefore, the clients should assume that backing store is not maintained. If this is the case, a visible part of a window may have an unspecified content. An event is sent to notify the client that part or all of the window content has to be drawn again.

The decorative border of windows are usually not drawn by client handling the window, but by the window manager, which is just another client from the point of view of the X server. The role of the window manager in the X Window System is described below.

Data about a window can be obtained running the xwininfo program.

Pixmaps and drawables

A pixmap is a region of memory that can be used for drawing. Contrary to windows, the content of pixmaps is not shown on any part of the screen. However, the content of a pixmap (or a part of it) can be transferred to a window, and vice versa. This allows for example to implement double buffering. Most of the graphical operations that can be done on windows can also be done on pixmaps.

Windows and pixmaps are collectively named drawables, and their content data resides on the server. A client can however request the content of a drawable to be transferred from the server to the client or vice versa.

Identificators

All data about windows, fonts, etc. is stored in the server. The client knows identifiers of these objects—an integer it can use for asking the server to do operations on them. For example, if a client wishes a window to be created, it request the server to create a window with a given identifier. The server creates a window and associates it with the identifier. The identifier can be later used by the client to request, for example, a string to be drawn in the window. The following objects reside in the server and known by the client via a numerical identificator:

Window, representing also windows that are not visible, windows that are only used for input and windows that are considered to stay within other windows (subwindows).
Pixmap: these are areas of memory that can be used for doing most of the graphical operations that can be done on window;
Font; this object contains the features of a font, such as the maximum height of characters in it;
Colormap, which contains colors that can be used for drawing.
Graphic context, which contains the the parameters of graphical operations which are, for example, the foreground color using for drawing, the font for drawing text, etc.

These objects are called resources. When a client requests the creation of one such object, it must specify an identificator for it. For example, for creating a new window, the client must specify the identificator that will be used for the window and its parameters (parent, width, height, etc.) Identificators are 32-bit integers with their three most significant bits equal to zero. Clients chose them in such a way:

they are in the set of possible identificators the server has assigned to the client for creation; this set is specified by the server in the acceptance packet, that is, the packet it sends to the client in response to the client request for connection;
they do not clash: two objects among windows, pixmaps, fonts, colormaps, and graphic contexts cannot have the same identificator.

Identificators are used by the client to request the server to perform some operation on the associated objects. Some operations affect the given object (for example a request to move a window), while others are requests for object data stored in the server. For example, a client can request the server to be sent the attributes of a window, the size of a string when drawn with a given font, etc.

Identificators are unique to the server, not only to the client; for example, no two windows have the same identificator, even if created by two different clients. A client can access any object given its identificator. In particular, it can also access objects created by other clients, even if their identificators are outside the set of its admissible identificators for creation.

Two clients connected to the same server can use the same identificator to refer to the same object. For example, if a client has created a window that has identificator 0x1e00021, this client can pass this number 0x1e00021 to another application via any available means, for example by storing this number in a file that is also accessible to the other application. This other application will then be able to operate on the very same window. This possibility is for example exploited by the X11 version of ghostview: this program creates a subwindow, storing its identificator in an environment variable, and calls ghostscript; this program draws the content of the postscript file to show in this window.

Atoms

Atoms are identificators of strings stored in the server. They are similar to the identificators of the other objects (Windows, Pixmaps, etc.) but differ from them in three ways:

when a client requests the creation of a new atom, it only sends the server the string to be stored in the server, but not its identifier; this identifier is chosen by the server and sent back as a reply to the client;
while atoms are identifiers and therefore unique, an atom and an object identifier can coincide;
atoms are not associated to clients.

The string associated with an atom is called the atom name. No two atoms can have the same name. As a result, it is common to see the atom as its corresponding string: “the atom ABCD” means, more precisely, “the atom whose associated string is ABCD.” or “the atom whose name is ABCD.”

The client can request a new atom associated with a given string to be created, and the server returns the atom (the identifier). The client can also request the server for the atom (the identifier) of a given string. Some atoms are predefined (created by the server with given identificator and string); clients can create new atoms at will, but two atoms cannot have the same name, and the name of an atom cannot be changed after creation.

Atoms are used for a number of purposes, mostly related to communication between different clients connected to the same server. In particular, they are used in association with the properties of windows, which are described below.

The list of all atoms residing in a server can be printed out using the program xlsatoms. In particular, this program prints each atom (the identificator, that is, a number) with its name (its associated string).

Attributes and properties

Every window has a predefined set of attributes and a set of properties, all stored in the server and accessible to the clients via appropriate requests. Attributes are data about the window, such as its size, position, background color, etc. Properties are pieces of data that are attached to a window. Contrary to attributes, properties have no meaning at the level of the X protocol. A client can store arbitrary data in a property of a window.

A property is characterized by a name, a type, and a value. Properties are similar to variables in imperative programming languages, in that the application can create a new property with a given name and of a given type and store a value in it. Properties are associated to windows: two properties with the same name can exist on two different windows while having different types and values.

The name, type, and value of a property are strings; more precisely, they are atoms, that is, strings stored in the server and accessible to the clients via identificators. A client application can access a given property by using the identificator of the atom containing the name of the property.

Properties are mostly used for interclient communication. For example, the property named WM_NAME (more precisely, the property named by the atom whose associated string is WM_NAME) is used for storing the name for the window; window managers typically read this property and display the name of the window at the top of it.

Some types of inter-client communication use properties of the root window. For example, the freedesktop window manager specification states that the window manager should store the identificator of currently active window in the property named by atom _NET_ACTIVE_WINDOW of the root window. The X resources, which contain parameters of programs, are also stored in properties of the root window to allow access from program running on different computers.

The xprop program prints the properties of a given window. For example, xprop -root prints the name, type, and value of each property of the root window.

Events

Events are packets sent by the server to the client to communicate that something the client may be interested into has happened. A client can also request the server to send an event to another client; this is used for communication between clients. Such an event is for example generated when a client requests the text that is currently selected: this request is sent to the client that is currently handling the window that holds the selection.

The content of a window may be destroyed in some conditions (for example, if the window is covered). Whenever this happens and the destroyed content is visible or made visible, the server generates an Expose event to notify the client that a part of the window has to be redrawn.

Other events are used to notify clients of keyboard or mouse input, of the creation of new windows, etc.

Some kinds of events are always sent to client, but most kinds of event are sent only if the client previously stated an interest in them. This is because clients may only be interested in some kind of events. For example, a client may be interested in keyboard-related event but not in mouse-related events. Some kind of events are however sent to clients even if they have not specifically requested them.

Clients specify which kinds of events they want to be sent by setting an attribute of a window. For example, in order to redraw a window when its content has been destroyed, a client must receive the Expose events that tell it that the window has to be drawn again. In order for such events to be sent by the server to it, the client has to specify an interest in these events by appropriately set the event mask attribute of the window.

Different clients can set different event masks on the same window, so that a client may request only keyboard events on a window while another client requests only mouse events on the same window. This is possible because the server, for each window, maintains a separate event mask for each client. However, there are some kinds of events that can only be selected by one client at time for each window.

The xev program shows the events relative to a window. In particular, xev -id WID requests all possible events relative to the window of identifier WID. All received events are then printed.

Example

The following is an example of interaction that takes place when a program creates a window and draws a black box in it.

The client opens the connection with the server and sends the initial packet specifying the byte order it is using.
The server accepts the connection by sending an appropriate packet, which contains other information such as the identificator of the root window and the set of identificators the client can create.
The client requests the server to create a top-level window (a window that is a child of the root window) with identificator 0x8200000, size 200x200, position (10,10), etc.
The server does not reply, as the above request does not require a reply (in case of error, the server sends an error packet).
The client requests a change in the attributes of the window to specify it is interested in receiving MapNotify, Expose, and KeyPress events.
The server does not reply (not required)
The client requests the window to be mapped, that is, shown on the screen
The server does not reply (not required)
When the window is actually mapped, the server sends the client a MapNotify event
The server sends the client an Expose event to tell it that the window has to be drawn
The client requests a box to be drawn by sending a PolyFillRectangle request
The server does not reply (not required)

If the window is covered by another window and uncovered again:

The server sends another Expose event to tell the client that the window has to be drawn again
The client redraws the window by sending a PolyFillRectangle request
The server does not reply (not required)

If a key is pressed:

The server sends a KeyPress event to the client to notify it that the user has pressed a key
The client reacts appropriately

Xlib and other client libraries

Most client programs communicate with the server via the Xlib client library. In particular, most client use libraries such as Xaw, Motif, GTK+, or Qt which in turn use Xlib for interacting with the server. The use of Xlib has the following effects:

Xlib makes the client synchronous with respect to replies and events:
1. the Xlib functions that send requests block until the appropriate replies, if any is expected, are received; in other words, an X11 client not using Xlib can send a request to the server and then do other operations while waiting for the reply, but a client using Xlib can only call an X11 function that sends the request and wait for the reply, thus blocking the client while waiting for the reply (unless the client starts a new thread before calling the function);
2. while the server sends events asynchronously, Xlib stores events received by the client in a queue; the client program can only access them by explicitely calling functions of the X11 library; in other words, the client is blocked or forced to busy-wait if expecting an event.
Xlib does not send requests to the server immediately, but stores them in a queue, called the output buffer; the requests in the output buffer are actually sent when:
1. the program explicitely requests so by calling a library function such as XFlush;
2. the program calls a function that gives as a result something that involve a reply from the server, such as XGetWindowAttributes;
3. the program asks for an event in the event queue (for example, by calling XNextEvent) and the call blocks (for example XNextEvent blocks if the queue is empty.)

Higher-level libraries such as Xt (which is in turn used by Xaw and Motif) allow the client program to specify the callback functions associated to some events; the library takes care of polling the event queue and calling the appropriate function when appropriate; some events such as those indicating the need of redrawing a window are handled internally by Xt.

References

Robert W. Scheifler and James Gettys: X Window System: Core and extension protocols, X version 11, releases 6 and 6.1, Digital Press 1996, ISBN 1-55558-148-X
An Introduction to X11 User Interfaces
Introduction to X Windows
Open Source Desktop Technology Road Map (Jim Gettys, 09 Dec 2003)

External links

X.Org Foundation (official home page)
X.Org Foundation wiki
Kenton Lee's pages on X Window and Motif