
Reputation: 129

Opening sockets to the Xserver directly

I'm looking to understand how Linux Desktop Environments work with Xserver. I was reading that most window managers don't open sockets directly, instead they use either Xlib bindings for which ever language the WM is being written or you can use higher level bindings XCB; but i would like to know What are the advantages to opening a socket directly to the Xserver?

Upvotes: 6

Views: 3623

Answers (3)

Andrey Sidorov
Andrey Sidorov

Reputation: 25456

I've done it with node/javascript ( ) It's not as scary as it seems, and for some particular tasks might work better than xlib ( non-trivial parallelism or resource dependencies or indirect glx ), but probably most value in this is educational. If you just want some work done there is probably solution available and you don't need to reinvent the wheel

Starting point would be x11 protocol documentation:

Upvotes: 2


Reputation: 1861

I'm not going to tell you the advantages, but I'm going to tell you how to do it.

Some time ago I found an exampe of someone this, and I can't find it anymore, but here's more or less his code. It's quite interesting to know what's going on under the hood of client libraries like XCB.

In order to send a request to the X server, you need:

  1. a socket connection to the X server
  2. the opcode of the request
  3. data associated to the request

The API to open a socket, create a graphics context, create a window, and map a window can look something like:

x11_connection_t connection = {0};

x11_state_t state;
state.socket_fd   = connection.socket_fd;
state.id_base     = connection.setup->id_base;
state.root_window = connection.root[0].window_id;
state.root_visual = connection.root[0].visual_id;
state.root_depth  = connection.root[0].depth;

state.gcontext = x11_init_gc(&state, X11_GC_GRAPHICS_EXPOSURES, (u32[]){X11_EXPOSURES_NOT_ALLOWED});

state.window = x11_init_window(&state, 0,0, WIDTH,HEIGHT, state.root_window, state.root_visual,
                               X11_CW_BACK_PIXEL | X11_CW_EVENT_MASK,
x11_map_window(&state, state.window);

But first we need to implement it.

For illustration, here are the first thirteen X protocol opcodes:

#define X11_OPCODE_CREATE_WINDOW                1
#define X11_OPCODE_DESTROY_WINDOW               4
#define X11_OPCODE_DESTROY_SUBWINDOWS           5
#define X11_OPCODE_CHANGE_SAVE_SET              6
#define X11_OPCODE_REPARENT_WINDOW              7
#define X11_OPCODE_MAP_WINDOW                   8
#define X11_OPCODE_MAP_SUBWINDOWS               9
#define X11_OPCODE_UNMAP_WINDOW                10
#define X11_OPCODE_UNMAP_SUBWINDOWS            11
#define X11_OPCODE_CONFIGURE_WINDOW            12
#define X11_OPCODE_CIRCULATE_WINDOW            13

You can open the socket with a function like the following,

int x11_init_socket(){
  int socket_fd                = socket(AF_UNIX, SOCK_STREAM, 0);  // Create the socket!
  struct sockaddr_un serv_addr = {0};
  serv_addr.sun_family         = AF_UNIX;
  strcpy(serv_addr.sun_path, "/tmp/.X11-unix/X0");
  int srv_len                  = sizeof(struct sockaddr_un);
  connect(socket_fd, (struct sockaddr*)&serv_addr, sizeof(serv_addr));
  return socket_fd;

Then you need the handshake,

int x11_init_connection(x11_connection_t* connection){
  connection->socket_fd = x11_init_socket();

  x11_connection_request_t request = {0};
  request.order = 'l';  // Little endian!
  request.major = 11;
  request.minor =  0;  // Version 11.0
  write(connection->socket_fd, &request, sizeof(x11_connection_request_t));             // Send request
  read(connection->socket_fd, &connection->header, sizeof(x11_connection_reply_t));     // Read reply header
  if(connection->header.success == 0)  return connection->header.success;   // Error handling!

  connection->setup = sbrk(connection->header.length * 4);       // Allocate memory for remainder of data
  read(connection->socket_fd, connection->setup, connection->header.length * 4);  // Read remainder of data

  void* p = ((void*)connection->setup);
                           p += sizeof(x11_connection_setup_t) + connection->setup->vendor_length;  // Ignore the vendor
  connection->format = p;  p += sizeof(x11_pixmap_format_t) * connection->setup->formats;   // Align struct with format sections. Move pointer to end of section
  connection->root   = p;  p += sizeof(x11_root_t) * connection->setup->roots;              // Align struct with root section(s). Move pointer to end of section
  connection->depth  = p;  p += sizeof(x11_depth_t);                                        // Align depth struct with first depth section. Move pointer to end of section
  connection->visual = p;  // Align visual with first visual for first depth

  return connection->header.success;

and a bunch of data structures that the X server likes:

typedef struct{
  u8  order;
  u8  pad1;
  u16 major, minor;
  u16 auth_proto, auth_data;
  u16 pad2;

typedef struct{
  u8  success;
  u8  pad1;
  u16 major, minor;
  u16 length;

typedef struct{
  u32 release;
  u32 id_base, id_mask;
  u32 motion_buffer_size;
  u16 vendor_length;
  u16 request_max;
  u8  roots;
  u8  formats;
  u8  image_order;
  u8  bitmap_order;
  u8  scanline_unit, scanline_pad;
  u8  keycode_min, keycode_max;
  u32 pad;

typedef struct{
  u8  depth;
  u8  bpp;
  u8  scanline_pad;
  u8  pad1;
  u32 pad2;

typedef struct{
  u32 window_id;
  u32 colormap;
  u32 white, black;
  u32 input_mask;   
  u16 width, height;
  u16 width_mm, height_mm;
  u16 maps_min, maps_max;
  u32 visual_id;
  u8  backing_store;
  u8  save_unders;
  u8  depth;
  u8  depths;

typedef struct{
  u8  depth;
  u8  pad1;
  u16 visuals;
  u32 pad2;

typedef struct{
  u8  group;
  u8  bits;
  u16 colormap_entries;
  u32 mask_red, mask_green, mask_blue;
  u32 pad;

typedef struct{
  u8  response_type;  /**< Type of the response */
  u8  pad0;           /**< Padding */
  u16 sequence;       /**< Sequence number */
  u32 pad[7];         /**< Padding */
  u32 full_sequence;  /**< full sequence */

typedef struct{
  u8  success;
  u8  code;
  u16 seq;
  u32 id;
  u16 op_major;
  u8  op_minor;
  u8  pad[21];

typedef struct{
  int socket_fd;
  x11_connection_reply_t  header;
  x11_connection_setup_t* setup;
  x11_pixmap_format_t*    format;
  x11_root_t*             root;
  x11_depth_t*            depth;
  x11_visual_t*           visual;

typedef struct{
  int  socket_fd;
  u32  id_base;  // We'll use this to generate 32-bit identifiers!

  u32  root_window;
  u32  root_visual;
  u8   root_depth;

  u32  window;
  u32  gcontext;

You also need a function a function to generate 32-bit identifiers:

u32 x11_generate_id(x11_connection_t* connection){
  static u32 id = 0;
  return (id++ | connection->setup->id_base);

Finally, you need the actual functions that assemble the packets with the requests. The following are helper functions to send X11_OPCODE_CREATE_GC, X11_OPCODE_CREATE_WINDOW, X11_OPCODE_MAP_WINDOW, and X11_OPCODE_PUT_IMAGE.

#define X11_NONE             0x00000000L  // X11_NONE is the universal null resource or null atom parameter value for many core X requests
#define X11_COPY_FROM_PARENT 0x00000000L  // X11_COPY_FROM_PARENT can be used for many xcb_create_window parameters
#define X11_CURRENT_TIME     0x00000000L  // X11_CURRENT_TIME can be used in most requests that take an xcb_timestamp_t
#define X11_NO_SYMBOL        0x00000000L  // X11_NO_SYMBOL fills in unused entries in xcb_keysym_t tables

enum x11_exposures_t{

enum x11_gc_t{
  X11_GC_FUNCTION              = 1<<0,
  X11_GC_PLANE_MASK            = 1<<1,
  X11_GC_FOREGROUND            = 1<<2,
  X11_GC_BACKGROUND            = 1<<3,
  X11_GC_LINE_WIDTH            = 1<<4,
  X11_GC_LINE_STYLE            = 1<<5,
  X11_GC_CAP_STYLE             = 1<<6,
  X11_GC_JOIN_STYLE            = 1<<7,
  X11_GC_FILL_STYLE            = 1<<8,
  X11_GC_FILL_RULE             = 1<<9,
  X11_GC_TILE                  = 1<<10,
  X11_GC_STIPPLE               = 1<<11,
  X11_GC_FONT                  = 1<<14,
  X11_GC_SUBWINDOW_MODE        = 1<<15,
  X11_GC_CLIP_ORIGIN_X         = 1<<17,
  X11_GC_CLIP_ORIGIN_Y         = 1<<18,
  X11_GC_CLIP_MASK             = 1<<19,
  X11_GC_DASH_OFFSET           = 1<<20,
  X11_GC_DASH_LIST             = 1<<21,
  X11_GC_ARC_MODE              = 1<<22,

u32 x11_init_gc(x11_state_t* state, u32 value_mask, u32* value_list){
  u32 gcontext_id = x11_generate_id(state);

  u16 flag_count = popcnt(value_mask);
  u16 length     = 4 + flag_count;
  u32* packet    = sbrk(length * 4);

  packet[0] = X11_OPCODE_CREATE_GC | (length<<16);  // The first `u32` in the packet is always the opcode and the length of the packet!
  packet[1] = gcontext_id;
  packet[2] = state->root_window;
  packet[3] = value_mask;
  for(int i=0; i < flag_count; ++i)
    packet[4 + i] = value_list[i];

  write(state->socket_fd, packet, length * 4);
  sbrk(-(length * 4));

  return gcontext_id;

enum x11_cw_t{
  X11_CW_BACK_PIXMAP       = 1<<0,
  X11_CW_BACK_PIXEL        = 1<<1,
  X11_CW_BORDER_PIXMAP     = 1<<2,
  X11_CW_BORDER_PIXEL      = 1<<3,
  X11_CW_BIT_GRAVITY       = 1<<4,
  X11_CW_WIN_GRAVITY       = 1<<5,
  X11_CW_BACKING_STORE     = 1<<6,
  X11_CW_BACKING_PLANES    = 1<<7,
  X11_CW_BACKING_PIXEL     = 1<<8,
  X11_CW_SAVE_UNDER        = 1<<10,
  X11_CW_EVENT_MASK        = 1<<11,
  X11_CW_DONT_PROPAGATE    = 1<<12,
  X11_CW_COLORMAP          = 1<<13,
  X11_CW_CURSOR            = 1<<14

enum x11_event_mask_t{
  X11_EVENT_MASK_NO_EVENT               = 0,
  X11_EVENT_MASK_KEY_PRESS              = 1<<0,   // 0x00000001
  X11_EVENT_MASK_KEY_RELEASE            = 1<<1,   // 0x00000002
  X11_EVENT_MASK_BUTTON_PRESS           = 1<<2,   // 0x00000004
  X11_EVENT_MASK_BUTTON_RELEASE         = 1<<3,   // 0x00000008
  X11_EVENT_MASK_ENTER_WINDOW           = 1<<4,   // 0x00000010  
  X11_EVENT_MASK_LEAVE_WINDOW           = 1<<5,   // 0x00000020  
  X11_EVENT_MASK_POINTER_MOTION         = 1<<6,   // 0x00000040  
  X11_EVENT_MASK_POINTER_MOTION_HINT    = 1<<7,   // 0x00000080  
  X11_EVENT_MASK_BUTTON_1_MOTION        = 1<<8,   // 0x00000100  
  X11_EVENT_MASK_BUTTON_2_MOTION        = 1<<9,   // 0x00000200  
  X11_EVENT_MASK_BUTTON_3_MOTION        = 1<<10,  // 0x00000400  
  X11_EVENT_MASK_BUTTON_4_MOTION        = 1<<11,  // 0x00000800  
  X11_EVENT_MASK_BUTTON_5_MOTION        = 1<<12,  // 0x00001000  
  X11_EVENT_MASK_BUTTON_MOTION          = 1<<13,  // 0x00002000  
  X11_EVENT_MASK_KEYMAP_STATE           = 1<<14,  // 0x00004000  
  X11_EVENT_MASK_EXPOSURE               = 1<<15,  // 0x00008000  
  X11_EVENT_MASK_VISIBILITY_CHANGE      = 1<<16,  // 0x00010000  
  X11_EVENT_MASK_STRUCTURE_NOTIFY       = 1<<17,  // 0x00020000  
  X11_EVENT_MASK_RESIZE_REDIRECT        = 1<<18,  // 0x00040000  
  X11_EVENT_MASK_SUBSTRUCTURE_NOTIFY    = 1<<19,  // 0x00080000  
  X11_EVENT_MASK_SUBSTRUCTURE_REDIRECT  = 1<<20,  // 0x00100000  
  X11_EVENT_MASK_FOCUS_CHANGE           = 1<<21,  // 0x00200000  
  X11_EVENT_MASK_PROPERTY_CHANGE        = 1<<22,  // 0x00400000  
  X11_EVENT_MASK_COLOR_MAP_CHANGE       = 1<<23,  // 0x00800000  
  X11_EVENT_MASK_OWNER_GRAB_BUTTON      = 1<<24   // 0x01000000

#define X11_DEFAULT_BORDER 0
#define X11_DEFAULT_GROUP  0

u32 x11_init_window(x11_state_t* state, u16 x, u16 y, u16 w, u16 h, u32 window_parent, u32 visual, u32 value_mask, u32* value_list){
  u32 window_id = x11_generate_id(state);

  u16 flag_count = popcnt(value_mask);
  u16 length     = 8 + flag_count;
  u32* packet    = sbrk(length * 4);

  packet[0] = X11_OPCODE_CREATE_WINDOW | length<<16;  // The first `u32` in the packet is always the opcode and the length of the packet!
  packet[1] = window_id;
  packet[2] = window_parent;
  packet[3] = x | (y<<16);
  packet[4] = w | (h<<16);
  packet[5] = (X11_DEFAULT_BORDER<<16) | X11_DEFAULT_GROUP;
  packet[6] = visual;
  packet[7] = value_mask;
  for(int i=0;i < flag_count; ++i)
    packet[8 + i] = value_list[i];

  write(state->socket_fd, packet, length * 4);
  sbrk(-(length * 4));

  return window_id;

void x11_map_window(x11_state_t* state, u32 window_id){
  int const length = 2;
  u32 packet[length];
  packet[0] = X11_OPCODE_MAP_WINDOW | (length<<16);  // The first `u32` in the packet is always the opcode and the length of the packet!
  packet[1] = window_id;
  write(state->socket_fd, packet, 8);

enum x11_image_format_t{

void x11_put_img(x11_state_t* state, u16 x, u16 y, u32* data){
  u32 packet[6];
  u16 const length = ((state->tile_width * state->tile_height)) + 6;

  packet[0] = X11_OPCODE_PUT_IMAGE | (X11_IMAGE_FORMAT_Z_PIXMAP<<8) | (length<<16);  // The first `u32` in the packet is always the opcode and the length of the packet!
  packet[1] = state->window;
  packet[2] = state->gcontext;
  packet[3] = state->tile_width | (state->tile_height<<16);
  packet[4] = x | (y<<16);
  packet[5] = state->root_depth<<8;

  write(state->socket_fd, packet, 24);
  write(state->socket_fd, data, state->tile_width * state->tile_height * 4);

Upvotes: 6

Jo So
Jo So

Reputation: 26501

The question should rather be "What are the advantages of using Xlib instead of a graphical toolkit like gtk"?

Even to master Xlib, you have to spent months or years! It's so intricate that almost every app that has graphical content to display uses a toolkit instead.

Using a plain socket means speaking the X11 protocol directly, which means you would end up recreating Xlib (no, you wouldn't finish!)

As an analogy, plain socket ~ machine code, xlib ~ assembly code, toolkit ~ higher level language. (notwithstanding that xlib does a little maintenance for you, but I guess it's rather neglectable)

Upvotes: 0

Related Questions