Trying to address one of my issues with Matrix

2025-01-08 - Matrix is polling based, one alternative to change this.

I have my gripes with Matrix and its design choices. Since it became an accepted messaging standard I want to point out the parts I don’t agree with and maybe come up with better solution. The web- and (long) polling-based nature of the protocol is one of these points.

How does Matrix work?

Note: this only very briefly covers the Client-Server interactions.

My Matrix client (nheko) opens a long lived TCP connection and then enters an HTTP/2 STREAM. From that point, clients long-poll /_matrix/client/v3/sync to get new messages and presence information. A long-poll in this case is a GET request that lingers until data returned by the server. Once data has been received, a new request is created.

Coming from XMPP, which handles data exchange over a long-lived TCP session the server pushes data to the client over, this choice seems very odd to me. I wondered if there is a better solution and I came up with a rough idea:

This idea has precedent inside the Matrix spec: “More efficient transports may be specified in future as optional extensions.” Ref. https://spec.matrix.org/unstable/client-server-api/#api-standards.

How to separate namespaces and transports

Every API endpoint in Matrix has a URL path associated with it, hereafter referred to as “namespace”. For example, to go back to the sync example, it is specified as:

GET /_matrix/client/v3/sync 

Ref. https://spec.matrix.org/v1.13/client-server-api/#get_matrixclientv3sync

Currently, namespaces are passed over the URL path, like a REST API. Over a non HTTP based transport, status updates, for example, can’t be associated with a namespace with the current JSON schema. All JSON objects need to be extended with an additional field which holds the namespace.

A new way of requesting information is also needed if HTTP is not used. For that, requests need to be able to be associated with responses. This could be accomplished with a message ID.

For example a sync sent by the server over a TCP transport could look like this:

Client <<< Server

{
  "@ns": "client/v3/sync",
  "@id": "1234567890",
  [...]
}
Extended example JSON from the Matrix specs
{
  "@ns": "client/v3/sync",
  "@id": "1234567890",
  "chunk": [
    {
      "content": {
        "body": "This is an example text message",
        "format": "org.matrix.custom.html",
        "formatted_body": "<b>This is an example text message</b>",
        "msgtype": "m.text"
      },
      "event_id": "$143273582443PhrSn:example.org",
      "origin_server_ts": 1432735824653,
      "room_id": "!jEsUZKDJdhlrceRyVU:example.org",
      "sender": "@example:example.org",
      "type": "m.room.message",
      "unsigned": {
        "age": 1234,
        "membership": "join"
      }
    }
  ],
  "end": "s3457_9_0",
  "start": "s3456_9_0"
}

Ref. https://spec.matrix.org/v1.13/client-server-api/#get_matrixclientv3sync

And a request and response for joined members:

Client >>> Server

{
  "@ns": "client/v3/rooms/!jEsUZKDJdhlrceRyVU:example.org/joined_members",
  "@id": "1234567890"
}

Client <<< Server

{
  "@ns": "client/v3/rooms/!jEsUZKDJdhlrceRyVU:example.org/joined_members",
  "@id": "1234567890",
  [...]
}
Extended example JSON from the Matrix specs

Client >>> Server

{
  "@ns": "client/v3/rooms/!jEsUZKDJdhlrceRyVU:example.org/joined_members",
  "@id": "1234567890"
}

Client <<< Server

{
  "@ns": "client/v3/rooms/!jEsUZKDJdhlrceRyVU:example.org/joined_members",
  "@id": "1234567890",
    "joined": {
    "@bar:example.com": {
      "avatar_url": "mxc://riot.ovh/printErCATzZijQsSDWorRaK",
      "display_name": "Bar"
    }
  }
}

Ref. https://spec.matrix.org/v1.13/client-server-api/#get_matrixclientv3roomsroomidjoined_members

Note that none of this is perfect. This is just a rough idea I had one evening, looked at the Client-Server interactions and then wrote down. This is missing server responses for e.g. duplicate message IDs and error handling or how IDs are even chose. Further, there are probably some objects and interactions that can’t easily be mapped to this process, for example due to HTTP headers. I would be interested to put this into an actual MSC although I am missing the time and motivation to write a stub client and server implementations.

Huge thanks to hcsch, HarHarLinks, jn and Domi for proofreading