simple, connectionless chat system – design draft

i intend to design a chat system. my goals:

  1. simple protocol spec
  2. no reliance on persistent connection state
  3. the protocol itself should not give an illusion of privacy wherever it is not possible to enable actual privacy (view permissions are for convenience not for privacy)
  4. decent, non-painful basic experience especially in terms of talking to people
  5. flexible enough to allow client implementations to implement all sorts of useful features (without requiring any modification or additional safeguarding in the server)

since web technologies exist, i will use them. the interface will therefore be a RESTful one – i think it’ll let me implement the intuitions about how this should work easily (and, even better, guide my intuitions to be in line whatever is already known to work).

a good REST interface, it is said, is based on collections of objects. here are some (key: bold means server fills field; plain means client provided, [bracketed] means client optional, (parens) means constraint or other info, <angled> means field takes sub-request, italics means only visible to a certain user):

  • users (who post topics); each has
    • an id (unique)
    • a handle  (unique)
    • an info object (an arbitrary json object within a network-set size limit)
    • an authentication thing (you know what i mean)
    • <followed users (list of user refs)>
    • <subscribed tags>
    • <allowed followers (list of user refs)>
    • allow followers automatically (default yes)
    • <blocked users (list of user refs – these ones cannot send requests to this user)>
    • <in topics (list of topic refs)>
  • topics (which contain messages); each topic has
    • an id (unique)
    • an original poster (a user ref)
    • a title (size limited string)
    • tags (set of strings)
    • public (boolean – yes: events will propagate out, anyone can join. no: only participants will see events, joins only by approval or by invitation)
    • <participants (a list of users who will see and post messages in the topic)>
    • <an info object (size limited json)>
    • <messages>; containing
      • an id (unique)
      • a timestamp
      • the poster (a user ref)
      • [a reply-to field (a message ref)]
      • a body object (size limited)
  • events (generated whenever a user interacts with a topic); each has
    • an id (unique)
    • a timestamp
    • an event type (what the user did to the topic)
    • a user ref
    • a topic ref
  • request (visible to both):
    • an id
    • a source user ref
    • a type (topic-invitation, permit-follow, permit-participate, etc)
    • a destination user ref
    • [a target topic ref]
    • [a message object (size limited)]
  • tags (topics can be tagged for searching); each has
    • a hash of the tag text
    • the tag text

so, where does this get us?
any request you make to these collections must be from the context of an authenticated user; however, you will not see anything at first – for access must be granted to you! so you follow some users – this sends permit-follow requests to those users, who either manually or automatically approve you (when you get approved, that user is added to your follow list, and you are added to that user’s allowed-followers list. then the request disappears.) you may also decide to follow tags.

then you get to query the events collection. events have a number of types – topic posted, message posted in topic, topic title changed, user joined topic, user left topic, etc. the collection can be filtered by type, tag, by user, by who you’re following, etc. it also can be queried in one of two ways – “before” and “after”.

  • “before”  takes an event id, or (by default “now”), and a count N >= 1(default 1), and returns the N tweets before (inclusive, and post-filtering) the event referenced (“now” means the most recent event available)
  • “after” takes an event id and a count N (default “all”), and returns the N tweets after (inclusive, post-filtering) the event referenced (“all” means return them all up to the most recent available event)

it is important to note that the event stream is always filtered first by whether or not an event is permitted to be shown to you. events from private topics will never enter your feed unless you are participating in those topics.

messages, requested from a topic, work by a similar principal (filtering on them operates on the body, though) – they too have before and after request types. in both collections, it is the after request mode that makes this work as a chat system. a working implementation must fulfil the request such that everything added after the item with the specified id was added will be returned.

now, a brief example of how client convention enables encrypted chat. your client obtains a public-key encryption keypair, and places the public key in your info object (with the index “publicKey”). someone wants to send a message only you can see, they look up your info object, get your public key, and encrypt the message before sending it to you. you want to send a message that everyone can verify is from you? you put text under the “text” index of your message object, then you hash it, decrypt it with your private key, and attach it to the message object on the “signature” index. when others in the topic receive it, they look up your public key, encrypt the signature, and compare it to the hash of the message text they receive. how about for group encrypted chat? the person setting up the topic creates a symmetric key, and a private topic. this OP then sends out invitations – each invitation contains the symmetric key encrypted with the public key of the invitation’s destination user. messages in the topic can then be encrypted and decrypted by all the invited users, using the same key. all that is needed is for clients to interpret the same fields the same way, (so they can automatically do the encryption and decryption for you. i suppose you could just manually move stuff to and from your encryption tools, but that’d be painful, I think.)

other things that need consideration:

  • server/service administration interfaces
    • how to add users
  • websocket for when-available streaming mode
  • more client conventions – for instance, implementation of encrypted voice or video chat
  • connecting servers to create networks
  • a name

my plan is to start working on an implementation tomorrow, once i figure out which of the frameworks for building REST apis in Scala to use.

Leave a comment