Dec 13: Doing it the hard way

Technical

This is the thirteenth post in the FastMail 2015 Advent Calendar. Stay tuned for another post tomorrow.


If you read some of our blog posts you'd think we always make smart decisions and build simple great things that work really well.

And that's not even counting last year.

Well, here's a story that goes deeper into the process of getting to the nice happy end result that we can blog about.

Complexity all the way down

One of the things we like to do is display a little badge on our mobile app icon to say how many unread messages are in the Inbox.

This is done through mboxevents and the pusher. We wrote about how it works last year.

We had already implemented the MOVE extension to IMAP, and before it was even proposed as an RFC we had XMOVE, because it makes quota issues so much nicer for users. When mboxevents were added to Cyrus we added a custom mboxevent for MOVE.

Only it turned out that when you moved messages from your INBOX, the badge didn't update, because MOVE, like COPY, is an event on the destination mailbox.

We did have the oldMailboxID field though, so when we detected a move from INBOX in pusher, we just cleared the badge. At least that's better than leaving it definitely wrong - and for an INBOX-zero user it might even be correct.

It annoyed people with unread messages though, because the counter didn't reappear until a new email was delivered.

Let's make it more complex

Rob N asked if I could output counters for the source mailbox in the MOVE event as well, so he could make it better.

In fact, the whole exchange is amusing enough that I'm going to paste from our internal slack channel, Dec 3rd. We were all working from home that day, so there's no side channel of yelling between rooms.

robn
  12:31 brong: so, I have a need which I suspect only you can fulfill.
nicola
  12:32 robn: you've decided to take up bodypump?
robn
  12:33 not exactly
  12:33 not at all, in fact
brong
  12:34 oh, the pressure
  12:34 what if I can't perform?
robn
  12:35 then you're not the man I thought you were
brong
  12:35 challenge accepted, name your terms
robn
  12:36 so when a message is moved, cyrus emits an mboxevent.
        the IMAP URI for the event is the dest folder. the
        event includes an oldMailboxID field, with the URL
        for the source folder. all good so far.
  12:36 but
  12:36 vnd.cmu.unseenMessages and vnd.fastmail.convUnseen are
        for the dest folder
  12:36 there's no unread counts for the source folder
  12:37 which means if you trash a message, I see a change on
        the inbox, but I have no counts
  12:37 so I can't push a badge update to ios
brong
  12:37 hmm
robn
  12:37 right now I push a 0, which removes it
  12:37 I did that deliberately, but now someone has complained
nicola
  12:37 YES! That explains why my count goes whack.
robn
  12:37 fair enough I suppose. but yeah, I need the counts for
        the old mailbox too
brong
  12:38 sounds legit
(... I did some code reading here ...)
brong
  12:53 robn: you're going to want CONVUNSEEN aren't you
  12:53 and CONVEXISTS
robn
  12:57 brong: UNSEEN_MESSAGES and CONVUNSEEN. we unpack MESSAGES
        and CONVEXISTS, but we don't use them
brong
  12:57 yeah, OK
  12:57 I'll do the lot, because reasons

I spent most of the rest of the day trying to work out how to get the values I needed without causing a locking inversion, because you don't have a locked source mailbox at the point where the mboxevent is generated. I was doing all sorts of surgery to the code.

Quick shoutout here to git - we could use a different version control tool, but it does make it incredibly easy to try out tons of different ideas on experimental branches, cherry pick the result back together and whitewash it into a rebased commit that makes it look like you knew what you were doing all along. Thanks git, it's helped me hide the evidence for more bad ideas than you could imagine.

So I spent ages messing around with various options and aborting when they got too complex. Thankfully I got tired. The upside of flexible work hours is that they're flexible. The downside is that at 11pm on some Thursday night you're still hacking away at this rubbish after a 9:30pm teleconference with people on the other side of the world...

It's all too hard

I had already noticed that we were deliberately suppressing an EXPUNGE mboxevent during the cleanup phase of the MOVE command. Normally any expunge would send an event, but we already knew from the MOVE event which messages had been removed, so it was duplicate info.

The EXPUNGE though - that's an event on the source mailbox, and it would contain all the counters and everything, no work required.

So I ran up another branch and tried just removing the suppression. Cyrus sent an EXPUNGE event.

It turned out this actually worked - the pusher would send the wipe for the MOVE, but immediately afterwards would see the EXPUNGE event and send a new badge with the correct number.

And all Rob N had to do was remove the special case code in pusher that detected that it was a MOVE from INBOX and cleared the badge. We actually removed code overall - me an mboxevent suppression (though an additional comment made up for the linecount) and Rob a special case in pusher.

Take a step back

Far too often as a programmer, you get caught up in a particular solution to a problem. Anything is possible with enough code. We could have solved this problem with special cases and holding the mboxevent inside Cyrus long enough to collect the EXPUNGE data and add it to the MOVE event, but it would have been complex and brittle and hard to maintain.

And honestly, MOVE is a bit special. It's the only command which modifies two mailboxes in a single action, so it's not unreasonable that it emits two events.

This is a great thing about standards and fixed APIs (assuming they aren't awful) - they force you to be creative within constraints.

It's also a great thing about having a small, smart team. If we had hundreds of people, we would have built a monstrosity to solve this problem, because we could. Since we just had me who had different things to do tomorrow, and I just wanted to get the job done and get to sleep, I had to solve it simply. And since I want to sleep next week, I had to solve it maintainably as well!