Dec 13: Doing it the hard way
This is the thirteenth post in the FastMail 2015 Advent Calendar. Stay tuned for another post tomorrow.
Well, here's a story that goes deeper into the process of getting to the nice happy end result that we can blog about.
Complexity all the way down
One of the things we like to do is display a little badge on our mobile app icon to say how many unread messages are in the Inbox.
This is done through mboxevents and the pusher. We wrote about how it works last year.
We had already implemented the MOVE extension to IMAP, and before it was even proposed as an RFC we had XMOVE, because it makes quota issues so much nicer for users. When mboxevents were added to Cyrus we added a custom mboxevent for MOVE.
Only it turned out that when you moved messages from your INBOX, the badge didn't update, because MOVE, like COPY, is an event on the destination mailbox.
We did have the oldMailboxID field though, so when we detected a move from INBOX in pusher, we just cleared the badge. At least that's better than leaving it definitely wrong - and for an INBOX-zero user it might even be correct.
It annoyed people with unread messages though, because the counter didn't reappear until a new email was delivered.
Let's make it more complex
Rob N asked if I could output counters for the source mailbox in the MOVE event as well, so he could make it better.
In fact, the whole exchange is amusing enough that I'm going to paste from our internal slack channel, Dec 3rd. We were all working from home that day, so there's no side channel of yelling between rooms.
robn 12:31 brong: so, I have a need which I suspect only you can fulfill. nicola 12:32 robn: you've decided to take up bodypump? robn 12:33 not exactly 12:33 not at all, in fact brong 12:34 oh, the pressure 12:34 what if I can't perform? robn 12:35 then you're not the man I thought you were brong 12:35 challenge accepted, name your terms robn 12:36 so when a message is moved, cyrus emits an mboxevent. the IMAP URI for the event is the dest folder. the event includes an oldMailboxID field, with the URL for the source folder. all good so far. 12:36 but 12:36 vnd.cmu.unseenMessages and vnd.fastmail.convUnseen are for the dest folder 12:36 there's no unread counts for the source folder 12:37 which means if you trash a message, I see a change on the inbox, but I have no counts 12:37 so I can't push a badge update to ios brong 12:37 hmm robn 12:37 right now I push a 0, which removes it 12:37 I did that deliberately, but now someone has complained nicola 12:37 YES! That explains why my count goes whack. robn 12:37 fair enough I suppose. but yeah, I need the counts for the old mailbox too brong 12:38 sounds legit (... I did some code reading here ...) brong 12:53 robn: you're going to want CONVUNSEEN aren't you 12:53 and CONVEXISTS robn 12:57 brong: UNSEEN_MESSAGES and CONVUNSEEN. we unpack MESSAGES and CONVEXISTS, but we don't use them brong 12:57 yeah, OK 12:57 I'll do the lot, because reasons
I spent most of the rest of the day trying to work out how to get the values I needed without causing a locking inversion, because you don't have a locked source mailbox at the point where the mboxevent is generated. I was doing all sorts of surgery to the code.
Quick shoutout here to git - we could use a different version control tool, but it does make it incredibly easy to try out tons of different ideas on experimental branches, cherry pick the result back together and whitewash it into a rebased commit that makes it look like you knew what you were doing all along. Thanks git, it's helped me hide the evidence for more bad ideas than you could imagine.
So I spent ages messing around with various options and aborting when they got too complex. Thankfully I got tired. The upside of flexible work hours is that they're flexible. The downside is that at 11pm on some Thursday night you're still hacking away at this rubbish after a 9:30pm teleconference with people on the other side of the world...
It's all too hard
I had already noticed that we were deliberately suppressing an EXPUNGE mboxevent during the cleanup phase of the MOVE command. Normally any expunge would send an event, but we already knew from the MOVE event which messages had been removed, so it was duplicate info.
The EXPUNGE though - that's an event on the source mailbox, and it would contain all the counters and everything, no work required.
So I ran up another branch and tried just removing the suppression. Cyrus sent an EXPUNGE event.
It turned out this actually worked - the pusher would send the wipe for the MOVE, but immediately afterwards would see the EXPUNGE event and send a new badge with the correct number.
And all Rob N had to do was remove the special case code in pusher that detected that it was a MOVE from INBOX and cleared the badge. We actually removed code overall - me an mboxevent suppression (though an additional comment made up for the linecount) and Rob a special case in pusher.
Take a step back
Far too often as a programmer, you get caught up in a particular solution to a problem. Anything is possible with enough code. We could have solved this problem with special cases and holding the mboxevent inside Cyrus long enough to collect the EXPUNGE data and add it to the MOVE event, but it would have been complex and brittle and hard to maintain.
And honestly, MOVE is a bit special. It's the only command which modifies two mailboxes in a single action, so it's not unreasonable that it emits two events.
This is a great thing about standards and fixed APIs (assuming they aren't awful) - they force you to be creative within constraints.
It's also a great thing about having a small, smart team. If we had hundreds of people, we would have built a monstrosity to solve this problem, because we could. Since we just had me who had different things to do tomorrow, and I just wanted to get the job done and get to sleep, I had to solve it simply. And since I want to sleep next week, I had to solve it maintainably as well!