Dec 1: Email Search System

Technical

This blog post is part of the FastMail 2014 Advent Calendar.

The next post on December 2nd is Security - Confidentiality, Integrity and Availability.

Technical level: medium

Our email search system was originally written by Greg Banks, who has moved on to a role at Linked In, so I maintain it now. It's a custom extension to the Cyrus IMAPd mail server.

Fast search was a core required feature for our new web interface. My work account has over half a million emails in it, and even though our interface allows fast scroll, it's still impossible to find anything more than a few weeks old without either knowing exactly when it was sent, or having a powerful search facility.

Greg tried a few different engines, and settled on the Xapian project as the best fit for the one-database-per-user that we wanted.

We tried indexing new emails as they arrived, even directly to fast SSDs, and discovered that the load was just too high. Our servers were overloaded trying to index in time - because adding a single email causes a lot of updates.

Luckily, Xapian supports searching from multiple databases at once, so we came up with the idea of a tiered database structure.

New messages get indexed to tmpfs in a small database. A job runs every hour to see if tmpfs is getting too full (over 50% of the defined size), in which case it compacts immediately, otherwise we automatically compact during the quiet time of the day. Compacted databases are more efficient, but read only.

This allows us to index all email immediately, and return a message that arrived just a second ago in your search results complete with highlighted search terms, yet not overload the servers. It also means that search data can be stored on inexpensive disks, keeping the costs of our accounts down.

Technical level: extreme

Here's some very technical information about how the tiers are implemented, and an example of running the compaction.

We have 4 tiers at FastMail, though we don't actually use the 'meta' one (SSD) at the moment:

  • temp
  • meta
  • data
  • archive

The temp level is on tmpfs, purely in memory. Meta is on SSD, but we don't use that except during shutdown. Data is the main version, and we re-compact all the data level indexes once per week. Finally archive is never automatically updated, but we build it when users are moved or renamed, or can create it manually.

Both external locking (Xapian isn't always happy with multiple writers on one database) and the compaction logic are managed via a separate file called xapianactive. The xapianactive looks like this:

% cat /mnt/ssd30/sloti30t01/store23/conf/user/b/brong.xapianactive
temp:264 archive:2 data:37

The first item in the active file is always the writable index - all the others are read-only.

These map to paths on disk according to the config file:

% grep search /etc/cyrus/imapd-sloti30t01.conf
search_engine: xapian
search_index_headers: no
search_batchsize: 8192
defaultsearchtier: temp
tempsearchpartition-default: /var/run/cyrus/search-sloti30t01
metasearchpartition-default: /mnt/ssd30/sloti30t01/store23/search
datasearchpartition-default: /mnt/i30search/sloti30t01/store23/search
archivesearchpartition-default: /mnt/i30search/sloti30t01/store23/search-archive

(the 'default tier' is to tell the system where to create a new search item)

So based on these paths, we find.

% du -s /var/run/cyrus/search-sloti30t01/b/user/brong/* /mnt/i30search/sloti30t01/store23/search/b/user/brong/* /mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/*
3328    /var/run/cyrus/search-sloti30t01/b/user/brong/xapian.264
1520432 /mnt/i30search/sloti30t01/store23/search/b/user/brong/xapian.37
3365336 /mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.2

I haven't compacted to archive for a while. Let's watch one of those. I'm selecting all the tiers, and compressing to a single tier. The process is as follows:

  1. take an exclusive lock on the xapianactive file
  2. insert a new default tier database on the front (in this example it
    will be temp:265) and unlock xapianactive again
  3. start compacting all the selected databases to a single database on
    the given tier
  4. take an exclusive lock on the xapianactive file again
  5. if the xapianactive file has changed, discard all our work (we lock
    against this, but it's a sanity check) and exit
  6. replace all the source databases for the compact with a reference to
    the destination database and unlock xapianactive again
  7. delete all now-unused databases

Note that the xapianactive file is only locked for two VERY SHORT times. All the rest of the time, the compact runs in parallel, and both searching on the read-only source databases and indexing to the new temp database can continue.

This allows us to only ever have a single thread compacting to disk, so our search drives are mostly idle, and able to serve
customer search requests very quickly.

When holding an exclusive xapianactive lock, it's always safe to delete any databases which aren't mentioned in the file - at worst you will race against another task which is also deleting the same databases, so this system is self-cleaning after any failures.

Here goes:

% time sudo -u cyrus /usr/cyrus/bin/squatter -C /etc/cyrus/imapd-sloti30t01.conf -v -z archive -t temp,meta,data,archive -u brong
compressing temp:264,archive:2,data:37 to archive:3 for user.brong (active temp:264,archive:2,data:37)
adding new initial search location temp:265
compacting databases
Compressing messages for brong
done /mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.3.NEW
renaming tempdir into place
finished compact of user.brong (active temp:265,archive:3)
real 4m52.285s
user    2m29.348s
sys 0m13.948s
% du -s /var/run/cyrus/search-sloti30t01/b/user/brong/* /mnt/i30search/sloti30t01/store23/search/b/user/brong/* /mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/*
368 /var/run/cyrus/search-sloti30t01/b/user/brong/xapian.265
du: cannot access `/mnt/i30search/sloti30t01/store23/search/b/user/brong/*': No such file or directory
4614368 /mnt/i30search/sloti30t01/store23/search-archive/b/user/brong/xapian.3

If you want to look at the code, it's all open source. I push the fastmail branch to github regularly. The xapianactive code is in imap/search_xapian.c and the C++ wrapper in imap/xapian_wrap.cpp.