Wednesday, 11 January 2017

Why email's broken (and how to fix it)

The world sends around 210 billion emails per day (source: the Radicati Group, Email Statistics Report 2015-2019). That's something like 30 per man, woman and child on the planet. Every day. That's a lot of messages.

Roughly 4 petabytes give or take and depending on whose figure you trust. Or 4 with 15 zeros after it characters. By way of comparison, the text of Wikipedia's 5.3 million English articles contain only about 3 millionths of that data, which is still enough to fill 2398 printed volumes.

So by any measure, we generate a lot of email traffic. But here's the thing. Most of that traffic is wasted.

To demonstrate, I did a little very non scientific experiment - I found a recent 10 message email conversation in my inbox, and copy-pasted the useful text (i.e. the complete last message with all of the replies below) into a text editor. Then I compared this with the size in my actual mailbox. What I found was that while the messages in the mailbox were about 650kB, the actual text, even allowing for the hidden meta data that identifies the messages, sender and so on, were more like 30kB. One of the messages took 74,000 characters for me to say simply, "I can do 8:30 tomorrow, yes".

Is it any wonder our mailboxes are flooded and overflowing? And that the young and trendy are increasingly moving to messaging apps?

Email is old by computing standards - the Simple Mail Transfer Protocol (SMTP) will be 35 this year. It has been improved a lot for security and spam prevention, but it's still the same basic, simple system where sending mail servers look up the 'mail exchanger' for a domain and send the message to it. No proprietary protocols or software (like WhatsApp), no per message charges (like SMS) and if everyone implemented encryption in transit it could be fairly secure (not everyone does though...). The simplicity and openness is why it refuses to die, especially in the business world.

But the reason for the bloat is nothing to do with the venerability of the protocol. There are actually two things. First, we like nicely formatted text, with bold characters, different fonts and even embedded images and so on. So the email needs to include markup, which is the codes that describe the formatting (like a web page) - this makes it bigger (<p class="body-text"> <span color="blue">Yes I agree</span></div> is obviously a lot bigger than 'Yes I agree'). Also it is pretty standard to include a plain text version for 'old' email programs that can't read the rich HTML, which of course no one actually uses these days so that's a waste.

But I can't see any of us giving up our rich text email. I'm old enough to remember text-based email software - I certainly don't want to go back to that. What we could change is the quoted text below the line. Why do we need it? Pretty much all email programs these days have the ability to sort a message by conversation, and behind the scenes they also tell the recipients the unique message identifier that they're responding to to make it more reliable. So why do I need to quote the message I'm replying to? I certainly don't feel the need when I send a text message on my phone or use Skype chat, because the conversation is there for me to read.

So here's an idea. Why don't we all turn off the 'quote message when replying' feature? It'd be weird at first, but we'd soon get used to it. And we'd end up sending a lot less crap with our emails, as well as not getting ourselves into trouble by forwarding something we were told in confidence 12 messages back to someone we shouldn't have. Now who's going to be brave enough to go first???