On December 7th I sent out a normal batch of emails to the Circumventor mailing list, where I send out new proxy sites for getting around Internet filters. I registered seven new domains and sent each domain to one seventh of the list; the list contains about 420,000 addresses, so each one went to about 60,000 people. (Each new site is only sent to a random subset of the list, so that a blocking company can't just subscribe one address to the list and block all new sites as soon as they're mailed out.)
The list is also comprised of 100%-verified-opt-in addresses, meaning that a new subscriber has to reply to a confirmation message in order to be added to the list. That's considered the gold standard for responsible mailing, but major email providers keep finding new ways to block the emails as "spam," which sometimes provide interesting insights into how the filters work behind the scenes.
After the last mailing, for example, all of my newly registered domains got disabled by the registrar because two of the domains had been incorrectly blacklisted by the Spamhaus Domain Block List. It took two days to discover the problem and then several hours to trace the problem to Spamhaus, although once I found Spamhaus's automated form I was able to get the domains un-blacklisted immediately. So the registrar re-enabled the domains a few hours later, although the traffic to the domains never returned to its previous levels. Spamhaus, meanwhile, continues to claim the DBL is a "zero false-positive" list, and has yet to acknowledge the error or contact me to help get to the bottom of how it happened. Well, they know how to reach me.
At least this time around, my domains didn't get disabled. Instead, the messages rolled out for a few hours with no problem (replies from users indicated that at least some hotmail.com and yahoo.com users were receiving them), until bounces abruptly started coming in from hotmail.com and yahoo.com addresses saying:
----- Transcript of session follows -----
... while talking to mta5.am0.yahoodns.net.:
<<< 550 Message Contains SPAM Content
554 5.0.0 Service unavailable
After pummeling my address with bounce messages (to the point where my own Gmail account started bouncing because it was getting hammered with so many bounce messages from Hotmail and Yahoo), when the dust finally settled, I tried reproducing the error by sending test messages from my server's IP address to a test Hotmail account. It turns out that out of the seven different URLs that I had been mailing to our users, four of the domains in those URLs would generate a "550 Message Contains SPAM Content" error when sent from my IP to a Hotmail address, and the other three did not. The message didn't have to contain the banned domain in the From: address; the message would get blocked if it even mentioned the domain anywhere in the message body. (This only happened when sending from my own IP address at peacefire.org. It didn't happen if I tried sending a message from my Gmail account to a Hotmail address, even if the message contained one of the four banned domain names, so the issue probably won't reproduce if you try sending a test message yourself.)
But interestingly, Yahoo Mail started bouncing my messages at about the same time — out of the seven domain names, the same four domain names were being bounced by Yahoo Mail as by Hotmail, also with the error "550 Message Contains SPAM Content." That's far too unlikely to be a coincidence, so it looks as if Hotmail and Yahoo Mail are using a common secret blacklist of domain names that cause a message to be blocked as spam. (As it happens, the other three domains were also being bounced by Yahoo Mail with the error "Message Contains SUSPECT Content" — as opposed to "SPAM Content" — while those three domains were not blocked by Hotmail at all. That of course is aggravating, but the real clue lies in the fact that both Yahoo Mail and Hotmail were giving "SPAM Content" errors to the exact same subset of domains.)
I don't want to publish the list of all seven domain names here, so as not to make it too easy for censorware companies to block them all, but one of the four blacklisted domains was 'golflanding.com.' (All of the new domains I register are nonsensical two-word combinations, since those are the only .com domains that are likely to be (1) still available and (2) easy to remember.) As soon as it seemed like Hotmail and Yahoo Mail were working off of a common blacklist, I checked to see if Spamhaus had screwed up again and listed our domains, but none of the seven domains were on Spamhaus's lists.
I looked up golflanding.com on the blacklistalert.org service, which checks against all major spam blacklists, but no hits were listed there either (except for on some defunct services which haven't been updated in years).
So if Hotmail and Yahoo Mail are both using the domain blacklist, perhaps it's a list compiled by one company and then licensed to the other, or perhaps it's a third-party list not widely known to the public. (Hotmail uses their own SmartScreen filter, but I've found nothing online about Yahoo using it as well.) It's conceivable that one or more of the domains might have gotten blacklisted as a result of Hotmail or Yahoo users clicking their "This is spam" button. However, Hotmail allows newsletter publishers to view data about what percent of their messages to Hotmail users are being flagged by users as "spam," and when I looked up the stats for our IP, they showed a "complaint rate" of less than 0.1% (usually the rest of people hitting 'Junk Mail' to unsubscribe from the list). Assuming that the complaint rates are similar for Yahoo Mail, it's unlikely that the domains got blacklisted as a result of user complaints, unless the blacklist trigger has a ridiculously low complaint threshold.
Neither the Hotmail postmaster site nor the Yahoo postmaster site mention anything about a list of domain names that could cause a message to be blocked for mentioning the domains in the message body. Yahoo Mail does provide a support form for newsletter publishers to send inquiries about why their mail is being blocked; I submitted that on Saturday and started a thread with email "support," although so far their response has just been to copy and paste articles from the Postmaster site, with tips like "Send email only to those that want it." Each time, I reply saying, No, this is not the problem, the problem is that the domains in the messages are getting incorrectly blacklisted, and each time, support cheerfully sends me another article. If I'm not literally talking to a bot, I might as well be.
I opened a similar ticket with Hotmail, and they sent me a form letter saying that the emails were being blocked because of SmartScreen, and that as a matter of policy, they would refuse to fix any errors being made by the SmartScreen filter. Waiting to see if I get a reply from a human next.
So why should you care? Well, for one thing, if you care about users in China and Iran being able to receive proxies to get around their Internet blockers, right now Hotmail and Yahoo are thwarting these proxies more effectively than those countries' own censors are. Yes, these are real people who really do write back to me after a mailing goes out, telling me about how they were able to use the proxies to receive banned political information, and sometimes how long the proxy lasted before the censors blocked it. This week, they had to do without.
But more importantly, this is an example of a general problem: That there are certain types of issues, like blocking of legitimate mail by spam filters, where the "free market" does not deliver the best experience to consumers, and the costs get passed on to everybody. Sometimes the problems could be solved with some effort, but the effort does not get made, because people believe that the free market will solve the problem, or that it already has.
In theory, if consumers have enough information about different companies and their services, the companies can compete to provide the best product to users. The problem is that if one type of information is systematically hidden from users — in this case, the fact that their mail provider is blocking mails from reaching them — then the "theory" falls apart. Since spam getting into your inbox is a visible problem, but missed email messages are an invisible problem, Hotmail's incentive is not to give the user the best experience, but rather to err on the side of blocking legitimate messages — even if the user might prefer to get slightly more spam, than to miss one important email that they were waiting for.
This means we're not just talking about a few messages getting caught in filters, which could happen even in an efficient marketplace. We're talking about a permanent equilibrium where the user gets a sub-par experience by default — a trade-off that causes them to miss more messages than they want to — and senders have to pay the cost of overcoming the marketplace inefficiencies. (Which means if the sender is a business you buy from or a charity you support, the costs get passed on to you.)
Pretty much the entire financial cost of sending email, is attributable to the failure of the "free market" to motivate email providers to deliver non-spam emails into their user's inboxes. If a company or organization uses an email list hosting company like AWeber or Constant Contact to email their users, they pay a fee of about $1 per month for every 100 users on their list (which would run me about $4,000 per month). That fee doesn't go towards bandwidth — even a 1-million-subscriber list, emailed once a month, would use less than 3 GB per month of bandwidth, which is what GeoCities was was giving away for free 10 years ago. What you're paying for is the fact that AWeber and Constant Contact have friends in the right places at Hotmail, Yahoo, and Gmail, so if your mails are getting blocked, they know the people to call to fix the problem. If you run your own list instead of paying a hosting fee to AWeber or Constant Contact, you'll end up paying other costs indirectly, through loss of income when your messages don't reach recipients, or in time and money spent trying to fix the issue. (I have to take this option anyway, since I send different URLs to different random subsets of my list, which is not supported by AWeber or Constant Contact.)
On the other hand, if the market actually "worked" — if email providers did reliably deliver non-spam messages to their users — a company or charity could run their own list for virtually zero cost, and would be able to keep all of that money. (I incur no up-front fees for running my own list; all of the costs are the time spent trying to get Yahoo, Gmail, and Hotmail to stop blocking it.) So every time you donate to a charity or buy from an online retailer, a little bit of that money goes towards the cost of that organization having to fight past marketplace failures in order to get their email to you.
I don't think there's an easy algorithmic solution, like crowdsourcing Facebook complaints or using random-sample voting on Digg. Generally, I just think we need more awareness of the fact that, under certain conditions (including those surrounding email deliverability), the "free market" is virtually guaranteed to arrive at a non-optimal solution. One manifestation of that awareness would be if Hotmail, Yahoo Mail, and Gmail created public points of contact where legitimate email publishers could find out why their emails were blocked, and had real humans responding to the messages and fixing the problems. By default, the imperfect information in the marketplace leads toward an equilibrium that errs on the side of blocking too much legitimate email, so anything that pushes the equilibrium back towards more legitimate messages getting delivered will improve the experience for users and lower costs for senders.
Besides, there's a more basic ethical issue here. If you're Hotmail and you tell your users that you're providing them with "email accounts," then those users expect those accounts to work — including having the ability to receive mails from mailing lists that they've signed up for. Helping legitimate emails get through to users is not just a matter of addressing a marketplace inefficiency, it's a matter of honesty.
Larry Lessig's book "Code is Law" describes how default choices built into the architecture of the Internet and other environments — the "code" — can steer our behavior in ways that we might not choose otherwise. I'm making essentially the same point in saying that some problems are not fixed by market forces, because people are not aware of the problem at all. I think the evidence and the reasoning are straightforward in this case, but it's hard to convince people who have adopted it as an axiom that whatever the free market arrives at, must be the solution. My favorite single sentence in Lessig's book was, "Put your Ayn Rand away." I could imagine the years of pushing against dogmatic fanaticism that led him to write that sentence, and I knew how he felt.