Mon 14 Dec 2009 10:17:08 AM PST
Correct HTTP status code for removed spam pages?
So, what's the right HTTP status code to return when someone requests a page from your site that had been spam, but that you removed?
In my humble opinion, if a spammer used a URL on your site, you should not re-use that URL for legit content for a long time. The recipient of a spam link could be checking the target long after you get around to cleaning up the spam. (For example, I'm experimenting with heuristics for marking domains as permanently bad based on whether they're still serving spamvertised pages as 200, hours or days after the spam went out.)
If a user takes a legit page down, that user might decide to put it back up. (Joe remembered that his Grandma follows him on your site, so he took down http://example.com/~joe/vacation-photos/ to change it around, then put it back up again without those photos.) That's an appropriate use for a 404.
If you took a spam page down, then it's probably going to have to be either "410 Gone" or "403 Forbidden." I like "403 Forbidden" since removing a spam page sounds like this: The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. But you could make a case for 410, too. The important thing is it should be something that signals, "yes, we had a spam problem, but we dealt with it," not "we're spammers," or "we're people who don't maintain our web site."