skip to content

System: Referer Spam from Microsoft Bing

With the demise of Live Search and the launch of Bing a few weeks ago we were hopeful that Microsoft was also going to change it's practice of spamming websites with fake search referrals. Until this week it looked like they had, but now we're seeing exactly the same pattern of abuse coming from the same IP addresses as the notorious 'LVSP' and 'QBHP' requests described earlier.

Evidence from the log files

Here's an example of the type of traffic we're seeing. Referrals that appear to come from bing.com, but can't possibly with these one-word search terms.

HTTP_REFERER

This is how the HTTP_REFERER field appears for the fake traffic from Bing:

http://www.bing.com/search?q=about http://www.bing.com/search?q=africa http://www.bing.com/search?q=atherton http://www.bing.com/search?q=australia http://www.bing.com/search?q=backpacker http://www.bing.com/search?q=beaded http://www.bing.com/search?q=calls http://www.bing.com/search?q=community http://www.bing.com/search?q=coolbaroo http://www.bing.com/search?q=corowa http://www.bing.com/search?q=emergency http://www.bing.com/search?q=family http://www.bing.com/search?q=farmkeeper http://www.bing.com/search?q=films http://www.bing.com/search?q=glenelg http://www.bing.com/search?q=health http://www.bing.com/search?q=higher http://www.bing.com/search?q=history http://www.bing.com/search?q=links http://www.bing.com/search?q=malua http://www.bing.com/search?q=massacre http://www.bing.com/search?q=member http://www.bing.com/search?q=merimbula http://www.bing.com/search?q=ocean http://www.bing.com/search?q=policy http://www.bing.com/search?q=president http://www.bing.com/search?q=screen http://www.bing.com/search?q=search http://www.bing.com/search?q=selling http://www.bing.com/search?q=simulations http://www.bing.com/search?q=sister http://www.bing.com/search?q=sisters http://www.bing.com/search?q=street http://www.bing.com/search?q=sydney http://www.bing.com/search?q=wedding http://www.bing.com/search?q=ylang

expand code box

For comparison, here is a sample of the HTTP_REFERER strings for real search traffic:

http://www.bing.com/search?q=php++output+examples+good+looking+tables&filt=all&first=11&FORM=PERE http://www.bing.com/search?q=escaping+characters+in+javascript&FORM=HPDTDF&src=IE-SearchBox http://www.bing.com/search?q=atom+reader&go=&form=QBRE http://www.bing.com/search?q=transitions+using+CSS+javascript&form=QBRE&qs=n http://www.bing.com/search?q=art+of+web&form=QBLH&filt=all&qs=n

IP addresses

As before, all the suspect traffic is coming from the same location, Microsoft Corporation in Redmond WA:

65.55.104.* 65.55.107.* 65.55.109.* 65.55.110.*

User agent

The user agents being used are interesting in that they mix up different versions of .NET and other components, but all seem to have double spaces where normally you would see a single space:

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30707) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.2) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.2) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.2) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729; InfoPath.2) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SLCC1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30707; InfoPath.2) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.40607; .NET CLR 3.0.30729; .NET CLR 3.5.30729; MS-RTC LM 8) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.40607; .NET CLR 3.0.04506.648) Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.2; SV1; .NET CLR 1.1.4325; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30707)

expand code box

If they carry on using the double spaces that might be another option for filtering out this traffic. We'll be monitoring the logs in any case and will report back any changes.

Update: So far it's consistent that these are the only requests in our server logs containing MSIE 6.0; surrounded by double spaces. If you wanted, you could use that as the blocking trigger as well as/instead of the rules below. For example:

RewriteCond %{REMOTE_ADDR} ^65\.55\.(104|107|109|110|165|232) RewriteCond %{HTTP_USER_AGENT} " MSIE 6.0; " RewriteRule .* - [F]

Note that there are two spaces at each end of the quoted string!

Blocking referer spam from Bing

As shown in our previous article it's possible to block this traffic from your website. They will still appear in the logfiles, but with a 403 Forbidden error, and the robot won't be able to request subsequent files which has been a problem in the past.

Here is the code we're using now to block both the old Live Search referer spam, which may or may not be relevant any more, and the new wave from Bing:

# block Microsoft referer spam RewriteCond %{REMOTE_ADDR} ^65\.55\.(104|107|109|110|165|232) RewriteCond %{HTTP_REFERER} (www\.bing|search\.live)\.com RewriteCond %{HTTP_REFERER} !\& RewriteRule .* - [F]

Most of this is explained in the preceding article. The fake requests from Bing are being blocked based on the fact that they contain only a single GET parameter and therefore do not contain the & character as would almost any 'real' search traffic.

What do you mean it's not spam?

There are a few alternative theories for why we might be seeing this traffic in the server logs and anlytics. Some of them are reasonable, while others are seriously confused:

Microsoft is trying to detect cloaked websites

Cloaking is a black hat search engine optimization (SEO) technique in which the content presented to the search engine spider is different to that presented to the user's browser.

This is an excuse that Microsoft put out in relation to the Live Search referral spam. At first it may seem reasonable, until you ask why none of the other major search engines (Google, Yahoo, Ask, Cuil, ichiro, ...) feel the need to do the same. Are they all just that much smarter in coming up with a solution?

Perpetrators of this myth often follow up by warning that 'if you block these requests your website will receive a ranking penalty or not be indexed'. That may be true or not, but it's a strange way to run a global search engine.

Search terms are being truncated by a browser bug

This is by far the most absurd theory I've heard to date. The suggestion is that even though someone may have searched for "back packer travel insurance" that the request could arrive with just the word "travel" in the referrer string.

This completely ignores the fact, as shown above, that: a) all the other GET parameters normally associated with a search referral are not present; and b) all these requests come from inside Microsoft Corporation!

Now there may be some kind of 'anonymizer' letting people make searches without their location or actual search terms being revealed, but what's the point then in sending a search string at all?!

Conclusion

Occam's razor tells us that the simplest explanation is usually the best, so until we see any evidence to the contrary we have to assume that it's spam intended to inflate Bing's search numbers.

On one of our servers we are now blocking up to 300 instances of referrer spam from Microsoft per day. The ratio of real to fake traffic is almost 1:1 meaning that if you're not doing any filtering you need to cut your numbers for Bing in half!

References

< System

Post your comment or question
top