The rise in fake traffic - pointless or just irritating?
MJB Data's Mat Barnett tries to shine some light on one of the Internet's darker activities.
Being a successful web-developer and online marketing expert, (tall, funny, handsome, modest, etc., etc.), I don't spend as much time looking at my own websites' traffic reports as I used to. I should though. I should make time to assess and improve the performance of my sites as it can be a bit embarrassing to extol my expertise in Search Engine Optimisation, (SEO), only to have to explain why mjbdata.co.uk receives such a dismal trickle of traffic*.
Like most people with a website, I have viewed sudden surges in traffic as an encouraging sign that my efforts are paying off. However, having scrutinised things a bit closer than most people either can, or would, want to I've discovered that this site is a bit of a magnet for fake referrals.
Traffic to this site is often realted related to various User Agents. As far as generating new business is concerned, this is about as likely to get the phone ringing as lightening striking the telegraph pole outside my office, but at least it's genuine traffic. At least there are some people in some way appreciating or in some way benefiting from the work I've put into this thing over the years. So, it's a bit irritating to see the list of top 10 referring domains featuring spampoker.com, im-a-desparate-casino-affiliate.com, and other rubbish websites which patently have never really referred any visitors to my site. Instead someone has hit my site with a SpamBot using bogus headers.
Now, while I know there are a few people out there who will know exactly what I'm on about, I expect that people reading this who have lives and complexions might need a bit of background on what headers are, what referrals are and why anyone would want want to fake them?
What are Headers?
As well as text and HTML that you see in your browser as a web page, there are other bits of information that are exchanged between your browser and a website's server that you generally can't see - known as 'headers'. Your browser, (e.g. Internet Explorer, FireFox, Safari), tells the server it's name and what kind of documents it likes, where it's based and in some cases where you have just been.
If you are reading this online, when you loaded this page into your browser, either by typing the address or clicking a link, your browser will have sent a collection of bits of info as a header as part of its request. At the risk of making more technically experience readers wince, it's a bit like introducing yourself to a librarian before asking for a book. "Hello, I'm Internet Explorer 6. My IP address is a load of numbers and dots. I like Jazz, Hip Hop and organic food. Can I have a copy of Factotum by Charles Bukowski please?" To which the server might reply, "I don't really give two hoots who you are or where you came from. I'll record your details though along with a note to say that your document was successfully delivered. Enjoy."
The header that is most relevant to the sort of fake traffic I'm discussing is known as the REFERRER. When you type an address of a page into your browser the referrer header will be blank. However when you click a link your browser will sends the server the address of the page that the link was on - the referring page, web address or URL. Details of referrals, along with other header information such as your IP address, and User Agent are then usually written to a log file.
Example headers - Is this you?
REFERRER : the page containing the link you clicked, (if there was one)
User agent: the Browser you are using
CCBot/1.0 (+http://www.commoncrawl.org/bot.html)
IP Address: your IP address
38.103.63.16
This information and log files are commonly processed later to give website owners an idea of how much traffic they're getting and where it's coming from. In many cases this log-file processing is automated and the reports themselves published as HTML files on the same server as the website being reported on. It is also quite common for the log-file crunching programs that produce these reports, (e.g. Webalizer and Analogue), to list the domains or websites that have referred traffic as links and it is these lists of links that our fake-referral traffic-spammers our trying to infiltrate.
So spammers are trying to get their web addresses to appear in traffic reports as web-links by hitting websites with fake referral headers, but why?
The most obvious reason for this kind of behaviour is the potential improvements in PageRank, Google's measure of a website's popularity based on the number of links that point to it. Does it work? Probably not, but everyone seems to be a bit Google-mad these days and will try almost anything to gain a decent position for one search term or another.
It looks like about half of the traffic to mjbdata.co.uk that is appearing in my reports for the past couple of months is fake. I've been looking at traffic reports for years now and this sort of thing stands out a mile. If you've recently seen a surge in traffic you might want to take a close look. One thing to watch out for is referrals coming from websites' default pages as opposed to deeper content, (e.g. www.spamcasino.com as opposed to www.spamcasino.com/links.html). It's easy to double-check - just visit the address of the pages that seem to be referring traffic to your site and see if there's a link.
My reporting system lets me filter out anything I think is dodgy so I'm off to get a better picture of how my website is really doing. But before I do I think I'll have a look at a few of the spammers' websites and see if their nefarious practices are having any positive effectsfor them.
Later...
I've now trawled through the major referrers for the past couple of months and filterered out around 35 dodgy domains. Depsite the tedious nature of this task I am slightly encouraged that firstly the impact on my traffic figures is nowhere near as significant as I had feared, and secondly the PageRank of the sites I checked was 0 in every case which hopefully means that this has been a temporary, experimental spam exercise which will now die out.
*it's not that dismal really
This site receives 1 or 2 thousand visitors a month, which is lower than I would expect for my clients' websites. The main reason for this site being a bit quiet is the amount of competition. There aren't that many websites on tiddlywinks for exmple, (115,000 pages on Google), so achieving top ranking for tiddlywinks terms would be relatively simple compared with trying to get a decent placement for terms to do with website design, (383,000,000 pages on Google). And I am very busy.