Filtering out hostname spam in Google Analytics

This is the weirdest kind of spam, I guess that's what it is; Intended to make the people who read the reports check the fake sounding hostnames? Probably so they can get a drive-by infection - makes sense I guess since these are people with websites, quite a good target.

I made a regular expression to help me filter out hostname spam from my reports:

First I exclude my own sites using this regex, the customisations are mostly to deal with .com and .co since many of my sites use the quite unique NZ TLD:

localhost|.nz$|tomachi.co$|(damnative|triptonites|carbonmade|boomboom|design|youtube|googleusercontent|sites.google).com$|.fritz.box$|.guru$|.dev$|auctiontix.net$

This provides a filtered 5 year view with these spam domains showing - great!

Hostname Spam in Google Analytics

Hostname Spam in Google Analytics

From here I created the following regex to outright block TLDs that I don't use, and even the (not set) hostnames I found:

not set|us$|cn$|\.ru$|info$|eu$|br

Luckily the block known bots feature works so well, and removes the need for this type of action, however, this can be useful for looking at historic reports.

GA Bot Filtering

GA Bot Filtering