2 types:
- Spam bots that visit websites (spam crawlers) – send out http requests to websites with fake referral headers
- Spam bots that don’t visit websites (ghost traffic) – send fake data directly to google analytics without hitting the website
Using .htaccess rules won’t help with ghost traffic since it never actually hits your site.
What to Look For
Hostnames
**Audience** > **Technology** > **Network** > **Hostnames**
Identify the hostnames for your site.
Referral Spam
These will show up in your acquisition reports.
- Source/Medium
- Referrals
Custom Segment
This segment is a series of 5 filters – 1 hostname filter and 4 spam crawler filters.
Hostname filter
Segment: Sessions > Include
Hostname > matches regex > *your valid hostnames*
Spam Crawler Filters
There are several blogs and articles with different filter patterns and example. You can also construct your own identifying only the spam affecting your site. Mike Sullivan’s article, linked below, maintains a fairly updated set of filters.
Segment: Sessions > Exclude
Source > matches regex > *spam crawler expression*
repeat as necessary
Google Analytics Bot and Spinder Filtering
Check the box for this option under View Settings.
Link to Segment:
https://analytics.google.com/analytics/web/template?uid=RwRGeBruR5Kz8uPZrSvheA
Note: The hostname filter is pretty general, but may require some changes based on your site.