Reputation: 7701
In my Google Analytics reports, I see traffic that I am almost sure that it comes from bots:
See how the service provider is amazon technologies inc.
(from Ashburn, Virginia, apparently Amazon’s AWS bots) and microsoft corporation
(from Coffeyville, Kansas).
I want to exclude all traffic from all bots, including Google, Amazon, Microsoft and any other company. I only want to see traffic from real people who visit my site, not from web robots. Thank you.
Upvotes: 0
Views: 3855
Reputation: 1467
You can use Robots.txt to try and exclude the bots: Robots exclusion standard
Some excerpts not that the link would ever likely fail.
The robots exclusion standard, also known as the robots exclusion protocol or simply robots.txt, is a standard used by websites to communicate with web crawlers and other web robots. The standard specifies how to inform the web robot about which areas of the website should not be processed or scanned. Robots are often used by search engines to categorize websites. Not all robots cooperate with the standard; email harvesters, spambots, malware, and robots that scan for security vulnerabilities may even start with the portions of the website where they have been told to stay out. The standard is different from but can be used in conjunction with, Sitemaps, a robot inclusion standard for websites.
About the Standard
When a site owner wishes to give instructions to web robots they place a text file called robots.txt in the root of the web site hierarchy (e.g. https://www.example.com/robots.txt). This text file contains the instructions in a specific format (see examples below). Robots that choose to follow the instructions try to fetch this file and read the instructions before fetching any other file from the website. If this file doesn't exist, web robots assume that the web owner wishes to provide no specific instructions and crawl the entire site.A robots.txt file on a website will function as a request that specified robots ignore specified files or directories when crawling a site. This might be, for example, out of a preference for privacy from search engine results, or the belief that the content of the selected directories might be misleading or irrelevant to the categorization of the site as a whole, or out of a desire that an application only operates on certain data. Links to pages listed in robots.txt can still appear in search results if they are linked to from a page that is crawled.
Some Simple Examples
This example tells all robots that they can visit all files because the wildcard * stands for all robots and the Disallow directive has no value, meaning no pages are disallowed.
User-agent: * Disallow: The same result can be accomplished with an empty or missing robots.txt file.
This example tells all robots to stay out of a website:
User-agent: * Disallow: / This example tells all robots not to enter three directories:
User-agent: * Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /junk/ This example tells all robots to stay away from one specific file:
User-agent: * Disallow: /directory/file.html Note that all other files in the specified directory will be processed.
Upvotes: 0
Reputation: 21
Most of these bots are coming from other tools. The last Friday we received a lot of sessions coming from Coffeyville and with the microsoft corporation as a service prodiver. It was because we used a tool to scan our website for cookies. So, that is the reason. My best option is to exclude any data from this town/city. Screenshot from Google Analytics about how I implemented the filter in that view
Upvotes: 1
Reputation: 121
In Google Analytics View Settings, you'll see an option for "Bot Filtering". Check the box to "Exclude all hits from known bots and spiders". If Google Analytics recognizes those hits from Ashburn and Coffeyville as bots, the data from those bots won't be recorded in your view.
If Google Analytics doesn't recognize them as bots, you could investigate the impact of adding a filter to your view(s) that would exclude traffic from the ISP Organization(s).
View Filter for ISP Organization
Upvotes: 4