### QQDownload User Agent Examples Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/user_agent/crawlers.txt User agent strings showing MSIE versions with QQDownload client information. ```user-agent Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; QQDownload 1.7; .NET CLR 2.0.50727) ``` ```user-agent Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; Trident/4.0; QQDownload 663; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET CLR 1.1.4322) ``` ```user-agent Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; QQDownload 760; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2; .NET4.0E; .NET4.0C; .NET CLR 1.1.4322) ``` -------------------------------- ### User-Agent Parsing Examples Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/sec_ch_ua/crawlers.txt Demonstrates parsing of User-Agent strings to identify browser versions and types. These examples are crucial for distinguishing between human users and automated bots. ```APIDOC User-Agent String Parsing: This section details the interpretation of common User-Agent strings used by browsers, particularly headless ones. Example 1: Input: "HeadlessChrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129" Analysis: Identifies 'HeadlessChrome' with version '129' and 'Chromium' with version '129'. The 'Not=A?Brand' is a placeholder. Purpose: Detects a specific version of Headless Chrome. Example 2: Input: "HeadlessChrome";v="113", "Chromium";v="113", "Not-A.Brand";v="24" Analysis: Identifies 'HeadlessChrome' with version '113' and 'Chromium' with version '113'. 'Not-A.Brand' is a placeholder. Purpose: Detects an older version of Headless Chrome. Example 3: Input: "Chromium";v="128", "Not;A=Brand";v="24", "HeadlessChrome";v="128" Analysis: Identifies 'Chromium' with version '128' and 'HeadlessChrome' with version '128'. 'Not;A=Brand' is a placeholder. Purpose: Detects another version of Headless Chrome. General Parsing Rules: - User-Agent strings are comma-separated. - Each segment typically follows the format "ProductName";v="Version". - The presence and version of 'HeadlessChrome' are key indicators. - Placeholder brands like 'Not=A?Brand' should be ignored for detection purposes. ``` -------------------------------- ### Ghost Inspector User Agent Examples Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/user_agent/crawlers.txt Multiple user agent strings for Ghost Inspector, a browser automation and testing tool, showing different Firefox and Chrome versions on Linux. ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:52.0.2) Gecko/20100101 Firefox/52.0.2 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Firefox/52.0 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:50.1.0) Gecko/20100101 Firefox/50.1.0 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) QtWebEngine/5.3.0 Safari/538.1 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:45.0.1) Gecko/20100101 Firefox/45.0.1 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:46.0.1) Gecko/20100101 Firefox/46.0.1 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:47.0.2) Gecko/20100101 Firefox/47.0.2 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:48.0.2) Gecko/20100101 Firefox/48.0.2 Ghost Inspector ``` ```user-agent Mozilla/5.0 (X11; Linux x86_64; rv:49.0.2) Gecko/20100101 Firefox/49.0.2 Ghost Inspector ``` -------------------------------- ### Example User Agent Strings Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/user_agent/crawlers.txt This snippet showcases a variety of user agent strings commonly encountered from different web crawlers and bots. These strings often contain information about the bot's name, version, operating system, and a URL for further details. ```text Mozilla/5.0 (compatible; Feedspotbot/1.0; +http://www.feedspot.com/fs/bot) ``` ```text sg-Orbiter/1.0 (+http://searchgears.de/uber-uns/crawling-faq.html) ``` ```text Site24x7 Tools ``` ```text NextGenSearchBot 1 (for information visit http://www.zoominfo.com/About/misc/NextGenSearchBot.aspx) ``` ```text Aboundex/0.3 (http://www.aboundex.com/crawler/) ``` ```text yacybot (freeworld/global; i386 Linux 2.6.37.6-0.5-desktop; java 1.6.0_20; Europe/en) http://yacy.net/bot.html ``` ```text Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.4 (KHTML, like Gecko) Chrome/98 Safari/537.4 (StatusCake) ``` ```text Mozilla/5.0 (compatible; DomainSONOCrawler/0.1; +http://domainsono.com) ``` ```text Mozilla/5.0 (compatible; Online Domain Tools - Server Monitor/1.0; +http://server-monitoring.online-domain-tools.com) ``` ```text Mozilla/5.0 (compatible; Konqueror/3.5; Linux) KHTML/3.5.5 (like Gecko) (Exabot-Thumbnails) ``` ```text Mozilla/5.0 (compatible; spbot/4.0.4; +http://www.seoprofiler.com/bot ) ``` ```text Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) ``` ```text Factbot 1.09 (see http://www.factbites.com/webmasters.php) ``` ```text CopperEgg/RevealUptime/SaoPauloBR(aws) ``` ```text CRIM Crawler/Nutch-2.3 (Crawler du Centre de Recherche Informatique de Montrxc3xa9al (CRIM)) ``` ```text Mozilla/5.0 (compatible; heritrix/1.14.3 +http://www.webarchiv.cz) ``` ```text Mozilla/5.0 (compatible; WormlyBot; +http://wormly.com) ``` ```text yacybot (i386 Linux 2.6.23; java 1.6.0_17; Europe/en) http://yacy.net/bot.html ``` ```text Mozilla/5.0 (compatible; Cliqzbot/0.1 +http://cliqz.com/company/cliqzbot) ``` ```text URLitor.com ``` ```text RavenCrawler ``` ```text Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.18) Gecko/20130331 HeartRails_Capture/1.0.5 (+http://capture.heartrails.com/) Namoroka/3.6.18 ``` ```text Mozilla/5.0 (compatible; WebThumbnail/3.x; Website Thumbnail Generator; +http://webthumbnail.org) ``` ```text Kyoto-Crawler/2.0 (Mozilla-compatible; kyoto-crawler-contact(at)nlp(dot)kuee(dot)kyoto-u(dot)ac(dot)jp; http://nlp.ist.i.kyoto-u.ac.jp/) ``` ```text Mozilla/5.0 (compatible; SputnikImageBot/2.3; +http://corp.sputnik.ru/webmaster) ``` ```text WatchMouse/8.4.0.3 (http://watchmouse.com/ ; lvps91-250-96-109.dedicated.hosteurope.de) ``` ```text MetaGeneratorCrawler/1.3.3 (www.metagenerator.info) ``` ```text LongURL API ``` ```text Mozilla/5.0 ( compatible; SETOOZBOT/0.30 ; http://www.setooz.com/bot.html ; agentname at setooz dot_com ) ``` ```text Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (FlipboardProxy/2.0; +http://flipboard.com/browserproxy) ``` ```text http://tools.geek-tools.org/link-counter/ ``` ```text SCFCrawler/Nutch-1.8 (Image Crawler for StolenCameraFinder.com; http://www.stolencamerafinder.com/; crawler@stolencamerafinder.com) ``` ```text Mozilla/5.0 (compatible; Uptimebot/0.2.41; +http://www.uptime.com/uptimebot) ``` ```text Mozilla/5.0 (compatible; DomainAppender /1.0; +http://www.profound.net/domainappender) ``` ```text Mozilla/5.0 (compatible; hypestat/1.0; +http://www.hypestat.com/bot) ``` ```text yacybot (/global; amd64 Linux 3.13.0-042stab093.4; java 1.7.0_79; Europe/en) http://yacy.net/bot.html ``` ```text PagesInventory (robot +http://www.pagesinventory.com) ``` ```text Sosospider+(+http://help.soso.com/webspider.htm) ``` ```text Mozilla/5.0 (compatible; AntBot/1.0; +http://www.ant.com/) ``` ```text PagesInventory (robot http://www.pagesinvenotry.com) ``` ```text Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; http://www.changedetection.com/bot.html ) ``` ```text Wotbox/2.01 (+http://www.wotbox.com/bot/) ``` ```text Mozilla/5.0 (compatible; bixolabs/1.0; +http://bixolabs.com/crawler/general; crawler@bixolabs.com) ``` ```text Mozilla/5.0 (compatible; SpiderLing (a SPIDER for LINGustic research); +http://nlp.fi.muni.cz/projects/biwec/) ``` ```text CopperEgg/RevealUptime/DallasTXUSA ``` ```text yacybot (freeworld/global; amd64 Linux 2.6.23; java 1.6.0_26; Europe/en) http://yacy.net/bot.html ``` ```text Bad-Neighborhood Link Analyzer (http://www.bad-neighborhood.com/) ``` ```text Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/) ``` ```text Pixray-Seeker/2.0 (Pixray-Seeker; http://www.pixray.com/pixraybot; crawler@pixray.com) ``` ```text Cityreview Robot (+http://www.cityreview.org/crawler/) ``` ```text yacybot (/global; amd64 Linux 3.16.0-4-amd64; java 1.7.0_95; Europe/en) http://yacy.net/bot.html ``` ```text Dlvr.it/1.0 (http://dlvr.it/) ``` ```text Zookabot/2.4;++http://zookabot.com ``` ```text AddThis.com robot tech.support@clearspring.com ``` ```text Mozilla/5.0 (compatible; discoverybot/2.0; +http://discoveryengine.com/discoverybot.html) ``` ```text TwengaBot ``` ```text Mozilla/5.0 (compatible; 4SeoHuntBot; +http://4seohunt.biz/about.html) ``` ```text Web-sniffer.me/1.0.0 (+http://web-sniffer.me/) ``` ```text KDDI-CA31 UP.Browser/6.2.0.7.3.129 (GUI) MMP/2.0 (compatible; ichiro/mobile goo; +http://search.goo.ne.jp/option/use/sub4/sub4-1/) ``` ```text gosquared-thumbnailer/1.0 ``` ```text yacybot (/global; amd64 Windows 7 6.1; java 1.7.0_55; Asia/zh) http://yacy.net/bot.html ``` ```text BPImageWalker/2.0 (www.bdbrandprotect.com) ``` ```text Acoon v4.10.3 (www.acoon.de) ``` ```text Mozilla/4.0 (CMS Crawler: http://www.cmscrawler.com) ``` ```text WP Engine Install Performance API ``` ```text Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.24; ips-agent) Gecko/20111107 Ubuntu/10.04 (lucid) Firefox/3.6.24 ``` ```text Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko PTST/277 ``` -------------------------------- ### Mozilla Firefox User Agents Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/user_agent/crawlers.txt Examples of user agent strings for Mozilla Firefox, including specific versions and operating system information. ```APIDOC Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:82.0) Gecko/20100101 Firefox/82.0 Observatory/82.0 Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) Project-Resonance (http://project-resonance.com/) (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36 Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:46.0) Gecko/20100101 Firefox/46.0 | Hardenize (https://www.hardenize.com) Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:46.0) Gecko/20100101 Firefox/46.0 | Hardenize/v1.1196.1 (https://www.hardenize.com) Mozilla/5.0 (Windows; U; Windows NT 5.1; pl; PageThing http://pagething.com); rv:1.9; Gecko/2008052906 Firefox/3.0 ``` -------------------------------- ### Common Web Crawler User Agents Source: https://github.com/jaybizzle/crawler-detect/blob/master/tests/data/user_agent/crawlers.txt This snippet provides a sample of user agent strings commonly encountered from various web crawlers and bots. It includes examples from search engines, monitoring services, and research initiatives. ```user-agent Mozilla/5.0 (compatible; NLNZ_IAHarvester2014 +https://natlib.govt.nz/publishers-and-authors/web-harvesting/domain-harvest) Mozilla/5.0 (compatible; YandexImageResizer/2.0; +http://yandex.com/bots) mahonie, neofonie search:robot/search:robot/0.0.1 (This is the MIA Bot - crawling for mia research project. If you feel unhappy and do not want to be visited by our crawler send an email to spider@neofonie.de; http://spider.neofonie.de; spider@neofonie.de) Mozilla/5.0 (Linux; Android 5.0; Nexus 5 Build/LRX21O) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.76 Mobile Safari/537.36 PTST/276 Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; Trident/5.0; WOW64; Trident/5.0; BingPreview/1.0b) Mozilla/5.0 (compatible; OptimizationCrawler/0.2; +http://www.domainoptima.com/bot.html) Mozilla/5.0 (compatible; Cloudinary/1.0) rogerbot/1.0 (http://www.seomoz.org/dp/rogerbot, rogerbot-crawler+shiny@seomoz.org) Iframely/0.9.8 (+http://iframely.com/) yacybot (webportal-global; amd64 Linux 2.6.32-5-amd64; java 1.6.0_18; Europe/en) http://yacy.net/bot.html Interdose AntiSpamBot/2.0 (+http://www.piep.net) CheckMarkNetwork/1.0 (+http://www.checkmarknetwork.com/spider.html) Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/537.36 (KHTML, like Gecko) Version/8.0 Mobile/12F70 Safari/600.1.4 (compatible; Google Search Console) findlinks/2.5 (+http://wortschatz.uni-leipzig.de/findlinks/) Mozilla/5.0 (compatible; MSIE 10.0; Windows NT 6.1; WOW64; Trident/6.0) PTST/277 yacybot (/global; x86 Windows 2003 5.2; java 1.7.0_67; Europe/de) http://yacy.net/bot.html NeutrinoAPI/2.0.3 Mozilla/5.0 (compatible; spbot/5.0.3; +http://OpenLinkProfiler.org/bot ) Mozilla/5.0 (compatible; MJ12bot/v1.4.4; http://www.majestic12.co.uk/bot.php?+) Site24x7 Agent Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/6.0; .NET4.0E; .NET4.0C; .NET CLR 3.5.30729; .NET CLR 2.0.50727; .NET CLR 3.0.30729) CrawlerProcess (http://www.PowerMapper.com) /5.7.704.0 Mozilla/5.0 (compatible; Falconsbot; +http://iws.seu.edu.cn/services/falcons/contact_us.jsp) Super Monitoring Seznam-Zbozi-robot/3.0 LinkedInBot/1.0 (compatible; Mozilla/5.0; Jakarta Commons-HttpClient/4.3 +http://www.linkedin.com) Mozilla/5.0 (compatible; EveryoneSocialBot/1.0; support@everyonesocial.com http://everyonesocial.com/) Mozilla/5.0 (compatible; IstellaBot/1.01.18 +http://www.tiscali.it/) Mozilla/5.0 (compatible; ToutiaoSpider/1.0; http://web.toutiao.com/media_cooperation/;) Mozilla/5.0 (compatible; BegunAdvertising/3.0; +http://begun.ru/begun/technology/indexer/) Netcraft SSL Server Survey - contact info@netcraft.com Mozilla/5.0 (compatible; Qualidator.com SiteAnalyzer 1.0;) Mozilla/5.0 (compatible; OrangeBot/2.0; support.orangebot@orange.com) Mozilla/5.0 (compatible; CloudFlare-AlwaysOnline/1.0; +http://www.cloudflare.com/always-online) seebot/1.0.0 (http://www.seegnify.com/bot) Slackbot-LinkExpanding 1.0 (+https://api.slack.com/robots) Eurobot/1.1 (http://eurobot.ayell.eu) Mozilla/5.0 (compatible; IXEbot; +http://medialab.di.unipi.it/IXEbot.html) Mozilla/5.0 (GetLinkInfo.com - http://www.getlinkinfo.com) Mozilla/5.0 (compatible; spbot/4.0.5; +http://www.seoprofiler.com/bot ) yacybot (freeworld/global; amd64 Windows 7 6.1; java 1.6.0_31; America/en) http://yacy.net/bot.html Mozilla/5.0 (Windows NT 6.1; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0 PTST/314 Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/) UXCrawlerBot Pixray-Seeker/2.0 (http://www.pixray.com/pixraybot; crawler@pixray.com) Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.94 Safari/537.36 PTST/276 FAST-WebCrawler/3.7 (atw-crawler at fast dot no; http://fast.no/support/crawler.asp) yacybot (freeworld/global; amd64 Windows 10 10.0; java 1.8.0_60; Europe/de) http://yacy.net/bot.html Technoratibot/7.0 Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0; DomainDB-1.1; http://domaindb.com/crawler/) Abrave v6.0 (http://robot.abrave.com) Testomatobot/1.0 (Linux x86_64; +http://www.testomato.com/testomatobot) minicrawler/4.0.0~beta11 DowntimeDetector/1.0 (+http://foreveryoneorjustme.com) Comodo Spider 1.1 Link Valet Online 1.1 yacybot (freeworld/global; amd64 Windows XP 5.2; java 1.7.0_04; America/en) http://yacy.net/bot.html li_viewer (larbin2.6.3@unspecified.mail) Mozilla/5.0 (compatible; Mail.RU/2.0c) Mozilla/5.0 (compatible; aiHitBot/1.0; +http://www.aihit.com/) Mozilla/5.0 (compatible; Uptimebot/0.2.19; +http://www.uptime.com/uptimebot) Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots) Mozilla/5.0 (iPhone; CPU iPhone OS 6_0 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10A5376e Safari/8536.25 ( compatible; CloudServerMarketSpider/1.0; +http://cloudservermarket.com/spider.html ) yacybot (i386 Linux 2.6.28-13-generic; java 1.6.0_13; Europe/en) http://yacy.net/bot.html ```