### Example Robots.txt Configuration Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec This example shows how to configure rules for a specific user-agent and declare multiple sitemap locations. Note that the 'disallow' path is case-sensitive. ```robots.txt user-agent: otherbot disallow: /kale sitemap: https://example.com/sitemap.xml sitemap: https://cdn.example.org/other-sitemap.xml sitemap: https://ja.example.org/テスト-サイトマップ.xml ``` -------------------------------- ### Storebot-Google robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the Storebot-Google user agent. This specifies which archives are allowed or disallowed for crawling. ```robots.txt user-agent: **Storebot-Google** allow: /archive/1Q84 disallow: /archive/konbini ``` -------------------------------- ### Basic robots.txt file example Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt This example shows a simple robots.txt file with rules for Googlebot and all other user agents, along with a sitemap declaration. Ensure your robots.txt file is UTF-8 encoded. ```robots.txt User-agent: Googlebot Disallow: /nogooglebot/ User-agent: * Allow: / Sitemap: https://www.example.com/sitemap.xml ``` -------------------------------- ### Googlebot-Video robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers Example of a robots.txt group configuration for Googlebot-Video. ```text user-agent: **Googlebot-Video** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Example Robots.txt with Sitemaps Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419 This example shows how to disallow crawling for a specific user-agent and provides multiple sitemap locations. Sitemap URLs must be fully qualified. ```robots.txt user-agent: otherbot disallow: /kale sitemap: https://example.com/sitemap.xml sitemap: https://cdn.example.org/other-sitemap.xml sitemap: https://ja.example.org/テスト-サイトマップ.xml ``` -------------------------------- ### GoogleOther-Video robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the GoogleOther-Video user agent. This specifies which video archives are allowed or disallowed for crawling. ```robots.txt user-agent: **GoogleOther-Video** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Google-CloudVertexBot robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the Google-CloudVertexBot user agent. This specifies which archives are allowed or disallowed for crawling. ```robots.txt user-agent: **Google-CloudVertexBot** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### GoogleOther robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the GoogleOther user agent. This specifies which archives are allowed or disallowed for crawling. ```robots.txt user-agent: **GoogleOther** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Google-InspectionTool robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the Google-InspectionTool user agent. This specifies which archives are allowed or disallowed for crawling. ```robots.txt user-agent: **Google-InspectionTool** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Googlebot robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers Example of a robots.txt group configuration for Googlebot. ```text user-agent: **Googlebot** allow: /archive/1Q84 disallow: /archive ``` -------------------------------- ### GoogleOther-Image robots.txt Example Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers An example robots.txt configuration for the GoogleOther-Image user agent. This specifies which image archives are allowed or disallowed for crawling. ```robots.txt user-agent: **GoogleOther-Image** allow: /archive/1Q84 disallow: /archive/moon.jpg ``` -------------------------------- ### Internal Merged Rules Example Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec This is the internally merged representation of the previous example, showing combined rules for 'googlebot-news' and '*'. ```robots.txt user-agent: googlebot-news disallow: /fish disallow: /shrimp user-agent: * disallow: /carrots ``` -------------------------------- ### Googlebot-Image robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers Example of a robots.txt group configuration for Googlebot-Image. ```text user-agent: **Googlebot-Image** allow: /archive/1Q84 disallow: /archive/moons.jpg ``` -------------------------------- ### Googlebot-News robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers Example of a robots.txt group configuration for Googlebot-News. ```text user-agent: **Googlebot-News** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Example faceted navigation URL Source: https://developers.google.com/crawling/docs/faceted-navigation A typical URL structure using query parameters to filter content. ```text https://example.com/items.shtm?**products=fish&color=radioactive_green&size=tiny** ``` -------------------------------- ### APIs-Google robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers Example robots.txt group for the APIs-Google crawler. This crawler ignores the global user agent (*). ```robots.txt user-agent: **APIs-Google** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### AdSense robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers Example robots.txt group for the Mediapartners-Google crawler. This crawler ignores the global user agent (*). ```robots.txt user-agent: **Mediapartners-Google** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### AdsBot robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers Example robots.txt group for the AdsBot-Google crawler. This crawler ignores the global user agent (*). ```robots.txt user-agent: **AdsBot-Google** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Example robots.txt configuration Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec This robots.txt file controls crawling of URLs under a specific domain. It disallows all crawlers from accessing files in the 'includes' directory but allows Googlebot to crawl them for rendering purposes. It also specifies the location of the sitemap. ```robots.txt # This robots.txt file controls crawling of URLs under https://example.com. # All crawlers are disallowed to crawl files in the "includes" directory, such # as .css, .js, but Google needs them for rendering, so Googlebot is allowed # to crawl them. User-agent: * Disallow: /includes/ User-agent: Googlebot Allow: /includes/ Sitemap: https://example.com/sitemap.xml ``` -------------------------------- ### AdsBot Mobile Web robots.txt Configuration Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers Example robots.txt group for the AdsBot-Google-Mobile crawler. This crawler ignores the global user agent (*). ```robots.txt user-agent: **AdsBot-Google-Mobile** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Matching User-Agent Fields for Precedence Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec Crawlers select the most specific user agent group. This example illustrates how 'googlebot-news', 'googlebot', and '*' are matched. ```robots.txt user-agent: googlebot-news (group 1) user-agent: * (group 2) user-agent: googlebot (group 3) ``` -------------------------------- ### Define user-agent access rules Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt Examples of blocking specific crawlers or all crawlers using User-agent and Disallow directives. ```text # Example 1: Block only Googlebot User-agent: Googlebot Disallow: / # Example 2: Block Googlebot and Adsbot User-agent: Googlebot User-agent: AdsBot-Google Disallow: / # Example 3: Block all crawlers except AdsBot (AdsBot crawlers must be named explicitly) User-agent: * Disallow: / ``` -------------------------------- ### Ignoring Non-Rule Directives in Grouping Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec Rules other than 'allow', 'disallow', and 'user-agent' are ignored. This example shows 'sitemap' being ignored, effectively grouping 'a' and 'b' under the 'disallow: /' rule. ```robots.txt user-agent: a sitemap: https://example.com/sitemap.xml user-agent: b disallow: / ``` -------------------------------- ### Internal Merging of Rules for a Specific User Agent Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec When multiple groups are relevant to a user agent, crawlers internally merge the rules. This example shows merging for 'googlebot-news'. ```robots.txt user-agent: googlebot-news disallow: /fish user-agent: * disallow: /carrots user-agent: googlebot-news disallow: /shrimp ``` -------------------------------- ### 通配符 '/fish*.php' 匹配 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 此规则匹配以 '/fish' 开头并以 '.php' 结尾的路径。它用于精确匹配特定类型的文件,例如 '/fish.php' 或 '/fish_data.php'。 ```robots.txt allow: /fish*.php ``` -------------------------------- ### 通配符 '/*.php' 匹配 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 此规则匹配任何以 '.php' 结尾的路径,无论其在哪个目录层级。它也匹配包含 '.php' 的文件名,即使后面还有其他字符。 ```robots.txt allow: /*.php ``` -------------------------------- ### 通配符 '/fish*' 匹配 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 此规则匹配以 '/fish' 开头的所有路径,包括 '/fish' 本身以及任何以 '/fish' 开头的子路径或文件名。结尾的通配符会被忽略。 ```robots.txt allow: /fish* ``` -------------------------------- ### Verify name server configuration using dig Source: https://developers.google.com/crawling/docs/troubleshooting/dns-network-errors Use these commands to check that name servers are correctly configured and pointing to the expected IP addresses for your domain. ```bash dig +nocmd example.com ns +noall +answer example.com. 86400 IN NS a.iana-servers.net. example.com. 86400 IN NS b.iana-servers.net. dig +nocmd @a.iana-servers.net example.com +noall +answer example.com. 86400 IN A 93.184.216.34 dig +nocmd @b.iana-servers.net example.com +noall +answer ... ``` -------------------------------- ### GET /special-crawlers.json Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers Retrieves the IP ranges for Google's special-case crawlers. ```APIDOC ## GET /special-crawlers.json ### Description Returns the IP ranges used by Google's special-case crawlers. These crawlers operate from different IP ranges than common crawlers. ### Method GET ### Endpoint /special-crawlers.json ``` -------------------------------- ### Grouping Rules for Multiple User-Agents Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419 Demonstrates how to apply the same rules to multiple user-agents by repeating the user-agent lines. This creates distinct rule groups. ```robots.txt user-agent: a disallow: /c user-agent: b disallow: /d user-agent: e user-agent: f disallow: /g user-agent: h ``` -------------------------------- ### 通配符 '$' 匹配 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 使用 '$' 通配符来指定网址的结束。这对于精确匹配特定路径非常有用,例如仅匹配根目录。 ```robots.txt allow: /$ ``` ```robots.txt allow: /*.php$ ``` -------------------------------- ### 通配符 '*' 匹配 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 使用 '*' 通配符匹配路径中零次或多次出现的任何字符。此通配符在路径值中非常有用,但结尾的 '*' 通常会被忽略。 ```robots.txt allow: / ``` ```robots.txt allow: /* ``` ```robots.txt allow: /fish* ``` ```robots.txt allow: /*.php ``` -------------------------------- ### robots.txt for Google-Extended Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers?hl=zh-cn This robots.txt configuration defines rules for the Google-Extended user agent, which is used to manage content availability for training future Gemini models. It allows specific archive paths while disallowing others. ```robots.txt user-agent: **Google-Extended** allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### 规则冲突示例 2 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 当 'allow' 和 'disallow' 规则存在冲突时,Google 会使用限制性最弱的规则。在此示例中,'/folder' 是限制性最弱的。 ```robots.txt allow: /folder disallow: /folder ``` -------------------------------- ### Download robots.txt with curl Source: https://developers.google.com/crawling/docs/robots-txt/submit-updated-robots-txt Use this command to download a copy of your robots.txt file from your server to your local machine. ```bash curl https://example.com/robots.txt -o robots.txt ``` -------------------------------- ### 规则冲突示例 4 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 当规则存在冲突时,Google 会使用限制性最弱的规则。在此示例中,'/page' 是限制性最弱的。 ```robots.txt allow: /page disallow: /*.ph ``` -------------------------------- ### 规则冲突示例 3 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn Google 会选择更具体的规则。在此示例中,'/*.htm' 比 '/page' 更具体,因为它匹配了更多的字符。 ```robots.txt allow: /page disallow: /*.htm ``` -------------------------------- ### Configure robots.txt for Google-Extended Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers Use this configuration in your robots.txt file to restrict Google-Extended from accessing specific directories while allowing access to others. ```text user-agent: Google-Extended allow: /archive/1Q84 disallow: /archive/ ``` -------------------------------- ### Allow Rule Syntax Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419 Use the allow rule to specify URL paths that designated crawlers can access. Rules without a path are ignored. Paths are case-sensitive. ```robots.txt allow: [path] ``` -------------------------------- ### Verify DNS records using dig Source: https://developers.google.com/crawling/docs/troubleshooting/dns-network-errors Use these commands to inspect A and CNAME records to ensure they point to the correct IP addresses and hostnames. ```bash dig +nocmd example.com a +noall +answer ``` ```bash dig +nocmd www.example.com cname +noall +answer ``` -------------------------------- ### 规则冲突示例 1 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 当 'allow' 和 'disallow' 规则都匹配时,Google 会选择更具体的规则。在此示例中,'/p' 比 '/' 更具体。 ```robots.txt allow: /p disallow: / ``` -------------------------------- ### 路径匹配示例 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn 展示了不同路径值及其通配符如何影响网址匹配。注意区分大小写和路径的精确匹配。 ```robots.txt allow: /fish ``` ```robots.txt allow: /fish/ ``` -------------------------------- ### 规则冲突示例 5 Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn Google 会选择更具体的规则。在此示例中,'allow: /$' 比 'disallow: /' 更具体,因为它精确匹配根目录。 ```robots.txt allow: /$ disallow: / ``` -------------------------------- ### Googlebot Desktop User-Agent String Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The primary user agent string used by Googlebot Desktop in HTTP requests. ```text Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/_W.X.Y.Z_ Safari/537.36 ``` -------------------------------- ### Verify rate-limited proxy IP address Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests Check rate-limited proxy IP addresses by using 'host' for reverse and forward DNS lookups. The domain should be google.com. ```bash host 66.249.90.77 77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com. host rate-limited-proxy-66-249-90-77.google.com rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77 ``` -------------------------------- ### Googlebot-Video User-Agent String Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The user agent string used by Googlebot-Video in HTTP requests. ```text Googlebot-Video/1.0 ``` -------------------------------- ### Verify common crawler IP address Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests Use the 'host' command for reverse and forward DNS lookups to verify common Google crawlers. Ensure the domain name is googlebot.com and the IP matches. ```bash host 66.249.66.1 1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com. host crawl-66-249-66-1.googlebot.com crawl-66-249-66-1.googlebot.com has address 66.249.66.1 ``` -------------------------------- ### Storebot-Google Desktop User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The User-Agent string for the Storebot-Google crawler when making desktop HTTP requests. This crawler affects Google Shopping surfaces. ```http Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Safari/537.36 ``` -------------------------------- ### Sitemap Directive Syntax Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419 Specify the absolute URL of a sitemap or sitemap index file. This directive is not tied to a specific user-agent and can be specified multiple times. ```robots.txt sitemap: [absoluteURL] ``` -------------------------------- ### Specify sitemap locations Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt Optional directive to provide the fully-qualified URL of a sitemap file. ```text Sitemap: https://example.com/sitemap.xml Sitemap: https://www.example.com/sitemap.xml ``` -------------------------------- ### Combine Multiple User Agents in a Single Group Source: https://developers.google.com/crawling/docs/robots-txt/useful-robots-txt-rules Consolidate rules for multiple crawlers into one group for easier management. All rules within the group apply to every listed user agent. ```robots.txt User-agent: Googlebot User-agent: Storebot-Google Allow: /cats Disallow: / ``` -------------------------------- ### Allow Rule Syntax Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec Use the 'allow' field to specify URL paths that designated crawlers may access. The field name is case-insensitive, but the path value is case-sensitive. Rules without a path are ignored. ```robots.txt allow: [path] ``` -------------------------------- ### GoogleOther Generic User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers A generic User-Agent string for the GoogleOther crawler. This crawler is used for fetching publicly accessible content for internal research and development. ```http Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/_W.X.Y.Z_ Safari/537.36 ``` -------------------------------- ### GoogleOther-Video User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The User-Agent string for the GoogleOther-Video crawler. This version of GoogleOther is optimized for fetching publicly accessible video URLs. ```http GoogleOther-Video/1.0 ``` -------------------------------- ### Verify geo-crawled IP address Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests Verify geo-crawled IP addresses by performing reverse and forward DNS lookups using the 'host' command. The domain should be geo.googlebot.com. ```bash host 35.247.243.240 240.243.247.35.in-addr.arpa domain name pointer geo-crawl-35-247-243-240.geo.googlebot.com. host geo-crawl-35-247-243-240.geo.googlebot.com geo-crawl-35-247-243-240.geo.googlebot.com has address 35.247.243.240 ``` -------------------------------- ### Sitemap Declaration Syntax Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec Use the 'sitemap' field to declare the absolute URL of your sitemap or sitemap index file. This field is case-insensitive and not tied to a specific user agent. ```robots.txt sitemap: [absoluteURL] ``` -------------------------------- ### Google-CloudVertexBot User-Agent Substring Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The User-Agent substring for the Google-CloudVertexBot crawler. This crawler affects crawls for building Vertex AI Agents. ```http Google-CloudVertexBot ``` -------------------------------- ### Google-InspectionTool Desktop User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The User-Agent string for the Google-InspectionTool crawler when making desktop HTTP requests. This crawler is used for Search testing tools. ```http Mozilla/5.0 (compatible; Google-InspectionTool/1.0;) ``` -------------------------------- ### Googlebot-Image User-Agent String Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The user agent string used by Googlebot-Image in HTTP requests. ```text Googlebot-Image/1.0 ``` -------------------------------- ### GoogleOther User Agent String Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers?hl=zh-cn This is the user agent string for the general GoogleOther crawler, used for various internal research and development tasks. It may appear in HTTP requests. ```text Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; GoogleOther) ``` ```text Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/_W.X.Y.Z_ Safari/537.36 ``` -------------------------------- ### Google Site Verifier User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-user-triggered-fetchers Identifies the User-Agent string used by the Google Site Verifier service. ```APIDOC ## Google Site Verifier User-Agent ### Description Google Site Verifier fetches Search Console verification tokens to confirm site ownership. ### User-Agent Mozilla/5.0 (compatible; Google-Site-Verification/1.0) ``` -------------------------------- ### Googlebot Smartphone User-Agent String Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The user agent string used by Googlebot Smartphone in HTTP requests. ```text Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html) ``` -------------------------------- ### GoogleOther Mobile User-Agent Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers The User-Agent string for the GoogleOther crawler when making mobile HTTP requests. This is a generic crawler for various product teams. ```http Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; GoogleOther) ``` -------------------------------- ### Use URL fragments for filters Source: https://developers.google.com/crawling/docs/faceted-navigation Replace query parameters with URL fragments to prevent crawlers from treating filtered views as distinct crawlable pages. ```text https://example.com/items.shtm**#**products=fish&color=radioactive_green&size=tiny ``` -------------------------------- ### Allow Crawling of Entire Site Source: https://developers.google.com/crawling/docs/robots-txt/useful-robots-txt-rules This rule explicitly permits all crawlers to access the entire site. It is equivalent to having no robots.txt file or using an 'Allow: /' rule. ```robots.txt User-agent: * Disallow: ```