### Example Robots.txt Configuration

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

This example shows how to configure rules for a specific user-agent and declare multiple sitemap locations. Note that the 'disallow' path is case-sensitive.

```robots.txt
user-agent: otherbot
disallow: /kale

sitemap: https://example.com/sitemap.xml
sitemap: https://cdn.example.org/other-sitemap.xml
sitemap: https://ja.example.org/テスト-サイトマップ.xml
```

--------------------------------

### Storebot-Google robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the Storebot-Google user agent. This specifies which archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **Storebot-Google**
allow: /archive/1Q84
disallow: /archive/konbini
```

--------------------------------

### Basic robots.txt file example

Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt

This example shows a simple robots.txt file with rules for Googlebot and all other user agents, along with a sitemap declaration. Ensure your robots.txt file is UTF-8 encoded.

```robots.txt
User-agent: Googlebot
Disallow: /nogooglebot/

User-agent: *
Allow: /

Sitemap: https://www.example.com/sitemap.xml
```

--------------------------------

### Googlebot-Video robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

Example of a robots.txt group configuration for Googlebot-Video.

```text
user-agent: **Googlebot-Video**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Example Robots.txt with Sitemaps

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419

This example shows how to disallow crawling for a specific user-agent and provides multiple sitemap locations. Sitemap URLs must be fully qualified.

```robots.txt
user-agent: otherbot
disallow: /kale

sitemap: https://example.com/sitemap.xml
sitemap: https://cdn.example.org/other-sitemap.xml
sitemap: https://ja.example.org/テスト-サイトマップ.xml

```

--------------------------------

### GoogleOther-Video robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the GoogleOther-Video user agent. This specifies which video archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **GoogleOther-Video**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Google-CloudVertexBot robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the Google-CloudVertexBot user agent. This specifies which archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **Google-CloudVertexBot**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### GoogleOther robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the GoogleOther user agent. This specifies which archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **GoogleOther**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Google-InspectionTool robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the Google-InspectionTool user agent. This specifies which archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **Google-InspectionTool**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Googlebot robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

Example of a robots.txt group configuration for Googlebot.

```text
user-agent: **Googlebot**
allow: /archive/1Q84
disallow: /archive
```

--------------------------------

### GoogleOther-Image robots.txt Example

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

An example robots.txt configuration for the GoogleOther-Image user agent. This specifies which image archives are allowed or disallowed for crawling.

```robots.txt
user-agent: **GoogleOther-Image**
allow: /archive/1Q84
disallow: /archive/moon.jpg
```

--------------------------------

### Internal Merged Rules Example

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

This is the internally merged representation of the previous example, showing combined rules for 'googlebot-news' and '*'.

```robots.txt
user-agent: googlebot-news
disallow: /fish
disallow: /shrimp

user-agent: *
disallow: /carrots

```

--------------------------------

### Googlebot-Image robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

Example of a robots.txt group configuration for Googlebot-Image.

```text
user-agent: **Googlebot-Image**
allow: /archive/1Q84
disallow: /archive/moons.jpg
```

--------------------------------

### Googlebot-News robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

Example of a robots.txt group configuration for Googlebot-News.

```text
user-agent: **Googlebot-News**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Example faceted navigation URL

Source: https://developers.google.com/crawling/docs/faceted-navigation

A typical URL structure using query parameters to filter content.

```text
https://example.com/items.shtm?**products=fish&color=radioactive_green&size=tiny**
```

--------------------------------

### APIs-Google robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers

Example robots.txt group for the APIs-Google crawler. This crawler ignores the global user agent (*).

```robots.txt
user-agent: **APIs-Google**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### AdSense robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers

Example robots.txt group for the Mediapartners-Google crawler. This crawler ignores the global user agent (*).

```robots.txt
user-agent: **Mediapartners-Google**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### AdsBot robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers

Example robots.txt group for the AdsBot-Google crawler. This crawler ignores the global user agent (*).

```robots.txt
user-agent: **AdsBot-Google**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Example robots.txt configuration

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

This robots.txt file controls crawling of URLs under a specific domain. It disallows all crawlers from accessing files in the 'includes' directory but allows Googlebot to crawl them for rendering purposes. It also specifies the location of the sitemap.

```robots.txt
# This robots.txt file controls crawling of URLs under https://example.com.
# All crawlers are disallowed to crawl files in the "includes" directory, such
# as .css, .js, but Google needs them for rendering, so Googlebot is allowed
# to crawl them.
User-agent: *
Disallow: /includes/

User-agent: Googlebot
Allow: /includes/

Sitemap: https://example.com/sitemap.xml
```

--------------------------------

### AdsBot Mobile Web robots.txt Configuration

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers

Example robots.txt group for the AdsBot-Google-Mobile crawler. This crawler ignores the global user agent (*).

```robots.txt
user-agent: **AdsBot-Google-Mobile**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Matching User-Agent Fields for Precedence

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

Crawlers select the most specific user agent group. This example illustrates how 'googlebot-news', 'googlebot', and '*' are matched.

```robots.txt
user-agent: googlebot-news
(group 1)

user-agent: *
(group 2)

user-agent: googlebot
(group 3)

```

--------------------------------

### Define user-agent access rules

Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt

Examples of blocking specific crawlers or all crawlers using User-agent and Disallow directives.

```text
# Example 1: Block only Googlebot
User-agent: Googlebot
Disallow: /

# Example 2: Block Googlebot and Adsbot
User-agent: Googlebot
User-agent: AdsBot-Google
Disallow: /

# Example 3: Block all crawlers except AdsBot (AdsBot crawlers must be named explicitly)
User-agent: *
Disallow: /
```

--------------------------------

### Ignoring Non-Rule Directives in Grouping

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

Rules other than 'allow', 'disallow', and 'user-agent' are ignored. This example shows 'sitemap' being ignored, effectively grouping 'a' and 'b' under the 'disallow: /' rule.

```robots.txt
user-agent: a
sitemap: https://example.com/sitemap.xml

user-agent: b
disallow: /

```

--------------------------------

### Internal Merging of Rules for a Specific User Agent

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

When multiple groups are relevant to a user agent, crawlers internally merge the rules. This example shows merging for 'googlebot-news'.

```robots.txt
user-agent: googlebot-news
disallow: /fish

user-agent: *
disallow: /carrots

user-agent: googlebot-news
disallow: /shrimp

```

--------------------------------

### 通配符 '/fish*.php' 匹配

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

此规则匹配以 '/fish' 开头并以 '.php' 结尾的路径。它用于精确匹配特定类型的文件，例如 '/fish.php' 或 '/fish_data.php'。

```robots.txt
allow: /fish*.php

```

--------------------------------

### 通配符 '/*.php' 匹配

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

此规则匹配任何以 '.php' 结尾的路径，无论其在哪个目录层级。它也匹配包含 '.php' 的文件名，即使后面还有其他字符。

```robots.txt
allow: /*.php

```

--------------------------------

### 通配符 '/fish*' 匹配

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

此规则匹配以 '/fish' 开头的所有路径，包括 '/fish' 本身以及任何以 '/fish' 开头的子路径或文件名。结尾的通配符会被忽略。

```robots.txt
allow: /fish*

```

--------------------------------

### Verify name server configuration using dig

Source: https://developers.google.com/crawling/docs/troubleshooting/dns-network-errors

Use these commands to check that name servers are correctly configured and pointing to the expected IP addresses for your domain.

```bash
dig +nocmd example.com ns +noall +answer
example.com.    86400  IN  NS  a.iana-servers.net.
example.com.    86400  IN  NS  b.iana-servers.net.
dig +nocmd @a.iana-servers.net example.com +noall +answer
example.com.    86400  IN  A  93.184.216.34
dig +nocmd @b.iana-servers.net example.com +noall +answer
...
```

--------------------------------

### GET /special-crawlers.json

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-special-case-crawlers

Retrieves the IP ranges for Google's special-case crawlers.

```APIDOC
## GET /special-crawlers.json

### Description
Returns the IP ranges used by Google's special-case crawlers. These crawlers operate from different IP ranges than common crawlers.

### Method
GET

### Endpoint
/special-crawlers.json
```

--------------------------------

### Grouping Rules for Multiple User-Agents

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419

Demonstrates how to apply the same rules to multiple user-agents by repeating the user-agent lines. This creates distinct rule groups.

```robots.txt
user-agent: a
disallow: /c

user-agent: b
disallow: /d

user-agent: e
user-agent: f
disallow: /g

user-agent: h


```

--------------------------------

### 通配符 '$' 匹配

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

使用 '$' 通配符来指定网址的结束。这对于精确匹配特定路径非常有用，例如仅匹配根目录。

```robots.txt
allow: /$

```

```robots.txt
allow: /*.php$

```

--------------------------------

### 通配符 '*' 匹配

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

使用 '*' 通配符匹配路径中零次或多次出现的任何字符。此通配符在路径值中非常有用，但结尾的 '*' 通常会被忽略。

```robots.txt
allow: /

```

```robots.txt
allow: /*

```

```robots.txt
allow: /fish*

```

```robots.txt
allow: /*.php

```

--------------------------------

### robots.txt for Google-Extended

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers?hl=zh-cn

This robots.txt configuration defines rules for the Google-Extended user agent, which is used to manage content availability for training future Gemini models. It allows specific archive paths while disallowing others.

```robots.txt
user-agent: **Google-Extended**
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### 规则冲突示例 2

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

当 'allow' 和 'disallow' 规则存在冲突时，Google 会使用限制性最弱的规则。在此示例中，'/folder' 是限制性最弱的。

```robots.txt
allow: /folder
disallow: /folder

```

--------------------------------

### Download robots.txt with curl

Source: https://developers.google.com/crawling/docs/robots-txt/submit-updated-robots-txt

Use this command to download a copy of your robots.txt file from your server to your local machine.

```bash
curl https://example.com/robots.txt -o robots.txt
```

--------------------------------

### 规则冲突示例 4

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

当规则存在冲突时，Google 会使用限制性最弱的规则。在此示例中，'/page' 是限制性最弱的。

```robots.txt
allow: /page
disallow: /*.ph

```

--------------------------------

### 规则冲突示例 3

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

Google 会选择更具体的规则。在此示例中，'/*.htm' 比 '/page' 更具体，因为它匹配了更多的字符。

```robots.txt
allow: /page
disallow: /*.htm

```

--------------------------------

### Configure robots.txt for Google-Extended

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

Use this configuration in your robots.txt file to restrict Google-Extended from accessing specific directories while allowing access to others.

```text
user-agent: Google-Extended
allow: /archive/1Q84
disallow: /archive/
```

--------------------------------

### Allow Rule Syntax

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419

Use the allow rule to specify URL paths that designated crawlers can access. Rules without a path are ignored. Paths are case-sensitive.

```robots.txt
allow: [path]

```

--------------------------------

### Verify DNS records using dig

Source: https://developers.google.com/crawling/docs/troubleshooting/dns-network-errors

Use these commands to inspect A and CNAME records to ensure they point to the correct IP addresses and hostnames.

```bash
dig +nocmd example.com a +noall +answer
```

```bash
dig +nocmd www.example.com cname +noall +answer
```

--------------------------------

### 规则冲突示例 1

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

当 'allow' 和 'disallow' 规则都匹配时，Google 会选择更具体的规则。在此示例中，'/p' 比 '/' 更具体。

```robots.txt
allow: /p
disallow: /

```

--------------------------------

### 路径匹配示例

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

展示了不同路径值及其通配符如何影响网址匹配。注意区分大小写和路径的精确匹配。

```robots.txt
allow: /fish

```

```robots.txt
allow: /fish/

```

--------------------------------

### 规则冲突示例 5

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=zh-cn

Google 会选择更具体的规则。在此示例中，'allow: /$' 比 'disallow: /' 更具体，因为它精确匹配根目录。

```robots.txt
allow: /$
disallow: /

```

--------------------------------

### Googlebot Desktop User-Agent String

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The primary user agent string used by Googlebot Desktop in HTTP requests.

```text
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/_W.X.Y.Z_ Safari/537.36
```

--------------------------------

### Verify rate-limited proxy IP address

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests

Check rate-limited proxy IP addresses by using 'host' for reverse and forward DNS lookups. The domain should be google.com.

```bash
host 66.249.90.77
77.90.249.66.in-addr.arpa domain name pointer rate-limited-proxy-66-249-90-77.google.com.

host rate-limited-proxy-66-249-90-77.google.com
rate-limited-proxy-66-249-90-77.google.com has address 66.249.90.77
```

--------------------------------

### Googlebot-Video User-Agent String

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The user agent string used by Googlebot-Video in HTTP requests.

```text
Googlebot-Video/1.0
```

--------------------------------

### Verify common crawler IP address

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests

Use the 'host' command for reverse and forward DNS lookups to verify common Google crawlers. Ensure the domain name is googlebot.com and the IP matches.

```bash
host 66.249.66.1
1.66.249.66.in-addr.arpa domain name pointer crawl-66-249-66-1.googlebot.com.

host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1
```

--------------------------------

### Storebot-Google Desktop User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The User-Agent string for the Storebot-Google crawler when making desktop HTTP requests. This crawler affects Google Shopping surfaces.

```http
Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Safari/537.36
```

--------------------------------

### Sitemap Directive Syntax

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec?hl=es-419

Specify the absolute URL of a sitemap or sitemap index file. This directive is not tied to a specific user-agent and can be specified multiple times.

```robots.txt
sitemap: [absoluteURL]

```

--------------------------------

### Specify sitemap locations

Source: https://developers.google.com/crawling/docs/robots-txt/create-robots-txt

Optional directive to provide the fully-qualified URL of a sitemap file.

```text
Sitemap: https://example.com/sitemap.xml
Sitemap: https://www.example.com/sitemap.xml
```

--------------------------------

### Combine Multiple User Agents in a Single Group

Source: https://developers.google.com/crawling/docs/robots-txt/useful-robots-txt-rules

Consolidate rules for multiple crawlers into one group for easier management. All rules within the group apply to every listed user agent.

```robots.txt
User-agent: Googlebot
User-agent: Storebot-Google
Allow: /cats
Disallow: /
```

--------------------------------

### Allow Rule Syntax

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

Use the 'allow' field to specify URL paths that designated crawlers may access. The field name is case-insensitive, but the path value is case-sensitive. Rules without a path are ignored.

```robots.txt
allow: [path]
```

--------------------------------

### GoogleOther Generic User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

A generic User-Agent string for the GoogleOther crawler. This crawler is used for fetching publicly accessible content for internal research and development.

```http
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/_W.X.Y.Z_ Safari/537.36
```

--------------------------------

### GoogleOther-Video User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The User-Agent string for the GoogleOther-Video crawler. This version of GoogleOther is optimized for fetching publicly accessible video URLs.

```http
GoogleOther-Video/1.0
```

--------------------------------

### Verify geo-crawled IP address

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests

Verify geo-crawled IP addresses by performing reverse and forward DNS lookups using the 'host' command. The domain should be geo.googlebot.com.

```bash
host 35.247.243.240
240.243.247.35.in-addr.arpa domain name pointer geo-crawl-35-247-243-240.geo.googlebot.com.

host geo-crawl-35-247-243-240.geo.googlebot.com
geo-crawl-35-247-243-240.geo.googlebot.com has address 35.247.243.240
```

--------------------------------

### Sitemap Declaration Syntax

Source: https://developers.google.com/crawling/docs/robots-txt/robots-txt-spec

Use the 'sitemap' field to declare the absolute URL of your sitemap or sitemap index file. This field is case-insensitive and not tied to a specific user agent.

```robots.txt
sitemap: [absoluteURL]
```

--------------------------------

### Google-CloudVertexBot User-Agent Substring

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The User-Agent substring for the Google-CloudVertexBot crawler. This crawler affects crawls for building Vertex AI Agents.

```http
Google-CloudVertexBot
```

--------------------------------

### Google-InspectionTool Desktop User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The User-Agent string for the Google-InspectionTool crawler when making desktop HTTP requests. This crawler is used for Search testing tools.

```http
Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)
```

--------------------------------

### Googlebot-Image User-Agent String

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The user agent string used by Googlebot-Image in HTTP requests.

```text
Googlebot-Image/1.0
```

--------------------------------

### GoogleOther User Agent String

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers?hl=zh-cn

This is the user agent string for the general GoogleOther crawler, used for various internal research and development tasks. It may appear in HTTP requests.

```text
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; GoogleOther)
```

```text
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/_W.X.Y.Z_ Safari/537.36
```

--------------------------------

### Google Site Verifier User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-user-triggered-fetchers

Identifies the User-Agent string used by the Google Site Verifier service.

```APIDOC
## Google Site Verifier User-Agent

### Description
Google Site Verifier fetches Search Console verification tokens to confirm site ownership.

### User-Agent
Mozilla/5.0 (compatible; Google-Site-Verification/1.0)
```

--------------------------------

### Googlebot Smartphone User-Agent String

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The user agent string used by Googlebot Smartphone in HTTP requests.

```text
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
```

--------------------------------

### GoogleOther Mobile User-Agent

Source: https://developers.google.com/crawling/docs/crawlers-fetchers/google-common-crawlers

The User-Agent string for the GoogleOther crawler when making mobile HTTP requests. This is a generic crawler for various product teams.

```http
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/_W.X.Y.Z_ Mobile Safari/537.36 (compatible; GoogleOther)
```

--------------------------------

### Use URL fragments for filters

Source: https://developers.google.com/crawling/docs/faceted-navigation

Replace query parameters with URL fragments to prevent crawlers from treating filtered views as distinct crawlable pages.

```text
https://example.com/items.shtm**#**products=fish&color=radioactive_green&size=tiny
```

--------------------------------

### Allow Crawling of Entire Site

Source: https://developers.google.com/crawling/docs/robots-txt/useful-robots-txt-rules

This rule explicitly permits all crawlers to access the entire site. It is equivalent to having no robots.txt file or using an 'Allow: /' rule.

```robots.txt
User-agent: *
Disallow:
```