Which AI crawlers are blocked most by hotels?

GPTBot (OpenAI) leads at 2.9%, followed by Google-Extended (2.7%) and CCBot (2.7%). Training-focused bots are blocked ~2.5x more than search bots (1.0%). This gap shows growing awareness of the difference between AI training and AI-powered search.

Should hotels block AI crawlers?

It depends on your strategy. Blocking training bots (GPTBot, ClaudeBot) prevents your content from being used in model training. But blocking search bots (PerplexityBot, OAI-SearchBot) removes you from AI-powered search results. The "smart" approach (2.1% of hotels) blocks training while allowing search.

Does star classification affect AI blocking rates?

No. Blocking rates are nearly identical across 1-star (2.6%) to 5-star (3.1%) hotels. Star classification has zero meaningful effect on whether hotels block AI crawlers.

Hotelrank ResearchResearch / robots.txt StudyMarch 2026

Do Hotels Block AI Crawlers?

Name: Hotel robots.txt AI Blocking Study 2026
Creator: Hotelrank
Published: 2026-03-20
License: https://creativecommons.org/licenses/by/4.0/

We parsed 105,002 hotel robots.txt files. 96.7% have zero AI-specific blocking rules. The industry is wide open.

82.2%

Have robots.txt

3.3%

Block Any AI

7.5%

France (Outlier)

2.1%

"Smart" Strategy

Summary robots.txt AI Blocking Per Bot Selective By Country By Stars Distribution Who Blocks Opting Out FAQ Methodology

TL;DR

We parsed robots.txt files from 105,002 hotel websites across 7 countries. Only 3.3% block any AI crawler — and just 0.9% block all of them. GPTBot (OpenAI) is the most commonly blocked at 2.9%, while AI search bots like PerplexityBot and OAI-SearchBot are blocked by just 1.0%. The most interesting signal: 2.1% of hotels use a "smart" strategy — blocking training bots while allowing search bots through. France is a clear outlier at 7.5%, more than 3x the rate of any other country.

Executive Summary

The robots.txt file is the first line of defense for any website. It tells crawlers what they can and cannot access. With the rise of AI-powered search engines (ChatGPT, Perplexity, Gemini) and AI model training, hotels face a new decision: should they allow AI bots to crawl their content?

Our analysis of 105,002 hotel websites reveals that the vast majority have not yet made this decision — or have decided to leave the door wide open. Only 3.3% block any AI crawler at all. For context, this means 96.7% of hotel websites are fully accessible to AI training bots and AI-powered search engines alike.

The distinction between training and search bots matters. Training bots (GPTBot, ClaudeBot, CCBot) scrape content to build AI models. Search bots (PerplexityBot, OAI-SearchBot) fetch content to answer user queries in real time. Blocking the first protects your content from being used in training. Blocking the second removes you from AI-powered search results. Understanding this distinction is critical — and our anatomy of ChatGPT hotel search article explains exactly how these bots work.

96.7%

No AI blocking rules

3.3%

Block at least one AI bot

2.5x

Training vs search blocking gap

The key finding: Hotels that do block AI crawlers are making a deliberate distinction between training and search. Training bots are blocked ~2.5x more often than search bots. This "smart" strategy — blocking training while allowing search — is emerging as the most sophisticated approach, adopted by 2.1% of hotels.

robots.txt Adoption

How many hotel websites have a robots.txt file? (n=105,002 hotels)

82.2%

Have robots.txt

86,348 hotels

60.1%

Have Sitemap

63,110 hotels

17.8%

No robots.txt

18,654 hotels

0.9%

Blanket Disallow

958 hotels

Hotel robots.txt status breakdown

The 60.1% sitemap rate is a positive signal. Hotels that declare a sitemap in their robots.txt are actively helping crawlers discover their content. Combined with the 82.2% robots.txt adoption rate, this suggests that most hotel websites have at least basic crawl management in place — they just haven't updated it for the AI era.

AI Blocking Overview

How does AI bot blocking compare to traditional search engine blocking?

3.3%

Block Any AI

3,458 hotels

0.9%

Block All AI

957 hotels

1.3%

Block Googlebot

1,325 hotels

1.1%

Block Bingbot

1,160 hotels

Hotel AI blocking vs traditional search engine blocking

AI blocking (3.3%) is higher than traditional search blocking. Hotels that block Googlebot (1.3%) or Bingbot (1.1%) are likely misconfigured — blocking your primary search engines is almost never intentional. But AI blocking at 3.3% represents a deliberate choice by hotels that are specifically targeting AI crawlers while keeping traditional search open.

Per-Bot Blocking Rates

Which AI bots are hotels blocking? Colored by category: training search user agent

Hotel AI bot blocking rates by bot

GPTBot leads at 2.9%. Training bots cluster between 2.5% and 2.9%, while search and user-agent bots hover around 1.0%. The ~2.5x gap between training and search bot blocking is the key finding — hotels that actively manage AI access are distinguishing between content scraping for model training and real-time search retrieval.

Full hotel AI bot blocking data

Bot	Provider	Category	Hotels Blocking	% of Total
GPTBot	OpenAI	training	3,036	2.9%
Google-Extended	Google	training	2,793	2.7%
CCBot	Common Crawl	training	2,847	2.7%
Bytespider	ByteDance	training	2,782	2.6%
ClaudeBot	Anthropic	training	2,742	2.6%
Applebot-Extended	Apple	training	2,669	2.5%

Selective Blocking: Training vs Search

Among hotels that block AI, what strategy are they using?

2.1%

"Smart" Strategy

Block training, allow search

0.9%

Block Everything

All AI bots blocked

0.1%

Reverse Strategy

Block search, allow training

Hotel AI blocking strategy distribution

The "smart" strategy is the most interesting signal in this data. 2.1% of hotels (2,214 properties) block training bots like GPTBot and ClaudeBot while allowing search bots like PerplexityBot and OAI-SearchBot to crawl freely. This means they protect their content from model training while remaining visible in AI-powered search results. Only 58 hotels (0.1%) do the reverse — blocking search while allowing training — which suggests either misconfiguration or a very unusual strategy.

Understanding OpenAI's 3 crawlers

Per OpenAI's official documentation, each crawler serves a distinct purpose:

Training

GPTBot — crawls content for training generative AI models. Blocking it means your content won't be used for training. This is the one most hotels should consider blocking.

OAI-SearchBot — indexes content for ChatGPT's search features. Blocking it means your site won't appear in ChatGPT search results, only as navigational links. Hotels wanting AI search visibility should allow this.

User

ChatGPT-User — triggered when a user asks ChatGPT to browse a page. It's user-initiated, and OpenAI states "robots.txt rules may not apply." Blocking this bot is largely pointless — yet 1,190 hotels do it.

The visibility trade-off is real. Hotels blocking OAI-SearchBot opt out of ChatGPT search results. Hotels blocking PerplexityBot vanish from Perplexity. As AI search becomes a primary discovery channel for travelers, blocking search bots is equivalent to delisting from a search engine. Read our anatomy of ChatGPT hotel search to understand how search bots retrieve and present hotel information.

AI Blocking by Country

France is a clear outlier. Germany, despite being another GDPR market, is the lowest.

% of hotels blocking any AI crawler, by country

France at 7.5% is 3x the US (2.1%) or UK (2.0%) rate. But the number is misleading. Most of it comes from a single chain. See below.

The Logis Effect: One Chain Explains France's Outlier Status

Logis Hotels is a French cooperative of ~2,300 independent hotels, restaurants, and guesthouses. Their shared CMS/platform includes a robots.txt that blocks 6 AI training bots (GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Applebot-Extended) while allowing all search bots. This single decision affects 955 properties in our dataset.

72.1%

of French blockers are Logis

2.1%

France's rate without Logis

= US rate

No longer an outlier

Remove Logis from the data, and France drops from 7.5% to 2.1% — exactly the US rate. The "French GDPR culture" hypothesis largely evaporates. What looks like a national trend is actually a single platform decision by a cooperative that bundles AI blocking into its shared infrastructure.

France blocking breakdown (1,317 total blockers):

Logis Hotels — 950 (72.1%)

Independent/Other — 283 (21.5%)

Aparthotel chains — ~90 (6.9%)

Sofitel + Others — 13 (1.0%)

The irony: Logis's blocking is actually the "smart" pattern — they block training bots while allowing search bots. Their hotels remain visible in ChatGPT search and Perplexity. This makes them the largest coordinated example of the training-only blocking strategy in our dataset.

Germany at 1.7% disproves the GDPR hypothesis entirely. If data protection regulation drove AI blocking, Germany — with its equally strong GDPR enforcement — would match France. Instead, it has the lowest rate of any country in our dataset. AI blocking in hospitality is driven by platform decisions and agency culture, not regulation.

Full country-level hotel AI blocking data

Country	Hotels	Has robots.txt	Blocks Any AI	GPTBot	ClaudeBot
France	17,634	89.8%	7.5%	7.2%	6.5%
Italy	27,319	71.7%	3.3%	2.6%	2.4%
Spain	16,411	83.6%	2.6%	2%	2%
Netherlands	2,891	86.6%	2.2%	1.8%	1.5%
USA	7,445	90.8%	2.1%	1.8%	1.7%
UK	10,547	89.4%	2%	1.7%	1.5%

AI Blocking by Star Classification

Does hotel quality affect AI blocking decisions?

Hotel AI blocking rates by star classification

Zero effect. The range is tight: 2.6% to 3.8% across all star classifications. Whether a hotel is 1-star or 5-star has no meaningful impact on whether it blocks AI crawlers. The slightly higher rate for "Unclassified" properties (3.8%) may reflect a different mix of website platforms rather than a deliberate strategic choice.

Hotel AI blocking rates by star classification

Stars	Hotels	Blocks Any AI	Blocks All AI
1-star	2,699	2.6%	1.2%
2-star	10,222	3.1%	0.8%
3-star	30,199	3%	1.1%
4-star	16,548	2.6%	1%
5-star	2,062	3.1%	0.6%
Unclassified	43,272	3.8%	0.8%

Blocking Distribution

How many AI bots do hotels block? The pattern is bimodal: 0 or most.

Number of AI bots blocked per hotel

Hotels either block 0 or block most/all. The distribution is bimodal: 101,544 hotels (96.7%) block zero AI bots, while 3,065 hotels (2.9%) block 9-14 bots. Very few hotels block just 1-3 bots (704, or 0.7%). This suggests that AI blocking is typically an all-or-nothing decision — when hotels add AI blocking rules, they tend to copy comprehensive blocklists rather than selectively choosing individual bots.

Hotels by number of AI bots blocked

Bots Blocked	Hotels	% of Total
0	101,544	96.7%
1-3	704	0.7%
4-8	1,689	1.6%
9-14	3,065	2.9%

Who's Actually Blocking?

The 3,458 blocking hotels aren't random — most blocking is chain or platform-driven.

Logis Hotels — 944 properties (27% of all blockers)

6 bots blocked: GPTBot, ClaudeBot, Google-Extended, CCBot, Bytespider, Applebot-Extended

The French cooperative hotel chain accounts for the single largest share of AI blocking. Their blocking is training-only — they allow ChatGPT-User, OAI-SearchBot, and PerplexityBot. This is the "smart" pattern: block AI training, stay visible in AI search. Logis alone explains most of France's 7.5% outlier rate.

Block-everything hotels — 957 properties

14 bots blocked (all AI crawlers)

These hotels use a blanket Disallow: / for all user agents, which blocks every crawler including AI. Many are Italian resort booking platforms (bookitalyhotels.com, Greenblu) or vacation club networks (Diamond Resorts). Notable 5-star blockers: Grand Hotel Des Iles Borromee (Stresa, 4.7★), Aquatio Cave Luxury Hotel & Spa (Matera, 4.7★), Hotel Masseria San Domenico (Fasano, 4.7★).

Partial search bot blocking — ~90 properties

GPTBot fully blocked + OAI-SearchBot blocked on specific paths

Some hotel chains block GPTBot entirely (no training) but only restrict OAI-SearchBot from sensitive paths like /booking/. This is actually a nuanced, smart strategy: the hotel remains visible in ChatGPT Search for discovery queries, but protects its booking funnel. Our detection flags any Disallow rule as a "block," but these hotels are still discoverable.

Sercotel Hotels — 71 properties (Spain)

9 bots blocked: GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, Google-Extended, PerplexityBot, cohere-ai, YouBot, Applebot-Extended

The Spanish chain blocks both training and search bots — including PerplexityBot and YouBot. They allow OAI-SearchBot and Claude-Web but block ChatGPT-User. Per OpenAI's docs, blocking ChatGPT-User is largely pointless: it's user-initiated and "robots.txt rules may not apply." Meanwhile, blocking PerplexityBot means Sercotel hotels are invisible to Perplexity search — a real visibility loss.

Sofitel (Accor luxury) — blocks only Google-Extended

1 bot blocked: Google-Extended

Sofitel Le Scribe Paris Opéra, Sofitel Paris le Faubourg, Sofitel Paris Arc de Triomphe, Sofitel London St James, Sofitel Legend The Grand Amsterdam — they all block only Google-Extended (Gemini training). GPTBot, ClaudeBot, and search bots pass freely. This is the most minimal blocking policy: stop Google from training Gemini on your content, allow everything else.

Paris Spotlight: 35 hotels block AI

Of the ~4,000+ hotels in our Paris dataset, only 35 block any AI crawler. Notable 5-star blockers:

Hôtel Madame Rêve

5★ · 4.6 rating · blocks 6 training bots

Sofitel Le Scribe Paris Opéra

5★ · 4.6 rating · blocks Google-Extended only

Sofitel Paris le Faubourg

5★ · 4.6 rating · blocks Google-Extended only

Sofitel Paris Arc de Triomphe

5★ · 4.6 rating · blocks Google-Extended only

The majority of Paris blockers are Logis Hotels (via their CMS) and aparthotel chains (corporate policy). The palace hotels — Ritz, Plaza Athénée, Le Bristol, Four Seasons George V — do not block any AI crawler.

5-Star Hotels: 64 Block AI

Only 64 out of ~2,062 five-star hotels (3.1%) block any AI crawler. The most notable:

Hotel	Location	Rating	Blocking
Villa d'Este	Cernobbio, IT	4.7★	GPTBot + ChatGPT-User
Aman Venice	Venice, IT	4.8★	Google-Extended only
Villa la Massa	Candeli, IT	4.8★	GPTBot + ChatGPT-User
Grand Hotel Des Iles Borromee	Stresa, IT	4.7★	All 14 bots
Equinox Hotel New York	New York, US	4.4★	anthropic-ai only
The Royal Horseguards	London, GB	4.4★	6 training bots
Hôtel Madame Rêve	Paris, FR	4.6★	6 training bots
Gran Hotel Inglés	Madrid, ES	4.7★	6 training bots

Italy dominates the 5-star blocking list. Villa d'Este and Villa la Massa (both luxury Italian properties) block GPTBot and ChatGPT-User specifically — an anti-OpenAI stance. Aman blocks only Google-Extended. Equinox New York blocks only anthropic-ai. Each has a different, seemingly deliberate policy.

Most AI blocking is a chain or CMS decision, not an individual hotel decision. Logis alone (944 hotels) accounts for 27% of all blockers. Add other chains and blanket-blocking platforms (~957), and you've explained ~60% of all AI blocking with just a few patterns. The remaining ~40% is a mix of individual hotels, smaller chains, and hosting providers with default blocking rules.

Opting Out

1,071 Hotels Are Invisible to ChatGPT Search

Blocking OAI-SearchBot doesn't just prevent training — it removes your hotel from ChatGPT's search results entirely.

1,071

block OAI-SearchBot

1.0% of all hotels

1,083

block PerplexityBot

1.0% of all hotels

block only search bots

0.1% — deliberate opt-out

Most hotels that block OAI-SearchBot do so as part of a blanket blocklist — they're blocking everything, not specifically targeting search. But 58 hotels block search bots while allowing training bots, which is the exact opposite of the "smart" strategy. These hotels are opting out of AI discovery while still letting their content be used for model training.

Three blocking patterns we observed

Full block

Blanket Disallow: / for all AI bots

~957 hotels block every crawler including all AI bots. These are typically platform-level decisions (booking platforms, resort networks) rather than individual hotel choices. The hotel is invisible to every AI search engine.

Partial block

Block OAI-SearchBot from specific paths only

Some hotel chains block OAI-SearchBot only from sensitive paths (e.g. /booking/) while allowing it on the rest of the site. This is actually a nuanced, smart strategy: the hotel remains visible in ChatGPT Search but protects its booking funnel from being scraped. Our detection counts these as "blocking OAI-SearchBot," but the hotel is still discoverable.

Smart pattern

Block GPTBot (training), allow OAI-SearchBot (search)

2.1% of hotels block training bots while keeping search bots open. This is the optimal approach: your content won't be used to train AI models, but your hotel still appears when travelers ask ChatGPT for recommendations. Some chains go further by also protecting booking paths from search bots — the most sophisticated policy we observed.

Common mistake: blocking ChatGPT-User instead of OAI-SearchBot

1,190 hotels block ChatGPT-User — but this is largely pointless. Per OpenAI's documentation: ChatGPT-User is triggered when a user asks ChatGPT to visit a page or interacts with a Custom GPT. It's user-initiated, not automated crawling — and robots.txt rules may not apply.

Critically, ChatGPT-User is not used to determine whether content appears in ChatGPT Search. That's OAI-SearchBot. Hotels blocking ChatGPT-User think they're opting out of ChatGPT — but they're blocking the wrong bot.

We also observed the reverse mistake: some hotel chains block GPTBot + ChatGPT-User but allow OAI-SearchBot. The result is correct (visible in search, opted out of training) — but likely achieved by accident rather than by understanding the bot taxonomy.

Note on partial blocks

Our detection flags any Disallow rule targeting OAI-SearchBot as a "block." But some of the 1,071 hotels only block specific paths (like /booking/) — not the entire site. These hotels are still discoverable in ChatGPT Search. The true "fully invisible" count is lower than 1,071, concentrated among blanket blockers and platform-level decisions.

Blocking OAI-SearchBot is the new "noindex". When a hotel blocks OAI-SearchBot, it won't appear when travelers ask ChatGPT for recommendations — even if the hotel has great reviews and a strong web presence. As AI search grows as a discovery channel, this is equivalent to delisting yourself from a search engine. Hotels that want to opt out of ChatGPT Search should block OAI-SearchBot — not ChatGPT-User.

Frequently Asked Questions

Methodology

Data Collection

Source: Same 105,002 reachable hotel websites from our schema adoption study
7 countries: France (17.6K), Italy (27.3K), Spain (16.4K), Netherlands (2.9K), USA (7.4K), UK (10.5K), Germany (22.3K)
robots.txt fetched from each domain root with Chrome-like user agent, 10-second timeout
Each robots.txt parsed for User-agent directives and Disallow rules
14 AI-specific bots tracked across training, search, and user categories

Bot Classification

Training bots: GPTBot, Google-Extended, CCBot, Bytespider, ClaudeBot, Applebot-Extended, anthropic-ai, cohere-ai, Diffbot
Search bots: PerplexityBot, OAI-SearchBot, YouBot
User agent bots: ChatGPT-User, Claude-Web
"Blocks any AI" = at least one AI bot has a Disallow rule
"Smart strategy" = blocks at least one training bot but allows all search bots

Continue Reading

Explore more Hotelrank research on AI hotel search.

AI Hotel Landscape 2026

Anatomy of ChatGPT Search Schema Adoption Study Google AI Mode Study Hotel Blog Study All Research