Does ChatGPT have its own search index for hotels?

Yes. OpenAI's API exposes an external_web_access parameter — set it to false and the model searches only cached/indexed results instead of the live web. Our test of 400 hotel queries shows that 83% of cited domains differ between live and cached mode, confirming OpenAI maintains its own index alongside live web access.

How different are ChatGPT hotel results between live and cached mode?

Very different. Across 100 hotel prompts, the domain Jaccard similarity between live and cached mode is only 0.17 for GPT-5.4 and 0.15 for GPT-5.3. For 13 prompts the overlap is exactly zero — completely disjoint source sets for the same question. The index and the live web are essentially different webs.

How does GPT-5.4 search differently from GPT-5.3 for hotels?

GPT-5.4 behaves like an SEO researcher: it runs ~2 searches per response, uses 11-word queries, and 87% of its queries contain "official." It uses site: operators 27-35% of the time to target specific hotel websites and editorial publishers. GPT-5.3 is a simple one-shot retriever: 1 search per response, 6-7 word queries, no advanced operators.

Which websites does ChatGPT cite most for hotels?

It depends on the model and mode. GPT-5.4 live mode favors Michelin Guide (#1 with 22 citations), CN Traveler, TripAdvisor, and Booking.com. GPT-5.3 live mode is dominated by Wikipedia (#1 with 56 citations). In cached mode, TripAdvisor leads for both models. The source mix shifts dramatically between live and cached — hotels visible in one mode may be invisible in the other.

Does this affect free ChatGPT users?

Very likely yes. Roughly 95% of ChatGPT users are on the free tier, which gets a less capable model with lower search frequency. Free users are more likely to receive answers without web search, meaning the model relies on its training data or cached index. The majority of people asking ChatGPT for hotel recommendations may never get live web results at all.

April 2026Live vs Cached

ChatGPT's Hotel Index Is a Different Web

Name: ChatGPT Hotel Index vs Live Web Search Comparison 2026
Creator: Hotelrank
Published: 2026-04-09
License: https://creativecommons.org/licenses/by/4.0/

We flipped one API parameter and got different hotel recommendations. 400 queries, 2 models, 2 search modes. 83% of cited domains change when ChatGPT uses its own index instead of the live web.

TL;DR: OpenAI's API has live web switch: external_web_access. Set it to false and ChatGPT searches only its cached corpus. We ran 100 hotel prompts on GPT-5.4 and GPT-5.3 in both modes. 83–85% of cited domains differ. For 13 prompts, the overlap is literally zero.

Nicolas Sitter

Founder, Hotelrank · Published April 9, 2026

Read Methodology What This Means for Hotels

Key Findings

OpenAI has an external_web_access parameter in their web search tool. Set it to false and the model searches only cached/indexed results — confirming that OpenAI maintains its own search index alongside live web access.

This is not a minor technical detail. For hotel marketers asking “is my property visible in ChatGPT?”, the answer depends on which ChatGPT — the one with live web (aka Google) access, or the one running on OpenAI's own index. They return fundamentally different source sets, cite different domains, and recommend different properties.

17%

Domain overlap (GPT-5.4)

Live vs cached

15%

Domain overlap (GPT-5.3)

Live vs cached

Overlap on 13 prompts

Completely disjoint

The one-parameter audit trap: if a hotel marketer runs a “is my property visible in ChatGPT?” check through the API without controlling external_web_access, they could get two contradictory answers from the same model, on the same prompt, seconds apart.

The Index Is a Different Web

Across 100 prompts, only 6–17% of cited domains overlap between live and cached responses. The overlap at the URL level is even lower (6–10%).

Overall Jaccard similarity between live and cached modes
Model	Domain Jaccard	URL Jaccard	Query Jaccard
gpt-5.4	0.17	0.06	0.02
gpt-5.3-chat-latest	0.15	0.10	0.43

For 13 prompts (across both models), the domain Jaccard is exactly 0.0 — the live and cached answers don't share a single source domain.

Zero-overlap examples

gpt-5.4Best hotels in Cape Town

gpt-5.4Best hotels in Tokyo

gpt-5.4Best hotels in Barcelona Gothic Quarter

gpt-5.3Best hotels in Dubai

gpt-5.3Best boutique hotels in Rio de Janeiro

gpt-5.3Best hotels in Marrakech Medina

Example: “Best hotels in Dubai”

GPT-5.3-chat-latest · Domain Jaccard = 0.0 — not a single shared source

Live web4 sources

en.wikipedia.org

Hotel history & location

bulgarihotels.com

Official brand site

worldtravelawards.com

Award listing

agluxuryproperties.com

Dubai luxury guide

Cached index4 sources

traveltodubai.ae

Dubai tourism portal

themiddleeastinsider.com

Regional travel blog

timesofindia.indiatimes.com

Indian newspaper travel section

dubaivisitvisa.online

Visa & travel guide site

Same model, same prompt, same moment. The live web pulls Wikipedia and the actual Bulgari hotel site. The index pulls a Dubai visa site and an Indian newspaper. These are not slightly different source mixes — they are entirely different information ecosystems producing different hotel recommendations.

Example: “Best boutique hotels in Tokyo”

GPT-5.3-chat-latest · Domain Jaccard = 0.0 — 17 sources, zero overlap

Live web10 sources

en.wikipedia.org

Hotel & neighborhood articles

wallpaper.com

Design & architecture magazine

thehoteljournal.com

Boutique hotel editorial

smallboutique-hotels.com

Boutique hotel directory

travel.rakuten.com

Japanese booking platform

whimsysoul.com

Travel blog

blog.bespoke-discovery.com

Japan travel blog

jasumo.com

Japan travel guide

cccj.or.jp

Canadian Chamber of Commerce Japan

team.interaction-design.org

Design community

Cached index7 sources

tripadvisor.com

Review platform

thehotelguru.com

Hotel comparison site

luxuryhotel.guru

Luxury hotel directory

trulytokyo.com

Tokyo travel guide

touristjapan.com

Japan tourism site

hikemasterjapan.com

Japan outdoor travel

localsinjapan.com

Japan expat blog

17 total sources, not a single one in common. The live web finds Wikipedia, Wallpaper* magazine, and Rakuten Travel. The index falls back to TripAdvisor, niche hotel directories, and Japan-focused blogs. A hotel visible on one side is invisible on the other.

Example: “Best hotels in Singapore Marina Bay”

GPT-5.4 · Domain Jaccard = 0.67 — when there is overlap, it's hotel brand sites

Live web5 sources

●marinabaysands.com

Official hotel site

●fullertonhotels.com

Official hotel site

●mandarinoriental.com

Official hotel site

●ritzcarlton.com

Official hotel site

hilton.com

Official hotel site

Cached index5 sources

●marinabaysands.com

Official hotel site

●fullertonhotels.com

Official hotel site

●mandarinoriental.com

Official hotel site

●ritzcarlton.com

Official hotel site

panpacific.com

Official hotel site

Green = shared across both modes. When there is convergence, it's on official hotel brand websites — the domains GPT-5.4 actively hunts with site: queries. Marina Bay Sands, Fullerton, Mandarin Oriental, and Ritz-Carlton appear in both modes because they're major brands with strong web presence. The only difference: live mode picks Hilton, cached mode picks Pan Pacific. Brand authority is the stabilizing force.

Cape Town, Tokyo, Barcelona, Dubai, Rio, Marrakech — these are not obscure destinations. These are tier-1 travel cities where the index and the live web produce zero shared sources. If your hotel is in one of these markets, which ChatGPT your guest uses matters.

Africa Is the Index's Blind Spot

GPT-5.3's index barely covers Africa — domain Jaccard of just 0.061, meaning the cached and live results share almost nothing. GPT-5.4 is dramatically more uniform across continents.

GPT-5.3 — Uneven coverage

GPT-5.3 live vs cached domain overlap by continent

GPT-5.4 — Uniform coverage

GPT-5.4 live vs cached domain overlap by continent

Per-continent domain Jaccard (live vs cached)
Continent	GPT-5.3	GPT-5.4
MENA	0.217	0.202
North America	0.210	0.187
Oceania	0.177	0.173
Latin America	0.169	0.165
Asia	0.137	0.165
Europe	0.101	0.163
Africa	0.061	0.155

GPT-5.4's fan-out strategy equalizes coverage. Because it issues brand-targeted site: queries, it finds common ground in both modes even where coverage is thin. GPT-5.3's simple one-shot searches expose the raw state of the index — and in Africa, that index is nearly empty.

3-Star Queries Are 2x More Reproducible

Budget/star-rating queries collapse to a small set of OTAs (Booking, Expedia, Hotels.com, TripAdvisor) that exist in both the index and the live web. Boutique and persona queries fan out to editorial sources where the divergence is much higher.

Live vs cached domain overlap by query type

Per-tier domain Jaccard (live vs cached)
Query Type	GPT-5.3	GPT-5.4
Broad ("Best hotels in {city}")	0.111	0.146
Boutique	0.098	0.125
3-star	0.298	0.262
Neighborhood	0.102	0.184
Persona (couples)	0.123	0.142

Operational takeaway: if you're auditing your hotel's visibility in ChatGPT, 3-star queries give the most stable results across modes. Boutique and luxury audits are mode-dependent — always run both and reconcile.

GPT-5.4 Does Keyword Research; GPT-5.3 Does Not

The two models have completely different search strategies. GPT-5.4 behaves like an SEO analyst; GPT-5.3 is a simple one-shot retriever.

Search behavior comparison
Metric	GPT-5.3	GPT-5.4
Searches per response	1.0	~2.0
Avg query length (words)	6.5	10.9
Max query length	11	27
% with year (2023+)	53%	27%
% with site: operator	0%	31%
% containing "official"	0%	87%
% containing "review"	3%	13%

GPT-5.3 queries

"best boutique hotels Paris 2026"

"top luxury hotels Tokyo 2026"

"best 3-star hotels Barcelona"

Simple, natural language. No operators.

GPT-5.4 queries

"site:cntraveler.com best boutique hotels paris 2025"

"site:michelin.com MICHELIN Guide Barcelona hotel"

"site:booking.com Rome 3-star hotel official rating"

Long, intent-loaded. 87% include "official".

GPT-5.4 searches by brand name — both editorial and hotel brands. Forbes, Michelin, CN Traveler, Booking.com appear by name in its queries. It also targets individual hotel domains with site: queries to verify location, amenities, and room types directly from the source. If your hotel's own website is unindexed, blocked to AI crawlers, or has poor structure, GPT-5.4 cannot find it through this path.

Brands GPT-5.4 searches for by name (across 381 queries)

Forbes ×48Michelin ×43Booking ×24TripAdvisor ×21Hyatt ×11Conde Nast ×8Park Hyatt ×7Four Seasons ×7Hilton ×7Mandarin ×6

GPT-5.3 issued zero queries containing any brand or publisher name.

Who Powers Each Mode

The source mix shifts dramatically between modes. Wikipedia dominates GPT-5.3 live mode. Michelin dominates GPT-5.4 live mode. TripAdvisor leads the cached index for both models.

GPT-5.3 — Cached (index)

GPT-5.3 cached: top cited domains

GPT-5.3 — Live

GPT-5.3 live: top cited domains

Wikipedia: absent from the index, #1 on the live web. For GPT-5.3, Wikipedia jumps from not appearing in the top 20 in cached mode to #1 with 56 citations in live mode. Hotel Wikipedia pages are an underrated visibility lever — but only for the live web path.

GPT-5.4 — Cached (index)

GPT-5.4 cached: top cited domains

GPT-5.4 — Live

GPT-5.4 live: top cited domains

The index favors TripAdvisor. The live web favors editorial authority. In cached mode, TripAdvisor leads for both models. In live mode, Michelin Guide (#1 for GPT-5.4) and Wikipedia (#1 for GPT-5.3) take over. This means TripAdvisor is a critical presence in OpenAI's own index — but editorial prestige (Michelin Keys, Forbes ratings) matters more when the live web is accessible.

GPT-5.4 Actually Browses Pages

GPT-5.4 doesn't just search — it opens pages and reads them. It issued 17 open_page and 4 find_in_page actions in live mode. GPT-5.3 issued zero.

Browsing actions by model and mode
Model + Mode	search	open_page	find_in_page
GPT-5.4 live	181	17	4
GPT-5.4 cached	200	5	0
GPT-5.3 live	100	0	0
GPT-5.3 cached	100	0	0

Where GPT-5.4 opens pages (live mode)

tripadvisor.com×6

booking.com×4

guide.michelin.com×3

cntraveler.com×2

oneandonlyresorts.com×1

casonaroma.com×1

What GPT-5.4 searches for inside pages

"#1 Best Value" on tripadvisor.com/Hotels-...-Paris

"4.5 of 5 bubbles" on tripadvisor.com/Hotels-...-London

"Palacio Duhau" on cntraveler.com/gallery/best-hotels-in-buenos-aires

The model has learned TripAdvisor's ranking labels and hunts for them by name. This is competitive research, not text generation.

21 out of 22 open_page URLs also appear in citations. Browsing doesn't unlock new sources — it's a deep-read of sources the model already found via search. The signal is which domains GPT-5.4 considers worth reading: TripAdvisor, Booking, Michelin, CN Traveler. Those are the publishers it actively trusts enough to read the body, not just the snippet.

What This Means for Hotels

1It's about authority sources — not just one publisher

GPT-5.4 searches for authority sources by name: Michelin Guide, Forbes Travel Guide, CN Traveler, TripAdvisor, Booking.com — 75+ brand mentions across 100 prompts. But the broader point is that any trusted editorial source matters: travel magazines, award bodies (World Travel Awards, World's 50 Best Hotels), national tourism boards, and respected travel blogs all feed into the live web path. In cached mode, the index falls back to a narrower set dominated by OTAs and niche aggregators. The takeaway isn't “get on Michelin” — it's that editorial authority is the currency of live-web AI recommendations, and hotels that invest in PR, awards, and media coverage have a structural advantage in the live path.

2Your hotel's own website matters — by name

87% of GPT-5.4's queries contain “official” and ~30% use site: against specific hotel domains. Hotels with sites that are blocked to AI crawlers, slow to render, or missing structured data are invisible to GPT-5.4's strongest research pattern. See our robots.txt study for how many hotels block AI crawlers.

3TripAdvisor is the index workhorse

TripAdvisor leads the cached index for both models. GPT-5.4 reads specific TripAdvisor list pages with find_in_page and pulls the “#1 Best Value” label. TripAdvisor visibility translates more directly into LLM citations than any other aggregator. See our TripAdvisor in ChatGPT study.

4Wikipedia is GPT-5.3's live-mode favorite

Wikipedia jumps from absent in cached mode to #1 with 56 citations in live mode for GPT-5.3. Hotel Wikipedia pages are an underrated visibility lever for the chat-tuned model line.

5Always audit in both modes

3-star queries give ~2x higher live-vs-cached overlap than any other tier. Boutique and luxury audits are mode-dependent. Run both modes and reconcile. And for African markets, only GPT-5.4 produces stable cross-mode results.

Methodology

How We Collected This Data

Setup

Models: GPT-5.4 (latest model, available to paid users) and GPT-5.3-chat-latest (the model currently powering ChatGPT.com for free users). We chose these two to cover both ends of the user base. Note: using the API does not perfectly replicate the ChatGPT.com UI experience (different system prompt, no memory, no tool orchestration), but it lets us isolate the external_web_access variable cleanly.
API: OpenAI Responses API with web_search tool
Modes: external_web_access=true (live) and false (cached)
Tool choice: forced via tool_choice={type: "web_search"} — every call searches

Prompts

100 hotel discovery prompts
20 cities × 5 prompt tiers spanning all inhabited continents
5 tiers: broad, boutique, 3-star, neighborhood, persona (couples)
1 run per (model, mode, prompt) → 400 total calls, all succeeded

Why 100 prompts, not 1,000? This study measures source-level divergence (which domains get cited), not answer-level variation (which hotels get named). Source sets stabilize fast — by 100 prompts across 20 cities and 5 tiers, the signal is already unambiguous: 83% domain divergence with 13 zero-overlap cases. More prompts would add volume without changing the conclusion.

Captured Per Call

Full response text
All web_search_call.action items (search, open_page, find_in_page)
All url_citation annotations (URL + title + offsets)
Latency and token usage

Analysis Metrics

Domain Jaccard: intersection/union of cited domains between live and cached for each prompt
URL Jaccard: same metric at the exact-URL level
Query Jaccard: overlap in the search queries the model issues
🥖 Query Jaccard vs Query Jacquouille: Just watch Les Visiteurs
Results aggregated per continent, per prompt tier, and per model

Data Summary

400 API calls (100 prompts × 2 models × 2 modes)
100% success rate — all 400 calls returned results
Data collected: April 2026

Data Access

We believe in open research. Contact us for access to the raw data, analysis scripts, and methodology details.

Request Data Access

Frequently Asked Questions

Continue Reading

This study is part of our ongoing research into how AI search engines recommend hotels.

Anatomy of ChatGPT Hotel Search AI Hotel Landscape 2026