ChatGPT's Hotel Index Is a Different Web
We flipped one API parameter and got different hotel recommendations. 400 queries, 2 models, 2 search modes. 83% of cited domains change when ChatGPT uses its own index instead of the live web.
TL;DR: OpenAI's API has live web switch: external_web_access. Set it to false and ChatGPT searches only its cached corpus. We ran 100 hotel prompts on GPT-5.4 and GPT-5.3 in both modes. 83β85% of cited domains differ. For 13 prompts, the overlap is literally zero.
Key Findings
OpenAI has an external_web_access parameter in their web search tool. Set it to false and the model searches only cached/indexed results β confirming that OpenAI maintains its own search index alongside live web access.
This is not a minor technical detail. For hotel marketers asking βis my property visible in ChatGPT?β, the answer depends on which ChatGPT β the one with live web (aka Google) access, or the one running on OpenAI's own index. They return fundamentally different source sets, cite different domains, and recommend different properties.
external_web_access, they could get two contradictory answers from the same model, on the same prompt, seconds apart.The Index Is a Different Web
Across 100 prompts, only 6β17% of cited domains overlap between live and cached responses. The overlap at the URL level is even lower (6β10%).
| Model | Domain Jaccard | URL Jaccard | Query Jaccard |
|---|---|---|---|
| gpt-5.4 | 0.17 | 0.06 | 0.02 |
| gpt-5.3-chat-latest | 0.15 | 0.10 | 0.43 |
For 13 prompts (across both models), the domain Jaccard is exactly 0.0 β the live and cached answers don't share a single source domain.
Zero-overlap examples
Example: βBest hotels in Dubaiβ
GPT-5.3-chat-latest Β· Domain Jaccard = 0.0 β not a single shared source
Same model, same prompt, same moment. The live web pulls Wikipedia and the actual Bulgari hotel site. The index pulls a Dubai visa site and an Indian newspaper. These are not slightly different source mixes β they are entirely different information ecosystems producing different hotel recommendations.
Example: βBest boutique hotels in Tokyoβ
GPT-5.3-chat-latest Β· Domain Jaccard = 0.0 β 17 sources, zero overlap
17 total sources, not a single one in common. The live web finds Wikipedia, Wallpaper* magazine, and Rakuten Travel. The index falls back to TripAdvisor, niche hotel directories, and Japan-focused blogs. A hotel visible on one side is invisible on the other.
Example: βBest hotels in Singapore Marina Bayβ
GPT-5.4 Β· Domain Jaccard = 0.67 β when there is overlap, it's hotel brand sites
Green = shared across both modes. When there is convergence, it's on official hotel brand websites β the domains GPT-5.4 actively hunts with site: queries. Marina Bay Sands, Fullerton, Mandarin Oriental, and Ritz-Carlton appear in both modes because they're major brands with strong web presence. The only difference: live mode picks Hilton, cached mode picks Pan Pacific. Brand authority is the stabilizing force.
Africa Is the Index's Blind Spot
GPT-5.3's index barely covers Africa β domain Jaccard of just 0.061, meaning the cached and live results share almost nothing. GPT-5.4 is dramatically more uniform across continents.
GPT-5.3 β Uneven coverage
GPT-5.3 live vs cached domain overlap by continent
Bar chart showing GPT-5.3 domain Jaccard by continent, Africa highlighted in red at 6.1%
GPT-5.4 β Uniform coverage
GPT-5.4 live vs cached domain overlap by continent
Bar chart showing GPT-5.4 domain Jaccard by continent, relatively uniform between 15-20%
| Continent | GPT-5.3 | GPT-5.4 |
|---|---|---|
| MENA | 0.217 | 0.202 |
| North America | 0.210 | 0.187 |
| Oceania | 0.177 | 0.173 |
| Latin America | 0.169 | 0.165 |
| Asia | 0.137 | 0.165 |
| Europe | 0.101 | 0.163 |
| Africa | 0.061 | 0.155 |
site: queries, it finds common ground in both modes even where coverage is thin. GPT-5.3's simple one-shot searches expose the raw state of the index β and in Africa, that index is nearly empty.3-Star Queries Are 2x More Reproducible
Budget/star-rating queries collapse to a small set of OTAs (Booking, Expedia, Hotels.com, TripAdvisor) that exist in both the index and the live web. Boutique and persona queries fan out to editorial sources where the divergence is much higher.
Live vs cached domain overlap by query type
Bar chart showing 3-star queries at ~28% overlap versus all other tiers at 10-18%
| Query Type | GPT-5.3 | GPT-5.4 |
|---|---|---|
| Broad ("Best hotels in {city}") | 0.111 | 0.146 |
| Boutique | 0.098 | 0.125 |
| 3-star | 0.298 | 0.262 |
| Neighborhood | 0.102 | 0.184 |
| Persona (couples) | 0.123 | 0.142 |
GPT-5.4 Does Keyword Research; GPT-5.3 Does Not
The two models have completely different search strategies. GPT-5.4 behaves like an SEO analyst; GPT-5.3 is a simple one-shot retriever.
| Metric | GPT-5.3 | GPT-5.4 |
|---|---|---|
| Searches per response | 1.0 | ~2.0 |
| Avg query length (words) | 6.5 | 10.9 |
| Max query length | 11 | 27 |
| % with year (2023+) | 53% | 27% |
| % with site: operator | 0% | 31% |
| % containing "official" | 0% | 87% |
| % containing "review" | 3% | 13% |
GPT-5.3 queries
"best boutique hotels Paris 2026"
"top luxury hotels Tokyo 2026"
"best 3-star hotels Barcelona"
Simple, natural language. No operators.
GPT-5.4 queries
"site:cntraveler.com best boutique hotels paris 2025"
"site:michelin.com MICHELIN Guide Barcelona hotel"
"site:booking.com Rome 3-star hotel official rating"
Long, intent-loaded. 87% include "official".
site: queries to verify location, amenities, and room types directly from the source. If your hotel's own website is unindexed, blocked to AI crawlers, or has poor structure, GPT-5.4 cannot find it through this path.Brands GPT-5.4 searches for by name (across 381 queries)
GPT-5.3 issued zero queries containing any brand or publisher name.
Who Powers Each Mode
The source mix shifts dramatically between modes. Wikipedia dominates GPT-5.3 live mode. Michelin dominates GPT-5.4 live mode. TripAdvisor leads the cached index for both models.
GPT-5.3 β Cached (index)
GPT-5.3 cached: top cited domains
TripAdvisor leads at 49, followed by Oyster at 27 and Expedia at 17
GPT-5.3 β Live
GPT-5.3 live: top cited domains
Wikipedia explodes to #1 with 56 citations, TripAdvisor drops to 23
Wikipedia: absent from the index, #1 on the live web. For GPT-5.3, Wikipedia jumps from not appearing in the top 20 in cached mode to #1 with 56 citations in live mode. Hotel Wikipedia pages are an underrated visibility lever β but only for the live web path.
GPT-5.4 β Cached (index)
GPT-5.4 cached: top cited domains
TripAdvisor leads at 26, CN Traveler at 19, Forbes Travel Guide at 18
GPT-5.4 β Live
GPT-5.4 live: top cited domains
Michelin Guide jumps to #1 with 22 citations in live mode
GPT-5.4 Actually Browses Pages
GPT-5.4 doesn't just search β it opens pages and reads them. It issued 17 open_page and 4 find_in_page actions in live mode. GPT-5.3 issued zero.
| Model + Mode | search | open_page | find_in_page |
|---|---|---|---|
| GPT-5.4 live | 181 | 17 | 4 |
| GPT-5.4 cached | 200 | 5 | 0 |
| GPT-5.3 live | 100 | 0 | 0 |
| GPT-5.3 cached | 100 | 0 | 0 |
Where GPT-5.4 opens pages (live mode)
What GPT-5.4 searches for inside pages
"#1 Best Value" on tripadvisor.com/Hotels-...-Paris
"4.5 of 5 bubbles" on tripadvisor.com/Hotels-...-London
"Palacio Duhau" on cntraveler.com/gallery/best-hotels-in-buenos-aires
The model has learned TripAdvisor's ranking labels and hunts for them by name. This is competitive research, not text generation.
open_page URLs also appear in citations. Browsing doesn't unlock new sources β it's a deep-read of sources the model already found via search. The signal is which domains GPT-5.4 considers worth reading: TripAdvisor, Booking, Michelin, CN Traveler. Those are the publishers it actively trusts enough to read the body, not just the snippet.What This Means for Hotels
1It's about authority sources β not just one publisher
GPT-5.4 searches for authority sources by name: Michelin Guide, Forbes Travel Guide, CN Traveler, TripAdvisor, Booking.com β 75+ brand mentions across 100 prompts. But the broader point is that any trusted editorial source matters: travel magazines, award bodies (World Travel Awards, World's 50 Best Hotels), national tourism boards, and respected travel blogs all feed into the live web path. In cached mode, the index falls back to a narrower set dominated by OTAs and niche aggregators. The takeaway isn't βget on Michelinβ β it's that editorial authority is the currency of live-web AI recommendations, and hotels that invest in PR, awards, and media coverage have a structural advantage in the live path.
2Your hotel's own website matters β by name
87% of GPT-5.4's queries contain βofficialβ and ~30% use site: against specific hotel domains. Hotels with sites that are blocked to AI crawlers, slow to render, or missing structured data are invisible to GPT-5.4's strongest research pattern. See our robots.txt study for how many hotels block AI crawlers.
3TripAdvisor is the index workhorse
TripAdvisor leads the cached index for both models. GPT-5.4 reads specific TripAdvisor list pages with find_in_page and pulls the β#1 Best Valueβ label. TripAdvisor visibility translates more directly into LLM citations than any other aggregator. See our TripAdvisor in ChatGPT study.
4Wikipedia is GPT-5.3's live-mode favorite
Wikipedia jumps from absent in cached mode to #1 with 56 citations in live mode for GPT-5.3. Hotel Wikipedia pages are an underrated visibility lever for the chat-tuned model line.
5Always audit in both modes
3-star queries give ~2x higher live-vs-cached overlap than any other tier. Boutique and luxury audits are mode-dependent. Run both modes and reconcile. And for African markets, only GPT-5.4 produces stable cross-mode results.
How We Collected This Data
Setup
- Models: GPT-5.4 (latest model, available to paid users) and GPT-5.3-chat-latest (the model currently powering ChatGPT.com for free users). We chose these two to cover both ends of the user base. Note: using the API does not perfectly replicate the ChatGPT.com UI experience (different system prompt, no memory, no tool orchestration), but it lets us isolate the
external_web_accessvariable cleanly. - API: OpenAI Responses API with
web_searchtool - Modes:
external_web_access=true(live) andfalse(cached) - Tool choice: forced via
tool_choice={type: "web_search"}β every call searches
Prompts
- 100 hotel discovery prompts
- 20 cities Γ 5 prompt tiers spanning all inhabited continents
- 5 tiers: broad, boutique, 3-star, neighborhood, persona (couples)
- 1 run per (model, mode, prompt) β 400 total calls, all succeeded
Captured Per Call
- Full response text
- All
web_search_call.actionitems (search, open_page, find_in_page) - All
url_citationannotations (URL + title + offsets) - Latency and token usage
Analysis Metrics
- Domain Jaccard: intersection/union of cited domains between live and cached for each prompt
- URL Jaccard: same metric at the exact-URL level
- Query Jaccard: overlap in the search queries the model issues
- π₯ Query Jaccard vs Query Jacquouille: Just watch Les Visiteurs
- Results aggregated per continent, per prompt tier, and per model
Data Summary
- 400 API calls (100 prompts Γ 2 models Γ 2 modes)
- 100% success rate β all 400 calls returned results
- Data collected: April 2026
Data Access
We believe in open research. Contact us for access to the raw data, analysis scripts, and methodology details.
Frequently Asked Questions
Continue Reading
This study is part of our ongoing research into how AI search engines recommend hotels.