How Flight Fare Data Is Collected (2026)
Airfare is one of the most dynamically priced products on the internet — and one of the most heavily defended against bots. Here's how fare and hotel price data is actually collected, and the lawsuits that define what's allowed.
Quick Answer
Flight fare data is collected from airline websites, online travel agencies, and metasearch engines. Because airfares vary by region, currency, and point-of-sale, collectors use geo-distributed IPs to see the true local fare a real customer in that market sees — while travel sites run some of the heaviest anti-bot defenses online.
- →Sources: airline sites, OTAs (Expedia, Booking.com), metasearch (Google Flights, Kayak, Skyscanner)
- →Geo-pricing is real; the cookie price-hike theory is largely a myth
- →Public-data scraping isn't a CFAA crime, but airline Terms of Service are enforceable contracts
Travel is a price-comparison business. iBuyers, travel agencies, fare-alert apps, revenue management teams, and corporate travel tools all depend on knowing what every seat costs right now, across every channel. That demand is what drives large-scale collection of airfare and hotel pricing data — and the airlines fight it hard.
What's collected
Fare data is pulled from three layers of the travel ecosystem, each with different structure and different defenses:
- →Airline sites: the source of truth for a carrier's own fares, but with the strictest bot defenses
- →Online travel agencies (OTAs): Expedia, Booking.com — aggregated inventory across many suppliers
- →Metasearch engines: Google Flights, Kayak, Skyscanner — comparison results that themselves aggregate airlines and OTAs
The same route can show different prices across all three at the same moment — which is exactly why comparison data has commercial value.
Geo-pricing is real — the cookie myth isn't
Airfares genuinely vary by region, currency, and point-of-sale. Carriers and OTAs set different fares for different markets and currencies, and they detect your market partly from your IP and location. This is real geographic price discrimination — and it's the core reason fare collection needs geo-distributed IPs.
The popular claim that airlines raise a price because your cookies show you searched before is a different story. Academic studies have not found a consistent cookie effect — one study found American, Delta, and United did not price-discriminate based on cookies. A 2024 non-peer-reviewed industry study (MightyTravels, ~2,000 searches) reported a clean browser was cheaper about 59% of the time, but that result contradicts the academic findings and should be treated as disputed.
The anti-bot / technical reality
Travel is among the most heavily defended verticals on the web. Airlines pay for every search that hits their reservation systems (GDS look-to-book costs), so they have a direct financial incentive to block bots. DataDome's published Dohop case study illustrates the scale: Dohop uses DataDome across 75+ airline partners, reporting roughly a 70% reduction in bot traffic and over 3 million malicious requests blocked per month.
Akamai, DataDome, and PerimeterX/HUMAN are commonly cited as the defenses behind major travel sites. The specific vendor-per-site mappings circulated in scraping blogs are not officially confirmed by the sites themselves — treat those mappings as reported rather than authoritative.
For how these systems flag automated traffic, see our guide on how websites detect proxies.
Legal precedent: the airline cases
Airline fare scraping has produced some of the clearest case law in the field. The pattern is consistent: courts rarely treat scraping public fares as a hacking crime, but they enforce the airline's Terms of Service as a binding contract.
Southwest v. Kiwi.com (N.D. Tex.)
A federal court granted Southwest a preliminary injunction on September 30, 2021, barring Kiwi.com from scraping Southwest fares. Kiwi had assented to Southwest's terms, which prohibited scraping. A permanent injunction followed in December 2021, and the matter settled. The win rested on the contract, not on hacking law.
Southwest v. Skiplagged
Skiplagged filed a preemptive declaratory-judgment suit in New York on July 1, 2021. The SDNY dismissed it as an improper anticipatory filing, and the dispute proceeded in the Northern District of Texas. The final Texas substantive outcome is not firmly established here, so we do not state a verdict.
Ryanair v. PR Aviation (CJEU C-30/14)
On January 15, 2015, the EU Court of Justice held that Ryanair's flight data was not protected by the Database Directive or copyright. But because the Directive didn't apply, Ryanair was free to enforce its own Terms & Conditions prohibiting screen-scraping. The landmark point: terms and conditions can bar scraping of data that copyright doesn't even protect.
Where mobile & geo IPs fit
Because fares are genuinely geo-priced, you cannot collect an accurate dataset from a single location. To see the real fare a customer in Berlin, São Paulo, or Dallas is offered, your request has to originate from that market. Geo-distributed mobile IPs are the infrastructure that makes location-accurate collection possible.
That infrastructure is for legitimate, geo-distributed price research — it doesn't override a site's Terms of Service or rate limits. Respect both: honor robots and ToS, throttle your request rate, and collect only what you're entitled to collect.
Sources
Related Guides
Collect fares from the right market
Geo-distributed 4G/5G IPs in the USA, UK, and Netherlands so you see the real local fare. Test it for $5.