Home/Blog/How Websites Detect Proxies

Technical Deep-Dive

How Websites Detect Proxies in 2026

Modern bot protection stacks use 7+ detection layers simultaneously. Here's exactly how they catch proxy traffic — and why mobile carrier IPs remain the hardest to detect.

15 min read·Based on research from Cloudflare, DataDome, MaxMind, Salesforce

Quick Answer

Websites detect proxies through 7 simultaneous layers: IP reputation databases (MaxMind, IPQualityScore), ASN classification (datacenter vs carrier), TLS fingerprinting (JA3/JA4), TCP/IP stack analysis, HTTP header inconsistencies, WebRTC/DNS leaks, and behavioral signals. Mobile proxies on carrier ASNs resist all 7 layers because they share CGNAT pools with real users.

→IP reputation databases flag datacenter and known VPN IPs first
→ASN-based blocking can't reject carrier ranges without losing real customers
→TLS/JA4 fingerprints from real mobile devices match exactly (RFC 1928)

Check your IP's detection class

Free tool — see what websites detect about you right now

Detecting proxy traffic isn't one test — it's a stack of them, running in parallel before your request even reaches application code. Cloudflare, DataDome, PerimeterX (HUMAN), and Akamai Bot Manager all evaluate multiple signals simultaneously. A single mismatch can flag your session.

This guide walks through each layer with technical accuracy and the real tools involved. At the end, we'll show why mobile proxies on carrier networks remain the hardest traffic to classify as "proxy" — not because they're invisible, but because blocking them costs websites more than it saves.

1. IP Reputation Databases

The first filter is usually a lookup against commercial IP reputation databases. These are updated continuously from honeypots, botnet logs, customer reports, and ASN ownership data.

Service	Classifications	Update Frequency
MaxMind Anonymous Plus	VPN, residential proxy, hosting, Tor exit, public proxy — with confidence score & provider name	Daily + "last seen" ISO date
IPQualityScore	25+ data points: honeypot traps, ML models, request velocity, abuse history	Real-time
Spur.us	~60M suspect IPs, 1,000+ known VPN/proxy services, session shift detection	Daily
IP2Proxy	Classes: VPN, PUB, WEB, TOR, DCH, SES, RES, CPN, EPN	Daily

Datacenter proxies appear in these databases almost immediately. Residential proxies take longer but eventually surface when abuse patterns accumulate. Mobile carrier IPs are reassigned frequently by the carrier — MaxMind's "last seen" date ages out stale entries quickly.

2. ASN-Based Detection

Every IP belongs to an Autonomous System Number (ASN). Bot protection classifies ASNs before any application logic runs. AWS WAF even exposes ASN matching as a first-class rule primitive.

Flagged Datacenter ASNs

• AWS (AS16509, AS14618)
• Google Cloud (AS15169, AS396982)
• Hetzner (AS24940)
• OVH (AS16276)
• DigitalOcean (AS14061)
• Linode/Akamai (AS63949)

→ Auto-flagged, rate limited, CAPTCHA'd

Protected Carrier ASNs

• AT&T (AS7018, AS20057)
• T-Mobile (AS21928)
• Verizon (AS22394, AS6167)
• Vodafone (AS12430)
• Orange, O2, EE, and others

→ Cannot be blocked without blocking millions of real users

This is the foundational reason mobile proxies work: carrier ASNs front real consumer traffic at massive scale. A website that blocks AS21928 loses every T-Mobile customer.

3. TLS Fingerprinting (JA3/JA4)

Before any HTTP content transmits, the TLS ClientHello packet reveals a client fingerprint. Sophisticated detection reads this pre-application and compares it to the claimed User-Agent.

JA3 (Salesforce, 2017)

Hashes TLS ClientHello fields — version, cipher suites, extensions, elliptic curves, EC point formats — into a 32-character MD5 fingerprint. Weakness: Chrome's TLS extension randomization (GREASE + permute-extensions) broke stable JA3 hashes.

JA4 (FoxIO)

36-character fingerprint that normalizes fields to survive extension randomization. Full suite:

JA4	TLS ClientHello fingerprint (normalized)
JA4S	TLS ServerHello response
JA4H	HTTP client fingerprint (headers, order)
JA4T	TCP fingerprint
JA4X	X.509 certificate fingerprint
JA4L	Latency-based fingerprint

Cloudflare implements JA4 at the edge using a Rust-based parser and exposes matching primitives like cf.bot_management.ja4 for WAF rules.

DataDome processes ~3 trillion signals daily. For each TLS fingerprint, they record the percentage of known bots, IP quality, and associated OS. Two patterns trigger blocks: (a) fingerprints tied to known bots, and (b) inconsistent combinations — e.g., a curl JA4 with an iPhone User-Agent.

For a scraper-focused breakdown of how these hashes are computed and why libraries like curl and Python requests stand out, see our deep dive on TLS fingerprinting with JA3 and JA4. The same detection stack also inspects the HTTP/2 fingerprint of each connection.

4. TCP/IP Stack Fingerprinting

Even deeper than TLS: the TCP handshake itself leaks OS information. p0f v3 (by lcamtuf) passively analyzes packets to identify the originating OS from the transport layer.

Signature fields

Initial TTL:   Windows=128, Linux=64, routers=255
Window size:   OS-specific default
MSS:           segment size advertised
TCP options:   exact order matters
DF flag:       Don't Fragment bit

If your User-Agent claims iOS 17 but the TCP stack matches Ubuntu Linux (as it would for a Python requests script or headless Chrome on a VPS), the lie collapses. Same for Go, Node, and any client that inherits the host OS TCP stack.

5. HTTP Header Inconsistencies

Many proxy servers insert revealing headers — often without the client's knowledge. Their presence is a direct giveaway.

Headers that leak proxy usage

Via (RFC 9110)

X-Forwarded-For

X-Real-IP

Forwarded (RFC 7239)

X-Proxy-ID

X-Proxy-Connection

Proxy-Connection

Client-IP

X-BlueCoat-Via

X-Forwarded-Host

X-Forwarded-Proto

X-Cache

Any of these arriving at a server when the client claims a direct connection is high-confidence proxy evidence. F5's published guidance treats X-Forwarded-For as untrusted for security decisions.

6. WebRTC & DNS Leaks

WebRTC uses RTCPeerConnectionto query STUN servers over UDP, enumerating both local and public IP candidates. JavaScript can read these candidates silently. Since HTTP(S) proxies only tunnel TCP, WebRTC's UDP traffic escapes the proxy entirely — exposing the real IP.

Mitigations: media.peerconnection.enabled=false, Chrome's WebRTC Network Limiter extension, or an antidetect browser that spoofs at the API level. On cellular connections specifically, see how to prevent WebRTC leaks on mobile proxies for a full checklist.

DNS resolver mismatch: if the visible client IP geolocates to city A but DNS queries reach authoritative resolvers via ECS indicating city B, the geo claim is inconsistent. A common tell for split-tunnel proxy setups.

7. Behavioral Signals

Even with a perfect IP + TLS + TCP story, behavioral analysis catches automation. DataDome, PerimeterX (HUMAN), and Akamai Bot Manager monitor:

→
Mouse dynamics: cursor curvature, micro-jitter, velocity/acceleration distributions, click-point entropy
→
Keystroke timing: dwell time (key-down → key-up), flight time (key-up → next key-down), rhythm variance
→
Scroll velocity & momentum: deceleration curves, inertia patterns
→
Mobile-specific: touch pressure, device orientation events, battery API, ambient light
→
Session-level: request timing entropy, navigation plausibility, form-fill timing

DataDome has publicly stated that fingerprint spoofing alone is insufficient — behavioral signals carry equal weight.

Why Mobile Proxies Beat Every Layer

Mobile proxies aren't invisible — they're impractical to block. The economics flip against the website.

Layer 1 (IP reputation)

Carrier-assigned IPs churn frequently. MaxMind's 'last seen' date ages out fast. The IP you use today isn't the IP flagged last week.

Layer 2 (ASN)

Blocking AS21928 blocks every T-Mobile customer. Cloudflare's own data shows CGNAT IPs get rate-limited 3× more but are rarely blocked outright.

Layer 3 (TLS/JA4)

Traffic from a real iPhone through a real carrier produces JA3/JA4 fingerprints identical to consumer devices — because it IS a consumer device.

Layer 4 (TCP stack)

The modem's own TCP stack matches real iOS/Android stacks (not Linux VPS stacks). p0f sees exactly what it would see from a phone.

Layer 5 (headers)

No gateway injects Via, X-Forwarded-For, or X-Proxy-* headers. The egress is the phone's modem itself — same as direct traffic.

Layer 6 (leaks)

WebRTC through a mobile carrier network exposes... a mobile carrier IP. DNS resolves through the carrier's resolver. Geography stays consistent.

Layer 7 (behavior)

This is still on you. Behavioral signals must look human regardless of IP quality. Mobile proxies handle infrastructure; behavior is the operator's job.

Sources

Related Guides

Technical Analysis

Traffic That Actually Looks Human

Real carrier IPs, real TLS stacks, real CGNAT trust. Test it for $5.

Try for $5 View plans →