How Websites Detect Proxies in 2026
Modern bot protection stacks use 7+ detection layers simultaneously. Here's exactly how they catch proxy traffic — and why mobile carrier IPs remain the hardest to detect.
Detecting proxy traffic isn't one test — it's a stack of them, running in parallel before your request even reaches application code. Cloudflare, DataDome, PerimeterX (HUMAN), and Akamai Bot Manager all evaluate multiple signals simultaneously. A single mismatch can flag your session.
This guide walks through each layer with technical accuracy and the real tools involved. At the end, we'll show why mobile proxies on carrier networks remain the hardest traffic to classify as "proxy" — not because they're invisible, but because blocking them costs websites more than it saves.
1. IP Reputation Databases
The first filter is usually a lookup against commercial IP reputation databases. These are updated continuously from honeypots, botnet logs, customer reports, and ASN ownership data.
| Service | Classifications | Update Frequency |
|---|---|---|
| MaxMind Anonymous Plus | VPN, residential proxy, hosting, Tor exit, public proxy — with confidence score & provider name | Daily + "last seen" ISO date |
| IPQualityScore | 25+ data points: honeypot traps, ML models, request velocity, abuse history | Real-time |
| Spur.us | ~60M suspect IPs, 1,000+ known VPN/proxy services, session shift detection | Daily |
| IP2Proxy | Classes: VPN, PUB, WEB, TOR, DCH, SES, RES, CPN, EPN | Daily |
Datacenter proxies appear in these databases almost immediately. Residential proxies take longer but eventually surface when abuse patterns accumulate. Mobile carrier IPs are reassigned frequently by the carrier — MaxMind's "last seen" date ages out stale entries quickly.
2. ASN-Based Detection
Every IP belongs to an Autonomous System Number (ASN). Bot protection classifies ASNs before any application logic runs. AWS WAF even exposes ASN matching as a first-class rule primitive.
Flagged Datacenter ASNs
- • AWS (AS16509, AS14618)
- • Google Cloud (AS15169, AS396982)
- • Hetzner (AS24940)
- • OVH (AS16276)
- • DigitalOcean (AS14061)
- • Linode/Akamai (AS63949)
→ Auto-flagged, rate limited, CAPTCHA'd
Protected Carrier ASNs
- • AT&T (AS7018, AS20057)
- • T-Mobile (AS21928)
- • Verizon (AS22394, AS6167)
- • Vodafone (AS12430)
- • Orange, O2, EE, and others
→ Cannot be blocked without blocking millions of real users
This is the foundational reason mobile proxies work: carrier ASNs front real consumer traffic at massive scale. A website that blocks AS21928 loses every T-Mobile customer.
3. TLS Fingerprinting (JA3/JA4)
Before any HTTP content transmits, the TLS ClientHello packet reveals a client fingerprint. Sophisticated detection reads this pre-application and compares it to the claimed User-Agent.
JA3 (Salesforce, 2017)
Hashes TLS ClientHello fields — version, cipher suites, extensions, elliptic curves, EC point formats — into a 32-character MD5 fingerprint. Weakness: Chrome's TLS extension randomization (GREASE + permute-extensions) broke stable JA3 hashes.
JA4 (FoxIO)
36-character fingerprint that normalizes fields to survive extension randomization. Full suite:
| JA4 | TLS ClientHello fingerprint (normalized) |
| JA4S | TLS ServerHello response |
| JA4H | HTTP client fingerprint (headers, order) |
| JA4T | TCP fingerprint |
| JA4X | X.509 certificate fingerprint |
| JA4L | Latency-based fingerprint |
Cloudflare implements JA4 at the edge using a Rust-based parser and exposes matching primitives like cf.bot_management.ja4 for WAF rules.
DataDome processes ~3 trillion signals daily. For each TLS fingerprint, they record the percentage of known bots, IP quality, and associated OS. Two patterns trigger blocks: (a) fingerprints tied to known bots, and (b) inconsistent combinations — e.g., a curl JA4 with an iPhone User-Agent.
4. TCP/IP Stack Fingerprinting
Even deeper than TLS: the TCP handshake itself leaks OS information. p0f v3 (by lcamtuf) passively analyzes packets to identify the originating OS from the transport layer.
Signature fields
Initial TTL: Windows=128, Linux=64, routers=255
Window size: OS-specific default
MSS: segment size advertised
TCP options: exact order matters
DF flag: Don't Fragment bitIf your User-Agent claims iOS 17 but the TCP stack matches Ubuntu Linux (as it would for a Python requests script or headless Chrome on a VPS), the lie collapses. Same for Go, Node, and any client that inherits the host OS TCP stack.
5. HTTP Header Inconsistencies
Many proxy servers insert revealing headers — often without the client's knowledge. Their presence is a direct giveaway.
Headers that leak proxy usage
Any of these arriving at a server when the client claims a direct connection is high-confidence proxy evidence. F5's published guidance treats X-Forwarded-For as untrusted for security decisions.
6. WebRTC & DNS Leaks
WebRTC uses RTCPeerConnectionto query STUN servers over UDP, enumerating both local and public IP candidates. JavaScript can read these candidates silently. Since HTTP(S) proxies only tunnel TCP, WebRTC's UDP traffic escapes the proxy entirely — exposing the real IP.
Mitigations: media.peerconnection.enabled=false, Chrome's WebRTC Network Limiter extension, or an antidetect browser that spoofs at the API level.
DNS resolver mismatch: if the visible client IP geolocates to city A but DNS queries reach authoritative resolvers via ECS indicating city B, the geo claim is inconsistent. A common tell for split-tunnel proxy setups.
7. Behavioral Signals
Even with a perfect IP + TLS + TCP story, behavioral analysis catches automation. DataDome, PerimeterX (HUMAN), and Akamai Bot Manager monitor:
- →Mouse dynamics: cursor curvature, micro-jitter, velocity/acceleration distributions, click-point entropy
- →Keystroke timing: dwell time (key-down → key-up), flight time (key-up → next key-down), rhythm variance
- →Scroll velocity & momentum: deceleration curves, inertia patterns
- →Mobile-specific: touch pressure, device orientation events, battery API, ambient light
- →Session-level: request timing entropy, navigation plausibility, form-fill timing
DataDome has publicly stated that fingerprint spoofing alone is insufficient — behavioral signals carry equal weight.
Why Mobile Proxies Beat Every Layer
Mobile proxies aren't invisible — they're impractical to block. The economics flip against the website.
Layer 1 (IP reputation)
Carrier-assigned IPs churn frequently. MaxMind's 'last seen' date ages out fast. The IP you use today isn't the IP flagged last week.
Layer 2 (ASN)
Blocking AS21928 blocks every T-Mobile customer. Cloudflare's own data shows CGNAT IPs get rate-limited 3× more but are rarely blocked outright.
Layer 3 (TLS/JA4)
Traffic from a real iPhone through a real carrier produces JA3/JA4 fingerprints identical to consumer devices — because it IS a consumer device.
Layer 4 (TCP stack)
The modem's own TCP stack matches real iOS/Android stacks (not Linux VPS stacks). p0f sees exactly what it would see from a phone.
Layer 5 (headers)
No gateway injects Via, X-Forwarded-For, or X-Proxy-* headers. The egress is the phone's modem itself — same as direct traffic.
Layer 6 (leaks)
WebRTC through a mobile carrier network exposes... a mobile carrier IP. DNS resolves through the carrier's resolver. Geography stays consistent.
Layer 7 (behavior)
This is still on you. Behavioral signals must look human regardless of IP quality. Mobile proxies handle infrastructure; behavior is the operator's job.
Sources
- • Cloudflare — Advancing Threat Intelligence: JA4 fingerprints
- • Cloudflare — Detecting CGNAT to reduce collateral damage
- • Salesforce Engineering — TLS Fingerprinting with JA3 and JA3S
- • DataDome — How TLS Fingerprinting Reinforces Protection
- • MaxMind — GeoIP Anonymous IP database
- • p0f v3 — passive OS fingerprinting
- • AWS WAF — ASN match rule documentation
Related Guides
Traffic That Actually Looks Human
Real carrier IPs, real TLS stacks, real CGNAT trust. Test it for $5.