Bot Detection

JA3 and JA4 TLS Fingerprinting for Scrapers

Before your scraper sends a single byte of HTTP, it has already announced itself. The TLS handshake carries a fingerprint of your client, and a proxy does not change it. Here is what JA3 and JA4 read, why default clients get flagged, and how to align the handshake with a clean carrier IP.

9 min read·Last updated: July 2026

Quick Answer

JA3 and JA4 are fingerprints of your TLS ClientHello: the cipher suites, extensions, curves, and their order that your client offers before any data is sent. A proxy does not change this handshake, so a Python or Go client keeps a non-browser fingerprint even on a clean mobile IP. You have to align the handshake and the IP together.

→The fingerprint comes from your TLS library, not your network path or exit IP
→A browser User-Agent over an OpenSSL handshake is a mismatch detectors flag
→curl-impersonate, curl_cffi, and utls reshape the handshake to match a real browser

This is a neutral, technical explainer of the TLS layer for people building scrapers and automation. It is the counterpart to the IP-reputation story: a clean carrier address answers one question a website asks, but the TLS handshake answers another, and the two are checked independently. If you want the full detection stack first, start with how websites detect proxies.

What a TLS ClientHello reveals

Every HTTPS connection opens with a TLS handshake, and the client speaks first. The ClientHello message is sent in the clear, before encryption is established, and it lays out everything the client is willing to negotiate. Because different software builds this message in different ways, the ClientHello is effectively a signature of the client that produced it, readable by any server or proxy in the path.

The fields that matter for fingerprinting are:

TLS version

The protocol versions the client supports

Cipher suites

The ordered list of encryption suites offered

Extensions

ALPN, SNI, session tickets, and dozens more

Supported groups

The elliptic curves the client will use for key exchange

Signature algorithms

Which signing schemes the client accepts

Ordering

The sequence of all of the above, which used to be stable per client

A real Chrome build offers a specific cipher list, a specific extension set, and specific supported groups. A Python script using OpenSSL offers a different set. Neither one is hidden by a proxy, because the proxy operates below the application and simply relays the bytes of the handshake it is given.

JA3: the original hash

JA3 is the method that made TLS fingerprinting practical. It was created at Salesforce in 2017 by John Althouse, Jeff Atkinson, and Josh Atkins, and open-sourced on GitHub. The idea is simple: take a fixed set of ClientHello fields, turn them into a compact string, then hash that string so it can be stored, compared, and shared as threat intelligence.

Per the Salesforce write-up, JA3 gathers the decimal values of the bytes for the TLS version, accepted ciphers, list of extensions, elliptic curves, and elliptic curve point formats. It concatenates those values in order, using a comma to separate the fields and a hyphen to separate values within a field, then runs the result through MD5 to produce a 32-character fingerprint. GREASE values are ignored so that clients using them still resolve to a single hash.

The key weakness: JA3 depends on the order of extensions. Chrome began randomizing the order of some ClientHello extensions on every connection specifically to discourage fingerprinting and protocol ossification. Against an order-dependent hash that is decisive, because the same browser now produces many different JA3 values, so a single JA3 no longer identifies modern Chrome reliably.

JA3 is still widely deployed and still catches plenty of naive automation, because most scripting clients do not randomize anything. But the randomization problem is exactly what its successor was built to solve.

JA4 and the JA4+ suite

JA4 is the successor, announced in 2023 by John Althouse under his company FoxIO. It is part of a broader suite called JA4+ that fingerprints more than just the TLS ClientHello. The core TLS fingerprint, JA4, uses a readable a_b_c layout instead of one opaque hash:

•JA4_a is a short, human-readable string summarizing the connection: transport (TCP or QUIC), TLS version, whether SNI is present, the cipher count, the extension count, and the first ALPN value.
•JA4_b is a truncated SHA-256 hash of the cipher suites, sorted before hashing.
•JA4_c is a truncated SHA-256 hash of the extensions, also sorted, combined with the signature algorithms.

That sorting step is the whole point. Because JA4 sorts the cipher and extension lists before hashing, the fingerprint no longer changes when a browser shuffles extension order. As FoxIO notes, applications tend to choose a distinctive cipher list more than a distinctive ordering, so sorting keeps the signal while dropping the noise that randomization introduced. JA4 also folds in ALPN and works across both TCP and QUIC, dimensions JA3 never captured.

The wider JA4+ suite adds separate fingerprints for other layers, including JA4S for the server response, JA4H for HTTP client behavior, and JA4X for TLS certificates, among others. For scrapers, the practical takeaway is that a detector can fingerprint your handshake, your HTTP layer, and their consistency with each other, not just one hash in isolation.

Licensing note: the core JA4 TLS fingerprint is published under the permissive BSD 3-Clause license, while the broader JA4+ methods carry FoxIO's own license that permits internal use but requires a commercial license to build them into a product you sell. This matters if you are embedding the algorithms rather than just being fingerprinted by them.

Why default clients look like bots

A stock HTTP client does not try to look like a browser, and it shows in the handshake:

•Python requests / httpx build their ClientHello from the system OpenSSL through urllib3, producing a JA3/JA4 that says "OpenSSL client," not "Chrome."
•Go net/http uses Go's own crypto/tls, which emits a distinct and stable Go fingerprint unlike any browser.
•Many HTTP-based headless clients reuse one of the above stacks, so they inherit the same non-browser handshake even when the User-Agent claims otherwise.

The problem is not only that the fingerprint is uncommon. It is the mismatch. DataDome's engineering team documents that its models flag an inconsistent combination of TLS fingerprint and device class, meaning the OS and browser name and version implied by your headers. If your User-Agent says Chrome on Windows but your handshake says OpenSSL, the two disagree, and that disagreement is itself the tell. Setting a browser User-Agent on a Python request can make you more detectable, not less, exactly because it manufactures that contradiction.

This is the same consistency principle that governs the browser layer, covered in how Cloudflare bot management works and how DataDome works. Detectors do not grade any single value in isolation; they grade whether the values agree.

Aligning the handshake with a real browser

You cannot fix a TLS fingerprint with headers or a proxy. You fix it by using a client that actually produces a browser-shaped ClientHello. Several open-source projects exist specifically for this:

curl-impersonate

A modified build of curl that makes TLS and HTTP handshakes look exactly like a real browser. It can impersonate Chrome, Edge, Safari, and Firefox, and is MIT-licensed.

curl_cffi (Python)

A Python binding over a curl-impersonate fork. Unlike requests or httpx, it can impersonate browser TLS/JA3 and HTTP/2 fingerprints directly from Python.

utls (Go)

A fork of Go's crypto/tls giving low-level control of the ClientHello for mimicry. It ships browser "parrots" such as HelloChrome_Auto and can emit randomized fingerprints too.

tls-client (Go)

A higher-level Go HTTP client built on utls, packaging browser profiles so you can send requests with a chosen browser fingerprint without wiring the handshake by hand.

The common thread is that all of them reshape the ClientHello itself: the cipher order, the extensions, the supported groups, the ALPN. That is what moves your JA3 and JA4 from a scripting library into the same cluster as a real browser. Whichever you pick, keep the impersonated version current, because browser handshakes change with each release and a stale profile becomes its own anomaly.

Two layers, one identity: TLS plus a clean carrier IP

A website asks two independent questions on every request. Who is this client? answered by the TLS and HTTP fingerprint. And where is it coming from? answered by the IP and its reputation. A proxy only touches the second question. Because the ClientHello is built by your TLS library and merely relayed by the proxy, changing your exit IP does nothing to your JA3 or JA4.

That is why neither layer works alone. A flawless browser handshake coming from a flagged datacenter range still loses on IP reputation. A pristine 4G/5G carrier address still loses if the handshake screams Python. The two have to agree, the same way a browser profile and its exit IP have to agree at the application layer, which we cover in CGNAT and mobile-proxy fingerprinting.

A real mobile proxy gives you a genuine carrier IP shared across many real subscribers behind CGNAT, which carries higher default network trust than a datacenter address. Pair that with a browser-matched TLS fingerprint and both questions get a consistent answer. Also mind the transport: whether you tunnel over SOCKS5 or HTTP the proxy still forwards your handshake untouched, so the fingerprint work stays on your side.

Where JA fingerprinting shows up in production

JA-family fingerprinting is not academic. It is documented in the products scrapers actually meet:

•Cloudflare exposes JA3 and JA4 fingerprints to Enterprise Bot Management customers, describing them as stable identifiers of a TLS client across different destination IPs, ports, and certificates, usable in analytics and in custom rules that challenge or block a given fingerprint.
•DataDome uses TLS fingerprint features inside its machine-learning models, both to catch fingerprints that are unique to known bots and to catch handshakes that are inconsistent with the device class a request claims to be.
•Cloudflare's research has extended JA4 into inter-request signals, grouping traffic by fingerprint over time rather than judging a single connection, which raises the bar for a fingerprint that is technically correct but behaves nothing like a person.

The pattern across all of them is the same: the TLS fingerprint is one input into a score, weighed against IP reputation, HTTP/2 fingerprint, header order, and behavior. Matching JA3 and JA4 removes one of the easiest reasons to be flagged; it does not remove the rest.

Frequently asked questions

Does a proxy change my JA3 or JA4 fingerprint?

No. The ClientHello that produces a JA3 or JA4 fingerprint is generated by your HTTP client and its TLS library, not by the proxy. A proxy forwards or tunnels that handshake unchanged, so switching to a clean mobile IP does not alter your TLS fingerprint. You change it by changing the client library.

What is the difference between JA3 and JA4?

JA3 is an MD5 hash of ClientHello fields in the order the client sent them, created at Salesforce in 2017. JA4, from FoxIO, sorts the cipher and extension lists before hashing and uses a readable a_b_c layout, which makes it resistant to the extension-order randomization that broke JA3 for modern Chrome.

Why do Python requests and Go net/http get flagged?

Those clients build their ClientHello from OpenSSL or Go's crypto/tls, which produce fingerprints distinct from any browser. When a User-Agent claims Chrome but the TLS fingerprint says OpenSSL, that mismatch between the handshake and the declared device is itself a signal detection vendors act on.

How do I match a real browser TLS fingerprint?

Use a client that reproduces a browser handshake: curl-impersonate, its Python binding curl_cffi, or utls in Go. These offer the same cipher suites, extensions, and ALPN as Chrome, Firefox, or Safari, so the resulting JA3 or JA4 matches a real browser rather than a scripting library.

Is a matched TLS fingerprint enough on its own?

No. The TLS fingerprint is one signal among IP reputation, HTTP/2 fingerprint, header order, and behavior. A perfect browser handshake from a flagged datacenter IP still fails on IP reputation, and a clean mobile IP with a Python fingerprint still fails on TLS. Both layers have to agree.

Do Cloudflare and DataDome use TLS fingerprinting?

Yes. Cloudflare exposes JA3 and JA4 fingerprints to Enterprise Bot Management customers for rules and analytics, and DataDome's engineering team documents using TLS fingerprint features in its machine-learning models, including flagging an inconsistent combination of TLS fingerprint and device class.

Sources

Related Guides

Technical

Give your matched fingerprint a clean IP

Fix your JA3/JA4 with the right client, then route it through a genuine 4G/5G carrier IP with API rotation and sticky sessions. Test it for $5.

Try for $5 View plans →