Bot Detection

How DataDome Bot Detection Works

DataDome is a real-time bot- and fraud-detection service common on e-commerce, travel, and classifieds sites. This is a neutral explainer of what signals it inspects and how its machine-learning model decides whether a request is human or automated.

8 min read·Last updated: May 2026

Quick Answer

DataDome makes a real-time decision on every request by feeding server-side and client-side signals — device fingerprint, behavior, and network attributes — into a multi-layered machine learning model. Most legitimate traffic passes silently; only requests it flags as risky are shown a device check or puzzle.

→DataDome states it reviews more than 5 trillion signals per day
→Documents a "Picasso" canvas/GPU technique for device-class fingerprinting
→Challenges are reserved for risky traffic, not shown to everyone

This guide describes how DataDome detects automation — the signal categories and the decision flow. It is not a bypass guide. Knowing what the model inspects explains why commodity automation gets flagged and why the system challenges some clients and not others.

Real-time machine learning

DataDome positions itself as a real-time bot and online-fraud detection layer. Each request is scored against machine-learning models as it arrives, rather than analyzed after the fact. DataDome states it reviews more than 5 trillion signals per day across its customer base — a vendor-stated figure that conveys the scale of the model's training and comparison data.

Because the decision is made inline, the model can act on the very first request from a new client, drawing on patterns learned from the wider network.

Server-side and client-side signals

DataDome combines signals from multiple layers rather than relying on any single tell:

→
Device fingerprinting: attributes of the browser and device environment
→
Behavioral signals: how the client interacts with the page over time
→
Network signals: server-side request and connection characteristics

Collecting both server-side and client-side data lets the model cross-check claims — a client that says it's a browser but doesn't produce browser-consistent client-side signals stands out.

One of the server-side network signals is the connection's TLS fingerprint. Our deep dive on TLS fingerprinting with JA3 and JA4 explains how a mismatched handshake gives away common automation libraries.

The "Picasso" fingerprint

DataDome has documented a technique it calls Picasso for device-class fingerprinting. It asks the client to render graphics via canvas/GPU operations; the precise output varies by the underlying hardware and graphics stack, which helps confirm that a client really is the device class it claims to be. A mismatch between the declared device and the rendered output is a useful inconsistency signal.

What the challenge looks like

DataDome reserves its visible device-check / CAPTCHA puzzle for traffic the model has already flagged as risky — most legitimate visitors never see it. DataDome states that one customer saw an 83% drop in CAPTCHA displays after deploying its risk-based approach (a vendor figure).

DataDome's published Dohop case study illustrates the scale on travel traffic: Dohop runs DataDome across 75+ airline partners, and DataDome reports it cut bot traffic by 70% during peak travel season and detected 2.6 million malicious requests in a 7-day January evaluation (vendor case-study figures).

2025: the AI traffic pivot

On September 30, 2025, DataDome published a 2025 Global Bot Security Report focused on what it calls the "AI traffic crisis," and has been shifting its messaging toward "Agent Trust" — distinguishing wanted AI-agent traffic from abusive automation. This mirrors a broader industry move; see our guide to the agentic web.

Why mobile / CGNAT IPs are treated differently

Network signals are one of DataDome's inputs — but the network layer is also where defenders face their hardest trade-off. Mobile carrier IPs sit behind Carrier-Grade NAT, so a single public address is shared by thousands of real subscribers. Blocking it harms a crowd of humans, not just one bot.

Cloudflare quantified this in its October 29, 2025 blog, "detecting CGN to reduce collateral damage." Cloudflare reported CGNAT IPs were being rate-limited around 3× more often than non-CGNAT IPs despite showing lower bot activity, and described detecting CGN specifically to avoid punishing the many people sharing those addresses.

That is a documented defender design constraint, not a bypass. It explains why carrier IPs earn higher default trust: the collateral-damage cost of over-blocking them is high. More in CGNAT and mobile proxies.

Sources

Related Guides

Technical

Test on real mobile carrier IPs

Genuine 4G/5G IPs in the USA, UK, and Netherlands for legitimate, compliant data work. Test it for $5.

Try for $5 View plans →