Legal · Web Scraping · CFAA

Is Web Scraping Legal?What Courts Have Ruled

A factual overview of U.S. court rulings on web scraping, the Computer Fraud and Abuse Act, and where the legal boundaries stand today.Based on actual case law: hiQ v. LinkedIn, Van Buren v. United States, and Meta v. Bright Data.

Apr 8, 2026

10 min read

Quick Answer

Scraping publicly available data (no login required) is generally not a CFAA violation per hiQ v. LinkedIn (2022 Ninth Circuit) and Meta v. Bright Data (2024). However, scraping behind authentication, copyrighted content, or personal data may violate other laws (contract, copyright, GDPR, CCPA). robots.txt is voluntary (RFC 9309) and not legally binding.

→Public data scraping: generally legal under CFAA (hiQ v. LinkedIn, 2022)
→Logged-in scraping: contract law and CFAA both apply
→robots.txt: voluntary standard, ignoring it doesn't create criminal liability

Disclaimer: This is not legal advice. This article summarizes publicly available court rulings and legal analysis for informational purposes only. Consult a lawyer for your specific situation.

1. The Short Answer 2. Key Court Cases 3. What About robots.txt?4. Where Scraping Gets Risky 5. How Mobile Proxies Help

The Short Answer

Scraping publicly available data — information visible to anyone without logging in — is generally not a violation of the Computer Fraud and Abuse Act (CFAA) based on recent U.S. court rulings. Multiple federal courts have held that accessing public websites does not constitute "unauthorized access" under the CFAA.

However, scraping is not governed by a single law. Even when CFAA liability is off the table, other legal frameworks can apply:

Generally permitted

Scraping publicly visible data
No login or authentication required
Factual, non-copyrighted information

May create liability under

Contract law (Terms of Service breach)
Privacy regulations (GDPR, CCPA, BIPA)
Copyright law (creative content)
State trespass to chattels claims

Key Court Cases

Three cases have shaped the current legal landscape for web scraping in the United States. Each addressed different aspects of the CFAA and its application to scraping public data.

hiQ Labs v. LinkedIn (2022)

Ninth Circuit Court of Appeals · Case No. 17-16783

hiQ Labs, a data analytics company, scraped publicly available LinkedIn profiles to build workforce analytics products. LinkedIn sent a cease-and-desist letter and blocked hiQ's access. hiQ sued for an injunction.

The Ninth Circuit held that scraping publicly available data does not violate the CFAA. The court reasoned that "without authorization" under the CFAA applies to private, access-controlled systems — not public websites that anyone can visit. A public website is analogous to an open store: you don't need "authorization" to walk through an open door.

After the Supreme Court vacated and remanded this case in light of Van Buren v. United States (2021), the Ninth Circuit reaffirmed its original holding in 2022, finding that Van Buren actually supported its reasoning — the CFAA's scope is narrower than LinkedIn argued.

The parties later settled. LinkedIn reportedly paid approximately $500,000 in damages to hiQ.

Van Buren v. United States (2021)

Supreme Court of the United States · 593 U.S. 374

A Georgia police officer, Nathan Van Buren, used his legitimate access to a law enforcement database to look up a license plate in exchange for money. The government charged him under the CFAA for "exceeding authorized access."

The Supreme Court, in a 6–3 decision authored by Justice Barrett, held that "exceeds authorized access" under the CFAA means accessing areas of a computer system that a person is not allowed to access at all. It does not cover misusing data that the person was otherwise authorized to retrieve. The Court adopted a "gates-up-or-down" model: the CFAA concerns whether you can access certain information, not what you do with information you're entitled to access.

This narrowing of the CFAA was significant for web scraping because it meant that accessing publicly available data — where no "gate" blocks access — is even less likely to create CFAA liability.

Meta Platforms v. Bright Data (January 2024)

U.S. District Court, Northern District of California · Judge Edward Chen

Meta sued Bright Data (a proxy and web scraping company) for scraping public Facebook and Instagram profiles. Judge Edward Chen ruled that Bright Data's scraping of logged-out, publicly accessible data did not violate Meta's Terms of Service.

The court's reasoning was precise: Meta's ToS binds "users" — meaning people who have Meta accounts. When Bright Data scraped public data without logging in, they were acting as visitors, not users, and were therefore not bound by the ToS. The court also noted that companies do not own publicly available data simply because it appears on their platform.

This ruling was significant because it established that website operators cannot use their Terms of Service to create a legal monopoly over publicly accessible information. However, the court left open that scraping while logged in could produce different results.

What About robots.txt?

The Robots Exclusion Protocol, now formalized as RFC 9309 (published September 2022 by the IETF), is a voluntary technical standard. It tells crawlers which parts of a site the operator prefers not be accessed, but it is not a legal instrument.

Respecting robots.txt

Demonstrates good faith and responsible crawling behavior. Courts have considered robots.txt compliance as evidence of a scraper's intent. Respecting it can help your legal position if a dispute arises.

Ignoring robots.txt

Does not create CFAA liability on its own. No U.S. court has held that violating robots.txt constitutes "unauthorized access" under the CFAA. However, ignoring it may weaken your position in other claims (contract, trespass to chattels).

RFC 9309 (September 2022) formalized the robots.txt protocol as an IETF standard after decades as an informal convention. It defines the syntax and expected behavior for crawlers but explicitly states compliance is voluntary. The standard replaced the original 1994 informal specification by Martijn Koster.

Where Scraping Gets Risky

While scraping public data has strong legal protection under the CFAA, several scenarios create genuine legal risk under other laws.

Behind authentication (logged-in scraping)

Scraping data that requires logging in with an account changes the legal analysis entirely. You are now a "user" bound by the platform's Terms of Service. The Meta v. Bright Data distinction between "users" and "visitors" cuts the other way: logged-in scrapers are users who agreed to the ToS. Violating those terms can support breach of contract claims, and depending on the facts, may even reach CFAA liability under Van Buren's gates-up-or-down framework.

Personal data collection

Scraping personal data triggers privacy regulations regardless of whether the data is publicly visible. The EU's GDPR, California's CCPA/CPRA, and Illinois's Biometric Information Privacy Act (BIPA) all regulate the collection and processing of personal information. GDPR in particular applies extraterritorially — if you scrape data about EU residents, you must comply regardless of where your servers are located.

Copyrighted content

Scraping and reproducing copyrighted material (articles, images, creative writing) creates intellectual property liability separate from the CFAA. Factual data (prices, product specifications, public stats) generally cannot be copyrighted under Feist Publications v. Rural Telephone Service (1991), but creative expression can. The distinction between uncopyrightable facts and copyrightable expression is case-specific.

Terms of Service violations

After Van Buren and hiQ, a ToS violation alone is generally not a CFAA crime. But it can still support a civil breach of contract claim. Courts evaluate whether the ToS was reasonably communicated, whether the scraper assented to it (e.g., by creating an account), and whether enforceable consideration existed. Browse-wrap ToS (those shown only via a link in the footer) have weaker enforceability than click-wrap agreements.

How Mobile Proxies Help

Mobile proxies are infrastructure tools that provide clean carrier IP addresses for web requests. They address the practical challenge of IP-based blocking during legitimate scraping operations.

Clean carrier IPs

Real 4G/5G IPs from mobile carriers avoid the IP reputation issues that plague datacenter proxies. Carrier IPs are shared by thousands of real users via CGNAT, giving them inherent trust.

Natural rotation

IP rotation distributes requests across multiple addresses, mimicking natural mobile network behavior where IPs change as devices move between towers and reconnect.

Infrastructure only

Mobile proxies provide proper IP infrastructure for legitimate operations. They are not a tool for circumventing legal restrictions — they prevent technical blocking of otherwise lawful activity.

Proxies address the IP-level technical challenge. Legal compliance — respecting privacy laws, copyright, and contractual obligations — remains your responsibility regardless of what IP infrastructure you use.

Get Started with Legitimate Scraping

Mobile proxies provide the IP infrastructure for scraping publicly available data without IP-level blocking. Clean carrier IPs, automatic rotation, and real mobile network traffic patterns.

Web Scraping Solution

Setup guides and best practices

View Pricing

Mobile proxy plans from 5 countries

Technical guides and insights

← Back to all articles