Court Rules Meta's Terms Do Not Prohibit Scraping of Public Data

California Court Rules that Facebook and Instagram’s Terms of Service Do Not Prohibit Scraping of Public Data

DISCLAIMER: This post is for information purposes only. The content is not legal advice and does not create an attorney-client relationship.

In 2023, Meta sued Bright Data for scraping data from Facebook and Instagram, alleging that its scraping breached Facebook and Instagram’s terms of service and is thus a breach of contract.

Both parties filed motions for summary judgment (“MSJ”), which the court just ruled on last week (a motion for summary judgment is when a party asks the court to rule on a cause of action before going to trial because the undisputed facts necessitate a decision in their favor). The court looked at two main issues:

Meta’s MSJ for a judgment in its favor as to the issue of liability in its breach of contract claims, and
Bright Data’s MSJ for the breach of contract claim against it.

Both these issues relate directly to the question of whether Facebook and Instagram’s terms, which prohibit scraping, were binding on Bright Data. The court denied Meta’s MSJ and granted Bright Data’s.

So what does all this legal jargon mean? And what did the court actually say?

It means that the court has ruled that Bright Data did not violate Meta’s terms of service or breach any contract with Meta by scraping public Facebook and Instagram data. As a result, the court dismissed Meta’s breach of contract claims against Bright Data.

After reading the terms of service very closely, the court stated that Meta's terms are only applicable to a user who is actively logged in to their account and is using the account for the purpose of scraping data. As such, Meta's terms do not apply to the scraping of public information while logged out of an account.

Background Facts

Before we dive into the court’s analysis, we need to understand the undisputed facts that the court was looking at in order to make its ruling:

Bright Data was scraping public Facebook and Instagram data,
in order to scrape that public data Bright Data was circumventing Meta’s anti-bot measures,
Facebook and Instagram’s terms of service both prohibit scraping,
in 2021 and 2022 Bright Data had Facebook and Instagram accounts which required them to agree to the terms of service, but they were corporate accounts and not used for scraping data, and
in December 2022 Bright Data disabled all its Meta accounts.

Based on these facts, it’s clear that Bright Data was in fact scraping public Facebook and Instagram data and that they had agreed to the terms of service at some point in time. As such, the court needed to look closely at the terms of service for each website to determine if they applied to a party that was scraping public data while not logged in.

Data not behind a login is considered public data, even if the scraper is circumventing anti-bot and CAPTCHA to obtain the data.

The court first looked at Meta’s claim that Bright Data’s use of tools to circumvent anti-bot measures shows that Bright Data collected data behind an authentication barrier and that it was not only collecting public data. The court found that there is a clear difference between defeating anti-scraping tech and piercing privacy walls like a login. Meta “left the gate open” by choosing not to place all its data behind a login. The court stated that circumventing CAPTCHAs and anti–bot technology is not equivalent to scraping behind a login. Since there is no evidence of logged in scraping, the court concluded that Bright Data only engaged in the scraping of public data.

Meta’s terms do not prohibit scraping of public data while not logged in, even when you have active Facebook and Instagram accounts.

Meta then argued that Bright Data is bound by its website terms, prior to and after Bright Data terminated its accounts. Bright Data contended that the terms did not prohibit scraping of public data while not logged in and after it terminated its accounts there was no valid contract in effect. They further contended that the terms govern use of Facebook and Instagram, and that scraping public data while logged out is not “use” as defined under the terms of service, as it is not a “user” of the services when conducting logged out scraping.

The court conducted a full analysis of the Facebook and Instagram terms in order to determine if they apply to scraping public data while logged out. They found that the terms state that they govern “your use” of Facebook and Instagram. It is reasonable to interpret that Bright Data did not “use” Facebook or Instagram in order to conduct logged off public data scraping. Further, based on the terms and the fact that non-logged in visitors are not asked to agree to the terms or even see them, it’s reasonable to infer that a user is an account holder. Additionally, the services listed in the terms that a user agrees to are mostly services that would require someone to be logged into the service. As such, it’s reasonable to conclude that a user is a logged in user of the service and does not apply to those who are not logged in.

Further, even when Bright Data did have accounts, there is no evidence that those accounts were related in any way to their scraping activity. When Bright Data scraped, it did so without logging in and it did not use its accounts in any way to conduct the scraping. So even though Bright Data was a user for other purposes during the time they had accounts, it was not using Facebook or Instagram as a part of its non-logged in scraping of public data. There is further back and forth between the parties that parses language in the terms in order to support their relative theories of who the terms apply to. The court took all the arguments into consideration, but still concluded that the terms do not apply to the scraping of public data while not logged in. The court believed that Bright Data’s interpretation of a user was more consistent with the overall language and purpose of the user terms.

For all the above reasons, the court found that the terms do not apply to Bright Data’s scraping and it denied Meta’s MSJ. Additionally, the court then moved to Bright Data’s MSJ, where it reiterated that the Facebook and Instagram terms “do not bar logged-off scraping of public data” and granted Bright Data’s MSJ.

The court’s order is a great step forward, but it is not fully determinative of the contract issues facing web scrapers.

While this is a great ruling for ethical web scrapers, we need to be cautious about overstating its reach. The court’s ruling does not establish that all logged off scraping of public data is ok. It’s a very fact specific analysis of Meta’s terms and how they apply to non-logged in scraping of public data. Meta could change its terms to specifically apply to non-logged in users and other website terms will be interpreted based on their precise wording, so there is still room for websites to win breach of contract claims.

Additionally, the court does leave the door open for Meta to file an amended complaint to add a quasi-contract claim, which would be litigated along with Meta’s remaining cause of action for tortious interference with contract. So while some of the case has been decided, there is still more to be litigated between Meta and Bright Data.

Finally, and most importantly, the wider question of whether browse wrap terms can ever be enforceable when someone is scraping public data is still open. But for now this order gives good guidance and some hope for where the courts are headed.

Court Rules Meta's Terms Do Not Prohibit Scraping of Public Data

Try Zyte API

Court Rules Meta's Terms Do Not Prohibit Scraping of Public Data

So what does all this legal jargon mean? And what did the court actually say?

Background Facts

Data not behind a login is considered public data, even if the scraper is circumventing anti-bot and CAPTCHA to obtain the data.

Meta’s terms do not prohibit scraping of public data while not logged in, even when you have active Facebook and Instagram accounts.

The court’s order is a great step forward, but it is not fully determinative of the contract issues facing web scrapers.

Try Zyte API