Meta announced a dramatic shift in its content moderation policies across its social media platforms ןn January 2025. Under the slogan "More Speech and Fewer Mistakes," the company decided to disable a significant portion of the automated systems it had used to proactively detect and remove harmful content, threats and harassment.
Now, a new and alarming report by the U.S.-based Center for Countering Digital Hate (CCDH), published earlier this month under the title "Safety Off," lays bare the severe — and, critics say, predictable — consequences of that decision.
According to the report, the changes have coincided with a troubling deterioration in online safety and democratic discourse, with elected officials in the United States facing what researchers describe as an unprecedented surge in violent rhetoric and hate speech.
And the threats are not merely rhetorical. The report highlights several recent incidents, including the April 2026 arrest of an armed man carrying a rifle and knives who allegedly attempted to infiltrate the White House Correspondents' Dinner with the intention of assassinating President Donald Trump, the arson attack on the residence of Pennsylvania Gov. Josh Shapiro, and the killing of Minnesota House Speaker Melissa Hortman and her husband at their home.
The numbers tell the story
Researchers collected and analyzed nearly 8 million Facebook comments directed at 100 members of the U.S. House of Representatives — 50 Republicans and 50 Democrats — during the six months before and the six months after the policy change.
The analysis was conducted using advanced artificial intelligence systems based on the GPT-4o-mini and GPT-5.1 language models. According to the report, the findings were manually validated and achieved an accuracy rate of approximately 79%.
The results show that the volume of abusive comments on the platform tripled following the policy shift.
Threats of physical violence increased fourfold, rising from 1,800 comments before the change to 7,600 afterward. These included explicit calls for lawmakers to be murdered, hanged or physically harmed.
Targeted expressions of hate also increased more than fourfold, jumping from 6,900 incidents to more than 30,000. The comments included racist, sexist and homophobic slurs.
Cases of bullying and personal harassment more than doubled, rising from 15,700 to 39,900 comments, including messages encouraging lawmakers to commit suicide.
Direct threats against President Trump also more than doubled after the new policy took effect, increasing from 800 to 1,900.
A deliberate policy shift
According to the report, the increase in abusive content can be traced to Meta's deliberate decision to scale back what it calls "proactive enforcement."
Meta's own transparency reports support that assessment. The company reduced by half the amount of harmful content it removed automatically through algorithmic systems.
Of the 35 million pieces of content blocked automatically during the six months preceding the reform, only 17.2 million were removed in the six months afterward.
Instead, Meta shifted responsibility to users by relying more heavily on manual reporting — a model that the report argues has proven ineffective. Public reports of harmful content increased only slightly, from 3 million to 3.5 million, and failed to stem the growth of abusive discourse across the company's platforms.
Meta also reversed its previous policy of limiting the visibility of political content in users' feeds and began recommending such content more aggressively. As a result, the overall volume of comments in the study sample increased from 2.2 million to 5.6 million.
Researchers argue that the combination of greater exposure to politically charged content and the weakening of automated safety mechanisms created a "perfect storm" of online toxicity and violence.
Echoes of Musk's strategy at X
The move closely resembles the strategy pursued by Elon Musk at the rival social media platform X.
Since acquiring the platform, Musk has aggressively reduced safety and moderation teams in favor of a model centered on what he describes as unrestricted free speech and a user-driven Community Notes system. Meta has adopted a similar approach by replacing its independent fact-checking programs with community-based moderation mechanisms.
Then came Trump
Content moderation on social media began in the early 2000s with simple keyword-based systems, when platforms relied on lists of prohibited words. Those systems were easily circumvented through spelling variations or slang.
Between 2010 and 2016, amid growing regulatory and public pressure, platforms shifted toward large-scale human moderation and outsourced content review. As Facebook expanded, armies of content moderators were hired around the world and tasked with reviewing millions of disturbing posts every day, leading to criticism over the psychological toll on workers.
Because those methods also proved insufficient, the era of proactive moderation emerged between 2017 and 2024. Under mounting regulatory pressure and threats of advertiser boycotts, technology companies developed advanced machine-learning and artificial intelligence systems capable of scanning billions of posts, identifying violent content and removing it before any human moderator saw it.
That changed after Trump's return to the White House for a second term in 2025. Silicon Valley gained a powerful voice within the administration through Vice President J.D. Vance, whose appointment was reportedly backed by technology investors Peter Thiel and Elon Musk.
The result, critics argue, has been a broader shift across the technology industry toward minimizing content-safety efforts. In some cases, companies have sought to reduce the substantial computing costs associated with AI moderation tools; in others, they have aimed to avoid accusations of political bias or censorship.
Meta pushes back
Meta has sought to counter the growing criticism.
In a statement provided to WIRED magazine, a company spokesperson said Meta remains committed to transparency and that its own data indicates that the prevalence of hate speech remained stable throughout 2025.
The company questioned the report's conclusions and said it could not comment fully without conducting a detailed review of the research.
However, WIRED reported that Meta had been provided in advance with specific examples of the threatening and abusive comments cited in the study but chose not to address them directly.
In a development that raised further questions, many of the offensive comments highlighted as evidence in the report were reportedly removed shortly before the study's publication.




