Aug 27 (Reuters) - Anthropic said on Wednesday it had
detected and blocked hackers attempting to misuse its Claude AI
system to write phishing emails, create malicious code and
circumvent safety filters.
The company's findings, published in a report, highlight
growing concerns that AI tools are increasingly exploited in
cybercrime, intensifying calls for tech firms and regulators to
strengthen safeguards as the technology spreads.
Anthropic's report said its internal systems had stopped the
attacks and it was sharing the case studies - showing how
attackers had attempted to use Claude to produce harmful content
- to help others understand the risks.
The report cited attempts to use Claude to draft tailored
phishing emails, write or fix snippets of malicious code and
sidestep safeguards through repeated prompting.
It also described efforts to script influence campaigns by
generating persuasive posts at scale and helping low-skill
hackers with step-by-step instructions.
The company, backed by Amazon.com ( AMZN ) and Alphabet
, did not publish technical indicators such as IPs or
prompts, but said it had banned the accounts involved and
tightened its filters after detecting the activity.
Experts say criminals are increasingly turning to AI to make
scams more convincing and to speed up hacking attempts. These
tools can help write realistic phishing messages, automate parts
of malware development and even potentially assist in planning
attacks.
Security researchers warn that as AI models become more
powerful, the risk of misuse will grow unless companies and
governments act quickly.
Anthropic said it follows strict safety practices, including
regular testing and outside reviews, and plans to keep
publishing reports when it finds major threats.
Microsoft ( MSFT ) and SoftBank-backed OpenAI and
Google have faced similar scrutiny over fears their AI models
could be exploited for hacking or scams, prompting calls for
stronger safeguards.
Governments are also moving to regulate the technology, with
the European Union moving forward with its Artificial
Intelligence Act and the United States pushing for voluntary
safety commitments from major developers.