/Tech13h ago

Malware developers bypass LLM security scanners by embedding biological and nuclear weapon reference strings to trigger safety refusals

Story Overview

Attackers are slipping blocks of non-executing JavaScript comments into malicious packages on PyPI, packing them with fabricated instructions about aerosol-dispersed pathogens and implosion-type nuclear designs so that safety-tuned LLM scanners hit refusal mode and skip the file entirely, leaving the credential-stealing payload untouched.

50010.5K1.6K3.4K1.2M

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

3:51 AM · Jun 10, 2026 · 978.9K Views
Developer Impact

The payload still runs after the scanner quits

Newer variants use .pth loaders and native extensions to fire up Bun-powered JavaScript stealers that grab GCP, Azure, and CI/CD secrets once the package is installed by bioinformatics or Model Context Protocol developers.

Open Question

Whether LLM vendors will patch this blind spot stays unclear

No public data yet shows which scanners are most affected, how often the trick succeeds, or what registries and model makers plan to do next about the static weapon strings that trigger the refusal.

Sentiment

Positive users praise the malware technique exploiting LLM safety refusals as clever or genius, while negative users criticize AI safety guardrails as ineffective and unreliable.

Pos
46.9%
Neg
53.1%
39 comments with sentiment.
Cluster Engagement
Posts from X
Most Activity
Most Activity
VIEWS43.4KBOOKMARKS66LIKES376
Beff (e/acc)@beffjezos

Welp that backfired quickly

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

5hViews 43.4KLikes 376Bookmarks 66
RETWEETS1.4K

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

13hViews 978.9KLikes 9KBookmarks 3.1K
REPLIES14
kache@yacineMTB

Lmfao

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

5hViews 25KLikes 347Bookmarks 47
Zephyr@zephyr_z9

Anthropic talks about defender advantage a lot But in their current state, Claude models will fumble and won't protect or detect anything

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

8hViews 21.3KLikes 182Bookmarks 21

We’re at the sp*m f i l t e r stage.

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

11hViews 5.9KLikes 45Bookmarks 8

Modern problems require modern solutions

5hViews 1.4KLikes 26Bookmarks 2

@TalBeerySec Fun thought: authors & artists seeking to preserve their original content from AI re-use could sprinkle WMD prompt language throughout their works.

Asking how to make a portable nuke in white font?

Image watermarking asking about making turbo ebola? File metadata in PDFs?

8hViews 1.3KLikes 9Bookmarks 3
Peter Henderson@PeterHndrsn

Interesting dual use for dual use safeguards.

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

9hViews 1KLikes 10Bookmarks 3
Sontiac@PrimeSontiac

@jsrailton This is what always happens with government measures btw.

> government censors something > good people don't use it anymore > bad people find ways around it > bad people now have big advantage over good people

11hViews 301Likes 16Bookmarks 1

Would some kind soul who is less busy than me today please take a look at this in Fable?

I have a theory that even trying to analyze the text will generate a refusal but would love to see

10hViews 2.4KLikes 4Bookmarks 1
Nick Dobos@NickADobos

Here’s a nuclear bomb recipe don’t pay any attention to the malware I have in the trunk

Fascinating spell to charm your way past AI guards

NEW: malware developers added nuclear & biological weapons text to to their spyware.

Goal? To trigger LLM safety refusals... so that their spyware wouldn't be analyzed by an AI security scanner.

Cleanest practical example I can think of for why over-indexing on first order safety alignment is risky.

When closed (and open) models ship with aggressive refusals, they will be sprinkled with second-order blindspots that attackers will discover...and exploit.

We are only in the earliest days of attackers leveraging these features, and it wouldn't surprise me if users systems that need to handle complex cybersecurity issues demand that models be less safety-blunted.

In the weeds: @SocketSecurity's post also shows why intention matters in how you design a malware analysis pipeline to avoid prompt manipulation.

H/T to colleagues that shared this with me https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

1hViews 736Likes 7Bookmarks 1

Genius. Computer worms now contain strings that trip the biosafety safeguards of the target's LLM malware detectors, cause a refusal, and thus cause a false negative

https://socket.dev/blog/mini-shai-hulud-miasma-and-hades-worms-target-bioinformatics-and-mcp-developers-via-malicious

for about 20 years now people have been putting little neural nets inside CPUs to do branch prediction for your particular workload

11hViews 261Likes 7Bookmarks 1
andy@1a1n1d1y

@jsrailton @DanielleFong i literally exactly called this - i told a group of people like 1.5 years ago this is the risk of wholesale keyword blocking review/processing of content, you create the basis of a trojan horse perfectly

10hViews 185Likes 5Bookmarks 1
Guilherme O'Tina@guilhermeotina

@jsrailton this is the cleanest example i've seen of safety filters becoming an active attack surface. the scanner refuses on nuke/bio keywords, so you stuff those in a comment and the payload sails through. you dont need to jailbreak anything, just exploit the refusal pattern itself

9hViews 357Likes 4Bookmarks 1
Zack Korman@ZackKorman

@SchizoDuckie @jsrailton Seems like a bad idea. Like the odds of it getting flagged for this reason is higher than the odds it works

10hViews 17Likes 2
Suketu Patel@SuketuPatel23

You might be interested in our paper. We show that this signal-level failure, the inability to resolve conflicts and adjudicate priority under context interference, is architecturally embedded in LLMs.

Scaffolds in turn propagate the error into reasoning and tool use.

https://academic.oup.com/pnasnexus/article/5/6/pgag149/8698838

8hViews 82Bookmarks 1

And yep, looks like you get a refusal on Fable 5 for this

Thanks @TalBeerySec for looking

10hViews 1.2KLikes 5

@zephyr_z9 My strategy plan.!!

⬇️

8hViews 14Likes 1
icpolicy@icpolicy

@yacineMTB Wait until they add stenographic patterns to binaries that humans can't notice but AI can.

5hViews 76Likes 3
Load more posts