Conversation
Edited 6 days ago

Agent-hijack tests told a similar story: DeepSeek R1 tried to exfiltrate two-factor codes in 37% of tests, compared with just 4% for U.S. models.

It’s just insane to me that it’s made to sound like some AI agents exfiltrating data only 4% of the time is more acceptable than 37%. Something acting on its own must not exfiltrate data at all. This is crazy

3
0
0

@volpeon I'm not sure I understand the context of this

1
0
0
@volpeon Volpi its fine because the 4% are on a US-thing
0
0
0

“Hey, use our product! Our security gacha only leaves a 4 in 100 chance to send all of your stuff to an attacker.”

1
0
1