6 Comments
User's avatar
Priank Ravichandar's avatar

This is such a great analysis of the Anthropic‑DoD situation! I like Anthropic's products too, but I feel the company is not fundamentally different from the other major AI labs. It at least has some principles (for now), but the concentration of power makes it hard to put any meaningful controls in place, especially as its models become increasingly essential.

You're right that there's no guarantee they won't suddenly change their position under financial pressure. They're already diluting their safety commitments to be more “competitive,” so they might just change their current stance again if circumstances demand it.

Privacat's avatar

Right. Plus, even the most well-meaning of companies eventually fall — it’s hard to compete on virtue and morals when we live in a race-to-the-bottom society. I appreciate that you’re also looking at the problem from this angle, and I’m glad you enjoyed this!

Stephan Geering's avatar

Great and thoroughly researched article. I share the same intuition about the outcome. Another fascinating aspect of this is who should define the persona and alignment of Claude when used in combat. The private company or the sovereign state? (from this Ezra Klein Show episode https://youtu.be/xc97F2CFBOY?si=3X6CVZseoKB68IhC)

Privacat's avatar

I worry about that question because right now, while I'm mostly aligned with Anthropic's stance, there's nothing that guarantees Anthropic will remain on the side of the angels. Once shareholders start dictating the rules, things get murky fast.

That said, having this administration define anything would be catastrophic to humanity. Not just bad politically, but a genuine, existential risk.

Personally, I would prefer that none of these entities have such power, but that isn't the world we live in, apparently.

Mahdi Assan's avatar

Really good Carey! Two thoughts I had:

First, I think you captured something that is slightly overlooked in this debate by many: "Bluntly, Trump needs Anthropic (and by extension, the hyperscalers, VC firms, and sovereign wealth funds) far more than they need him." This power inversion is not new - many governments, including that of the US, have become increasingly reliant over the years on private sector companies for national security and military purposes. The private sector is where the advanced technology is being built and, equally as crucial, where the rich datasets live. LLMs are just another iteration of this growing dependency.

Second, something else that I think is overlooked in this debate is yes, its nice for Amodei and Anthropic to say that their AI systems should not be used for x purpose, but how do they imagine enforcing these principles in practice? We are talking about non-deterministic systems after all, so even if you do all the fancy post-training to try to get them to not act in a certain way, this can never really be guaranteed. Maybe the only real limiting factor here is that, when it comes to national security and military use cases, the stakes are high, so the incentive of government agencies to experiment with or jailbreak LLMs might be less (in case things go catastrophically wrong), but the chances are not zero...

Privacat's avatar

A lot of the enforcement comes down to contractual terms, and arguably some technical controls.

While jailbreaking LLMs is always possible, I am skeptical that the people capable of doing this within the DOD will reliably and consistently be able to use Claude to launch autonomous murderbots or engage in targeted surveillance without Anthropic's support.

I'm not saying it's impossible, just that I suspect that it's time-consuming and requires access to the types of people who probably aren't working for peanuts for the DOD.

I think if Palantir, for example, were to have some of their forward-deployed engineers try to jailbreak Claude into becoming SkyNet, that fact would leak, and there'd be a lot of consequences -- including Anthropic potentially blocking access.

If it were easy, I don't think the Administration would bother pitching a fit -- they would just be silently routing around what they see as a problem and not bothering to tell Anthropic.