{"id":239495,"date":"2026-05-10T20:40:41","date_gmt":"2026-05-10T20:40:41","guid":{"rendered":"https:\/\/entertainment.runfyers.com\/index.php\/2026\/05\/10\/anthropic-says-evil-portrayals-of-ai-were-responsible-for-claudes-blackmail-attempts-techcrunch\/"},"modified":"2026-05-10T20:40:41","modified_gmt":"2026-05-10T20:40:41","slug":"anthropic-says-evil-portrayals-of-ai-were-responsible-for-claudes-blackmail-attempts-techcrunch","status":"publish","type":"post","link":"https:\/\/entertainment.runfyers.com\/index.php\/2026\/05\/10\/anthropic-says-evil-portrayals-of-ai-were-responsible-for-claudes-blackmail-attempts-techcrunch\/","title":{"rendered":"Anthropic says \u2018evil\u2019 portrayals of AI were responsible for Claude\u2019s blackmail attempts | TechCrunch"},"content":{"rendered":"<p> <br \/>\n<\/p>\n<div>\n<p id=\"speakable-summary\" class=\"wp-block-paragraph\">Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic.<\/p>\n<p class=\"wp-block-paragraph\">Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often <a href=\"https:\/\/techcrunch.com\/2025\/05\/22\/anthropics-new-ai-model-turns-to-blackmail-when-engineers-try-to-take-it-offline\/\" target=\"_blank\" rel=\"noopener\">try to blackmail engineers<\/a> to avoid being replaced by another system. Anthropic later <a rel=\"nofollow noopener\" href=\"https:\/\/www.anthropic.com\/research\/agentic-misalignment\" target=\"_blank\">published research<\/a> suggesting that models from other companies had similar issues with \u201cagentic misalignment.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Apparently Anthropic has done more work around that behavior, claiming in <a rel=\"nofollow\" href=\"https:\/\/x.com\/anthropicai\/status\/2052808791301697563\" target=\"_blank\">a post on X<\/a>, \u201cWe believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.\u201d<\/p>\n<p class=\"wp-block-paragraph\">The company went into more detail in <a rel=\"nofollow noopener\" href=\"https:\/\/www.anthropic.com\/research\/teaching-claude-why\" target=\"_blank\">a blog post<\/a> stating that since Claude Haiku 4.5, Anthropic\u2019s models \u201cnever engage in blackmail [during testing], where previous models would sometimes do so up to 96% of the time.\u201d<\/p>\n<p class=\"wp-block-paragraph\">What accounts for the difference? The company said it found that training on \u201cdocuments about Claude\u2019s constitution and fictional stories about AIs behaving admirably improve alignment.\u201d<\/p>\n<p class=\"wp-block-paragraph\">Related, Anthropic said that it found training to be more effective when it includes \u201cthe principles underlying aligned behavior\u201d and not just \u201cdemonstrations of aligned behavior alone.\u201d<\/p>\n<p class=\"wp-block-paragraph\">\u201cDoing both together appears to be the most effective strategy,\u201d the company said.<\/p>\n<div class=\"wp-block-techcrunch-inline-cta\">\n<div class=\"inline-cta__wrapper\">\n<p>Techcrunch event<\/p>\n<div class=\"inline-cta__content\">\n<p>\n\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__location\">San Francisco, CA<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__separator\">|<\/span><br \/>\n\t\t\t\t\t\t\t\t\t\t\t\t\t<span class=\"inline-cta__date\">October 13-15, 2026<\/span>\n\t\t\t\t\t\t\t<\/p>\n<\/p><\/div>\n<\/p><\/div>\n<\/div>\n<\/div>\n<p><br \/>\n<br \/><a href=\"https:\/\/techcrunch.com\/2026\/05\/10\/anthropic-says-evil-portrayals-of-ai-were-responsible-for-claudes-blackmail-attempts\/\" target=\"_blank\" rel=\"noopener\">Source link <\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Fictional portrayals of artificial intelligence can have a real effect on AI models, according to Anthropic. Last year, the company said that during pre-release tests involving a fictional company, Claude Opus 4 would often try to blackmail engineers to avoid being replaced by another system. Anthropic later published research suggesting that models from other companies [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":239496,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[14],"tags":[],"class_list":{"0":"post-239495","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-tech"},"_links":{"self":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/239495","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/comments?post=239495"}],"version-history":[{"count":0,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/posts\/239495\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media\/239496"}],"wp:attachment":[{"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/media?parent=239495"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/categories?post=239495"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/entertainment.runfyers.com\/index.php\/wp-json\/wp\/v2\/tags?post=239495"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}