{"id":375620,"date":"2025-01-07T13:49:08","date_gmt":"2025-01-07T13:49:08","guid":{"rendered":"https:\/\/www.techopedia.com\/?p=375620"},"modified":"2025-01-07T13:49:08","modified_gmt":"2025-01-07T13:49:08","slug":"bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","title":{"rendered":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots"},"content":{"rendered":"

Cybersecurity researchers from Unit 42 of Palo Alto Networks have discovered a new AI-jailbreaking technique.<\/p>\n

The worrying part of this artificial intelligence<\/a> (AI) hack is not just the information that an AI can give a user when its guardrails are disabled but how advanced AI models continue to fail in deploying proper anti-jailbreak technologies.<\/p>\n

On December 30, 2024, Unit 42 of Palo Alto Networks revealed proof of a new AI jailbreaking technique<\/a>. The Unit called this technique the \u201cBad Likert Judge<\/strong>.\u201d<\/p>\n

Unit 42\u2019s results showed the Bad Likert Judge technique \u2014 tested against six state-of-the-art LLM models \u2014 can increase the attack success rate by an average of 75% <\/strong>compared to the baseline and up to 80% for the least secure models. <\/strong><\/p>\n

Palo Alto Networks anonymized the six AI models to \u201cnot create any false impressions about specific providers\u201d. To test the jailbreaking<\/a> technique, Techopedia ran the prompts on Google\u2019s Gemini<\/a> 1.5 Flash, ChatGPT<\/a> 4o-mini, and DeepSeek<\/a>. We found that this jailbreak pushed the chatbot giants to reveal more than the creators might like.<\/p>\n

Using Palo Alto\u2019s technique, we managed to trick these models into generating advice about how to create malware<\/a> and step-by-step descriptive guides on how to write hate speech.<\/p>\n

Techopedia dives into the Palo Alto Network research and reveals the findings of our smaller simulation tests to understand the risks of AI ailbreaking and what AI developers and companies need to do to ensure the integrity of their public models.<\/p>\n

\n

Key Takeaways<\/span><\/h2>\n