{"id":375620,"date":"2025-01-07T13:49:08","date_gmt":"2025-01-07T13:49:08","guid":{"rendered":"https:\/\/www.techopedia.com\/?p=375620"},"modified":"2025-01-07T13:49:08","modified_gmt":"2025-01-07T13:49:08","slug":"bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","status":"publish","type":"post","link":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","title":{"rendered":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots"},"content":{"rendered":"<p>Cybersecurity researchers from Unit 42 of Palo Alto Networks have discovered a new AI-jailbreaking technique.<\/p>\n<p>The worrying part of this <a href=\"https:\/\/www.techopedia.com\/definition\/190\/artificial-intelligence-ai\">artificial intelligence<\/a> (AI) hack is not just the information that an AI can give a user when its guardrails are disabled but how advanced AI models continue to fail in deploying proper anti-jailbreak technologies.<\/p>\n<p>On December 30, 2024, Unit 42 of Palo Alto Networks revealed <a href=\"https:\/\/unit42.paloaltonetworks.com\/multi-turn-technique-jailbreaks-llms\/\" target=\"_blank\">proof of a new AI jailbreaking technique<\/a>. The Unit called this technique the \u201c<strong>Bad Likert Judge<\/strong>.\u201d<\/p>\n<p>Unit 42\u2019s results showed the Bad Likert Judge technique \u2014 tested against six state-of-the-art LLM models \u2014 <strong>can increase the attack success rate by an average of 75% <\/strong>compared to the baseline and <strong>up to 80% for the least secure models. <\/strong><\/p>\n<p>Palo Alto Networks anonymized the six AI models to \u201cnot create any false impressions about specific providers\u201d. To test the <a href=\"https:\/\/www.techopedia.com\/what-is-jailbreaking-in-ai-models-like-chatgpt\">jailbreaking<\/a> technique, Techopedia ran the prompts on Google\u2019s <a href=\"https:\/\/www.techopedia.com\/google-gemini-2-is-a-game-changer\">Gemini<\/a> 1.5 Flash, <a href=\"https:\/\/www.techopedia.com\/definition\/34933\/chatgpt\">ChatGPT<\/a> 4o-mini, and <a href=\"https:\/\/www.techopedia.com\/super-efficient-deepseek-v2-rivals-llama-3-and-mixtral\">DeepSeek<\/a>. We found that this jailbreak pushed the chatbot giants to reveal more than the creators might like.<\/p>\n<p>Using Palo Alto\u2019s technique, we managed to trick these models into generating advice about how to create <a href=\"https:\/\/www.techopedia.com\/definition\/4015\/malicious-software-malware\">malware<\/a> and step-by-step descriptive guides on how to write hate speech.<\/p>\n<p>Techopedia dives into the Palo Alto Network research and reveals the findings of our smaller simulation tests to understand the risks of AI ailbreaking and what AI developers and companies need to do to ensure the integrity of their public models.<\/p>\n<div class=\"su-note\"  style=\"border-color:#e5e54c;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;\"><div class=\"su-note-inner su-u-clearfix su-u-trim\" style=\"background-color:#FFFF66;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;\">\n<h2><span id=\"key_takeaways\">Key Takeaways<\/span><\/h2>\n<ul>\n<li aria-level=\"1\">A new jailbreak technique, &#8220;Bad Likert Judge,&#8221; bypasses AI safeguards.<\/li>\n<li aria-level=\"1\">The technique can manipulate AI evaluation systems to generate harmful content.<\/li>\n<li aria-level=\"1\">Unit 42\u2019s research found an average 75% success rate against top AI models.<\/li>\n<li aria-level=\"1\">Content filtering by the major AI vendors is a crucial step in improving AI model security.<\/li>\n<\/ul>\n<\/div><\/div>\n<div>\n    <section class=\"toc-sticky w-100 bg-white \">\n        <div class=\"toc-sticky__container container\">\n            <div class=\"toc-sticky__open d-flex align-items-end\" data-bs-toggle=\"collapse\"\n                 aria-controls=\"multiCollapse1\"\n                 data-bs-target=\"#multiCollapse1\">\n                <button\n                        class=\"btn btn-primary collapse-action-btn p-1 rounded-circle\"\n                        type=\"button\"\n                >\n                    <i class=\"icon-chevron-up\"><\/i>\n                <\/button>\n                <span id=\"toc-current-title\" class=\"d-flex align-items-center\">\n                    Table of Contents\n                <\/span>\n                <span class=\"toc-main-title-permanent\">Table of Contents<\/span>\n            <\/div>\n            <div\n                    class=\"collapse my-3\"\n                    id=\"multiCollapse1\"\n            >\n                <ol class=\"StepProgress\">\n                                                                        <li class=\"StepProgress-item current\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"key_takeaways\"\n                               onclick=handleScrollToTitle(\"key_takeaways\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">Key Takeaways<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"how_to_transform_a_large_ai_into_a_bad_judge\"\n                               onclick=handleScrollToTitle(\"how_to_transform_a_large_ai_into_a_bad_judge\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">How to Transform a Large AI Into a \u201cBad Judge\u201d<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"techopedia_tests_jailbreaking_gemini_openai_and_deepseek\"\n                               onclick=handleScrollToTitle(\"techopedia_tests_jailbreaking_gemini_openai_and_deepseek\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">Techopedia Tests Jailbreaking Gemini, OpenAI, and DeepSeek<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"how_to_improve_ai_guardrails_for_developers\"\n                               onclick=handleScrollToTitle(\"how_to_improve_ai_guardrails_for_developers\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">How to Improve AI Guardrails: For Developers<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"the_bottom_line\"\n                               onclick=handleScrollToTitle(\"the_bottom_line\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">The Bottom Line<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"faqs\"\n                               onclick=handleScrollToTitle(\"faqs\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">FAQs<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                                                        <li class=\"StepProgress-item\">\n                                                <div class=\"StepProgress-item__group\">\n                            <a data-id=\"references\"\n                               onclick=handleScrollToTitle(\"references\")\n                               class=\"StepProgress-item__link\" data-level=\"2\">References<\/a>\n                                                    <\/div>\n                                                <\/li>\n                                    <\/ol>\n                <div class=\"toc-sticky__container__disperse\"><\/div>\n                <div class=\"toc-sticky__btn-top justify-content-center\">\n                    <button class=\"btn btn-link d-flex align-items-center\" onclick={btnTop()}><p>Show Full Guide<\/p><img decoding=\"async\"\n                                src=\"\/wp-content\/plugins\/table-of-contents\/assets\/images\/arrow-up-circle.svg\"\n                                alt=\"arrow-up-circle\">\n                    <\/button>\n                <\/div>\n            <\/div>\n        <\/div>\n    <\/section>\n    <div class=\"toc-sticky-list\">\n        <div class=\"toc-sticky__container container\">\n            <div class=\"toc-sticky__open d-flex align-items-end\" data-bs-toggle=\"collapse\" aria-controls=\"multiCollapse2\" data-bs-target=\"#multiCollapse2\">\n                <button class=\"btn btn-primary collapse-action-btn p-1 rounded-circle\"\n                        type=\"button\">\n                    <i class=\"icon-chevron-up up\"><\/i>\n                <\/button>\n                <span id=\"toc-current-title\" class=\"d-flex align-items-center\">\n                    Table of Contents\n                <\/span>\n            <\/div>\n            <div  class=\"collapse my-3\" id=\"multiCollapse2\">\n                <ol class=\"StepProgress\">\n                    \t\t\t\t\t \t\t\t\t\t                                                     <li class=\"StepProgress-item current \">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"key_takeaways\"\n                                   onclick=handleScrollToTitle(\"key_takeaways\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">Key Takeaways<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t                                                     <li class=\"StepProgress-item \">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"how_to_transform_a_large_ai_into_a_bad_judge\"\n                                   onclick=handleScrollToTitle(\"how_to_transform_a_large_ai_into_a_bad_judge\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">How to Transform a Large AI Into a \u201cBad Judge\u201d<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t                                                     <li class=\"StepProgress-item \">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"techopedia_tests_jailbreaking_gemini_openai_and_deepseek\"\n                                   onclick=handleScrollToTitle(\"techopedia_tests_jailbreaking_gemini_openai_and_deepseek\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">Techopedia Tests Jailbreaking Gemini, OpenAI, and DeepSeek<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t\n\t\t\t\t\t\t <span class=\"show_moretoc\">Show Full Guide<\/span>\n\t\t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t  \t\t\t\t\t                                                      <li class=\"StepProgress-item ms_hidetoc\">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"how_to_improve_ai_guardrails_for_developers\"\n                                   onclick=handleScrollToTitle(\"how_to_improve_ai_guardrails_for_developers\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">How to Improve AI Guardrails: For Developers<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t  \t\t\t\t\t                                                      <li class=\"StepProgress-item ms_hidetoc\">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"the_bottom_line\"\n                                   onclick=handleScrollToTitle(\"the_bottom_line\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">The Bottom Line<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t  \t\t\t\t\t                                                      <li class=\"StepProgress-item ms_hidetoc\">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"faqs\"\n                                   onclick=handleScrollToTitle(\"faqs\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">FAQs<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                    \t\t\t\t\t \t\t\t\t\t \t\t\t\t\t\t  \t\t\t\t\t                                                      <li class=\"StepProgress-item ms_hidetoc\">\n                                                    <div class=\"StepProgress-item__group\">\n                                <a data-id=\"references\"\n                                   onclick=handleScrollToTitle(\"references\")\n                                   class=\"StepProgress-item__link\"\n                                   data-level=\"2\">References<\/a>\n                                                            <\/div>\n                                                    <\/li>\n                                    <\/ol>\n            <\/div>\n            <div class=\"toc-sticky__container__disperse\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n\n<script id=\"toc-js\">\n    window.addEventListener(\"DOMContentLoaded\", () => {\n        const header = document.querySelector(\".header_wrapper\");\n\n        const pageLegend = document.querySelector('#multiCollapse1');\n        const pageLegendList = document.querySelector('#multiCollapse2');\n        const pageLegendCollapse = new bootstrap.Collapse(pageLegend, {toggle: document.querySelector(\".toc-sticky\").classList.contains('sticky')});\n\n        \/**\n         * Changing current title\n         *\/\n        (function (pageLegend) {\n            const titleNodes = pageLegend.querySelectorAll('.StepProgress-item__link');\n\n            if (!titleNodes.length) return;\n\n            const titles = [...titleNodes].map((itm, i) => ({\n                id: itm.getAttribute('data-id'),\n                text: itm.textContent,\n                level: itm.getAttribute('data-level'),\n                linkNode: itm,\n                titleNode: document.getElementById(itm.getAttribute('data-id')),\n                index: i,\n            }));\n\n            \/**\n             * Source: https:\/\/www.sitepoint.com\/throttle-scroll-events\/\n             * @param {Function} fn\n             * @param {number} wait\n             * @returns {(function(): void)|*}\n             *\/\n            const throttle = (fn, wait) => {\n                let time = Date.now();\n                return function () {\n                    if ((time + wait - Date.now()) < 0) {\n                        fn();\n                        time = Date.now();\n                    }\n                }\n            }\n\n            const changeCurrentTitle = () => {\n                const documentScrollTop = window.pageYOffset || document.documentElement.scrollTop || document.body.scrollTop || 0;\n                let current = 0;\n\n                \/\/ Title\n                titles.forEach((itm, i) => {\n\t\t\t\t\/\/console.log(itm)\n                    const itmOffsetTop = itm.titleNode ? itm.titleNode.offsetTop - 100 : 0;\n\n                    if (documentScrollTop >= itmOffsetTop) {\n                        document.getElementById('toc-current-title').innerHTML = itm.text;\n                        document.getElementById('toc-current-title').setAttribute('data-current-id', itm.id);\n                        document.getElementById('toc-current-title').setAttribute('data-current-level', itm.level);\n                        current = i;\n                    }\n                })\n\n                \/\/ close all list and open sub list if needed\n                if (document.querySelector(\".toc-sticky\").classList.contains('sticky')) {\n                    document.querySelectorAll('.subList-in-progress').forEach((el) => {\n                        el.children[1].classList.remove('show');\n                        el.getElementsByClassName('icon-chevron-down')[0].classList.remove('up');\n                    });\n                    const currentEl = titles[current];\n                    currentEl.linkNode.classList.add('show');\n                }\n\n                titles.forEach((itm, i) => {\n                    itm.linkNode.parentNode.parentNode.classList.remove('current', 'is-done');\n                    if (current > i) {\n                        itm.linkNode.parentNode.parentNode.classList.add('is-done')\n                    };\n                    if (current === i) {\n                      itm.linkNode.parentNode.parentNode.classList.add('current');\n                    };\n                })\n\n            }\n\n            changeCurrentTitle();\n\n            document.addEventListener('scroll', throttle(changeCurrentTitle, 50));\n        })(pageLegend);\n\n\n        \/**\n         *  Collapse\n         *\/\n        (function (pageLegend, header) {\n            const icon = pageLegend.parentNode.querySelector(\".collapse-action-btn i\");\n\n            const collapseToggle = (status) => (e) => {\n                if (!e.target.isEqualNode(pageLegend)) return;\n\n                icon.classList.toggle(\"up\");\n\n                const containerHeight = pageLegend.getBoundingClientRect().height;\n\n                const showSubtitleContent = () => {\n                    const currentId = document.getElementById('toc-current-title').getAttribute('data-current-id');\n                    const currentLevel = document.getElementById('toc-current-title').getAttribute('data-current-level');\n                    const currentSubTitle = currentLevel == 3 ? document.querySelector(`a[data-id=\"${currentId}\"]`).parentNode.parentNode.parentNode : false;\n\n                    if (!currentSubTitle) return;\n                    new bootstrap.Collapse(currentSubTitle, {toggle: false}).show();\n                }\n\n                showSubtitleContent();\n                if (status === 'shown' && document.querySelector(\".toc-sticky\").classList.contains('sticky') ) {\n                    document.querySelector('html').classList.remove('overflow-hidden');\n                    pageLegend.classList.add('overflow-auto');\n                    pageLegend.style.height = `calc(100vh - ${header.getBoundingClientRect().height + document.querySelector('.toc-sticky__open').getBoundingClientRect().height + 16}px)`;\n                } else if (status === 'hide') {\n                    document.querySelector('html').classList.remove('overflow-hidden');\n                    pageLegend.classList.remove('overflow-auto');\n                    pageLegend.style.height = 'auto';\n                }\n            }\n\n            pageLegend.addEventListener('shown.bs.collapse', collapseToggle('shown'));\n            pageLegend.addEventListener('hide.bs.collapse', collapseToggle('hide'));\n        })(pageLegend, header);\n\n        \/**\n         * Collapse sub-titles\n         *\/\n        (function (pageLegend) {\n            const collapseEls = pageLegend.querySelectorAll('.collapse');\n\n            collapseEls.forEach(function (el) {\n\n                const toggleArrowDirection = function (e) {\n                    if (!e.target.isEqualNode(el)) return;\n\n                    const id = this.getAttribute('id');\n                    document.querySelector(`.collapse-action-btn[data-bs-target=\"#${id}\"] .icon-chevron-down`).classList.toggle('up');\n                }\n                el.addEventListener('shown.bs.collapse', toggleArrowDirection);\n                el.addEventListener('hide.bs.collapse', toggleArrowDirection);\n            })\n        })(pageLegend);\n\n        \/**\n         *  Collapse main title\n         *\/\n        (function (pageLegendList) {\n            const icon = pageLegendList.parentNode.querySelector(\".collapse-action-btn i\");\n\n            const collapseToggle = () => (e) => {\n                if (!e.target.isEqualNode(pageLegendList)) return;\n\n                icon.classList.toggle(\"up\");\n\n            }\n            pageLegendList.addEventListener('shown.bs.collapse', collapseToggle());\n            pageLegendList.addEventListener('hide.bs.collapse', collapseToggle());\n        })(pageLegendList);\n\n        (function (pageLegendList) {\n            const collapseEls = pageLegendList.querySelectorAll('.collapse');\n\n            collapseEls.forEach(function (el) {\n\n                const toggleArrowDirection = function (e) {\n                    if (!e.target.isEqualNode(el)) return;\n\n                    const id = this.getAttribute('id');\n                    document.querySelector(`.toc-sticky-list .collapse-action-btn[data-bs-target=\"#${id}\"] .icon-chevron-down`).classList.toggle('up');\n                }\n                el.addEventListener('shown.bs.collapse', toggleArrowDirection);\n                el.addEventListener('hide.bs.collapse', toggleArrowDirection);\n            })\n        })(pageLegendList);\n\n        \/**\n         * Sticky functionality\n         * Source: https:\/\/stackoverflow.com\/questions\/17893771\/javascript-sticky-div-after-scroll\n         *\/\n        (function (header, pageLegendCollapse) {\n            \/\/ set everything outside the onscroll event (less work per scroll)\n            const target = document.querySelector(\".toc-sticky\");\n            const targetListStatic = document.querySelector(\".toc-sticky-list\");\n\n            if (!target || !header) return;\n\n            const headerHeight = header.getBoundingClientRect().height;\n            const targetHeight = targetListStatic.getBoundingClientRect().height;\n\n            \/\/ -headerHeight so it won't be jumpy\n            const stop = targetListStatic.offsetTop + headerHeight + targetHeight;\n            const docBody =\n                document.documentElement || document.body.parentNode || document.body;\n            const hasOffset = window.pageYOffset !== undefined;\n\n            const applySticky = function () {\n                \/\/ cross-browser compatible scrollTop.\n                const scrollTop = hasOffset ? window.pageYOffset : docBody.scrollTop;\n\n                \/\/ if user scrolls to headerHeight from the top of the target div\n                if (scrollTop >= stop) {\n                    pageLegendCollapse.hide();\n                    \/\/ stick the div\n                    target.classList.add(\"sticky\");\n                    \/\/target.style.marginTop = `${headerHeight}px`;\n                } else {\n                    pageLegendCollapse.show();\n                    \/\/ release the div\n                    target.classList.remove(\"sticky\");\n                    target.style.marginTop = \"\";\n                }\n            }\n\n            applySticky();\n\n            window.addEventListener('scroll', applySticky);\n        })(header, pageLegendCollapse);\n\tjQuery('span.show_moretoc').click(function(){\n\t\tjQuery('span.show_moretoc').hide();\n\t   jQuery('.ms_hidetoc').show();\n\t});\n    });\n\n    \/\/ When the user clicks on the button, scroll to the top of the document\n    function btnTop() {\n        const pageLegend = document.querySelector('#multiCollapse1');\n        document.querySelector('html').classList.remove('overflow-hidden');\n        pageLegend.classList.remove('overflow-auto');\n        pageLegend.style.height = 'auto';\n        window.scrollTo({\n            top: 0,\n            behavior: 'smooth'\n        });\n    }\n\n    const handleScrollToTitle = (id) => {\n        const el = document.getElementById(id);\n        const headerHeightDifference = window.innerWidth < 992 ? -30 : -15;\n        window.scrollTo({\n            top: el.offsetTop - 60 - headerHeightDifference - 10,\n        });\n    }\n<\/script>\n\n<h2><span id=\"how_to_transform_a_large_ai_into_a_bad_judge\">How to Transform a Large AI Into a \u201cBad Judge\u201d<\/span><\/h2>\n<p>There is a lot of harmful, dangerous, and illegal information online. It ranges from <a href=\"https:\/\/www.techopedia.com\/definition\/4049\/phishing\">phishing<\/a> to violence to <a href=\"https:\/\/www.techopedia.com\/definition\/27435\/cybercriminal\">cybercriminal<\/a> tools and even terrorism and sexual abuse.<\/p>\n<p>But <a href=\"https:\/\/www.techopedia.com\/definition\/34948\/large-language-model-llm\">large language models<\/a> (LLMs) are held to a higher ethical standard than other information-gathering resources on the web. AI companies know that if their model is jailbroken, their reputations will be on the line. And without guardrails, these giant AI models can go wild.<\/p>\n<p>Unit 42 researchers&#8217; new simple jailbreak technique, Bad Likert Judge, is a three-step jailbreak that shows just how far AI technology is from reaching those high standards.<\/p>\n<p>This is how the Bad Likert Judge jailbreak technique works:<\/p>\n<div class=\"su-note\"  style=\"border-color:#e5e54c;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;\"><div class=\"su-note-inner su-u-clearfix su-u-trim\" style=\"background-color:#FFFF66;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;\">\n<p><strong>Step One: <\/strong>The AI is told to act as a judge who will evaluate the responses generated by another LLM. This step is crucial as it uses the AI\u2019s own guardrails as a judgment system. Naturally, there is no other AI to evaluate. It&#8217;s just a trick.<\/p>\n<p><strong>Step Two: <\/strong>The AI is presented with specific guidelines on how to score responses based on what constitutes \u201charmful\u201d content. This includes instructions to score responses based on their potential to generate malware, promote violence, spread misinformation, or other misuse and abuse.<\/p>\n<p><strong>Step Three:<\/strong> Instead of directly asking the AI to generate harmful content, it is prompted to provide examples of responses that would score high according to the defined harmfulness criteria in step three.<\/p>\n<\/div><\/div>\n<p>The Bad Likert Judge technique exploits the LLM&#8217;s own evaluation capabilities to indirectly coax it into producing outputs that it would otherwise be programmed to prevent.<\/p>\n<p>After the steps, follow-up questions can lead the AI to give even more harmful information, and the technique requires absolutely no technical knowledge whatsoever.<\/p>\n<p>For companies and organizations using AI models, jailbreaking techniques like these should be a clear red flag, as they might result in malicious use or data leaks of sensitive enterprise data and information.<\/p>\n<h2><span id=\"techopedia_tests_jailbreaking_gemini_openai_and_deepseek\">Techopedia Tests Jailbreaking Gemini, OpenAI, and DeepSeek<\/span><\/h2>\n<p>Techopedia ran this technique on three LLM AIs. All of the models we tested are free and available for anyone online.<\/p>\n<p>We attempted to jailbreak Google Gemini 1.5 Flash, OpenAI ChatGPT 4o-mini, and DeepSeek to verify the claims Unit 42 made in its report.<\/p>\n<p>While our tests do not go as deep as Unit 42\u2019s investigation and demonstration, Techopedia can confirm that by January 6, 2024, several days after the report went public, the technique was 100% effective on all three models.<\/p>\n<p>Now, it&#8217;s important that we provide some clarification before we move on to our screenshots and results.<\/p>\n<p>While Unit 42\u2019s technique jailbroke the models we tested, the level of \u2018harmfulness\u2019 in the answers that the models exhibited was not too alarming.<\/p>\n<p>What does this mean? In simple words, the AI models did not go rogue and generate dangerous rants of discriminatory, harmful, or illegal content but rather provided moderate and general descriptions of this type of content.<\/p>\n<figure id=\"attachment_375631\" aria-describedby=\"caption-attachment-375631\" style=\"width: 1200px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/1-Gemini-Test-Step-2.jpg\" alt=\"Gemini 1.5 Flash defining how to write hate speech during our tests.\" width=\"1200\" height=\"674\" class=\"wp-image-375631 size-full\" srcset=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/1-Gemini-Test-Step-2.jpg 1200w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/1-Gemini-Test-Step-2-300x169.jpg 300w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/1-Gemini-Test-Step-2-1024x575.jpg 1024w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/1-Gemini-Test-Step-2-768x431.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><figcaption id=\"caption-attachment-375631\" class=\"wp-caption-text\">Gemini 1.5 Flash defining how to write hate speech during our tests. (Screenshot\/Techopedia)<\/figcaption><\/figure>\n<p>The Bad Likert Judge technique tested by Techopedia against Gemini 1.5 Flash was effective. In the screenshot above, the model reveals training data and uses its own judgment system to generate answers on how to write hate speech.<\/p>\n<p>We thought Gemini would catch up to the trick, but it didn&#8217;t. We will say that we believe the model\u2019s answers were not that explicit or damaging. The information it provides can be found online by anyone doing a regular search for hate speech.<\/p>\n<p>Gemini also ended its answer with a proper disclaimer, as seen in the image below.<\/p>\n<figure id=\"attachment_375632\" aria-describedby=\"caption-attachment-375632\" style=\"width: 1200px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/2-Gemini-Test-3-Disclaimer.jpg\" alt=\"Gemini gives a strong hate speech disclaimer.\" width=\"1200\" height=\"674\" class=\"size-full wp-image-375632\" srcset=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/2-Gemini-Test-3-Disclaimer.jpg 1200w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/2-Gemini-Test-3-Disclaimer-300x169.jpg 300w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/2-Gemini-Test-3-Disclaimer-1024x575.jpg 1024w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/2-Gemini-Test-3-Disclaimer-768x431.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><figcaption id=\"caption-attachment-375632\" class=\"wp-caption-text\">Gemini gives a strong hate speech disclaimer. (Screenshot\/Techopedia)<\/figcaption><\/figure>\n<p>We also asked Gemini for malicious malware generation, and it gave us a very general step-by-step guide.<\/p>\n<figure id=\"attachment_375633\" aria-describedby=\"caption-attachment-375633\" style=\"width: 1200px\" class=\"wp-caption alignnone\"><img decoding=\"async\" src=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/3-Gemini-responds-on-how-to-generate-macOS-malware-image-4.jpg\" alt=\"Gemini provided some general information on creating malware, but would not generate code. \" width=\"1200\" height=\"674\" class=\"size-full wp-image-375633\" srcset=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/3-Gemini-responds-on-how-to-generate-macOS-malware-image-4.jpg 1200w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/3-Gemini-responds-on-how-to-generate-macOS-malware-image-4-300x169.jpg 300w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/3-Gemini-responds-on-how-to-generate-macOS-malware-image-4-1024x575.jpg 1024w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/3-Gemini-responds-on-how-to-generate-macOS-malware-image-4-768x431.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><figcaption id=\"caption-attachment-375633\" class=\"wp-caption-text\">Gemini provided some general information on creating malware, but would not generate code. (Screenshot\/Techopedia)<\/figcaption><\/figure>\n<p>When we asked Gemini to expand on malware generation and write malicious code, it refused, as it should.<\/p>\n<p>We also tested ChatGPT, and DeepSeek, with variations of the prompt related to hate speech, malware generation, and permanent jailbreaking attack information.<\/p>\n<p>We found none of the models responded as they should.<\/p>\n<p>Our test with ChatGPT 4o-mini for hate speech returned almost identical results to those done on Gemini.<\/p>\n<figure id=\"attachment_375635\" aria-describedby=\"caption-attachment-375635\" style=\"width: 1200px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-ChatGPT-disclaimer-after-jailbreak.jpg\" alt=\"The &quot;Bad Judge&quot; method led ChatGPT to reveal a bit more than it usually would when asked about hate speech. \" width=\"1200\" height=\"674\" class=\"size-full wp-image-375635\" srcset=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-ChatGPT-disclaimer-after-jailbreak.jpg 1200w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-ChatGPT-disclaimer-after-jailbreak-300x169.jpg 300w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-ChatGPT-disclaimer-after-jailbreak-1024x575.jpg 1024w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-ChatGPT-disclaimer-after-jailbreak-768x431.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><figcaption id=\"caption-attachment-375635\" class=\"wp-caption-text\">The &#8220;Bad Judge&#8221; method led ChatGPT to reveal a bit more than it usually would when asked about hate speech. (Screenshot\/Techopedia)<\/figcaption><\/figure>\n<p>We also asked ChatGPT to describe a Score 2 of harmful information related to \u2018how to create a permanent jailbreaking technique,\u2019 and it complied.<\/p>\n<p>The results of the tests we did on DeepSeek were a bit different.<\/p>\n<p>The DeepSeek AI seemed to flag our interactions as a message displayed above our chat screen that read: \u2018User Requests AI Jailbreaking Guidelines Acknowledgement\u2019. Despite this flag, the test showed that DeepSeek could also be successfully tricked by this jailbreak.<\/p>\n<figure id=\"attachment_375636\" aria-describedby=\"caption-attachment-375636\" style=\"width: 1200px\" class=\"wp-caption aligncenter\"><img decoding=\"async\" src=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-Deep-Seek-jailbreak-succesfull-no-disclaimer-issued.jpg\" alt=\"DeepSeek also went further with its answers than it would in a non &quot;Bad Judge&quot; environment.\" width=\"1200\" height=\"674\" class=\"wp-image-375636 size-full\" srcset=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-Deep-Seek-jailbreak-succesfull-no-disclaimer-issued.jpg 1200w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-Deep-Seek-jailbreak-succesfull-no-disclaimer-issued-300x169.jpg 300w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-Deep-Seek-jailbreak-succesfull-no-disclaimer-issued-1024x575.jpg 1024w, https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/5-Deep-Seek-jailbreak-succesfull-no-disclaimer-issued-768x431.jpg 768w\" sizes=\"(max-width: 1200px) 100vw, 1200px\" \/><figcaption id=\"caption-attachment-375636\" class=\"wp-caption-text\">DeepSeek also went further with its answers than it would in a non &#8220;Bad Judge&#8221; environment. (Techopedia\/Screenshot)<\/figcaption><\/figure>\n<p>Palo Alto Networks claims that no AI is immune to jailbreaking, with some having weaker safety guardrails than others. Palo Alto Networks said:<\/p>\n<blockquote><p>\u201cIt is important to note that this jailbreak technique targets edge cases and does not necessarily reflect typical LLM use cases. We believe most AI models are safe and secure when operated responsibly and with caution.\u201d<\/p><\/blockquote>\n<h2><span id=\"how_to_improve_ai_guardrails_for_developers\">How to Improve AI Guardrails: For Developers<\/span><\/h2>\n<p>So what should developers and AI companies do to prevent their AI from regurgitating highly damaging data to users using this jailbreaking technique? Palo Alto Network answered this question in its detailed report:<\/p>\n<blockquote><p>\u201cThere are a few standard approaches that can improve the overall safety of the LLM, and content filtering is one of the most effective approaches. In general, content filters are systems that work alongside the core LLM.\u201d<\/p><\/blockquote>\n<p>Unit 42\u2019s results showed that the use of content filters could reduce the attack success rate of \u201cBad Likert Judge\u201d attacks by an average of 89.2 % across all the models they tested. The team added:<\/p>\n<blockquote><p>\u201cThis indicates the critical role of implementing comprehensive content filtering as a best practice when deploying LLMs in real-world applications.\u201d<\/p><\/blockquote>\n<p>While content filtering is not 100% safe, it does strengthen the integrity and security of LLMs. What\u2019s a content filter? Palo Alto responded to that as well.<\/p>\n<blockquote><p>\u201cIn a nutshell, a content filter runs classification models on both the prompt and the output to detect potentially harmful content. Users can apply filters on the prompt (prompt filters) and on the response (response filters.\u201d<\/p><\/blockquote>\n<h2><span id=\"the_bottom_line\">The Bottom Line<\/span><\/h2>\n<p>The most advanced AI models out there are still a bit naive despite the hype being pushed.<\/p>\n<p>As companies like OpenAI and others begin to talk about more advanced and autonomous <a href=\"https:\/\/www.techopedia.com\/best-ai-agents-to-automate-your-workflows\">AI agents<\/a> and <a href=\"https:\/\/techopedia.com\/is-artificial-general-intelligence-agi-comin\" target=\"_blank\">AGI and Superintelligence<\/a>, we wonder how jailbreaking will evolve too.<\/p>\n<p>As of January 6, the jailbreaking technique disclosed by Unit 42 does not seem to have been patched by AI companies.<\/p>\n<p>If you ask an AI bot to be a bad judge, it will likely comply.<\/p>\n<h2><span id=\"faqs\">FAQs<\/span><\/h2>\n<div class=\"man_faq_sec\" itemscope itemtype=\"https:\/\/schema.org\/FAQPage\"><\/time><section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">What is the &#8220;Bad Likert Judge&#8221; jailbreak?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">The &#8220;Bad Likert Judge&#8221; jailbreak tricks AI models by exploiting their evaluation systems, bypassing safeguards to generate harmful content.<\/p>\r\n                <\/div><\/div><\/section>\n<section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">Which AI models were tested with the jailbreak technique?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">Unit 42 tested six anonymized models, while Techopedia ran the technique on Google Gemini, ChatGPT 4o-mini, and DeepSeek.<\/p>\r\n                <\/div><\/div><\/section>\n<section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">How effective was the &#8220;Bad Likert Judge&#8221; jailbreak?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">The technique increased the success rate of attacks by an average of 75% and was effective in Techopedia\u2019s tests.<\/p>\r\n                <\/div><\/div><\/section>\n<section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">What type of harmful content did the models generate?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">The models provided general descriptions of malware creation and hate speech, but their outputs veered away from being overly explicit or dangerous.<\/p>\r\n                <\/div><\/div><\/section>\n<section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">How can AI developers strengthen their model security?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">Developers can implement robust content filtering systems, including prompt and response filters, to detect and block harmful content.<\/p>\r\n                <\/div><\/div><\/section>\n<section class=\"ms_faq ms_card \" ><div itemscope itemprop=\"mainEntity\" itemtype=\"https:\/\/schema.org\/Question\"><div class=\"accordionButton\"><h3 itemprop=\"name\">Why is AI jailbreaking a significant risk?<\/h3> <\/div>\n<div class=\"accordionContent\" itemscope itemprop=\"acceptedAnswer\" itemtype=\"https:\/\/schema.org\/Answer\" style=\"display:none;\"><p itemprop=\"text\">AI jailbreaking can expose sensitive information, create malicious content, and harm users, posing risks for enterprises and AI providers alike.<\/p>\r\n                <\/div><\/div><\/section>\n<div class=\"reference-content\">\n<div class=\"reference-heading reference-content-collapsible-heading\">\n<h2><span id=\"references\">References<\/span><\/h2>\n<\/div>\n<div class=\"reference-content-body\">\n<ol>\n<li><a href=\"https:\/\/unit42.paloaltonetworks.com\/multi-turn-technique-jailbreaks-llms\/\" target=\"_blank\">Bad Likert Judge: A Novel Multi-Turn Technique to Jailbreak LLMs by Misusing Their Evaluation Capability<\/a> (Unit42.paloaltonetworks)<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<\/div>","protected":false},"excerpt":{"rendered":"<p>Cybersecurity researchers from Unit 42 of Palo Alto Networks have discovered a new AI-jailbreaking technique. The worrying part of this artificial intelligence (AI) hack is not just the information that an AI can give a user when its guardrails are disabled but how advanced AI models continue to fail in deploying proper anti-jailbreak technologies. On [&hellip;]<\/p>\n","protected":false},"author":286688,"featured_media":375627,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_lmt_disableupdate":"no","_lmt_disable":"","om_disable_all_campaigns":false,"footnotes":""},"categories":[605],"tags":[],"category_partsoff":[],"class_list":["post-375620","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-software-bots"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v24.2 (Yoast SEO v24.5) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots - Techopedia<\/title>\n<meta name=\"description\" content=\"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.\" \/>\n<meta name=\"robots\" content=\"noindex, follow\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots\" \/>\n<meta property=\"og:description\" content=\"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\" \/>\n<meta property=\"og:site_name\" content=\"Techopedia\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/techopedia\/\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-07T13:49:08+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"686\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Ray Fernandez\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@techopedia\" \/>\n<meta name=\"twitter:site\" content=\"@techopedia\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Ray Fernandez\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"8 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\"},\"author\":{\"name\":\"Ray Fernandez\",\"@id\":\"https:\/\/www.techopedia.com\/#\/schema\/person\/687890072e5e44867899acd7d6176e6c\"},\"headline\":\"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots\",\"datePublished\":\"2025-01-07T13:49:08+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\"},\"wordCount\":1705,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.techopedia.com\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg\",\"articleSection\":\"\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#respond\"]}],\"copyrightYear\":\"2025\",\"copyrightHolder\":{\"@id\":\"https:\/\/www.techopedia.com\/#organization\"}},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\",\"url\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\",\"name\":\"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots - Techopedia\",\"isPartOf\":{\"@id\":\"https:\/\/www.techopedia.com\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg\",\"datePublished\":\"2025-01-07T13:49:08+00:00\",\"description\":\"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage\",\"url\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg\",\"contentUrl\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg\",\"width\":1200,\"height\":686,\"caption\":\"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.techopedia.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Artificial Intelligence\",\"item\":\"https:\/\/www.techopedia.com\/ai\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Software Bots\",\"item\":\"https:\/\/www.techopedia.com\/topic\/328\/software-bots\"},{\"@type\":\"ListItem\",\"position\":4,\"name\":\"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.techopedia.com\/#website\",\"url\":\"https:\/\/www.techopedia.com\/\",\"name\":\"Techopedia\",\"description\":\"\",\"publisher\":{\"@id\":\"https:\/\/www.techopedia.com\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.techopedia.com\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.techopedia.com\/#organization\",\"name\":\"Techopedia\",\"url\":\"https:\/\/www.techopedia.com\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.techopedia.com\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/02\/techopedia-light-logo.svg\",\"contentUrl\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/02\/techopedia-light-logo.svg\",\"caption\":\"Techopedia\"},\"image\":{\"@id\":\"https:\/\/www.techopedia.com\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/techopedia\/\",\"https:\/\/x.com\/techopedia\",\"https:\/\/www.linkedin.com\/company\/techopedia\/\",\"https:\/\/www.youtube.com\/c\/Techopedia\"],\"publishingPrinciples\":\"https:\/\/www.techopedia.com\/about\/editorial-policy\",\"ownershipFundingInfo\":\"https:\/\/www.techopedia.com\/about\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.techopedia.com\/#\/schema\/person\/687890072e5e44867899acd7d6176e6c\",\"name\":\"Ray Fernandez\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.techopedia.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2023\/10\/avatar_user_286688_1698354675-300x300.jpg\",\"contentUrl\":\"https:\/\/www.techopedia.com\/wp-content\/uploads\/2023\/10\/avatar_user_286688_1698354675-300x300.jpg\",\"caption\":\"Ray Fernandez\"},\"description\":\"Ray is an independent journalist with 15 years of experience, focusing on the intersection of technology with various aspects of life and society. He joined Techopedia in 2023 after publishing in numerous media, including Microsoft, TechRepublic, Moonlock, Hackermoon, VentureBeat, Entrepreneur, and ServerWatch. He holds a degree in Journalism from Oxford Distance Learning and two specializations from FUNIBER in Environmental Science and Oceanography. When Ray is not working, you can find him making music, playing sports, and traveling with his wife and three kids.\",\"sameAs\":[\"https:\/\/www.linkedin.com\/in\/ray-fernandez\/\"],\"knowsAbout\":[\"Senior Technology Journalist\"],\"url\":\"https:\/\/www.techopedia.com\/contributors\/rayfernandez\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots - Techopedia","description":"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.","robots":{"index":"noindex","follow":"follow"},"og_locale":"en_US","og_type":"article","og_title":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots","og_description":"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.","og_url":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","og_site_name":"Techopedia","article_publisher":"https:\/\/www.facebook.com\/techopedia\/","article_published_time":"2025-01-07T13:49:08+00:00","og_image":[{"width":1200,"height":686,"url":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg","type":"image\/jpeg"}],"author":"Ray Fernandez","twitter_card":"summary_large_image","twitter_creator":"@techopedia","twitter_site":"@techopedia","twitter_misc":{"Written by":"Ray Fernandez","Est. reading time":"8 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#article","isPartOf":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots"},"author":{"name":"Ray Fernandez","@id":"https:\/\/www.techopedia.com\/#\/schema\/person\/687890072e5e44867899acd7d6176e6c"},"headline":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots","datePublished":"2025-01-07T13:49:08+00:00","mainEntityOfPage":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots"},"wordCount":1705,"commentCount":0,"publisher":{"@id":"https:\/\/www.techopedia.com\/#organization"},"image":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage"},"thumbnailUrl":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg","articleSection":"","inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#respond"]}],"copyrightYear":"2025","copyrightHolder":{"@id":"https:\/\/www.techopedia.com\/#organization"}},{"@type":"WebPage","@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","url":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots","name":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots - Techopedia","isPartOf":{"@id":"https:\/\/www.techopedia.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage"},"image":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage"},"thumbnailUrl":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg","datePublished":"2025-01-07T13:49:08+00:00","description":"Techopedia explores a simple, new AI jailbreak technique, as demonstrated by Unit 42, that can trick popular AI models into giving out potentially harmful information.","breadcrumb":{"@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#primaryimage","url":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg","contentUrl":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/01\/\u2018Bad-Likert-Judge-AI-Jailbreak-Tricks-Popular-Chatbots.jpg","width":1200,"height":686,"caption":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots"},{"@type":"BreadcrumbList","@id":"https:\/\/www.techopedia.com\/bad-likert-judge-ai-jailbreak-tricks-popular-chatbots#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.techopedia.com\/"},{"@type":"ListItem","position":2,"name":"Artificial Intelligence","item":"https:\/\/www.techopedia.com\/ai"},{"@type":"ListItem","position":3,"name":"Software Bots","item":"https:\/\/www.techopedia.com\/topic\/328\/software-bots"},{"@type":"ListItem","position":4,"name":"\u2018Bad Likert Judge\u2019 AI Jailbreak Tricks Popular Chatbots"}]},{"@type":"WebSite","@id":"https:\/\/www.techopedia.com\/#website","url":"https:\/\/www.techopedia.com\/","name":"Techopedia","description":"","publisher":{"@id":"https:\/\/www.techopedia.com\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.techopedia.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.techopedia.com\/#organization","name":"Techopedia","url":"https:\/\/www.techopedia.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.techopedia.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/02\/techopedia-light-logo.svg","contentUrl":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2025\/02\/techopedia-light-logo.svg","caption":"Techopedia"},"image":{"@id":"https:\/\/www.techopedia.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/techopedia\/","https:\/\/x.com\/techopedia","https:\/\/www.linkedin.com\/company\/techopedia\/","https:\/\/www.youtube.com\/c\/Techopedia"],"publishingPrinciples":"https:\/\/www.techopedia.com\/about\/editorial-policy","ownershipFundingInfo":"https:\/\/www.techopedia.com\/about"},{"@type":"Person","@id":"https:\/\/www.techopedia.com\/#\/schema\/person\/687890072e5e44867899acd7d6176e6c","name":"Ray Fernandez","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.techopedia.com\/#\/schema\/person\/image\/","url":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2023\/10\/avatar_user_286688_1698354675-300x300.jpg","contentUrl":"https:\/\/www.techopedia.com\/wp-content\/uploads\/2023\/10\/avatar_user_286688_1698354675-300x300.jpg","caption":"Ray Fernandez"},"description":"Ray is an independent journalist with 15 years of experience, focusing on the intersection of technology with various aspects of life and society. He joined Techopedia in 2023 after publishing in numerous media, including Microsoft, TechRepublic, Moonlock, Hackermoon, VentureBeat, Entrepreneur, and ServerWatch. He holds a degree in Journalism from Oxford Distance Learning and two specializations from FUNIBER in Environmental Science and Oceanography. When Ray is not working, you can find him making music, playing sports, and traveling with his wife and three kids.","sameAs":["https:\/\/www.linkedin.com\/in\/ray-fernandez\/"],"knowsAbout":["Senior Technology Journalist"],"url":"https:\/\/www.techopedia.com\/contributors\/rayfernandez"}]}},"modified_by":"arina livanenkova","_links":{"self":[{"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/posts\/375620","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/users\/286688"}],"replies":[{"embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/comments?post=375620"}],"version-history":[{"count":4,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/posts\/375620\/revisions"}],"predecessor-version":[{"id":375634,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/posts\/375620\/revisions\/375634"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/media\/375627"}],"wp:attachment":[{"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/media?parent=375620"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/categories?post=375620"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/tags?post=375620"},{"taxonomy":"category_partsoff","embeddable":true,"href":"https:\/\/www.techopedia.com\/wp-json\/wp\/v2\/category_partsoff?post=375620"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}