ASCII Art Can Be Used To Generate Harmful Responses From AI Chatbots



Researchers have discovered a weak spot for AI chatbots: ASCII art.ASCII was created during the early days of printers when the printers were unable to handle graphics. ASCII art are images that have been pieced together from 95 printable characters in the ASCII Standard from 1963. The images were also used early on in email when images were not able to be embedded into messages.Here’s an ASCII image of a cat:

(Creit: ASCII Art Archive)

  And one of a DJ from the ASCII Art Archive:

(Credit: ASCII Art Archive)

While AI chatbots are, in general, trained to not provide responses that could cause harm to the user or others, researchers have discovered that many chat-based large language models including GPT-4  become distracted when trying to process the images, so much so that they don’t enforce the rules they have in place for blocking harmful responses, Ars Technica reports.To get around the rules, the researchers replace one word in a query with an ASCII drawing of the word instead.

Recommended by Our Editors

(Credit: ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs)

The group published their results in a paper this month. The group tested the theory on SPT-3.5, GPT-4, Claude (v2), Gemini Pro and Llama2 and said its goal in the paper was meerly to point out vulnerabilities in LLMs and to advance the safety of those LLMs when they are operating under adversarial conditions.”This paper reveals the limitations and potential vulnerabilities of the existing LLMs if the training corpora are interpreted using semantics only,” the group said in the paper. “We acknowledge that the vulnerabilities of LLMs and prompts demonstrated in this paper can be repurposed or misused by malicious entities to attack LLMs. We will disseminate the code and prompts used in our experiments to the community, hoping that they will further assist in the redteaming of LLM.”

Get Our Best Stories!
Sign up for What’s New Now to get our top stories delivered to your inbox every morning.

This newsletter may contain advertising, deals, or affiliate links. Subscribing to a newsletter indicates your consent to our Terms of Use and Privacy Policy. You may unsubscribe from the newsletters at any time.

We will be happy to hear your thoughts

Leave a reply

Pulsethrivehub
Logo
Shopping cart