ℹ️   About
📜   Paper & Code
  

ChatProtect detects and removes hallucinations of LLMs when you chat with them.

Try out some examples or your own live chat:

 Biography
 Scientific
 History
 QA
 Sports️
 Math
 Adversarial
 Breakout
💬  Live Chat

LLM Chat
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. Tell me about Thomas Chapais [Hallucination removed] He was born in Quebec and studied law at Université Laval. After practicing law for several years, he became interested in history and began to publish articles and books on Canadian history. Chapais was appointed to the Senate of Canada. During his time in the Senate, he continued to write and lecture on Canadian history, particularly on the history of Quebec. In 1914, he published his most famous work, a two-volume history of Quebec titled Cours d'histoire du Canada, which was translated into English as A History of Canada. Chapais also played an important role in the establishment of the National Battlefields Commission, which was responsible for the preservation and promotion of historic sites in Quebec City. He was appointed as the first chairman of the commission in 1908 and served in that role until 1919. After leaving the Senate, Chapais was appointed as a judge of the Quebec Superior Court. He served in that role from 1920 until his retirement in 1933. Throughout his career, Chapais was recognized as a leading authority on Canadian history and played an important role in promoting the study and preservation of Canada's historic sites and heritage.
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. A sample of a radioactive isotope has an activity of 380 Bq. 960 hours later the activity has fallen to 11 Bq. What is the half-life of the isotope? We can use the following equation to relate the half-life (T1/2) to the initial activity (A0), final activity (A), and time elapsed (t): A = A0/2^(t/T1/2) We can use this equation to solve for T1/2. [Hallucination removed]
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. Give me a short description of Bellum Sicilum. [Hallucination removed] The war was fought over control of Sicily, which was a rich and strategically significant island at the time. The war saw a series of battles and ended in a victory for one side. [Hallucination removed]
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. What is the 20th letter of the Kyrillic alphabet? The model can not provide a reliable answer for this query.
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. Please write a short article about the 2019 game between Manchester and Chelsea The highly anticipated match between Manchester United and Chelsea in April 2019 did not disappoint, as both teams fought tooth and nail for the three points. The game between Manchester United and Chelsea in April 2019 was highly anticipated and both teams fought hard for the three points. Gonzalo Higuain had an effort saved by Manchester goalkeeper David de Gea. Manchester United soon found their feet, with midfielder Paul Pogba leading the charge with his passing and forward runs. The Frenchman created multiple chances for his teammates. [Hallucination removed] Chelsea tried to find an equalizer, but Manchester United's defense held firm, with De Gea pulling off some stunning saves to keep the lead intact. The game was not without controversy, with both teams disputing refereeing decisions throughout the match. [Hallucination removed] The game ended with a 1-0 victory for Manchester United, which allowed them to move further up the table in the Premier League. The match showcased the best of both teams, and it was a thrilling encounter for football fans around the world.
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. Can you tell me the 4th square root of 123? The model can not provide a reliable answer for this query.
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. What is the origin of fortune cookies? The origin of fortune cookies is a matter of some debate, with both Japan and the United States claiming to have invented them. Some sources suggest that they may have been inspired by a Japanese cookie called tsujiura senbei, which included a small slip of paper with a fortune or prediction inside. [Hallucination removed] Either way, fortune cookies gained popularity in Chinese restaurants throughout the U.S. in the mid-20th century and have become a staple in many Asian-inspired restaurants and snack aisles today.
You are chatting with an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect. Which issue of "Star Wars Insider" was released in December 2019? As an AI language model, I do not have access to real-time information and cannot provide the current status. [Hallucination removed]
You are chatting an LLM (ChatGPT in this case). Click a sentence to see the effect of ChatProtect.
ChatProtect
ChatProtect's mechainism to detect and remove LLM hallucinations.

# What is the goal of ChatProtect?

ChatProtect detects and removes hallucinated content from output of large language models (LLMs).

While LLMs are being increasingly integrated into daily life, they are prone to produce hallucinated information, ranging from factual inaccuracies to completely made up content. This limitation greatly threats LLMs' trustworthiness and practical usability. We propose ChatProtect, a simple yet effective approach to detect and remove hallucinated information from LLM-generated text.

# What type of hallucination does ChatProtect handle?

ChatProtect handles self-contradictory hallucinations, which occur when an LLM generates two logically contradictory sentences given the same context.

LLMs produce text by sampling from a probability distribution over the next token. This process introduces inherent uncertainty that can result in the generation of different sentences from the same context. While this uncertainty enhances generation diversity and creativity, it also brings the possibility of contradictory outcomes, as shown below. Among two contradictory choices, at least one must be factually incorrect. We refer to these instances as self-contradictory hallucinations and develop ChatProtect to address them.

Thomas Chapais was a Canadian historian and judge. He was born in  

# How does ChatProtect trigger hallucinations?

ChatProtect extracts atomic facts from each generated sentence and resamples a sentence for each fact.

To trigger self-contradictory hallucinations, we need to generate two sentences from the same LLM for the same fact. To achieve this, we first extrat a set of atomic facts from each original sentence generated by the LLM. Then, we query the LLM to resample an alternative sentence by filling a cloze test constructed from each fact.

Original He was born in Saint-Denis-sur-Richelieu, Quebec, and studied law at Université Laval in Quebec City.   Please generate a sentence that fills the gap in this triple: (He; was born in; _). Alternative He was born in Saint-Denis-de-la-Bouteillerie, Quebec.   Please generate a sentence that fills the gap in this triple: (He; studied law at; _). Alternative He studied law at Laval University.

# How does ChatProtect detect hallucinations?

ChatProtect queries ChatGPT to assess whether there is a contradiction between a generated sentence pair.

After the trigger step, ChatProtect asks ChatGPT if each generated sentence pair forms a logical contradiction. If the answer is yes, then a self-contradictory hallucination is found. We found that ChatGPT is highly accurate for this classification task. We also ask ChatGPT for providing a explanation for better interpretability.

Original He was born in Saint-Denis-sur-Richelieu, Quebec, and studied law at Université Laval in Quebec City.   Alternative He was born in Saint-Denis-de-la-Bouteillerie, Quebec, and studied law at Université Laval. Assessment by ChatGPT: Please explain whether these statements are contradictory. The statements about Thomas Chapais are contradictory as they provide different birthplaces for him. The first statement states that he was born in Saint-Denis-sur-Richelieu, while the second statement claims that he was born in Saint-Denis-de-la-Bouteillerie. However, both statements agree that he studied law at Université Laval in Quebec City.

# How does ChatProtect remove hallucinated content?

ChatProtect queries ChatGPT to remove conflicting information from contradictory sentence pairs.

After detecting a hallucination (i.e., a contradictory sentence pair), ChatProtect queries ChatGPT again to remove conflicting information from the sentence pair and generate a new sentence accordingly. ChatProtect also gives specific instructions to ChatGPT to preserve the fluency and informativeness of the sentence. In the end, the new sentence is placed back to the generated text.

Original He was born in Saint-Denis-sur-Richelieu, Quebec, and studied law at Université Laval in Quebec City.   Alternative He was born in Saint-Denis-de-la-Bouteillerie, Quebec, and studied law at Université Laval.   Please remove the contradictory information. Preserve as much remaining information as possible. Removal He was born in Quebec and studied law at Université Laval.

# Check out our paper to dive into the details!

Self-Contradictory Hallucinations of Large Language Models

Niels Mündler1, Jingxuan He1, Slobodan Jenko1, Martin Vechev1
1ETH Zurich