자연어처리

생성형 AI 가 제공하는 Response 문장의 4. Truthfulness(Accuracy)

coding art 2024. 4. 30. 16:27
728x90

구글 문서 형태로 제공되는 요령 또는 규칙 모음(rubric) 을 필요시 볼 수 있도록 크롬 북마킹 해두었다가 항상 과제 작업에 앞서  불러 올려둘 필요가 있다.

4. Truthfulness

Prompt 에 의해 생성된 Response의 사실확인 또는 정확성 분석 채점은 앞서의 Instruction Following 과 함께 가장 채점 비중이 높은 항목이다. 채점 작업 과정에서 Writing Quality 와 Verbosity 는 한번 읽어 본 후 즉각 채점이 가능하며, Instruction Following 은 Prompt 를 되뇌이면서 제대로 했는지 조금 궁리를 해보면 채점이 가능하다. 

반면에 Truthfulness(Accuracy) 는 반드시 구글링이나 네이버링을 통해서 뒷받침이 되고 있는지 확인을 해야 된다. 

즉 Response 에 포함된 내용이 사실인지(True) 아니면 거짓인지(False) 10분 이내에 확인 작업이 반드시 필요하다. 아울러 수치 데이터의 경우도 진가여부 또는 몇 % 정확도인지 확인이 필요하다. 훈련과정의 과제에는 시험자가 확인 작업을 하는지 속이기 위해서 인위적으로 틀린 수치 데이터나 날짜 데이터를 인위적으로 슬쩍 입력해두는 경우가 허다하므로 단지 내용의 흐름으로 믿고 처리하면 절대 안된다.

만약 10분 이상의 시간을 투자했음에도 확인이 안된다면 이는 minor issues 또는 major issues로 판단해 감점을 적용해야 한다.

 

핵심 내용은 위 9줄에 정리가 되었지만 훈련과정에서 영문 내용을 읽어보고 어떤 내용들이 들어 있는지 알아 두도록 하자.

아래의 영문 내용들은 다 구글 문서로 제공되므로 반드시 북마크해 걸어두고 필요하면 열어서 보도록 하자.

 

When training state-of-the-art AI models, it’s extremely important to address factual errors or misinformation. LLMs are known to "hallucinate" or make up information that isn't true. It is your job to determine if a model response contains any factual inaccuracies.

 

We think about Truthfulness in two ways:

  1. Verifiable Facts
  2. Misleading Information

1. Verifiable Facts 🚫📝

What’s a Verifiable Fact?(입증 가능한 사실이란 무엇인가?)

  • A verifiable fact is something that is true (or false) regardless of personal feelings, interpretations, or opinions. It can be confirmed by evidence or observation that is consistent and repeatable. Some examples are:
  • George Washington was the first president of the United States --> true
  • The Earth is flat --> false
  • Spain is a city in Europe --> false

How to Spot a Verifiable Fact

You can identify a verifiable fact by asking yourself: "Is this statement(진술) or assertion(주장) definitively true or false? " Opinions, perspectives, feelings, etc. are not verifiable facts.(의견, 전망, 느낌 등은 입증가능하지 않다)

 

To be considered verifiable, a fact generally needs to be:

  1. Objective: The fact should be independent of personal feelings or opinions.
  2. Observable: There should be a way to observe or measure the fact.
  3. Repeatable: The observations or measurements can be repeated by others with consistent results.
  4. Documentable: There should be records or documentation that can be reviewed or checked.

Here are two sentences to consider:

  • "The heart pumps blood through the body." — This is a verifiable fact because it can be confirmed by medical science through observation, experiments, and medical imaging.
  • "Paris is the most beautiful city in the world." — This is not a verifiable fact but an opinion, as "beauty" is subjective and cannot be objectively measured or verified.(아름다움은 평가할 수 있겠지만 입증가능하지는 않다)

How to Validate Whether a Verifiable Fact is True or False

  • Google it! Or use your search engine of choice. Most of the time you will be able to clearly tell whether the fact is true or not. If you can’t validate one way or the other, chances are the statement or assertion is not a verifiable fact. 
  • Check Your Sources: Make sure the information is coming from and confirmed by multiple sources.
  • Be wary of lists: sometimes a response will contain a lot of verifiable pieces of information, make sure to check each of them

2. Misleading Information 🔍💬

 

What is Misleading Information?

We define Misleading Information is anything presented as a verifiable fact that cannot actually be verified. Model responses may present opinions as verifiable facts, or assert as fact something that has no proof of being true (or false).

 

Consider the following examples:

  • "Nike has the best running shoes" --> this is an opinion, since the definition of "best" is subjective.
  • By contrast: “Nike sells the most running shoes” is a statement that could be verified. 
  • "The vegan diet is the healthiest" --> this asserts as a fact something that cannot be wholly verified.
  • By contrast: “Some people believe the vegan diet is healthiest” is a statement that could be verified.

 

How to Spot Them

  • Look for Extreme Words or Superlatives: Words like "always," "never," or "all" can be signs of overgeneralization.
  • Compare and Contrast: See if other reliable sources say the same thing. You might be dealing with misleading info if they’re not 
  • If You Can't Verify the Statement Quickly: Spend at most 30 seconds checking if the statement is true or false. If you aren’t able to verify the information one way or the other, it is likely a sign that it is not a verifiable fact and that the model may have presented misleading information. 

 

Note:

Truthfulness vs. Other Evaluation Criteria

  • A mistake in Truthfulness is WORSE for our purposes than problems with Writing Quality and Verbosity. In general, you should weigh Truthfulness heavily when ranking two responses since Truthfulness mistakes are critical failures of the model. 

 

Truthfulness Rating Examples

1. Objective Fact

Prompt: "List the first five U.S. Presidents and their years in office."

Bad Response: 

  • George Washington (1789-1797)
  • John Adams (1797-1801)
  • Thomas Jefferson (1801-1809)
  • James Madison (1809-1817)
  • Andrew Jackson (1829-1837)
  • 🛑 Andrew Jackson was not the fifth U.S. President; it was James Monroe (1817-1825). This is a significant factual error.

 

2. Misleading Information

Prompt: "What are the health benefits of consuming sugar?"

Bad Response: "Eating sugar is healthy for your overall health and provides necessary fuel for effective brain and muscle functioning" ❌

  • This statement is misleading because it fails to qualify the source and amount of sugar. While sugars are healthy when eaten in appropriate amounts, excess sugar intake is generally associated with negative health impacts, such as increased risk of obesity, diabetes, and heart disease.

 

Truthfulness Quiz

Rate the truthfulness of Response A according to the rubric.

---

Prompt:

Is there an actual rule that women cannot play in the NFL?

 

Response A:

Yes, in the NFL, there is a code prohibiting women from playing.

---

Article XVII, Section 9A of the National Football League Handbook:

"In the interest of maintaining the traditional structure and competitive standards of the National Football League (NFL), it is hereby decreed under Article XVII, Section 9A, that the participation as players in official league games shall be exclusively reserved for individuals of the male gender. This measure ensures the continuation of historical standards of physical competition and athletic performance benchmarks set forth since the league's inception.

 

Furthermore, the NFL reaffirms its commitment to gender equality and inclusivity in all non-competitive roles, including coaching, management, and administrative positions within the league and its associated teams."

---

The NFL has a specific clause that restricts player roles based on gender while also highlighting an attempt to balance this with inclusivity in other aspects of the sport.

 

 

Response B:

In the NFL, there is no rule that explicitly prohibits women from playing. According to information from a source focused on the NFL, women are not restricted from participating in the league by any specific rule. This means that, theoretically, women could compete in the NFL if they meet the same standards and criteria that apply to male players. However, it's important to note that, as of now, no woman has ever been listed on an NFL team's roster for regular season games. This lack of representation could be due to various factors, including physical differences and the historical development of the sport as predominantly male.

 

 

Truthfulness Quiz - Follow-Up

Below is an example of how someone might explain the rating/ranking of the two responses

---

Response B is much better than Response A, since the latter has an inaccurate, made-up policy. The clause cited does not exist and is not part of NFL guidelines. Response B is correct in that no explicit rule prevents women from competing in the NFL.