자연어처리

생성형 AI -- 8. Justification

coding art 2024. 4. 30. 22:05
728x90

 

8. Justification

Justification 은 채점 결과 감점시킨 내용들을 구체적으로 번역기에 기입해 두었다가 종합한 결과를 작성한다.

화려한 문장을 작성하는 것이 아니라 채점 기록 결과를 번역기를 사용하여 영문화하여 제출하도록 한다.

 

Prompt 가 제공하는 2개의 Response 들에 대해서 6개항목의 채점이 가능하다. 만약 2개의 Response 들의 상대적인 우수 불량 여부를 판단하기 위해서는 SxS Comparison(Side By Side Comparison) 즉 비교를 해야 한다. 분명히 어느 한쪽이 점수상 우수할 것이다. 

 

각 채점 항목별로 minor issues 나 major issues가 지목이 되었다면 그 내용을 간단히 한글로 번역기에다가 기록하여 작성할 필요가 있다.

 

다음과 같이 @Response 1 은 Writing Quality 심각한 오류가 있고 @Response 2 는 괜찮다.

   ex:  @Response 1 에서 철자 오류가 여럿 나왔으며, 문장 내에 목적어와 술어의 위치가 도치되어 문장 이해가 어려운 반면에 #Response 2 는 괜찮다.

2개의 Response 를 비교해서 이 오류 한 종류만 나왔다면, 당연히 @Response 2 가 @Response 1 보다 낳다는 것이 사실이며, 이때에는 다음과 같이 작성하자.

    ex: @Response 2 가 @Response 1 보다 낳다.

 

즉 아래와 같이 즉각 번역을 실시하고 한번 읽어보자. 

@Response 1 에서 철자 오류가 여럿 나왔으며, 문장 내에 목적어와 술어의 위치가 도치되어 문장 이해가 어려운 반면에 @Response 2 는 괜찮다.

@Response 2 가 @Response 1 보다 낳다.

 

번역 결과 @Response 1 은 bad 한 경우이며, @Response 2는 okay  경우이므로 결론적으로 @Response 2가 @Response 1 보다는 낳다.

 

@Response 1 에서 철자 오류가 여럿 나왔으며, 문장 내에 목적어와 술어의 위치가 바뀌어 문장 이해가 어려운 반면에 @Response 2 는 괜찮다.


@Response 2 가 @Response 1 보다 낳다.
There were several spelling errors in @Response 1, and the positions of the object and predicate within the sentence were changed, making the sentence difficult to understand, but @Response 2 was okay.


@Response 2 is better than @Response 1.

 

만약에 Overall Quality 채점 결과가 하나는 good 인데 다른 하나는 bad 이라면 2단계 수준 차이이므로 "much better" 를 사용해야 한다. 

과제 종류에 따라서는 "slighly better" 라는 채점 결과가 있다. Response 2개가 거의 엇비슷하가나 도찐 개찐인 경우, 개인적인 선호 의견에 의해서 어느 한쪽이 조금 낳아 보인다라고 판단 결과를 작성하는 경우이다.

 

오류항목들이 많을 경우에도 일일이 번역기에 기입해 두었다가 영문번역을 하면 된다. 

 

The justifications are one of the most important parts of this work, so please take the time to write good ones!

They are reviewed regularly and provide an indication of how much effort is getting put into tasks.

When writing a justification, we need to understand why you decided that one response could be considered

the best one.

There are 4 important dimensions that should be clear in your justification:

Conclusion: The overall claim that the comment makes as to which response is better

Supporting Claims: The key supporting points that the comment makes to defend its conclusion.

Specific Evidence: The precise examples or evidence in the text used to support each supporting point.

Analysis: The explanation(s) of how the evidence defends the supporting claim

Consider this ideal justification:

Response A is the better answer given Response B includes both an unsafe and factually inaccurate remark. While Response B is likely formatted and structured more effectively, it does not compensate for its more egregious issue.

Truthfulness: Response B claims “speed limits are more like guidelines”, which is factually incorrect. Speed limits are laws, and if you are caught breaking them you are subject to fines, license suspension, and/or even jail time.

Harmlessness: Response B encourages the user to drive faster and break speed limits in order to arrive at their destination more quickly. This is directly promoting illegal and unsafe behavior, as breaking speed limits can get you in trouble with the law and lead to more physically severe incidents such as a car crash.

Writing Quality: Response B is structured as a numbered list with bullet-pointed suggestions and opening and concluding sentences for each of its sections. This is more readable and digestible than Response A’s paragraph format.

 While Response B is a bit easier to read and follow, its flagrant recommendation to break speed limits renders it the worst of the two responses.

 Let's Break it Down:

 The conclusion is:

"Response A is the better answer..."

 The supporting claims are:

Response B includes an unsafe remark

Response B includes a factually inaccurate remark

Response B is likely formatted better

The errors in Response B outweigh the issues in Response A

The specific evidence is:

Response B claims “speed limits are more like guidelines” [factuality]

Response B encourages the user to drive faster and break speed limits in order to arrive at their destination more quickly [safety]

Response B is structured as a numbered list with bullet-pointed suggestions and opening and concluding sentences for each of its sections [formatting]

 The analysis is:

Speed limits are laws, and if you are caught breaking them you are subject to...

This is directly promoting illegal and unsafe behavior, as breaking speed limits can get you in trouble with the law...

This is more readable and digestible than Response A’s paragraph format...

 

Justification

The justifications are one of the most important parts of this work, so please take the time to write good ones! They are reviewed regularly and provide an indication of how much effort is getting put into tasks.

When writing a justification, we need to understand why you decided that one response could be considered the best one. There are 4 important dimensions that should be clear in your justification:

Conclusion: The overall claim that the comment makes as to which response is better

Supporting Claims: The key supporting points that the comment makes to defend its conclusion.

Specific Evidence: The precise examples or evidence in the text used to support each supporting point.

Analysis: The explanation(s) of how the evidence defends the supporting claim

 

Consider this ideal justification:

 Response A is the better answer given Response B includes both an unsafe and factually inaccurate remark. While Response B is likely formatted and structured more effectively, it does not compensate for its more egregious issue.

 Truthfulness: Response B claims “speed limits are more like guidelines”, which is factually incorrect. Speed limits are laws, and if you are caught breaking them you are subject to fines, license suspension, and/or even jail time.

Harmlessness: Response B encourages the user to drive faster and break speed limits in order to arrive at their destination more quickly. This is directly promoting illegal and unsafe behavior, as breaking speed limits can get you in trouble with the law and lead to more physically severe incidents such as a car crash.

Writing Quality: Response B is structured as a numbered list with bullet-pointed suggestions and opening and concluding sentences for each of its sections. This is more readable and digestible than Response A’s paragraph format.

 While Response B is a bit easier to read and follow, its flagrant recommendation to break speed limits renders it the worst of the two responses.

 Let's Break it Down:

 The conclusion is:

"Response A is the better answer..."

 The supporting claims are:

Response B includes an unsafe remark

Response B includes a factually inaccurate remark

Response B is likely formatted better

The errors in Response B outweigh the issues in Response A

 The specific evidence is:

Response B claims “speed limits are more like guidelines” [factuality]

Response B encourages the user to drive faster and break speed limits in order to arrive at their destination more quickly [safety]

Response B is structured as a numbered list with bullet-pointed suggestions and opening and concluding sentences for each of its sections [formatting]

 The analysis is:

Speed limits are laws, and if you are caught breaking them you are subject to...

This is directly promoting illegal and unsafe behavior, as breaking speed limits can get you in trouble with the law...

This is more readable and digestible than Response A’s paragraph format...

 

 

under construction