Skip to main content

Criteria Quality Framework

Follow this framework to measure the quality of the AI prompt provided. Below is a list of what should be included in each prompt for each question type. Follow this framework to determine whether the prompt is high quality, mid quality, or low quality:

Yes/No Question Principles

  1. Does the prompt start by identifying the exact part of the conversation the AI should reference to provide the score? For example:
    • “In the greeting, did the user…”
    • “While explaining price, did the user…”
    • “After the AI provided their second objection, did the user…”
  2. After identifying where in the conversation to reference to score, does the prompt ask a clear, well-defined, direct question that can be answered with a “Yes” or “No”?
  3. Does the prompt clearly define actions that would result in a negative score?
  4. Is every potentially ambiguous word or concept important to the scoring explicitly defined?
  5. Should a “Yes” result in a positive score? And should a “No” result in a negative score?
  6. Did the prompt clearly identify the role of the person that should be scored? Did the prompt also avoid using general words to refer to the person being scored such as “they,” “you,” etc. Positive examples could include:
    • “Did the manager…”
    • “Did the rep…”
    • “Did the CSR…”
  7. Does the prompt provide examples of what would be considered a passing score? (Ignore if the prompt is asking the user to avoid doing something)
  8. (If relevant) Does the prompt include examples of what would result in a negative score?

Range Question Principles

  1. Does the prompt start by identifying the exact part of the conversation the AI should reference to provide the score? For example:
    • “In the greeting, did the user…”
    • “While explaining price, did the user…”
    • “After the AI provided their second objection, did the user…”
  2. After identifying where in the conversation to reference to score, does the prompt ask a clear, well-defined, question that can be answered with a number on a scale of 1-5?
  3. Does the prompt clearly define what would result in a score of “5”?
  4. Does the prompt give examples of what would result in a score of “5”?
  5. Does the prompt clearly define what would result in a score of “1”?
  6. Does the prompt give examples of what would result in a score of “1”?
  7. Is every potentially ambiguous word or concept important to the scoring explicitly defined?
  8. Should a score of “5” result in the user getting full points? And should a “1” result in the user getting zero points?
  9. Did the prompt clearly identify the role of the person that should be scored? Did the prompt also avoid using general words to refer to the person being scored such as “they,” “you,” etc. Positive examples could include:
    • “Did the manager…”
    • “Did the rep…”
    • “Did the CSR…”
I