Peeking Behind the Curtain: How to Gauge GPT's Confidence
Discover how to gauge GPT's confidence using log probabilities. This post walks through a Python script that extracts and analyzes probabilities from GPT-4o's responses, helping you understand the certainty behind AI-generated answers.
So, why should you care about GPT's confidence? Think about it - wouldn't it be great to know if GPT is pretty darn sure about its answer, or if it's just taking a wild guess? This kind of insight can be extremely helpful, especially if you're using AI for important tasks like fact-checking or making decisions.
In this blog post, we'll explore how to peek behind the curtain and gauge OpenAI GPT-4o's confidence using log probabilities, with a step-by-step walkthrough of a Python script.
Introduction
Language models generate text by predicting the next word (or token) in a sequence. During this process, they assign probabilities to possible next tokens. By monitoring these probabilities, we can gauge the model's confidence in its choices.
In the context of fact-checking or binary decisions (e.g., "true" or "false"), extracting these probabilities can help us understand how confident the model is in its answer. This can be particularly useful when you need to make decisions based on the model's output or when you want to present users with not just an answer but also the confidence level behind it.
Understanding Log Probabilities
Log probabilities are the logarithm of probabilities. Models often work with log probabilities because they are more numerically stable and easier to compute, especially when dealing with very small probabilities.
- Probability (
p
): A value between 0 and 1. - Log Probability (
log(p)
): The natural logarithm of the probability.
Converting log probabilities back to probabilities involves exponentiating the log probability.
By analyzing log probabilities, we can compare the confidence levels between different tokens or responses.
The Script Overview
The script we'll discuss performs the following actions:
- Sends OpenAI API requests to
gpt-4o-2024-08-06
for fact-checking statements. - Extracts log probabilities to determine the model's confidence in labeling statements as "true" or "false."
- Outputs the confidence percentages.
Key Components Explained
Let's walk through the crucial parts of the script.
Importing Modules and Configuration
import asyncio
import logging
import httpx
from openai import AsyncOpenAI
import random
import os
import tiktoken
import numpy as np
# Configure the logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
c_handler = logging.StreamHandler()
logger.addHandler(c_handler)
# Retrieve OpenAI API credentials
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
OPENAI_API_BASE = os.getenv("OPENAI_API_BASE") or "https://api.openai.com/v1"
# Initialize the OpenAI client
client = AsyncOpenAI(
api_key=OPENAI_API_KEY,
base_url=OPENAI_API_BASE,
http_client=httpx.AsyncClient(http2=True),
max_retries=0,
timeout=3600,
)
- Logging: Sets up logging to track the script's progress.
- API Client: Initializes an asynchronous OpenAI API client with HTTP/2 support for better performance.
Token Biasing
# Specify the GPT model to use
model = "gpt-4o-2024-08-06"
# Define the expected JSON responses
preferred_answers = ['{"fact": true}', '{"fact": false}']
# Get the token encoding for the model
encoding = tiktoken.encoding_for_model(model)
# Encode preferred answers to tokens
preferred_tokens = encoding.encode_batch(preferred_answers)
# Extract unique tokens
unique_tokens = set(token for sublist in preferred_tokens for token in sublist)
# Set a positive bias value
logit_bias_value = 50
# Create a token bias dictionary
token_dict = {token: logit_bias_value for token in unique_tokens}
- Preferred Answers: Specifies the two target responses we expect from the model.
- Token Encoding: Uses
tiktoken
to encode the preferred answers into tokens. - Logit Bias: Applies a positive bias to the tokens corresponding to the preferred answers to make them more likely.
Making the API Request
# List of statements to be fact-checked
statements_to_check = [
"SEZC means Special Economic Zone Company.",
"SEZC means Special Economic Zone Corporation.",
]
# Process each statement
for statement_to_check in statements_to_check:
logger.debug(f"--------")
logger.debug(f"Checking statement: {statement_to_check}")
# Request a completion from the OpenAI API
response = await client.chat.completions.create(
response_format={"type": "json_object"},
model=model,
max_tokens=20,
logprobs=True,
top_logprobs=20,
logit_bias=token_dict,
stop=["rue", "alse", "}"],
seed=random.randint(1, 0x7FFFFFFF),
messages=[
{
"role": "system",
"content": 'Fact check the user\'s message and answer in a JSON object that follows {"fact": boolean}.',
},
{"role": "user", "content": statement_to_check},
],
temperature=0.0,
n=1,
)
- Statements: A list of statements we want to fact-check.
- API Request: Sends a request to the GPT model with specific parameters:
logprobs=True
: Requests log probabilities.top_logprobs=20
: Retrieves the top 20 token probabilities.logit_bias
: Applies the token bias.
Processing the Response
results = []
# Analyze each choice in the API response
for idx, choice in enumerate(response.choices):
odds = {"true": None, "false": None}
# Skip choices without log probabilities
if not choice.logprobs or not choice.logprobs.content:
continue
# Extract log probabilities for "true" and "false" tokens
for content_logprob in choice.logprobs.content:
token = content_logprob.token.lower()
logprob = content_logprob.logprob
for label in ["true", "false"]:
if label in token and odds[label] is None:
odds[label] = logprob
# Find the probability for the opposite label
other_label = "false" if label == "true" else "true"
if odds[other_label] is None:
sorted_top_logprobs = sorted(
content_logprob.top_logprobs,
key=lambda x: x.logprob,
reverse=True,
)
for top_logprob in sorted_top_logprobs:
if other_label in top_logprob.token.lower():
odds[other_label] = top_logprob.logprob
break
# Ensure both probabilities were found
if odds["true"] is None or odds["false"] is None:
continue
# Convert log probabilities to percentages
odds_of_true_as_percentage = np.round(np.exp(odds["true"]) * 100, 2)
odds_of_false_as_percentage = np.round(np.exp(odds["false"]) * 100, 2)
# Compile results
results.append(
{
"choice_index": idx,
"odds_of_true": odds["true"],
"odds_of_false": odds["false"],
"odds_of_true_as_percentage": odds_of_true_as_percentage,
"odds_of_false_as_percentage": odds_of_false_as_percentage,
}
)
# Log the results
for result in results:
logger.debug(f"Choice {result['choice_index']}:")
logger.debug(f" odds_of_true: {result['odds_of_true']}")
logger.debug(f" odds_of_false: {result['odds_of_false']}")
logger.debug(
f" odds_of_true_as_percentage: {result['odds_of_true_as_percentage']}%"
)
logger.debug(
f" odds_of_false_as_percentage: {result['odds_of_false_as_percentage']}%"
)
- Extracting Log Probabilities: Iterates over the response to find the top log probabilities associated with the tokens "true" and "false."
- Calculating Percentages: Converts the log probabilities to percentages to represent confidence levels.
- Output: Logs the confidence percentages for each choice.
Running the Script
Set API Credentials: Export your OpenAI API key as an environment variable:
export OPENAI_API_KEY='your-api-key'
Set Up Environment: Ensure you have Python 3.11 and install the required packages:
pip3.11 install httpx openai tiktoken numpy
To run the script:
python3.11 your_script_name.py
Sample Output:
--------
Checking statement: SEZC means Special Economic Zone Company.
Choice 0:
odds_of_true: -0.05496985
odds_of_false: -2.9299698
odds_of_true_as_percentage: 94.65%
odds_of_false_as_percentage: 5.34%
--------
Checking statement: SEZC means Special Economic Zone Corporation.
Choice 0:
odds_of_true: -1.8066396
odds_of_false: -0.18163955
odds_of_true_as_percentage: 16.42%
odds_of_false_as_percentage: 83.39%
Perfect! gpt-4o-2024-08-06
is correctly fact checking both of our statements. Now... what if we try to use the smaller gpt-4o-mini-2024-07-18
model?
--------
Checking statement: SEZC means Special Economic Zone Company.
Choice 0:
odds_of_true: -11.62527
odds_of_false: -0.0002702761
odds_of_true_as_percentage: 0.0%
odds_of_false_as_percentage: 99.97%
--------
Checking statement: SEZC means Special Economic Zone Corporation.
Choice 0:
odds_of_true: -10.750117
odds_of_false: -0.00011760922
odds_of_true_as_percentage: 0.0%
odds_of_false_as_percentage: 99.99%
This shows that the smaller model is confidently wrong, while the larger model is reasonably confident in providing the correct answers.
Conclusion
By accessing the log probabilities from GPT's response, we can quantify the model's confidence in its outputs. This approach is particularly useful for applications like fact-checking, where understanding the certainty behind an answer is crucial.
Full Script Code
Below is the complete script for reference.
#!/usr/bin/env python3.11
# -*- coding: utf-8 -*-
# Author: David Manouchehri
# Import necessary modules
import asyncio # For asynchronous programming and coroutine management
import logging # For structured logging and debugging
import httpx # For making asynchronous HTTP requests with HTTP/2 support
from openai import AsyncOpenAI # Asynchronous client for OpenAI API
import random # For generating random seeds
import os # For accessing environment variables
import tiktoken # For tokenizing text according to OpenAI's encoding schemes
import numpy as np # For numerical computations and array operations
# Configure the logger
logger = logging.getLogger(__name__) # Create a logger instance for this module
logger.setLevel(logging.DEBUG) # Set logging level to capture all messages
# Set up console output for logs
c_handler = logging.StreamHandler() # Create a handler for console output
logger.addHandler(c_handler) # Attach the handler to the logger
# Retrieve OpenAI API credentials from environment variables
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY") # API key for authentication
OPENAI_API_BASE = (
os.getenv("OPENAI_API_BASE") or "https://api.openai.com/v1"
) # API base URL, defaulting to OpenAI's if not specified
# Initialize the OpenAI client with custom HTTP settings
client = AsyncOpenAI(
api_key=OPENAI_API_KEY,
base_url=OPENAI_API_BASE,
http_client=httpx.AsyncClient(
http2=True, # Enable HTTP/2 for improved performance
limits=httpx.Limits(
max_connections=None, # Allow unlimited concurrent connections
max_keepalive_connections=None, # Keep all connections alive
keepalive_expiry=None, # Connections never expire
),
),
max_retries=0, # Disable automatic retries
timeout=3600, # Set a generous timeout of 1 hour
)
async def main():
# Specify the GPT model to use
model = "gpt-4o-2024-08-06"
# Define the expected JSON responses
preferred_answers = ['{"fact": true}', '{"fact": false}']
# Get the token encoding specific to the chosen model
encoding = tiktoken.encoding_for_model(model)
# Convert preferred answers to token representations
preferred_tokens = encoding.encode_batch(preferred_answers)
# Extract unique tokens from all preferred answers
unique_tokens = set(token for sublist in preferred_tokens for token in sublist)
# Set a positive bias value to increase likelihood of preferred tokens
logit_bias_value = 50
# Create a dictionary mapping each unique token to the bias value
token_dict = {token: logit_bias_value for token in unique_tokens}
# List of statements to be fact-checked
statements_to_check = [
"SEZC means Special Economic Zone Company.",
"SEZC means Special Economic Zone Corporation.",
]
# Process each statement
for statement_to_check in statements_to_check:
logger.debug(f"--------") # Visual separator in logs
logger.debug(f"Checking statement: {statement_to_check}")
# Request a completion from the OpenAI API
response = await client.chat.completions.create(
response_format={"type": "json_object"}, # Enforce JSON response
model=model,
max_tokens=20, # Limit response length
logprobs=True, # Request token probabilities
top_logprobs=20, # Get probabilities for top 20 alternative tokens
logit_bias=token_dict, # Apply token biases
stop=["rue", "alse", "}"], # Early stopping conditions
seed=random.randint(1, 0x7FFFFFFF), # Random seed for reproducibility
messages=[
{
"role": "system",
"content": 'Fact check the user\'s message and answer in a JSON object that follows {"fact": boolean}.',
},
{"role": "user", "content": statement_to_check},
],
temperature=0.0, # Use deterministic sampling
n=1, # Generate a single completion
)
results = []
# Analyze each choice in the API response
for idx, choice in enumerate(response.choices):
odds = {"true": None, "false": None}
# Skip choices without log probabilities
if not choice.logprobs or not choice.logprobs.content:
continue
# Extract log probabilities for "true" and "false" tokens
for content_logprob in choice.logprobs.content:
token = content_logprob.token.lower()
logprob = content_logprob.logprob
for label in ["true", "false"]:
if label in token and odds[label] is None:
odds[label] = logprob
# Find the probability for the opposite label
other_label = "false" if label == "true" else "true"
if odds[other_label] is None:
sorted_top_logprobs = sorted(
content_logprob.top_logprobs,
key=lambda x: x.logprob,
reverse=True,
)
for top_logprob in sorted_top_logprobs:
if other_label in top_logprob.token.lower():
odds[other_label] = top_logprob.logprob
break
# Ensure both probabilities were found
if odds["true"] is None:
raise ValueError("odds_of_true is None")
if odds["false"] is None:
raise ValueError("odds_of_false is None")
# Convert log probabilities to percentages
odds_of_true_as_percentage = np.round(np.exp(odds["true"]) * 100, 2)
odds_of_false_as_percentage = np.round(np.exp(odds["false"]) * 100, 2)
# Compile results for this choice
results.append(
{
"choice_index": idx,
"odds_of_true": odds["true"],
"odds_of_false": odds["false"],
"odds_of_true_as_percentage": odds_of_true_as_percentage,
"odds_of_false_as_percentage": odds_of_false_as_percentage,
}
)
# Log the results for each choice
for result in results:
logger.debug(f"Choice {result['choice_index']}:")
logger.debug(f" odds_of_true: {result['odds_of_true']}")
logger.debug(f" odds_of_false: {result['odds_of_false']}")
logger.debug(
f" odds_of_true_as_percentage: {result['odds_of_true_as_percentage']}%"
)
logger.debug(
f" odds_of_false_as_percentage: {result['odds_of_false_as_percentage']}%"
)
# Script entry point
if __name__ == "__main__":
asyncio.run(main()) # Execute the main coroutine