LLM Provider Issues

Solutions for AI model access, API errors, and response handling problems

Overview
API Key Issues
Rate Limiting
Model Availability
Response Errors
Provider-Specific Issues
Configuration Issues
See Also

Overview

LLM provider issues typically fall into these categories:

Category	Symptoms	Common Causes
Authentication	401/403 errors	Invalid or missing API key
Rate Limits	429 errors	Too many requests
Model	Model not found	Wrong model name, unavailable
Response	Parsing errors	Unexpected response format
Configuration	Connection errors	Wrong endpoint, provider settings

API Key Issues

Invalid API Key

Symptom:

AuthenticationError: Invalid API key provided

Solutions:

Verify API key is set:

# Check environment variable
echo $OPENAI_API_KEY
   
# Or for compatible APIs
echo $OPENAI_COMPATIBLE_API_KEY

Check key format:

# OpenAI keys start with "sk-"
# Ensure no extra whitespace
echo "$OPENAI_API_KEY" | cat -A

Set in .env file:
```
OPENAI_API_KEY=sk-your-key-here
```

Test API key:

curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY"

API Key Not Found

Symptom:

ValueError: OPENAI_API_KEY environment variable not set

Solutions:

Export in shell:

export OPENAI_API_KEY="sk-your-key-here"

Add to .env file:

# In project root .env
OPENAI_API_KEY=sk-your-key-here

Load .env in Python:

from dotenv import load_dotenv
load_dotenv()

Key Permissions

Symptom:

PermissionError: API key does not have access to this model

Solutions:

Check API key permissions in provider dashboard

Verify model access:

curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY" | jq '.data[].id'

Use a model you have access to:

OPENAI_MODEL=gpt-3.5-turbo  # Instead of gpt-4

Rate Limiting

Too Many Requests

Symptom:

RateLimitError: Rate limit exceeded. Please retry after X seconds.

Solutions:

Implement exponential backoff:

import time
from tenacity import retry, wait_exponential, stop_after_attempt
   
@retry(wait=wait_exponential(min=1, max=60), stop=stop_after_attempt(5))
def call_llm(prompt):
    return llm.invoke(prompt)

Reduce request frequency:

import time
   
for item in items:
    result = llm.invoke(item)
    time.sleep(1)  # Add delay between requests

Use batch processing:

# Process in batches with delays
batch_size = 10
for i in range(0, len(items), batch_size):
    batch = items[i:i+batch_size]
    results = process_batch(batch)
    time.sleep(5)  # Delay between batches

Token Limit Exceeded

Symptom:

InvalidRequestError: This model's maximum context length is 8192 tokens

Solutions:

Truncate input:

def truncate_messages(messages, max_tokens=6000):
    # Keep system message and recent messages
    return messages[:1] + messages[-5:]

Use a model with larger context:

OPENAI_MODEL=gpt-4-turbo  # 128k context

Summarize conversation history:

# Periodically summarize old messages
if len(messages) > 20:
    summary = summarize(messages[:-5])
    messages = [summary] + messages[-5:]

Quota Exceeded

Symptom:

QuotaExceededError: You have exceeded your monthly quota

Solutions:

Check usage in provider dashboard
Upgrade plan or add credits

Use a different provider temporarily:

OPENAI_COMPATIBLE_BASE_URL=https://alternative-api.com/v1
OPENAI_COMPATIBLE_API_KEY=your-alternative-key

Model Availability

Model Not Found

Symptom:

NotFoundError: The model 'gpt-4' does not exist

Solutions:

Check model name:

# List available models
curl https://api.openai.com/v1/models \
  -H "Authorization: Bearer $OPENAI_API_KEY" | jq '.data[].id'

Use correct model identifier:

# Correct names
OPENAI_MODEL=gpt-4
OPENAI_MODEL=gpt-4-turbo
OPENAI_MODEL=gpt-3.5-turbo

Check provider-specific model names:

# For Azure
AZURE_DEPLOYMENT_NAME=your-deployment-name
   
# For compatible APIs
OPENAI_COMPATIBLE_MODEL=your-model-name

Model Deprecated

Symptom:

DeprecationWarning: Model 'gpt-3.5-turbo-0301' is deprecated

Solutions:

Update to current model:

# Old
OPENAI_MODEL=gpt-3.5-turbo-0301
   
# New
OPENAI_MODEL=gpt-3.5-turbo

Check deprecation schedule in provider docs

Model Overloaded

Symptom:

ServiceUnavailableError: The model is currently overloaded

Solutions:

Retry with backoff:

@retry(wait=wait_exponential(min=1, max=60), stop=stop_after_attempt(3))
def call_llm(prompt):
    return llm.invoke(prompt)

Use fallback model:

try:
    result = gpt4.invoke(prompt)
except ServiceUnavailableError:
    result = gpt35.invoke(prompt)

Response Errors

JSON Parsing Error

Symptom:

JSONDecodeError: Expecting value: line 1 column 1

Solutions:

Check response format:

response = llm.invoke(prompt)
print(f"Raw response: {response}")

Handle non-JSON responses:

try:
    data = json.loads(response.content)
except json.JSONDecodeError:
    # Handle as plain text
    data = {"text": response.content}

Use structured output:

from langchain_core.output_parsers import JsonOutputParser
   
parser = JsonOutputParser()
chain = llm | parser

Structured Output Failure

Symptom:

OutputParserException: Could not parse LLM output

Solutions:

Check if provider supports structured output:

# Only OpenAI and Azure support with_structured_output
if provider in ["openai", "azure"]:
    llm = llm.with_structured_output(MySchema)
else:
    # Use JSON parsing fallback
    llm = llm | JsonOutputParser()

Improve prompt for better formatting:

prompt = """Return your response as valid JSON with this structure:
{
    "field1": "value",
    "field2": "value"
}
"""

Incomplete Response

Symptom: Response cuts off mid-sentence

Solutions:

Increase max_tokens:
```
llm = ChatOpenAI(max_tokens=4096)
```

Check for finish_reason:

response = llm.invoke(prompt)
if response.response_metadata.get("finish_reason") == "length":
    # Response was truncated
    pass

Streaming Errors

Symptom: Streaming stops unexpectedly

Solutions:

Handle stream interruptions:

try:
    async for chunk in llm.astream(prompt):
        yield chunk
except Exception as e:
    logger.error(f"Stream error: {e}")
    yield AIMessage(content="[Stream interrupted]")

Set appropriate timeouts:
```
llm = ChatOpenAI(request_timeout=120)
```

Provider-Specific Issues

OpenAI Issues

Common problems:

Organization ID required:
```
OPENAI_ORGANIZATION=org-your-org-id
```

API version mismatch:

# Use latest client
pip install --upgrade openai

Azure OpenAI Issues

Common problems:

Endpoint configuration:

AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-key
AZURE_DEPLOYMENT_NAME=your-deployment

API version:

AZURE_OPENAI_API_VERSION=2024-02-15-preview

OpenAI-Compatible APIs

Common problems:

Base URL configuration:

OPENAI_COMPATIBLE_BASE_URL=https://api.provider.com/v1
OPENAI_COMPATIBLE_API_KEY=your-key
OPENAI_COMPATIBLE_MODEL=model-name

Feature compatibility:

# Some features may not be supported
# Disable structured output for incompatible APIs
use_structured = provider in ["openai", "azure"]

Local Models (Ollama)

Common problems:

Ollama not running:

# Start Ollama
ollama serve
   
# Check status
curl http://localhost:11434/api/tags

Model not pulled:
```
ollama pull llama2
```

Configuration Issues

Wrong Provider Settings

Symptom: Connection errors or unexpected behavior

Solutions:

Check configuration in .env:

# For OpenAI
LLM_PROVIDER=openai
OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4
   
# For Azure
LLM_PROVIDER=azure
AZURE_OPENAI_ENDPOINT=https://...
AZURE_OPENAI_API_KEY=...
AZURE_DEPLOYMENT_NAME=...
   
# For compatible APIs
LLM_PROVIDER=openai_compatible
OPENAI_COMPATIBLE_BASE_URL=https://...
OPENAI_COMPATIBLE_API_KEY=...
OPENAI_COMPATIBLE_MODEL=...

Verify settings are loaded:

from OHMind_agent.config import get_settings
settings = get_settings()
print(settings.get_llm_config())

SSL/TLS Errors

Symptom:

SSLError: certificate verify failed

Solutions:

Update certificates:
```
pip install --upgrade certifi
```

For development only (not recommended for production):

import ssl
ssl._create_default_https_context = ssl._create_unverified_context

Proxy Issues

Symptom: Connection timeout behind corporate proxy

Solutions:

Set proxy environment variables:

export HTTP_PROXY=http://proxy:port
export HTTPS_PROXY=http://proxy:port

Configure in Python:

import os
os.environ["HTTP_PROXY"] = "http://proxy:port"
os.environ["HTTPS_PROXY"] = "http://proxy:port"

Diagnostic Script

#!/bin/bash
# llm_diagnostics.sh

echo "=== LLM Provider Diagnostics ==="

# Check environment variables
echo ""
echo "1. Environment Variables:"
echo "   LLM_PROVIDER: ${LLM_PROVIDER:-not set}"
echo "   OPENAI_API_KEY: ${OPENAI_API_KEY:+set (${#OPENAI_API_KEY} chars)}"
echo "   OPENAI_MODEL: ${OPENAI_MODEL:-not set}"

# Test OpenAI API
echo ""
echo "2. API Connectivity:"
if [ -n "$OPENAI_API_KEY" ]; then
    response=$(curl -s -o /dev/null -w "%{http_code}" \
        https://api.openai.com/v1/models \
        -H "Authorization: Bearer $OPENAI_API_KEY")
    if [ "$response" = "200" ]; then
        echo "   ✅ OpenAI API accessible"
    else
        echo "   ❌ OpenAI API returned HTTP $response"
    fi
else
    echo "   ⚠️  OPENAI_API_KEY not set"
fi

# Test model availability
echo ""
echo "3. Model Availability:"
if [ -n "$OPENAI_API_KEY" ]; then
    models=$(curl -s https://api.openai.com/v1/models \
        -H "Authorization: Bearer $OPENAI_API_KEY" | \
        jq -r '.data[].id' 2>/dev/null | grep -E "gpt-4|gpt-3.5" | head -5)
    if [ -n "$models" ]; then
        echo "   Available models:"
        echo "$models" | sed 's/^/     - /'
    else
        echo "   ❌ Could not list models"
    fi
fi

echo ""
echo "=== Diagnostics Complete ==="

LLM Provider Issues

Table of Contents

Overview

API Key Issues

Invalid API Key

API Key Not Found

Key Permissions

Rate Limiting

Too Many Requests

Token Limit Exceeded

Quota Exceeded

Model Availability

Model Not Found

Model Deprecated

Model Overloaded

Response Errors

JSON Parsing Error

Structured Output Failure

Incomplete Response

Streaming Errors

Provider-Specific Issues

OpenAI Issues

Azure OpenAI Issues

OpenAI-Compatible APIs

Local Models (Ollama)

Configuration Issues

Wrong Provider Settings

SSL/TLS Errors

Proxy Issues

Diagnostic Script

See Also