API Reference - AvaTar

Base URLs
Authentication
Core Endpoints
- Session Management
- Conversational AI
WebSocket Events
Utility Endpoints
Error Responses
SDK Examples
Best Practices

🔗 Base URLs

Environment	API Base URL	WebSocket URL
Local Development	`http://localhost:8000/v1`	`ws://localhost:8001/ws`
Production	`https://api.yourdomain.com/v1`	`wss://ws.yourdomain.com/ws`

🔐 Authentication

Note: Authentication is currently optional for development. Production deployments should implement proper authentication.

// Example: Bearer token authentication (when implemented)
headers: {
    'Authorization': 'Bearer YOUR_API_KEY',
    'Content-Type': 'application/json'
}

📍 Core Endpoints

Session Management

POST /v1/streaming.new

Create a new avatar session

Request Body

{
    "quality": "medium",      // "low" | "medium" | "high"
    "avatar_id": "avatar_1",  // Optional, defaults to "avatar_1"
    "voice_id": "EXAVITQu4vr4xnSDxMaL"  // Optional 11Labs voice ID
}

Response

{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "ice_servers": [
        {
            "urls": "stun:stun.l.google.com:19302"
        }
    ],
    "access_token": "eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9..."
}

POST /v1/streaming.start

Start avatar streaming for a session

Request Body

{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "sdp": {
        "type": "offer",
        "sdp": "v=0\r\no=- 46117..."
    }
}

Response

{
    "status": "started",
    "sdp": {
        "type": "answer",
        "sdp": "v=0\r\no=- 46117..."
    }
}

POST /v1/streaming.stop

Stop avatar streaming

Request Body

{
    "session_id": "550e8400-e29b-41d4-a716-446655440000"
}

Response

{
    "status": "stopped",
    "duration": 125.4  // Session duration in seconds
}

GET /v1/streaming.sessions

List all active sessions

Response

{
    "sessions": [
        {
            "session_id": "550e8400-e29b-41d4-a716-446655440000",
            "status": "active",
            "created_at": "2024-01-22T10:30:00Z",
            "duration": 45.2
        }
    ],
    "total": 1
}

Conversational AI

POST /v1/chat.completions

Send a message to the AI avatar

Request Body

{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "messages": [
        {
            "role": "user",
            "content": "Hello, how are you today?"
        }
    ],
    "stream": true,  // Enable streaming response
    "model": "gpt-4"  // Optional, defaults to configured model
}

Response (Streaming)

data: {"choices":[{"delta":{"content":"Hello!"}}]}
data: {"choices":[{"delta":{"content":" I'm"}}]}
data: {"choices":[{"delta":{"content":" doing"}}]}
data: [DONE]

POST /v1/avatar.speak

Make the avatar speak specific text

Request Body

{
    "session_id": "550e8400-e29b-41d4-a716-446655440000",
    "text": "Welcome to our interactive AI avatar system!",
    "voice_id": "EXAVITQu4vr4xnSDxMaL",  // Optional
    "voice_settings": {  // Optional
        "stability": 0.75,
        "similarity_boost": 0.75
    }
}

Response

{
    "task_id": "task_123456",
    "status": "processing",
    "duration_estimate": 2.5  // Estimated speech duration
}

🔄 WebSocket Events

Connection

// Connect to WebSocket
const ws = new WebSocket('ws://localhost:8001/ws/SESSION_ID');

// Connection established
ws.onopen = () => {
    console.log('Connected to avatar stream');
};

Client → Server Events

audio_buffer_append

Send audio data to be spoken by the avatar

{
    "type": "audio_buffer_append",
    "data": {
        "audio": "base64_encoded_audio_data"
    }
}

audio_buffer_commit

Commit the audio buffer and start playback

{
    "type": "audio_buffer_commit"
}

audio_buffer_clear

Clear the audio buffer

{
    "type": "audio_buffer_clear"
}

start_listening

Start listening for user speech

{
    "type": "start_listening"
}

stop_listening

Stop listening for user speech

{
    "type": "stop_listening"
}

interrupt

Interrupt current avatar speech

{
    "type": "interrupt"
}

Server → Client Events

frame

Video frame data

{
    "type": "frame",
    "data": "base64_encoded_jpeg",
    "frame_id": 12345,
    "timestamp": 1705920123.456
}

audio

Audio data to be played

{
    "type": "audio",
    "data": "base64_encoded_wav",
    "duration": 2.5,
    "sample_rate": 24000
}

chat

Chat messages (user/agent)

{
    "type": "chat",
    "role": "agent",
    "content": "Hello! How can I help you today?"
}

status

Status updates

{
    "type": "status",
    "status": "listening" | "speaking" | "idle",
    "message": "Avatar is now listening..."
}

error

Error messages

{
    "type": "error",
    "code": "AUDIO_PROCESSING_ERROR",
    "message": "Failed to process audio input"
}

🛠️ Utility Endpoints

GET /health

Health check endpoint

Response

{
    "status": "healthy",
    "version": "1.0.0",
    "uptime": 3600,
    "services": {
        "api": "healthy",
        "websocket": "healthy",
        "redis": "healthy",
        "gpu": "available"
    }
}

GET /metrics

System metrics

Response

{
    "active_sessions": 5,
    "total_sessions": 142,
    "cpu_usage": 45.2,
    "memory_usage": 62.8,
    "gpu_usage": 38.5,
    "websocket_connections": 5,
    "redis_memory": "245MB",
    "average_latency": 125  // milliseconds
}

❌ Error Responses

Status Code	Error Type	Description
400	Bad Request	Invalid request parameters
401	Unauthorized	Missing or invalid authentication
404	Not Found	Session or resource not found
429	Too Many Requests	Rate limit exceeded
500	Internal Server Error	Server error, check logs
503	Service Unavailable	Service temporarily unavailable

Error Response Format

{
    "error": {
        "code": "SESSION_NOT_FOUND",
        "message": "The specified session does not exist",
        "details": {
            "session_id": "550e8400-e29b-41d4-a716-446655440000"
        }
    }
}

💻 SDK Examples

JavaScript/TypeScript

import { AvatarClient } from '@avatar/sdk';

// Initialize client
const client = new AvatarClient({
    apiUrl: 'http://localhost:8000/v1',
    wsUrl: 'ws://localhost:8001/ws'
});

// Create session
const session = await client.createSession({
    quality: 'high',
    voice_id: 'EXAVITQu4vr4xnSDxMaL'
});

// Connect to WebSocket
await session.connect();

// Handle events
session.on('frame', (frameData) => {
    // Display video frame
    videoElement.src = `data:image/jpeg;base64,${frameData}`;
});

session.on('audio', (audioData) => {
    // Play audio
    const audioBlob = base64ToBlob(audioData);
    audioElement.src = URL.createObjectURL(audioBlob);
    audioElement.play();
});

// Send chat message
await session.chat('Hello, how are you?');

// Make avatar speak
await session.speak('Welcome to our demo!');

// Clean up
await session.disconnect();

Python

import asyncio
from avatar_sdk import AvatarClient

async def main():
    # Initialize client
    client = AvatarClient(
        api_url="http://localhost:8000/v1",
        ws_url="ws://localhost:8001/ws"
    )
    
    # Create session
    session = await client.create_session(
        quality="high",
        voice_id="EXAVITQu4vr4xnSDxMaL"
    )
    
    # Connect to WebSocket
    await session.connect()
    
    # Handle events
    @session.on("frame")
    async def on_frame(frame_data):
        # Process video frame
        pass
    
    @session.on("audio")
    async def on_audio(audio_data):
        # Process audio
        pass
    
    # Send chat message
    response = await session.chat("Hello, how are you?")
    print(f"Avatar: {response}")
    
    # Make avatar speak
    await session.speak("Welcome to our demo!")
    
    # Keep running
    await session.run_forever()

if __name__ == "__main__":
    asyncio.run(main())

cURL Examples

# Create session
curl -X POST http://localhost:8000/v1/streaming.new \
  -H "Content-Type: application/json" \
  -d '{
    "quality": "medium",
    "avatar_id": "avatar_1"
  }'

# Get session list
curl http://localhost:8000/v1/streaming.sessions

# Make avatar speak
curl -X POST http://localhost:8000/v1/avatar.speak \
  -H "Content-Type: application/json" \
  -d '{
    "session_id": "YOUR_SESSION_ID",
    "text": "Hello from cURL!"
  }'

# Health check
curl http://localhost:8000/health

✨ Best Practices

1. Session Management

Always clean up sessions when done
Implement reconnection logic for WebSocket
Handle session timeouts gracefully

2. Performance

Use appropriate quality settings for your use case
Implement frame dropping for slow connections
Buffer audio data before committing

3. Error Handling

Implement exponential backoff for retries
Log all errors with context
Provide user-friendly error messages

4. Security

Use HTTPS/WSS in production
Implement rate limiting
Validate all input data
Use authentication tokens

Rate Limits

Endpoint	Rate Limit	Window
Session Creation	10 requests	per minute
Avatar Speak	30 requests	per minute
Chat Messages	60 requests	per minute
WebSocket Messages	100 messages	per second

🔌 API Reference

Table of Contents

🔗 Base URLs

🔐 Authentication

📍 Core Endpoints

Session Management

POST /v1/streaming.new

Request Body

Response

POST /v1/streaming.start

Request Body

Response

POST /v1/streaming.stop

Request Body

Response

GET /v1/streaming.sessions

Response

Conversational AI

POST /v1/chat.completions

Request Body

Response (Streaming)

POST /v1/avatar.speak

Request Body

Response

🔄 WebSocket Events

Connection

Client → Server Events

audio_buffer_append

audio_buffer_commit

audio_buffer_clear

start_listening

stop_listening

interrupt

Server → Client Events

frame

audio

chat

status

error

🛠️ Utility Endpoints

GET /health

Response

GET /metrics

Response

❌ Error Responses

Error Response Format

💻 SDK Examples

JavaScript/TypeScript

Python

cURL Examples

✨ Best Practices

1. Session Management

2. Performance

3. Error Handling

4. Security

Rate Limits

Need Help?