Add Julia AI voice integration documentation

2026-01-18 20:17:30 -08:00 · 2026-01-18 20:17:30 -08:00 · 20d5f42114
commit 20d5f42114
parent 059bc29b6b
1 changed files with 279 additions and 0 deletions
--- a/docs/JULIA-AI-VOICE-INTEGRATION.md
+++ b/docs/JULIA-AI-VOICE-INTEGRATION.md
@ -0,0 +1,279 @@
 # Julia AI Voice Integration
 ## Architecture Overview
 ```
 ┌─────────────────────────────────────────────────────────────────┐
 │                    WellNuo Lite App (iOS)                       │
 │  ┌─────────────────────────────────────────────────────────┐   │
 │  │  Voice Call Screen (app/voice-call.tsx)                  │   │
 │  │  - useLiveKitRoom hook                                   │   │
 │  │  - Audio session management                              │   │
 │  │  - Microphone permission handling                        │   │
 │  └───────────────────────┬─────────────────────────────────┘   │
 │                          │ WebSocket + WebRTC                   │
 └──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                   LiveKit Cloud                                  │
 │  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
 │  │  SFU Server     │  │  Room Mgmt      │  │  Agent Hosting  │  │
 │  │  (WebRTC)       │  │  (Token Auth)   │  │  (Python)       │  │
 │  └────────┬────────┘  └─────────────────┘  └────────┬────────┘  │
 │           │                                          │          │
 │           └──────────────────────────────────────────┘          │
 │                          │ Audio Streams                        │
 └──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
 ┌─────────────────────────────────────────────────────────────────┐
 │                  Julia AI Agent (Python)                         │
 │  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
 │  │ Deepgram    │  │ Deepgram    │  │ WellNuo voice_ask API   │  │
 │  │ STT         │  │ TTS         │  │ (Custom LLM backend)    │  │
 │  │ (Nova-2)    │  │ (Aura)      │  │                         │  │
 │  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
 └─────────────────────────────────────────────────────────────────┘
 ```
 ## Components
 ### 1. React Native Client
 **Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts`
 **Dependencies:**
 - `@livekit/react-native` - LiveKit React Native SDK
 - `@livekit/react-native-webrtc` - WebRTC for React Native
 - `expo-av` - Audio session management
 **Key Features:**
 - Connects to LiveKit room with JWT token
 - Manages audio session (activates speaker mode)
 - Handles microphone permissions
 - Displays connection state and transcription
 ### 2. LiveKit Cloud
 **Project:** `live-kit-demo-70txlh6a`
 **Agent ID:** `CA_Yd3qcuYEVKKE`
 **Configuration:**
 - Auto-scaling agent workers
 - Managed STT/TTS through inference endpoints
 - Built-in noise cancellation
 **Getting Tokens:**
 ```typescript
 // From WellNuo backend
 const response = await fetch('/api/livekit/token', {
  method: 'POST',
  body: JSON.stringify({ roomName, userName })
 });
 const { token, url } = await response.json();
 ```
 ### 3. Julia AI Agent (Python)
 **Location:** `julia-agent/julia-ai/src/agent.py`
 **Stack:**
 - LiveKit Agents SDK
 - Deepgram Nova-2 (STT)
 - Deepgram Aura Asteria (TTS - female voice)
 - Silero VAD (Voice Activity Detection)
 - Custom WellNuo LLM (voice_ask API)
 ## Setup & Deployment
 ### Prerequisites
 1. **LiveKit Cloud Account**
   - Sign up at https://cloud.livekit.io/
   - Create a project
   - Get API credentials
 2. **LiveKit CLI**
   ```bash
   # macOS
   brew install livekit-cli
   # Login
   lk cloud auth
   ```
 ### Agent Deployment
 1. **Navigate to agent directory:**
   ```bash
   cd julia-agent/julia-ai
   ```
 2. **Install dependencies:**
   ```bash
   uv sync
   ```
 3. **Configure environment:**
   ```bash
   cp .env.example .env.local
   # Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
   ```
 4. **Local development:**
   ```bash
   uv run python src/agent.py dev
   ```
 5. **Deploy to LiveKit Cloud:**
   ```bash
   lk agent deploy
   ```
 ### React Native Setup
 1. **Install packages:**
   ```bash
   npm install @livekit/react-native @livekit/react-native-webrtc
   ```
 2. **iOS permissions (Info.plist):**
   ```xml
   <key>NSMicrophoneUsageDescription</key>
   <string>WellNuo needs microphone access for voice calls with Julia AI</string>
   ```
 3. **Pod install:**
   ```bash
   cd ios && pod install
   ```
 ## Flow Diagram
 ```
 User opens Voice tab
        │
        ▼
 Request microphone permission
        │
        ├─ Denied → Show error
        │
        ▼
 Get LiveKit token from WellNuo API
        │
        ▼
 Connect to LiveKit room
        │
        ▼
 Agent joins automatically (LiveKit Cloud)
        │
        ▼
 Agent sends greeting (TTS)
        │
        ▼
 User speaks → STT → WellNuo API → Response → TTS
        │
        ▼
 User ends call → Disconnect from room
 ```
 ## API Integration
 ### WellNuo voice_ask API
 The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary.
 **Endpoint:** `https://eluxnetworks.net/function/well-api/api`
 **Authentication:**
 ```python
 data = {
    "function": "credentials",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "ps": WELLNUO_PASSWORD,
    "nonce": str(random.randint(0, 999999)),
 }
 ```
 **Voice Ask:**
 ```python
 data = {
    "function": "voice_ask",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "token": token,
    "question": user_message,
    "deployment_id": DEPLOYMENT_ID,
 }
 ```
 ## Troubleshooting
 ### Common Issues
 1. **No audio playback on iOS**
   - Check audio session configuration
   - Ensure `expo-av` is properly configured
   - Test on real device (simulator has audio limitations)
 2. **Microphone not working**
   - Verify permissions in Info.plist
   - Check if user granted permission
   - Real device required for full audio testing
 3. **Agent not responding**
   - Check agent logs: `lk agent logs`
   - Verify LIVEKIT credentials
   - Check WellNuo API connectivity
 4. **Connection fails**
   - Verify token is valid
   - Check network connectivity
   - Ensure LiveKit URL is correct
 ### Debugging
 ```bash
 # View agent logs
 lk agent logs
 # View specific deployment logs
 lk agent logs --version v20260119031418
 # Check agent status
 lk agent list
 ```
 ## Environment Variables
 ### Agent (.env.local)
 ```
 LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
 LIVEKIT_API_KEY=your-api-key
 LIVEKIT_API_SECRET=your-api-secret
 WELLNUO_USER=anandk
 WELLNUO_PASSWORD=anandk_8
 DEPLOYMENT_ID=21
 ```
 ### React Native (via WellNuo backend)
 Token generation handled server-side for security.
 ## Status
 **Current State:** WIP - Not tested on real device
 **Working:**
 - Agent deploys to LiveKit Cloud
 - Agent connects to rooms
 - STT/TTS pipeline configured
 - WellNuo API integration
 - React Native UI
 **Needs Testing:**
 - Real device microphone capture
 - Audio playback on physical iOS device
 - Full conversation loop end-to-end
 - Token refresh/expiration handling