Add Julia AI voice integration documentation

2026-01-18 20:17:30 -08:00 · 2026-01-18 20:17:30 -08:00 · 20d5f42114
commit 20d5f42114
parent 059bc29b6b
1 changed files with 279 additions and 0 deletions
--- a/docs/JULIA-AI-VOICE-INTEGRATION.md
+++ b/docs/JULIA-AI-VOICE-INTEGRATION.md
@ -0,0 +1,279 @@
+# Julia AI Voice Integration
+
+## Architecture Overview
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                    WellNuo Lite App (iOS)                       │
+│  ┌─────────────────────────────────────────────────────────┐   │
+│  │  Voice Call Screen (app/voice-call.tsx)                  │   │
+│  │  - useLiveKitRoom hook                                   │   │
+│  │  - Audio session management                              │   │
+│  │  - Microphone permission handling                        │   │
+│  └───────────────────────┬─────────────────────────────────┘   │
+│                          │ WebSocket + WebRTC                   │
+└──────────────────────────┼──────────────────────────────────────┘
+                           │
+                           ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                   LiveKit Cloud                                  │
+│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
+│  │  SFU Server     │  │  Room Mgmt      │  │  Agent Hosting  │  │
+│  │  (WebRTC)       │  │  (Token Auth)   │  │  (Python)       │  │
+│  └────────┬────────┘  └─────────────────┘  └────────┬────────┘  │
+│           │                                          │          │
+│           └──────────────────────────────────────────┘          │
+│                          │ Audio Streams                        │
+└──────────────────────────┼──────────────────────────────────────┘
+                           │
+                           ▼
+┌─────────────────────────────────────────────────────────────────┐
+│                  Julia AI Agent (Python)                         │
+│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
+│  │ Deepgram    │  │ Deepgram    │  │ WellNuo voice_ask API   │  │
+│  │ STT         │  │ TTS         │  │ (Custom LLM backend)    │  │
+│  │ (Nova-2)    │  │ (Aura)      │  │                         │  │
+│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+## Components
+
+### 1. React Native Client
+
+**Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts`
+
+**Dependencies:**
+- `@livekit/react-native` - LiveKit React Native SDK
+- `@livekit/react-native-webrtc` - WebRTC for React Native
+- `expo-av` - Audio session management
+
+**Key Features:**
+- Connects to LiveKit room with JWT token
+- Manages audio session (activates speaker mode)
+- Handles microphone permissions
+- Displays connection state and transcription
+
+### 2. LiveKit Cloud
+
+**Project:** `live-kit-demo-70txlh6a`
+**Agent ID:** `CA_Yd3qcuYEVKKE`
+
+**Configuration:**
+- Auto-scaling agent workers
+- Managed STT/TTS through inference endpoints
+- Built-in noise cancellation
+
+**Getting Tokens:**
+```typescript
+// From WellNuo backend
+const response = await fetch('/api/livekit/token', {
+  method: 'POST',
+  body: JSON.stringify({ roomName, userName })
+});
+const { token, url } = await response.json();
+```
+
+### 3. Julia AI Agent (Python)
+
+**Location:** `julia-agent/julia-ai/src/agent.py`
+
+**Stack:**
+- LiveKit Agents SDK
+- Deepgram Nova-2 (STT)
+- Deepgram Aura Asteria (TTS - female voice)
+- Silero VAD (Voice Activity Detection)
+- Custom WellNuo LLM (voice_ask API)
+
+## Setup & Deployment
+
+### Prerequisites
+
+1. **LiveKit Cloud Account**
+   - Sign up at https://cloud.livekit.io/
+   - Create a project
+   - Get API credentials
+
+2. **LiveKit CLI**
+   ```bash
+   # macOS
+   brew install livekit-cli
+
+   # Login
+   lk cloud auth
+   ```
+
+### Agent Deployment
+
+1. **Navigate to agent directory:**
+   ```bash
+   cd julia-agent/julia-ai
+   ```
+
+2. **Install dependencies:**
+   ```bash
+   uv sync
+   ```
+
+3. **Configure environment:**
+   ```bash
+   cp .env.example .env.local
+   # Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
+   ```
+
+4. **Local development:**
+   ```bash
+   uv run python src/agent.py dev
+   ```
+
+5. **Deploy to LiveKit Cloud:**
+   ```bash
+   lk agent deploy
+   ```
+
+### React Native Setup
+
+1. **Install packages:**
+   ```bash
+   npm install @livekit/react-native @livekit/react-native-webrtc
+   ```
+
+2. **iOS permissions (Info.plist):**
+   ```xml
+   <key>NSMicrophoneUsageDescription</key>
+   <string>WellNuo needs microphone access for voice calls with Julia AI</string>
+   ```
+
+3. **Pod install:**
+   ```bash
+   cd ios && pod install
+   ```
+
+## Flow Diagram
+
+```
+User opens Voice tab
+        │
+        ▼
+Request microphone permission
+        │
+        ├─ Denied → Show error
+        │
+        ▼
+Get LiveKit token from WellNuo API
+        │
+        ▼
+Connect to LiveKit room
+        │
+        ▼
+Agent joins automatically (LiveKit Cloud)
+        │
+        ▼
+Agent sends greeting (TTS)
+        │
+        ▼
+User speaks → STT → WellNuo API → Response → TTS
+        │
+        ▼
+User ends call → Disconnect from room
+```
+
+## API Integration
+
+### WellNuo voice_ask API
+
+The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary.
+
+**Endpoint:** `https://eluxnetworks.net/function/well-api/api`
+
+**Authentication:**
+```python
+data = {
+    "function": "credentials",
+    "clientId": "001",
+    "user_name": WELLNUO_USER,
+    "ps": WELLNUO_PASSWORD,
+    "nonce": str(random.randint(0, 999999)),
+}
+```
+
+**Voice Ask:**
+```python
+data = {
+    "function": "voice_ask",
+    "clientId": "001",
+    "user_name": WELLNUO_USER,
+    "token": token,
+    "question": user_message,
+    "deployment_id": DEPLOYMENT_ID,
+}
+```
+
+## Troubleshooting
+
+### Common Issues
+
+1. **No audio playback on iOS**
+   - Check audio session configuration
+   - Ensure `expo-av` is properly configured
+   - Test on real device (simulator has audio limitations)
+
+2. **Microphone not working**
+   - Verify permissions in Info.plist
+   - Check if user granted permission
+   - Real device required for full audio testing
+
+3. **Agent not responding**
+   - Check agent logs: `lk agent logs`
+   - Verify LIVEKIT credentials
+   - Check WellNuo API connectivity
+
+4. **Connection fails**
+   - Verify token is valid
+   - Check network connectivity
+   - Ensure LiveKit URL is correct
+
+### Debugging
+
+```bash
+# View agent logs
+lk agent logs
+
+# View specific deployment logs
+lk agent logs --version v20260119031418
+
+# Check agent status
+lk agent list
+```
+
+## Environment Variables
+
+### Agent (.env.local)
+```
+LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
+LIVEKIT_API_KEY=your-api-key
+LIVEKIT_API_SECRET=your-api-secret
+WELLNUO_USER=anandk
+WELLNUO_PASSWORD=anandk_8
+DEPLOYMENT_ID=21
+```
+
+### React Native (via WellNuo backend)
+Token generation handled server-side for security.
+
+## Status
+
+**Current State:** WIP - Not tested on real device
+
+**Working:**
+- Agent deploys to LiveKit Cloud
+- Agent connects to rooms
+- STT/TTS pipeline configured
+- WellNuo API integration
+- React Native UI
+
+**Needs Testing:**
+- Real device microphone capture
+- Audio playback on physical iOS device
+- Full conversation loop end-to-end
+- Token refresh/expiration handling