From 20d5f421149666270e47c933529bbb22fa896785 Mon Sep 17 00:00:00 2001 From: Sergei Date: Sun, 18 Jan 2026 20:17:30 -0800 Subject: [PATCH] Add Julia AI voice integration documentation --- docs/JULIA-AI-VOICE-INTEGRATION.md | 279 +++++++++++++++++++++++++++++ 1 file changed, 279 insertions(+) create mode 100644 docs/JULIA-AI-VOICE-INTEGRATION.md diff --git a/docs/JULIA-AI-VOICE-INTEGRATION.md b/docs/JULIA-AI-VOICE-INTEGRATION.md new file mode 100644 index 0000000..ba2bcb2 --- /dev/null +++ b/docs/JULIA-AI-VOICE-INTEGRATION.md @@ -0,0 +1,279 @@ +# Julia AI Voice Integration + +## Architecture Overview + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ WellNuo Lite App (iOS) │ +│ ┌─────────────────────────────────────────────────────────┐ │ +│ │ Voice Call Screen (app/voice-call.tsx) │ │ +│ │ - useLiveKitRoom hook │ │ +│ │ - Audio session management │ │ +│ │ - Microphone permission handling │ │ +│ └───────────────────────┬─────────────────────────────────┘ │ +│ │ WebSocket + WebRTC │ +└──────────────────────────┼──────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ LiveKit Cloud │ +│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ +│ │ SFU Server │ │ Room Mgmt │ │ Agent Hosting │ │ +│ │ (WebRTC) │ │ (Token Auth) │ │ (Python) │ │ +│ └────────┬────────┘ └─────────────────┘ └────────┬────────┘ │ +│ │ │ │ +│ └──────────────────────────────────────────┘ │ +│ │ Audio Streams │ +└──────────────────────────┼──────────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────────────┐ +│ Julia AI Agent (Python) │ +│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ +│ │ Deepgram │ │ Deepgram │ │ WellNuo voice_ask API │ │ +│ │ STT │ │ TTS │ │ (Custom LLM backend) │ │ +│ │ (Nova-2) │ │ (Aura) │ │ │ │ +│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +## Components + +### 1. React Native Client + +**Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts` + +**Dependencies:** +- `@livekit/react-native` - LiveKit React Native SDK +- `@livekit/react-native-webrtc` - WebRTC for React Native +- `expo-av` - Audio session management + +**Key Features:** +- Connects to LiveKit room with JWT token +- Manages audio session (activates speaker mode) +- Handles microphone permissions +- Displays connection state and transcription + +### 2. LiveKit Cloud + +**Project:** `live-kit-demo-70txlh6a` +**Agent ID:** `CA_Yd3qcuYEVKKE` + +**Configuration:** +- Auto-scaling agent workers +- Managed STT/TTS through inference endpoints +- Built-in noise cancellation + +**Getting Tokens:** +```typescript +// From WellNuo backend +const response = await fetch('/api/livekit/token', { + method: 'POST', + body: JSON.stringify({ roomName, userName }) +}); +const { token, url } = await response.json(); +``` + +### 3. Julia AI Agent (Python) + +**Location:** `julia-agent/julia-ai/src/agent.py` + +**Stack:** +- LiveKit Agents SDK +- Deepgram Nova-2 (STT) +- Deepgram Aura Asteria (TTS - female voice) +- Silero VAD (Voice Activity Detection) +- Custom WellNuo LLM (voice_ask API) + +## Setup & Deployment + +### Prerequisites + +1. **LiveKit Cloud Account** + - Sign up at https://cloud.livekit.io/ + - Create a project + - Get API credentials + +2. **LiveKit CLI** + ```bash + # macOS + brew install livekit-cli + + # Login + lk cloud auth + ``` + +### Agent Deployment + +1. **Navigate to agent directory:** + ```bash + cd julia-agent/julia-ai + ``` + +2. **Install dependencies:** + ```bash + uv sync + ``` + +3. **Configure environment:** + ```bash + cp .env.example .env.local + # Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET + ``` + +4. **Local development:** + ```bash + uv run python src/agent.py dev + ``` + +5. **Deploy to LiveKit Cloud:** + ```bash + lk agent deploy + ``` + +### React Native Setup + +1. **Install packages:** + ```bash + npm install @livekit/react-native @livekit/react-native-webrtc + ``` + +2. **iOS permissions (Info.plist):** + ```xml + NSMicrophoneUsageDescription + WellNuo needs microphone access for voice calls with Julia AI + ``` + +3. **Pod install:** + ```bash + cd ios && pod install + ``` + +## Flow Diagram + +``` +User opens Voice tab + │ + ▼ +Request microphone permission + │ + ├─ Denied → Show error + │ + ▼ +Get LiveKit token from WellNuo API + │ + ▼ +Connect to LiveKit room + │ + ▼ +Agent joins automatically (LiveKit Cloud) + │ + ▼ +Agent sends greeting (TTS) + │ + ▼ +User speaks → STT → WellNuo API → Response → TTS + │ + ▼ +User ends call → Disconnect from room +``` + +## API Integration + +### WellNuo voice_ask API + +The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary. + +**Endpoint:** `https://eluxnetworks.net/function/well-api/api` + +**Authentication:** +```python +data = { + "function": "credentials", + "clientId": "001", + "user_name": WELLNUO_USER, + "ps": WELLNUO_PASSWORD, + "nonce": str(random.randint(0, 999999)), +} +``` + +**Voice Ask:** +```python +data = { + "function": "voice_ask", + "clientId": "001", + "user_name": WELLNUO_USER, + "token": token, + "question": user_message, + "deployment_id": DEPLOYMENT_ID, +} +``` + +## Troubleshooting + +### Common Issues + +1. **No audio playback on iOS** + - Check audio session configuration + - Ensure `expo-av` is properly configured + - Test on real device (simulator has audio limitations) + +2. **Microphone not working** + - Verify permissions in Info.plist + - Check if user granted permission + - Real device required for full audio testing + +3. **Agent not responding** + - Check agent logs: `lk agent logs` + - Verify LIVEKIT credentials + - Check WellNuo API connectivity + +4. **Connection fails** + - Verify token is valid + - Check network connectivity + - Ensure LiveKit URL is correct + +### Debugging + +```bash +# View agent logs +lk agent logs + +# View specific deployment logs +lk agent logs --version v20260119031418 + +# Check agent status +lk agent list +``` + +## Environment Variables + +### Agent (.env.local) +``` +LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud +LIVEKIT_API_KEY=your-api-key +LIVEKIT_API_SECRET=your-api-secret +WELLNUO_USER=anandk +WELLNUO_PASSWORD=anandk_8 +DEPLOYMENT_ID=21 +``` + +### React Native (via WellNuo backend) +Token generation handled server-side for security. + +## Status + +**Current State:** WIP - Not tested on real device + +**Working:** +- Agent deploys to LiveKit Cloud +- Agent connects to rooms +- STT/TTS pipeline configured +- WellNuo API integration +- React Native UI + +**Needs Testing:** +- Real device microphone capture +- Audio playback on physical iOS device +- Full conversation loop end-to-end +- Token refresh/expiration handling