Add Julia AI voice integration documentation

This commit is contained in:
Sergei 2026-01-18 20:17:30 -08:00
parent 059bc29b6b
commit 20d5f42114

View File

@ -0,0 +1,279 @@
# Julia AI Voice Integration
## Architecture Overview
```
┌─────────────────────────────────────────────────────────────────┐
│ WellNuo Lite App (iOS) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Voice Call Screen (app/voice-call.tsx) │ │
│ │ - useLiveKitRoom hook │ │
│ │ - Audio session management │ │
│ │ - Microphone permission handling │ │
│ └───────────────────────┬─────────────────────────────────┘ │
│ │ WebSocket + WebRTC │
└──────────────────────────┼──────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ LiveKit Cloud │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │ SFU Server │ │ Room Mgmt │ │ Agent Hosting │ │
│ │ (WebRTC) │ │ (Token Auth) │ │ (Python) │ │
│ └────────┬────────┘ └─────────────────┘ └────────┬────────┘ │
│ │ │ │
│ └──────────────────────────────────────────┘ │
│ │ Audio Streams │
└──────────────────────────┼──────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ Julia AI Agent (Python) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
│ │ Deepgram │ │ Deepgram │ │ WellNuo voice_ask API │ │
│ │ STT │ │ TTS │ │ (Custom LLM backend) │ │
│ │ (Nova-2) │ │ (Aura) │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
```
## Components
### 1. React Native Client
**Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts`
**Dependencies:**
- `@livekit/react-native` - LiveKit React Native SDK
- `@livekit/react-native-webrtc` - WebRTC for React Native
- `expo-av` - Audio session management
**Key Features:**
- Connects to LiveKit room with JWT token
- Manages audio session (activates speaker mode)
- Handles microphone permissions
- Displays connection state and transcription
### 2. LiveKit Cloud
**Project:** `live-kit-demo-70txlh6a`
**Agent ID:** `CA_Yd3qcuYEVKKE`
**Configuration:**
- Auto-scaling agent workers
- Managed STT/TTS through inference endpoints
- Built-in noise cancellation
**Getting Tokens:**
```typescript
// From WellNuo backend
const response = await fetch('/api/livekit/token', {
method: 'POST',
body: JSON.stringify({ roomName, userName })
});
const { token, url } = await response.json();
```
### 3. Julia AI Agent (Python)
**Location:** `julia-agent/julia-ai/src/agent.py`
**Stack:**
- LiveKit Agents SDK
- Deepgram Nova-2 (STT)
- Deepgram Aura Asteria (TTS - female voice)
- Silero VAD (Voice Activity Detection)
- Custom WellNuo LLM (voice_ask API)
## Setup & Deployment
### Prerequisites
1. **LiveKit Cloud Account**
- Sign up at https://cloud.livekit.io/
- Create a project
- Get API credentials
2. **LiveKit CLI**
```bash
# macOS
brew install livekit-cli
# Login
lk cloud auth
```
### Agent Deployment
1. **Navigate to agent directory:**
```bash
cd julia-agent/julia-ai
```
2. **Install dependencies:**
```bash
uv sync
```
3. **Configure environment:**
```bash
cp .env.example .env.local
# Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
```
4. **Local development:**
```bash
uv run python src/agent.py dev
```
5. **Deploy to LiveKit Cloud:**
```bash
lk agent deploy
```
### React Native Setup
1. **Install packages:**
```bash
npm install @livekit/react-native @livekit/react-native-webrtc
```
2. **iOS permissions (Info.plist):**
```xml
<key>NSMicrophoneUsageDescription</key>
<string>WellNuo needs microphone access for voice calls with Julia AI</string>
```
3. **Pod install:**
```bash
cd ios && pod install
```
## Flow Diagram
```
User opens Voice tab
Request microphone permission
├─ Denied → Show error
Get LiveKit token from WellNuo API
Connect to LiveKit room
Agent joins automatically (LiveKit Cloud)
Agent sends greeting (TTS)
User speaks → STT → WellNuo API → Response → TTS
User ends call → Disconnect from room
```
## API Integration
### WellNuo voice_ask API
The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary.
**Endpoint:** `https://eluxnetworks.net/function/well-api/api`
**Authentication:**
```python
data = {
"function": "credentials",
"clientId": "001",
"user_name": WELLNUO_USER,
"ps": WELLNUO_PASSWORD,
"nonce": str(random.randint(0, 999999)),
}
```
**Voice Ask:**
```python
data = {
"function": "voice_ask",
"clientId": "001",
"user_name": WELLNUO_USER,
"token": token,
"question": user_message,
"deployment_id": DEPLOYMENT_ID,
}
```
## Troubleshooting
### Common Issues
1. **No audio playback on iOS**
- Check audio session configuration
- Ensure `expo-av` is properly configured
- Test on real device (simulator has audio limitations)
2. **Microphone not working**
- Verify permissions in Info.plist
- Check if user granted permission
- Real device required for full audio testing
3. **Agent not responding**
- Check agent logs: `lk agent logs`
- Verify LIVEKIT credentials
- Check WellNuo API connectivity
4. **Connection fails**
- Verify token is valid
- Check network connectivity
- Ensure LiveKit URL is correct
### Debugging
```bash
# View agent logs
lk agent logs
# View specific deployment logs
lk agent logs --version v20260119031418
# Check agent status
lk agent list
```
## Environment Variables
### Agent (.env.local)
```
LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret
WELLNUO_USER=anandk
WELLNUO_PASSWORD=anandk_8
DEPLOYMENT_ID=21
```
### React Native (via WellNuo backend)
Token generation handled server-side for security.
## Status
**Current State:** WIP - Not tested on real device
**Working:**
- Agent deploys to LiveKit Cloud
- Agent connects to rooms
- STT/TTS pipeline configured
- WellNuo API integration
- React Native UI
**Needs Testing:**
- Real device microphone capture
- Audio playback on physical iOS device
- Full conversation loop end-to-end
- Token refresh/expiration handling