280 lines
8.6 KiB
Markdown
280 lines
8.6 KiB
Markdown
# Julia AI Voice Integration
|
|
|
|
## Architecture Overview
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ WellNuo Lite App (iOS) │
|
|
│ ┌─────────────────────────────────────────────────────────┐ │
|
|
│ │ Voice Call Screen (app/voice-call.tsx) │ │
|
|
│ │ - useLiveKitRoom hook │ │
|
|
│ │ - Audio session management │ │
|
|
│ │ - Microphone permission handling │ │
|
|
│ └───────────────────────┬─────────────────────────────────┘ │
|
|
│ │ WebSocket + WebRTC │
|
|
└──────────────────────────┼──────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ LiveKit Cloud │
|
|
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
|
│ │ SFU Server │ │ Room Mgmt │ │ Agent Hosting │ │
|
|
│ │ (WebRTC) │ │ (Token Auth) │ │ (Python) │ │
|
|
│ └────────┬────────┘ └─────────────────┘ └────────┬────────┘ │
|
|
│ │ │ │
|
|
│ └──────────────────────────────────────────┘ │
|
|
│ │ Audio Streams │
|
|
└──────────────────────────┼──────────────────────────────────────┘
|
|
│
|
|
▼
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Julia AI Agent (Python) │
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
|
│ │ Deepgram │ │ Deepgram │ │ WellNuo voice_ask API │ │
|
|
│ │ STT │ │ TTS │ │ (Custom LLM backend) │ │
|
|
│ │ (Nova-2) │ │ (Aura) │ │ │ │
|
|
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Components
|
|
|
|
### 1. React Native Client
|
|
|
|
**Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts`
|
|
|
|
**Dependencies:**
|
|
- `@livekit/react-native` - LiveKit React Native SDK
|
|
- `@livekit/react-native-webrtc` - WebRTC for React Native
|
|
- `expo-av` - Audio session management
|
|
|
|
**Key Features:**
|
|
- Connects to LiveKit room with JWT token
|
|
- Manages audio session (activates speaker mode)
|
|
- Handles microphone permissions
|
|
- Displays connection state and transcription
|
|
|
|
### 2. LiveKit Cloud
|
|
|
|
**Project:** `live-kit-demo-70txlh6a`
|
|
**Agent ID:** `CA_Yd3qcuYEVKKE`
|
|
|
|
**Configuration:**
|
|
- Auto-scaling agent workers
|
|
- Managed STT/TTS through inference endpoints
|
|
- Built-in noise cancellation
|
|
|
|
**Getting Tokens:**
|
|
```typescript
|
|
// From WellNuo backend
|
|
const response = await fetch('/api/livekit/token', {
|
|
method: 'POST',
|
|
body: JSON.stringify({ roomName, userName })
|
|
});
|
|
const { token, url } = await response.json();
|
|
```
|
|
|
|
### 3. Julia AI Agent (Python)
|
|
|
|
**Location:** `julia-agent/julia-ai/src/agent.py`
|
|
|
|
**Stack:**
|
|
- LiveKit Agents SDK
|
|
- Deepgram Nova-2 (STT)
|
|
- Deepgram Aura Asteria (TTS - female voice)
|
|
- Silero VAD (Voice Activity Detection)
|
|
- Custom WellNuo LLM (voice_ask API)
|
|
|
|
## Setup & Deployment
|
|
|
|
### Prerequisites
|
|
|
|
1. **LiveKit Cloud Account**
|
|
- Sign up at https://cloud.livekit.io/
|
|
- Create a project
|
|
- Get API credentials
|
|
|
|
2. **LiveKit CLI**
|
|
```bash
|
|
# macOS
|
|
brew install livekit-cli
|
|
|
|
# Login
|
|
lk cloud auth
|
|
```
|
|
|
|
### Agent Deployment
|
|
|
|
1. **Navigate to agent directory:**
|
|
```bash
|
|
cd julia-agent/julia-ai
|
|
```
|
|
|
|
2. **Install dependencies:**
|
|
```bash
|
|
uv sync
|
|
```
|
|
|
|
3. **Configure environment:**
|
|
```bash
|
|
cp .env.example .env.local
|
|
# Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
|
|
```
|
|
|
|
4. **Local development:**
|
|
```bash
|
|
uv run python src/agent.py dev
|
|
```
|
|
|
|
5. **Deploy to LiveKit Cloud:**
|
|
```bash
|
|
lk agent deploy
|
|
```
|
|
|
|
### React Native Setup
|
|
|
|
1. **Install packages:**
|
|
```bash
|
|
npm install @livekit/react-native @livekit/react-native-webrtc
|
|
```
|
|
|
|
2. **iOS permissions (Info.plist):**
|
|
```xml
|
|
<key>NSMicrophoneUsageDescription</key>
|
|
<string>WellNuo needs microphone access for voice calls with Julia AI</string>
|
|
```
|
|
|
|
3. **Pod install:**
|
|
```bash
|
|
cd ios && pod install
|
|
```
|
|
|
|
## Flow Diagram
|
|
|
|
```
|
|
User opens Voice tab
|
|
│
|
|
▼
|
|
Request microphone permission
|
|
│
|
|
├─ Denied → Show error
|
|
│
|
|
▼
|
|
Get LiveKit token from WellNuo API
|
|
│
|
|
▼
|
|
Connect to LiveKit room
|
|
│
|
|
▼
|
|
Agent joins automatically (LiveKit Cloud)
|
|
│
|
|
▼
|
|
Agent sends greeting (TTS)
|
|
│
|
|
▼
|
|
User speaks → STT → WellNuo API → Response → TTS
|
|
│
|
|
▼
|
|
User ends call → Disconnect from room
|
|
```
|
|
|
|
## API Integration
|
|
|
|
### WellNuo voice_ask API
|
|
|
|
The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary.
|
|
|
|
**Endpoint:** `https://eluxnetworks.net/function/well-api/api`
|
|
|
|
**Authentication:**
|
|
```python
|
|
data = {
|
|
"function": "credentials",
|
|
"clientId": "001",
|
|
"user_name": WELLNUO_USER,
|
|
"ps": WELLNUO_PASSWORD,
|
|
"nonce": str(random.randint(0, 999999)),
|
|
}
|
|
```
|
|
|
|
**Voice Ask:**
|
|
```python
|
|
data = {
|
|
"function": "voice_ask",
|
|
"clientId": "001",
|
|
"user_name": WELLNUO_USER,
|
|
"token": token,
|
|
"question": user_message,
|
|
"deployment_id": DEPLOYMENT_ID,
|
|
}
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
1. **No audio playback on iOS**
|
|
- Check audio session configuration
|
|
- Ensure `expo-av` is properly configured
|
|
- Test on real device (simulator has audio limitations)
|
|
|
|
2. **Microphone not working**
|
|
- Verify permissions in Info.plist
|
|
- Check if user granted permission
|
|
- Real device required for full audio testing
|
|
|
|
3. **Agent not responding**
|
|
- Check agent logs: `lk agent logs`
|
|
- Verify LIVEKIT credentials
|
|
- Check WellNuo API connectivity
|
|
|
|
4. **Connection fails**
|
|
- Verify token is valid
|
|
- Check network connectivity
|
|
- Ensure LiveKit URL is correct
|
|
|
|
### Debugging
|
|
|
|
```bash
|
|
# View agent logs
|
|
lk agent logs
|
|
|
|
# View specific deployment logs
|
|
lk agent logs --version v20260119031418
|
|
|
|
# Check agent status
|
|
lk agent list
|
|
```
|
|
|
|
## Environment Variables
|
|
|
|
### Agent (.env.local)
|
|
```
|
|
LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
|
|
LIVEKIT_API_KEY=your-api-key
|
|
LIVEKIT_API_SECRET=your-api-secret
|
|
WELLNUO_USER=anandk
|
|
WELLNUO_PASSWORD=anandk_8
|
|
DEPLOYMENT_ID=21
|
|
```
|
|
|
|
### React Native (via WellNuo backend)
|
|
Token generation handled server-side for security.
|
|
|
|
## Status
|
|
|
|
**Current State:** WIP - Not tested on real device
|
|
|
|
**Working:**
|
|
- Agent deploys to LiveKit Cloud
|
|
- Agent connects to rooms
|
|
- STT/TTS pipeline configured
|
|
- WellNuo API integration
|
|
- React Native UI
|
|
|
|
**Needs Testing:**
|
|
- Real device microphone capture
|
|
- Audio playback on physical iOS device
|
|
- Full conversation loop end-to-end
|
|
- Token refresh/expiration handling
|