Add Julia AI voice integration documentation
This commit is contained in:
parent
059bc29b6b
commit
20d5f42114
279
docs/JULIA-AI-VOICE-INTEGRATION.md
Normal file
279
docs/JULIA-AI-VOICE-INTEGRATION.md
Normal file
@ -0,0 +1,279 @@
|
||||
# Julia AI Voice Integration
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ WellNuo Lite App (iOS) │
|
||||
│ ┌─────────────────────────────────────────────────────────┐ │
|
||||
│ │ Voice Call Screen (app/voice-call.tsx) │ │
|
||||
│ │ - useLiveKitRoom hook │ │
|
||||
│ │ - Audio session management │ │
|
||||
│ │ - Microphone permission handling │ │
|
||||
│ └───────────────────────┬─────────────────────────────────┘ │
|
||||
│ │ WebSocket + WebRTC │
|
||||
└──────────────────────────┼──────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ LiveKit Cloud │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ SFU Server │ │ Room Mgmt │ │ Agent Hosting │ │
|
||||
│ │ (WebRTC) │ │ (Token Auth) │ │ (Python) │ │
|
||||
│ └────────┬────────┘ └─────────────────┘ └────────┬────────┘ │
|
||||
│ │ │ │
|
||||
│ └──────────────────────────────────────────┘ │
|
||||
│ │ Audio Streams │
|
||||
└──────────────────────────┼──────────────────────────────────────┘
|
||||
│
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────────────────┐
|
||||
│ Julia AI Agent (Python) │
|
||||
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
||||
│ │ Deepgram │ │ Deepgram │ │ WellNuo voice_ask API │ │
|
||||
│ │ STT │ │ TTS │ │ (Custom LLM backend) │ │
|
||||
│ │ (Nova-2) │ │ (Aura) │ │ │ │
|
||||
│ └─────────────┘ └─────────────┘ └─────────────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Components
|
||||
|
||||
### 1. React Native Client
|
||||
|
||||
**Location:** `app/voice-call.tsx`, `hooks/useLiveKitRoom.ts`
|
||||
|
||||
**Dependencies:**
|
||||
- `@livekit/react-native` - LiveKit React Native SDK
|
||||
- `@livekit/react-native-webrtc` - WebRTC for React Native
|
||||
- `expo-av` - Audio session management
|
||||
|
||||
**Key Features:**
|
||||
- Connects to LiveKit room with JWT token
|
||||
- Manages audio session (activates speaker mode)
|
||||
- Handles microphone permissions
|
||||
- Displays connection state and transcription
|
||||
|
||||
### 2. LiveKit Cloud
|
||||
|
||||
**Project:** `live-kit-demo-70txlh6a`
|
||||
**Agent ID:** `CA_Yd3qcuYEVKKE`
|
||||
|
||||
**Configuration:**
|
||||
- Auto-scaling agent workers
|
||||
- Managed STT/TTS through inference endpoints
|
||||
- Built-in noise cancellation
|
||||
|
||||
**Getting Tokens:**
|
||||
```typescript
|
||||
// From WellNuo backend
|
||||
const response = await fetch('/api/livekit/token', {
|
||||
method: 'POST',
|
||||
body: JSON.stringify({ roomName, userName })
|
||||
});
|
||||
const { token, url } = await response.json();
|
||||
```
|
||||
|
||||
### 3. Julia AI Agent (Python)
|
||||
|
||||
**Location:** `julia-agent/julia-ai/src/agent.py`
|
||||
|
||||
**Stack:**
|
||||
- LiveKit Agents SDK
|
||||
- Deepgram Nova-2 (STT)
|
||||
- Deepgram Aura Asteria (TTS - female voice)
|
||||
- Silero VAD (Voice Activity Detection)
|
||||
- Custom WellNuo LLM (voice_ask API)
|
||||
|
||||
## Setup & Deployment
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **LiveKit Cloud Account**
|
||||
- Sign up at https://cloud.livekit.io/
|
||||
- Create a project
|
||||
- Get API credentials
|
||||
|
||||
2. **LiveKit CLI**
|
||||
```bash
|
||||
# macOS
|
||||
brew install livekit-cli
|
||||
|
||||
# Login
|
||||
lk cloud auth
|
||||
```
|
||||
|
||||
### Agent Deployment
|
||||
|
||||
1. **Navigate to agent directory:**
|
||||
```bash
|
||||
cd julia-agent/julia-ai
|
||||
```
|
||||
|
||||
2. **Install dependencies:**
|
||||
```bash
|
||||
uv sync
|
||||
```
|
||||
|
||||
3. **Configure environment:**
|
||||
```bash
|
||||
cp .env.example .env.local
|
||||
# Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
|
||||
```
|
||||
|
||||
4. **Local development:**
|
||||
```bash
|
||||
uv run python src/agent.py dev
|
||||
```
|
||||
|
||||
5. **Deploy to LiveKit Cloud:**
|
||||
```bash
|
||||
lk agent deploy
|
||||
```
|
||||
|
||||
### React Native Setup
|
||||
|
||||
1. **Install packages:**
|
||||
```bash
|
||||
npm install @livekit/react-native @livekit/react-native-webrtc
|
||||
```
|
||||
|
||||
2. **iOS permissions (Info.plist):**
|
||||
```xml
|
||||
<key>NSMicrophoneUsageDescription</key>
|
||||
<string>WellNuo needs microphone access for voice calls with Julia AI</string>
|
||||
```
|
||||
|
||||
3. **Pod install:**
|
||||
```bash
|
||||
cd ios && pod install
|
||||
```
|
||||
|
||||
## Flow Diagram
|
||||
|
||||
```
|
||||
User opens Voice tab
|
||||
│
|
||||
▼
|
||||
Request microphone permission
|
||||
│
|
||||
├─ Denied → Show error
|
||||
│
|
||||
▼
|
||||
Get LiveKit token from WellNuo API
|
||||
│
|
||||
▼
|
||||
Connect to LiveKit room
|
||||
│
|
||||
▼
|
||||
Agent joins automatically (LiveKit Cloud)
|
||||
│
|
||||
▼
|
||||
Agent sends greeting (TTS)
|
||||
│
|
||||
▼
|
||||
User speaks → STT → WellNuo API → Response → TTS
|
||||
│
|
||||
▼
|
||||
User ends call → Disconnect from room
|
||||
```
|
||||
|
||||
## API Integration
|
||||
|
||||
### WellNuo voice_ask API
|
||||
|
||||
The agent uses WellNuo's `voice_ask` API to get contextual responses about the beneficiary.
|
||||
|
||||
**Endpoint:** `https://eluxnetworks.net/function/well-api/api`
|
||||
|
||||
**Authentication:**
|
||||
```python
|
||||
data = {
|
||||
"function": "credentials",
|
||||
"clientId": "001",
|
||||
"user_name": WELLNUO_USER,
|
||||
"ps": WELLNUO_PASSWORD,
|
||||
"nonce": str(random.randint(0, 999999)),
|
||||
}
|
||||
```
|
||||
|
||||
**Voice Ask:**
|
||||
```python
|
||||
data = {
|
||||
"function": "voice_ask",
|
||||
"clientId": "001",
|
||||
"user_name": WELLNUO_USER,
|
||||
"token": token,
|
||||
"question": user_message,
|
||||
"deployment_id": DEPLOYMENT_ID,
|
||||
}
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **No audio playback on iOS**
|
||||
- Check audio session configuration
|
||||
- Ensure `expo-av` is properly configured
|
||||
- Test on real device (simulator has audio limitations)
|
||||
|
||||
2. **Microphone not working**
|
||||
- Verify permissions in Info.plist
|
||||
- Check if user granted permission
|
||||
- Real device required for full audio testing
|
||||
|
||||
3. **Agent not responding**
|
||||
- Check agent logs: `lk agent logs`
|
||||
- Verify LIVEKIT credentials
|
||||
- Check WellNuo API connectivity
|
||||
|
||||
4. **Connection fails**
|
||||
- Verify token is valid
|
||||
- Check network connectivity
|
||||
- Ensure LiveKit URL is correct
|
||||
|
||||
### Debugging
|
||||
|
||||
```bash
|
||||
# View agent logs
|
||||
lk agent logs
|
||||
|
||||
# View specific deployment logs
|
||||
lk agent logs --version v20260119031418
|
||||
|
||||
# Check agent status
|
||||
lk agent list
|
||||
```
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Agent (.env.local)
|
||||
```
|
||||
LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
|
||||
LIVEKIT_API_KEY=your-api-key
|
||||
LIVEKIT_API_SECRET=your-api-secret
|
||||
WELLNUO_USER=anandk
|
||||
WELLNUO_PASSWORD=anandk_8
|
||||
DEPLOYMENT_ID=21
|
||||
```
|
||||
|
||||
### React Native (via WellNuo backend)
|
||||
Token generation handled server-side for security.
|
||||
|
||||
## Status
|
||||
|
||||
**Current State:** WIP - Not tested on real device
|
||||
|
||||
**Working:**
|
||||
- Agent deploys to LiveKit Cloud
|
||||
- Agent connects to rooms
|
||||
- STT/TTS pipeline configured
|
||||
- WellNuo API integration
|
||||
- React Native UI
|
||||
|
||||
**Needs Testing:**
|
||||
- Real device microphone capture
|
||||
- Audio playback on physical iOS device
|
||||
- Full conversation loop end-to-end
|
||||
- Token refresh/expiration handling
|
||||
Loading…
x
Reference in New Issue
Block a user