Sergei 20d5f42114 Add Julia AI voice integration documentation

2026-01-18 20:17:30 -08:00

8.6 KiB

Raw Blame History

Julia AI Voice Integration

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    WellNuo Lite App (iOS)                       │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  Voice Call Screen (app/voice-call.tsx)                  │   │
│  │  - useLiveKitRoom hook                                   │   │
│  │  - Audio session management                              │   │
│  │  - Microphone permission handling                        │   │
│  └───────────────────────┬─────────────────────────────────┘   │
│                          │ WebSocket + WebRTC                   │
└──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│                   LiveKit Cloud                                  │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │  SFU Server     │  │  Room Mgmt      │  │  Agent Hosting  │  │
│  │  (WebRTC)       │  │  (Token Auth)   │  │  (Python)       │  │
│  └────────┬────────┘  └─────────────────┘  └────────┬────────┘  │
│           │                                          │          │
│           └──────────────────────────────────────────┘          │
│                          │ Audio Streams                        │
└──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│                  Julia AI Agent (Python)                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │ Deepgram    │  │ Deepgram    │  │ WellNuo voice_ask API   │  │
│  │ STT         │  │ TTS         │  │ (Custom LLM backend)    │  │
│  │ (Nova-2)    │  │ (Aura)      │  │                         │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Components

1. React Native Client

Location: app/voice-call.tsx, hooks/useLiveKitRoom.ts

Dependencies:

@livekit/react-native - LiveKit React Native SDK
@livekit/react-native-webrtc - WebRTC for React Native
expo-av - Audio session management

Key Features:

Connects to LiveKit room with JWT token
Manages audio session (activates speaker mode)
Handles microphone permissions
Displays connection state and transcription

2. LiveKit Cloud

Project: live-kit-demo-70txlh6a Agent ID: CA_Yd3qcuYEVKKE

Configuration:

Auto-scaling agent workers
Managed STT/TTS through inference endpoints
Built-in noise cancellation

Getting Tokens:

// From WellNuo backend
const response = await fetch('/api/livekit/token', {
  method: 'POST',
  body: JSON.stringify({ roomName, userName })
});
const { token, url } = await response.json();

3. Julia AI Agent (Python)

Location: julia-agent/julia-ai/src/agent.py

Stack:

LiveKit Agents SDK
Deepgram Nova-2 (STT)
Deepgram Aura Asteria (TTS - female voice)
Silero VAD (Voice Activity Detection)
Custom WellNuo LLM (voice_ask API)

Setup & Deployment

Prerequisites

LiveKit Cloud Account
- Sign up at https://cloud.livekit.io/
- Create a project
- Get API credentials

LiveKit CLI

# macOS
brew install livekit-cli

# Login
lk cloud auth

Agent Deployment

Navigate to agent directory:
```
cd julia-agent/julia-ai
```
Install dependencies:
```
uv sync
```

Configure environment:

cp .env.example .env.local
# Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET

Local development:
```
uv run python src/agent.py dev
```
Deploy to LiveKit Cloud:
```
lk agent deploy
```

React Native Setup

Install packages:

npm install @livekit/react-native @livekit/react-native-webrtc

iOS permissions (Info.plist):

<key>NSMicrophoneUsageDescription</key>
<string>WellNuo needs microphone access for voice calls with Julia AI</string>

Pod install:
```
cd ios && pod install
```

Flow Diagram

User opens Voice tab
        │
        ▼
Request microphone permission
        │
        ├─ Denied → Show error
        │
        ▼
Get LiveKit token from WellNuo API
        │
        ▼
Connect to LiveKit room
        │
        ▼
Agent joins automatically (LiveKit Cloud)
        │
        ▼
Agent sends greeting (TTS)
        │
        ▼
User speaks → STT → WellNuo API → Response → TTS
        │
        ▼
User ends call → Disconnect from room

API Integration

WellNuo voice_ask API

The agent uses WellNuo's voice_ask API to get contextual responses about the beneficiary.

Endpoint: https://eluxnetworks.net/function/well-api/api

Authentication:

data = {
    "function": "credentials",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "ps": WELLNUO_PASSWORD,
    "nonce": str(random.randint(0, 999999)),
}

Voice Ask:

data = {
    "function": "voice_ask",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "token": token,
    "question": user_message,
    "deployment_id": DEPLOYMENT_ID,
}

Troubleshooting

Common Issues

No audio playback on iOS
- Check audio session configuration
- Ensure expo-av is properly configured
- Test on real device (simulator has audio limitations)
Microphone not working
- Verify permissions in Info.plist
- Check if user granted permission
- Real device required for full audio testing
Agent not responding
- Check agent logs: lk agent logs
- Verify LIVEKIT credentials
- Check WellNuo API connectivity
Connection fails
- Verify token is valid
- Check network connectivity
- Ensure LiveKit URL is correct

Debugging

# View agent logs
lk agent logs

# View specific deployment logs
lk agent logs --version v20260119031418

# Check agent status
lk agent list

Environment Variables

Agent (.env.local)

LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret
WELLNUO_USER=anandk
WELLNUO_PASSWORD=anandk_8
DEPLOYMENT_ID=21

React Native (via WellNuo backend)

Token generation handled server-side for security.

Status

Current State: WIP - Not tested on real device

Working:

Agent deploys to LiveKit Cloud
Agent connects to rooms
STT/TTS pipeline configured
WellNuo API integration
React Native UI

Needs Testing:

Real device microphone capture
Audio playback on physical iOS device
Full conversation loop end-to-end
Token refresh/expiration handling

8.6 KiB Raw Blame History