wellnua-lite-Robert/docs/JULIA-AI-VOICE-INTEGRATION.md

8.6 KiB

Julia AI Voice Integration

Architecture Overview

┌─────────────────────────────────────────────────────────────────┐
│                    WellNuo Lite App (iOS)                       │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │  Voice Call Screen (app/voice-call.tsx)                  │   │
│  │  - useLiveKitRoom hook                                   │   │
│  │  - Audio session management                              │   │
│  │  - Microphone permission handling                        │   │
│  └───────────────────────┬─────────────────────────────────┘   │
│                          │ WebSocket + WebRTC                   │
└──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│                   LiveKit Cloud                                  │
│  ┌─────────────────┐  ┌─────────────────┐  ┌─────────────────┐  │
│  │  SFU Server     │  │  Room Mgmt      │  │  Agent Hosting  │  │
│  │  (WebRTC)       │  │  (Token Auth)   │  │  (Python)       │  │
│  └────────┬────────┘  └─────────────────┘  └────────┬────────┘  │
│           │                                          │          │
│           └──────────────────────────────────────────┘          │
│                          │ Audio Streams                        │
└──────────────────────────┼──────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│                  Julia AI Agent (Python)                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │ Deepgram    │  │ Deepgram    │  │ WellNuo voice_ask API   │  │
│  │ STT         │  │ TTS         │  │ (Custom LLM backend)    │  │
│  │ (Nova-2)    │  │ (Aura)      │  │                         │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Components

1. React Native Client

Location: app/voice-call.tsx, hooks/useLiveKitRoom.ts

Dependencies:

  • @livekit/react-native - LiveKit React Native SDK
  • @livekit/react-native-webrtc - WebRTC for React Native
  • expo-av - Audio session management

Key Features:

  • Connects to LiveKit room with JWT token
  • Manages audio session (activates speaker mode)
  • Handles microphone permissions
  • Displays connection state and transcription

2. LiveKit Cloud

Project: live-kit-demo-70txlh6a Agent ID: CA_Yd3qcuYEVKKE

Configuration:

  • Auto-scaling agent workers
  • Managed STT/TTS through inference endpoints
  • Built-in noise cancellation

Getting Tokens:

// From WellNuo backend
const response = await fetch('/api/livekit/token', {
  method: 'POST',
  body: JSON.stringify({ roomName, userName })
});
const { token, url } = await response.json();

3. Julia AI Agent (Python)

Location: julia-agent/julia-ai/src/agent.py

Stack:

  • LiveKit Agents SDK
  • Deepgram Nova-2 (STT)
  • Deepgram Aura Asteria (TTS - female voice)
  • Silero VAD (Voice Activity Detection)
  • Custom WellNuo LLM (voice_ask API)

Setup & Deployment

Prerequisites

  1. LiveKit Cloud Account

  2. LiveKit CLI

    # macOS
    brew install livekit-cli
    
    # Login
    lk cloud auth
    

Agent Deployment

  1. Navigate to agent directory:

    cd julia-agent/julia-ai
    
  2. Install dependencies:

    uv sync
    
  3. Configure environment:

    cp .env.example .env.local
    # Add LIVEKIT_URL, LIVEKIT_API_KEY, LIVEKIT_API_SECRET
    
  4. Local development:

    uv run python src/agent.py dev
    
  5. Deploy to LiveKit Cloud:

    lk agent deploy
    

React Native Setup

  1. Install packages:

    npm install @livekit/react-native @livekit/react-native-webrtc
    
  2. iOS permissions (Info.plist):

    <key>NSMicrophoneUsageDescription</key>
    <string>WellNuo needs microphone access for voice calls with Julia AI</string>
    
  3. Pod install:

    cd ios && pod install
    

Flow Diagram

User opens Voice tab
        │
        ▼
Request microphone permission
        │
        ├─ Denied → Show error
        │
        ▼
Get LiveKit token from WellNuo API
        │
        ▼
Connect to LiveKit room
        │
        ▼
Agent joins automatically (LiveKit Cloud)
        │
        ▼
Agent sends greeting (TTS)
        │
        ▼
User speaks → STT → WellNuo API → Response → TTS
        │
        ▼
User ends call → Disconnect from room

API Integration

WellNuo voice_ask API

The agent uses WellNuo's voice_ask API to get contextual responses about the beneficiary.

Endpoint: https://eluxnetworks.net/function/well-api/api

Authentication:

data = {
    "function": "credentials",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "ps": WELLNUO_PASSWORD,
    "nonce": str(random.randint(0, 999999)),
}

Voice Ask:

data = {
    "function": "voice_ask",
    "clientId": "001",
    "user_name": WELLNUO_USER,
    "token": token,
    "question": user_message,
    "deployment_id": DEPLOYMENT_ID,
}

Troubleshooting

Common Issues

  1. No audio playback on iOS

    • Check audio session configuration
    • Ensure expo-av is properly configured
    • Test on real device (simulator has audio limitations)
  2. Microphone not working

    • Verify permissions in Info.plist
    • Check if user granted permission
    • Real device required for full audio testing
  3. Agent not responding

    • Check agent logs: lk agent logs
    • Verify LIVEKIT credentials
    • Check WellNuo API connectivity
  4. Connection fails

    • Verify token is valid
    • Check network connectivity
    • Ensure LiveKit URL is correct

Debugging

# View agent logs
lk agent logs

# View specific deployment logs
lk agent logs --version v20260119031418

# Check agent status
lk agent list

Environment Variables

Agent (.env.local)

LIVEKIT_URL=wss://live-kit-demo-70txlh6a.livekit.cloud
LIVEKIT_API_KEY=your-api-key
LIVEKIT_API_SECRET=your-api-secret
WELLNUO_USER=anandk
WELLNUO_PASSWORD=anandk_8
DEPLOYMENT_ID=21

React Native (via WellNuo backend)

Token generation handled server-side for security.

Status

Current State: WIP - Not tested on real device

Working:

  • Agent deploys to LiveKit Cloud
  • Agent connects to rooms
  • STT/TTS pipeline configured
  • WellNuo API integration
  • React Native UI

Needs Testing:

  • Real device microphone capture
  • Audio playback on physical iOS device
  • Full conversation loop end-to-end
  • Token refresh/expiration handling