16 Commits

Author SHA1 Message Date
9f12830850 Improve STT quality and add session/chat management
- Switch Android STT from on-device to cloud recognition for better accuracy
- Add lastMessageWasVoiceRef to prevent TTS for text-typed messages
- Stop voice session and clear chat when changing Deployment or Voice API
- Ensures clean state when switching between beneficiaries/models
2026-01-29 18:29:00 -08:00
a1ff324a5a Increase Android STT silence timeout from 2s to 4s
Fix premature speech cutoff during natural pauses:
- EXTRA_SPEECH_INPUT_COMPLETE_SILENCE_LENGTH_MILLIS: 4000ms (was 2000ms)
- EXTRA_SPEECH_INPUT_POSSIBLY_COMPLETE_SILENCE_LENGTH_MILLIS: 3000ms (was 1500ms)

This allows users to pause between sentences without being cut off.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-29 16:34:53 -08:00
d6353c8533 2026-01-29: Stable version with voice debug and iOS STT fix
Добавлено:
- Voice Debug tab - real-time логи STT/API/TTS/Timer
- iOS STT fix - отправка последнего partial как final при onEnd
- iOS auto-stop - автоматическая остановка STT после 2s тишины
- Voice API selector в Profile (voice_ask / ask_wellnuo_ai)

Исправлено:
- iOS никогда не отправлял isFinal:true - теперь отправляет через onEnd
- STT не останавливался после тишины - добавлен auto-stop таймер
- Profile Voice API selector восстановлен после rollback

Известные issues:
- TypeScript ошибки (setTimeout type) - не критично
- updateVoiceApiType отсутствует в VoiceContext - нужно добавить

Стабильная версия для тестирования на iPhone.
2026-01-28 19:45:40 -08:00
05f872d067 fix: voice session improvements - FAB stop, echo prevention, chat TTS
- FAB button now correctly stops session during speaking/processing states
- Echo prevention: STT stopped during TTS playback, results ignored during speaking
- Chat TTS only speaks when voice session is active (no auto-speak for text chat)
- Session stop now aborts in-flight API requests and prevents race conditions
- STT restarts after TTS with 800ms delay for audio focus release
- Pending interrupt transcript processed after TTS completion
- ChatContext added for message persistence across tab navigation
- VoiceFAB redesigned with state-based animations
- console.error replaced with console.warn across voice pipeline
- no-speech STT errors silenced (normal silence behavior)
2026-01-27 22:59:55 -08:00
f2803ca5db fix(stt): graceful degradation for Expo Go
Handle missing native module @jamsch/expo-speech-recognition gracefully.
In Expo Go the native module is not available, which was causing the entire
_layout.tsx to fail to export, breaking tab navigation.

- Use dynamic require() with try/catch instead of static import
- Initialize ExpoSpeechRecognitionModule and useSpeechRecognitionEvent as no-ops
- Check module availability before calling any native methods
- isAvailable state properly reflects module presence

Tab navigation now works in Expo Go (with STT disabled).
Full STT functionality requires a development build.
2026-01-27 17:03:56 -08:00
3c7a48df5b Integrate TTS interruption in VoiceFAB when voice detected
- Add onVoiceDetected callback to useSpeechRecognition hook
  - Triggered on first interim result (voice activity detected)
  - Uses voiceDetectedRef to ensure callback fires only once per session
  - Reset flag on session start/end

- Connect STT to VoiceContext in _layout.tsx
  - Use useSpeechRecognition with onVoiceDetected callback
  - Call interruptIfSpeaking() when voice detected during 'speaking' state
  - Forward STT results to VoiceContext (setTranscript, sendTranscript)
  - Start/stop STT based on isListening state

- Export interruptIfSpeaking from VoiceContext provider

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-27 16:34:07 -08:00
62eb7c4de0 Add useTextToSpeech hook for TTS operations
Create a reusable hook wrapping expo-speech that provides:
- speak/stop controls
- isSpeaking state tracking
- Voice listing support
- Promise-based API for async flows
- Proper cleanup on unmount

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-27 16:20:51 -08:00
54bff8d9d5 Add useSpeechRecognition hook for voice input
Implements speech recognition hook wrapping @jamsch/expo-speech-recognition:
- startListening/stopListening/abortListening controls
- Real-time transcript updates with interim results
- Permission handling with user feedback
- Platform-specific options (Android silence timeout, iOS punctuation)
- Error handling with graceful degradation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-27 16:19:31 -08:00
260a722cd9 Remove unused useLiveKitRoom hook
This LiveKit hook was no longer used after switching to speech recognition.
Also removed the outdated comment referencing it in _layout.tsx.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-27 16:08:41 -08:00
Sergei
f2e633df99 Fix audio playback: add room.startAudio() call
Root cause: Audio from remote participant (Julia AI) was not playing
because room.startAudio() was never called after connecting.

This is REQUIRED by LiveKit WebRTC to enable audio playback.
The fix matches the working implementation in debug.tsx (Robert version).

Changes:
- Add room.startAudio() call after room.connect()
- Add canPlayAudio state tracking
- Add proper error handling for startAudio

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-25 18:03:56 -08:00
Sergei
d9fff44fc9 Remove unused expo-speech packages to avoid AudioSession conflicts
- Remove expo-speech (TTS) - not used
- Remove expo-speech-recognition (STT) - not used
- Delete dead code: hooks/useSpeechRecognition.ts

These packages add native audio modules that can conflict with
LiveKit's AudioSession management on iOS.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-22 09:41:35 -08:00
Sergei
906213e620 Add beneficiary_names_dict support for voice assistant
- Voice agent now extracts deploymentId and beneficiaryNamesDict from
  participant metadata passed via LiveKit token
- WellNuoLLM class accepts dynamic deployment_id and beneficiary_names_dict
- API calls now include personalized beneficiary names for better responses
- Text chat already has this functionality (verified)
- Updated LiveKit agent deployed to cloud

Also includes:
- Speaker toggle button in voice call UI
- Keyboard controller integration for chat
- Various UI improvements
2026-01-20 14:41:33 -08:00
Sergei
e3192ead12 Voice call improvements: single call limit, hide debug tab, remove speaker toggle
Changes:
- Add CallManager singleton to ensure only 1 call per device at a time
- Hide Debug tab from production (href: null)
- Remove speaker/earpiece toggle button (always use speaker)
- Agent uses voice_ask API (fast ~1 sec latency)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-01-19 23:55:27 -08:00
Sergei
059bc29b6b WIP: LiveKit voice call integration with Julia AI agent
NOT TESTED ON REAL DEVICE - simulator only verification

Components:
- LiveKit Cloud agent deployment (julia-agent/julia-ai/)
- React Native LiveKit client (hooks/useLiveKitRoom.ts)
- Voice call screen with audio session management
- WellNuo voice_ask API integration in Python agent

Tech stack:
- LiveKit Cloud for agent hosting
- @livekit/react-native SDK
- Deepgram STT/TTS (via LiveKit Cloud)
- Silero VAD for voice activity detection

Known issues:
- Microphone permissions may need manual testing on real device
- LiveKit audio playback not verified on physical hardware
- Agent greeting audio not confirmed working end-to-end

Next steps:
- Test on physical iOS device
- Verify microphone capture works
- Confirm TTS audio playback
- Test full conversation loop
2026-01-18 20:16:25 -08:00
Sergei
b2639dd540 Add Sherpa TTS voice synthesis system
Core TTS infrastructure:
- sherpaTTS.ts: Sherpa ONNX integration for offline TTS
- TTSErrorBoundary.tsx: Error boundary for TTS failures
- ErrorBoundary.tsx: Generic error boundary component
- VoiceIndicator.tsx: Visual indicator for voice activity
- useSpeechRecognition.ts: Speech-to-text hook
- DebugLogger.ts: Debug logging utility

Features:
- Offline voice synthesis (no internet needed)
- Multiple voices support
- Real-time voice activity indication
- Error recovery and fallback
- Debug logging for troubleshooting

Tech stack:
- Sherpa ONNX runtime
- React Native Audio
- Expo modules
2026-01-14 19:09:27 -08:00
Sergei
8bc9649146 WellNuo Lite v1.0.0 - simplified version for App Store review
- Removed voice input features
- Simplified profile page (only legal links and logout)
- Chat with AI context working
- Auto-select first beneficiary
- Dashboard WebView intact

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-12-24 17:13:13 -08:00