# provider-elevenlabs/alignment (ElevenLabs Forced Alignment) Generate time-aligned subtitles (SRT/VTT) from audio and transcripts using ElevenLabs forced alignment. ## Quick Start ```bash npx promptfoo@latest init --example provider-elevenlabs/alignment cd provider-elevenlabs/alignment export ELEVENLABS_API_KEY=your_api_key_here npx promptfoo@latest eval ``` ## What this tests - **Subtitle generation**: Create SRT and VTT subtitle files - **Word-level alignment**: Precise timestamp data for each word - **Multiple formats**: JSON (raw data), SRT (video players), VTT (web players) - **Accuracy**: Verify alignment matches audio timing ## How it works Forced alignment takes two inputs: 1. **Audio file**: Speech recording (MP3, WAV, etc.) 2. **Transcript**: Text of what was spoken It returns precise timestamps showing when each word was spoken, formatted as subtitles. ## Use Cases - **Video subtitles**: Generate SRT files for video editing software - **Web captions**: Create VTT files for HTML5 video players - **Karaoke apps**: Word-level timing for synchronized highlighting - **Accessibility**: Auto-generate captions for spoken content - **Translation sync**: Time-align translations to original audio ## Output Formats ### JSON (Raw alignment data) ```json { "alignment": [ { "char": "T", "start": 0.0, "end": 0.1 }, { "char": "h", "start": 0.1, "end": 0.15 } ], "characters": "That's one small step..." } ``` ### SRT (Standard video subtitles) ```text 1 00:00:00,000 --> 00:00:02,500 That's one small step for man 2 00:00:02,500 --> 00:00:05,000 one giant leap for mankind ``` ### VTT (WebVTT for web players) ```text WEBVTT 1 00:00:00.000 --> 00:00:02.500 That's one small step for man 2 00:00:02.500 --> 00:00:05.000 one giant leap for mankind ``` ## Configuration ### Basic alignment (JSON output) ```yaml providers: - id: elevenlabs:alignment:json label: Alignment (JSON) tests: - vars: audioFile: path/to/audio.mp3 transcript: 'Your transcript text here' format: json ``` ### SRT subtitles ```yaml providers: - id: elevenlabs:alignment:srt label: Alignment (SRT Subtitles) tests: - vars: audioFile: path/to/audio.mp3 transcript: 'Your transcript text here' format: srt ``` ### VTT subtitles ```yaml providers: - id: elevenlabs:alignment:vtt label: Alignment (VTT Subtitles) tests: - vars: audioFile: path/to/audio.mp3 transcript: 'Your transcript text here' format: vtt ``` ## Testing Assertions ```yaml tests: # Verify alignment succeeds - assert: - type: javascript value: output.includes('words') # JSON format - type: not-contains value: error # Verify SRT format - assert: - type: javascript value: output.includes('-->') && output.includes('small step') ``` ## Best Practices 1. **Transcript accuracy**: Ensure transcript exactly matches spoken audio 2. **Include punctuation**: Better subtitle chunking and timing 3. **Audio quality**: Clear audio produces more accurate timestamps 4. **Format selection**: - Use SRT for video editing (Premiere, Final Cut, DaVinci) - Use VTT for web players (HTML5 `