Back to Blog

How We Achieved 116ms Response Time: AI Performance Benchmark

November 29, 2025
Features5 min read
How We Achieved 116ms Response Time: AI Performance Benchmark

How We Achieved 116ms Response Time: Engineering Benchmark

Speed matters in real-time interview assistance. At 500ms, answers feel delayed. At 200ms, they feel fast. At 116ms, they feel instant — answers appear before you've even finished processing the question yourself. Here's exactly how we achieved it.

Pipeline Optimization Strategies

1. Edge-First Audio Processing

Audio processing runs locally on your machine, not in the cloud. This eliminates ~80ms of network round-trip time. We use optimized WASM-compiled VAD and STT models for near-native performance.

2. Streaming Everything

Nothing waits for full completion. Audio streams to STT, STT streams to LLM, LLM streams to overlay. Each component processes chunks as they arrive, not buffered batches.

3. Speculative Inference

As transcription streams in, we begin inference on partial input. If the question changes, we restart, but for 90% of questions, the first few words indicate the topic accurately enough to start generating.

4. Model Selection & Quantization

We maintain multiple model variants: lightweight models for quick pattern detection, quantized models for fast inference, and full-precision models for complex questions. The router selects based on question complexity.

Benchmark Comparison

ToolFirst Token LatencyFull Answer
AissenceAI116ms1.2s
Final Round AI~500ms~3-5s
LockedIn AI~300ms~2-4s
ChatGPT (manual)~1000ms~5-10s

Why This Matters

In a live interview, you have about 3-5 seconds to start responding to a question. With 500ms+ tools, you get AI suggestions after you've already started talking (too late). At 116ms, suggestions arrive while you're still hearing the question — giving you time to plan your response. This is the core advantage of AissenceAI.

#Features#InterviewPrep#CareerGrowth
How We Achieved 116ms Response Time: AI Performance Benchmark | AissenceAI Blog