Agents can't see or hear video. loom-watch pulls a Loom's transcript and screen frames and aligns them, so what was said sits next to what was on screen.