Paper

StoryVideoQA: Scaling Deep Video Understanding with a Large-Scale, Multi-Genre and Auto-Generated Dataset

arXiv:2606.06338v1 Announce Type: new Abstract: Video question answering (VideoQA) aims to answer questions about given videos. While existing approaches excel on factoid VideoQA, they struggle with deep video understanding (DVU), which requires the comprehension of complex storylines. This challenge arises from the inherent long-range video content, multi-faceted question types, and instance-level story elements, all of which constrain the scale and diversity of manually constructed DVU datasets.These difficulties constrain the scale and diversity of manually-constructed DVU dataset. To addr…

arXiv cs.CVPublished 2026-06-05Paper link

Authors:

Topics

Relevant entities

People

Linked people will appear here.

Related coverage

Linked coverage will appear here.

Related events

Linked events will appear here.

Related discussions

Related discussion nodes will appear here.