Home
Posts
Publications
CV
Light
Dark
Automatic
Resource
How2: A Large-scale Dataset For Multimodal Language Understanding
In this paper, we introduce How2, a multimodal collection of instructional videos with English subtitles and crowdsourced Portuguese translations. We also present integrated sequence-to-sequence baselines for machine translation, automatic speech …
Cite
×