Department of Computer Science at North Carolina State University

Seminars & Colloquia

Raymond Mooney

University of Texas at Austin

"Generating Natural-Language Video Descriptions using LSTM Recurrent Neural Networks"

Monday October 17, 2016 04:00 PM
Location: 3211, EB2 NCSU Centennial Campus
(Visitor parking instructions)

This talk is part of the Triangle Computer Science Distinguished Lecturer Series

Abstract:

We present a method for automatically generating English sentences describing short videos using deep neural networks. Specifically, we apply convolutional and Long Short-Term Memory (LSTM) recurrent networks to translate videos to English descriptions using an encoder/decoder framework. A sequence of image frames (represented using deep visual features) is first mapped to a vector encoding the full video, and then this encoding is mapped to a sequence of words. We have also explored how statistical linguistic knowledge mined from large text corpora, specifically LSTM language models and lexical embeddings, can improve the descriptions. Experimental evaluation on a corpus of short YouTube videos and movie clips annotated by Descriptive Video Service demonstrate the capabilities of the technique by comparing its output to human-generated descriptions.

Short Bio:

Raymond J. Mooney is a Professor in the Department of Computer Science at the University of Texas at Austin. He received his Ph.D. in 1988 from the University of Illinois at Urbana/Champaign. He is an author of over 160 published research papers, primarily in the areas of machine learning and natural language processing. He was the President of the International Machine Learning Society from 2008-2011, program co-chair for AAAI 2006, general chair for HLT-EMNLP 2005, and co-chair for ICML 1990. He is a Fellow of the American Association for Artificial Intelligence, the Association for Computing Machinery, and the Association for Computational Linguistics and the recipient of best paper awards from AAAI-96, KDD-04, ICML-05 and ACL-07.

Host: Mohit Bansal, UNC

To access the video of this talk, click here.

Back to Seminar Listings
Back to Colloquia Home Page

Seminars & Colloquia

Raymond Mooney

University of Texas at Austin

"Generating Natural-Language Video Descriptions using LSTM Recurrent Neural Networks"

Monday October 17, 2016 04:00 PM Location: 3211, EB2 NCSU Centennial Campus (Visitor parking instructions)

Monday October 17, 2016 04:00 PM
Location: 3211, EB2 NCSU Centennial Campus
(Visitor parking instructions)