Jointly Modeling Embedding and Translation to Bridge Video and Language

US-201514946988-A
Stocking
Nationwide
Liên hệ
0x0
0 (gram)

(en)Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.

You are commenting for Jointly Modeling Embedding and Translation to Bridge Video and Language


You are contracting for Jointly Modeling Embedding and Translation to Bridge Video and Language


Expert Jointly Modeling Embedding and Translation to Bridge Video and Language

Full name: Đoàn Thị Kiều Oanh

VTEX2208
(+84) 982 982 604
kieuoanh.doan@gmail.com

Address : Phường Quyết Tâm, TP. Sơn La, Tỉnh Sơn La