Cogent Labs Research Scientist Max gave a talk at NVIDIA GTC Japan 2018: Variational Autoencoders for NLP



Variational Autoencoders for NLP – Particular Difficulties, Recent Solutions, and Practical Applications

According to a Forbes study in May 2018, every single minute humanity as a whole is sending 156 Million emails (not including the countless millions of bot-generated spam emails), performing 3.9 million Google searches, posting 800,000 comments on Facebook, and edits 600 Wikipedia pages. And by the time you are reading this, these numbers are likely to have increased even further.

Language is at the core of how we as humans make sense of the world around us and how we interact with each other. Hence if we are striving for an AI that can both make use of human-generated data, as well as interact and communicate with us humans in a way that is not only understandable but also feels natural and comfortable, we need to consider language understanding as one of the core capabilities.

The field of Natural Language Processing (NLP) is currently undergoing a dramatic transformation. While the field has existed for many decades, only the recent advent and rapid advance of deep learning have lead to breakthroughs that could actually foster widespread practical adoption of these techniques. Whereas traditional NLP heavily relied on complex and painstakingly constructed rule-based systems, deep learning approaches, given enough training data, are much more flexible and have rapidly achieved state of the art results in many disciplines, from human-like translation, to convincing chat bots, accurate speech recognition, and beyond.



One class of models that has been particularly successful, and responsible for much of the progress in deep learning, are so called Generative Models. The rough intuition behind this is that a system that can generate realistic data must have gained at least some limited “understanding” of how the real world (and in this specific case language) works.

Two of the most prominent types of generative models are General Adversarial Networks (GANs) and Variational Autoencoders (VAEs). While GANs have been responsible for many breakthrough results in image processing, they have so far (with certain exceptions) not proven to be particularly successful at NLP related tasks. VAEs are a much more common choice when trying to solve NLP problems.

I recently was invited by NVIDIA to speak about this topic at their GPU Technology Conference (GTC) in Tokyo. My talk “Variational Autoencoders for NLP: Particular Difficulties, Recent Solutions, and Practical Applications” introduced the general idea of generative models and VAEs and discussed how they are applied in NLP. It also touched on how recent advances have allowed us to overcome some difficulties faced by the earlier models.

Besides these more technical aspects, I wanted to do something that is often overlooked in academic settings, namely bridging the gap between research and practical applications, solving real problems with real data. For this reason, despited being invited as a speaker at a technical session, I decided to spend the final third of my talk outlining how we at Cogent Labs are using some of the general ideas introduced in the first part of the talk and apply them in our natural language processing solution Kaidoku. In addition to a general introduction to Kaidoku, I further demonstrated its wide applicability with two real use cases from the domains of finance and policy making.

Photos provided by NVIDIA