Variational Inference and Deep Generative Models ACL18 Tutorial, Melbourne, Australia. Abstract
Variational Inference and Deep Generative Models
NLP has seen a surge in neural network models in recent years. These models provide state-of-the-art performance on many supervised tasks. Unsupervised and semi-supervised learning has only been addressed scarcely, however. Deep Generative Models (DGMs) make it possible to integrate neural networks with probabilistic graphical models. Using DGMs one can easily design latent variable models that account for missing observations and thereby enable unsupervised and semi-supervised learning with neural networks. The method of choice for training these models is variational inference. This tutorial offers a general introduction to variational inference followed by a thorough and example-driven discussion of how to use variational methods for training DGMs. It provides both the mathematical background necessary for deriving the learning algorithms as well as practical implementation guidelines. Moreover, we discuss common pitfalls that one may encounter when using DGMs for NLP applications, such as the latent variable being ignored by the model, and discuss potential solutions from a theoretical and practical perspective. Importantly, the tutorial will cover models with continuous and discrete variables.
Probabilistic modelling for NLP powered by deep learning
Deep generative models (DGMs) are probabilistic models parametrised by neural networks (NNs). DGMs combine the power of NNs with the generality of the probabilistic learning framework allowing a modeller to be more explicit about her statistical assumptions. To unlock this power however one must consider efficient ways to approach probabilistic inference. Variational inference surfaced as the method of choice, howerver, efficient and effective VI for DGMs require low-variance gradient estimation for stochastic computation graphs (Kingma and Welling, 2013; Rezende et al, 2014; Titsias and Lazaro-Gredilla, 2014). In this talk I will present an overview of deep generative modelling, amortised variational inference, and the mathematics behind low-variance reparameterised gradients.
In this talk I will outline a few deep generative models applied to induction of word alignments. I propose a fully probabilistic account to unsupervised problems such as word embedding, alignment and segmentation. This is in contrast with deterministic representations obtained as a byproduct of training neural networks as fully supervised classifiers. If you care about NLP applications such as machine translation and if you are interested in the synergy between probabilistic graphical modelling and neural networks, then do come to my talk.
Directed graphical models (the case of lexical alignment models) 08/12/2016, NLP1-UvA About
Directed graphical models (the case of lexical alignment models)
Invited lecture for NLP1 at UvA. I cover basic notions about directed graphical models and present IBM1 as an application of latent variable modelling. I also present maximum likelihood estimation via expectation maximisation for categorical distributions.
Sampling (for unsupervised language learning) 23/02/2015, ULL-UvA About
Sampling (for unsupervised language learning)
Guest lecture for Jelle Zuidema’s course on Unsupervised Language Learning at UvA. In this one hour talk, I present sampling methods and their role in performing MLE and/or posterior inference with special attention to the case of parsing.