The workshop will be held in-person on August 5th, 2022 in Eindhoven, Netherlands.
The online poster session will take place via GatherTown, accessible through the UAI Zoom event.
|9:00 – 9:15 am||Welcome & Best Paper Awards|
|9:15 – 10:00 am||Spotlight Presentations|
|10:00 – 11:00 am||Poster session I & Coffee Break|
|11:00 – 11:40 am||Encoding spatial priors with VAEs for geospatial modelling |
Speaker: Elizaveta Semenova
|11:40 – 12:20 pm||Cancelled Generative models for discrete random variables and lossless source compression |
Speaker: Rianne van den Berg
|12:20 – 1:50 pm||Lunch Break|
|1:50 – 2:30 pm||An alternate scaling route for AI via probabilistic programs |
Speaker: Vikash Mansinghka
|2:30 – 3:10 pm||Are Tractable Probabilistic Generative Models More Diverse? |
Speaker: Adji Bousso Dieng
|3:10 – 4:20 pm||Poster session II & Coffee Break|
|4:20 – 5:00 pm||Algorithms for Solving The Constrained Most Probable Explanation Problem |
Speaker: Vibhav Gogate
|5:00 – 5:40 pm||Efficient and Robust Learning from Massive Datasets |
Speaker: Baharan Mirzasoleiman
|5:40 – 6:00 pm||Panel discussion & closing remarks|
Rianne van den Berg (Microsoft Research)
Abstract: In this talk I will discuss how different classes of generative models can be adapted to handle discrete random variables, and how this can be used to connect generative models to downstream tasks such as lossless compression. I will start by discussing normalizing flow models, and the challenges that arise when converting these models that are typically designed for real-valued random variables to discrete random variables. Next, I will demonstrate how denoising diffusion models with discrete state spaces have a rich design space in terms of the noising process, and how this influences the performance of the learned denoising model. Finally, I will show how denoising diffusion models can be connected to autoregressive models, and introduce an autoregressive model with a random generation order.
Bio: Rianne is a Principal Researcher at Microsoft Research Amsterdam, where she works on the intersection of deep learning and computational chemistry and physics for molecular simulation. Her research has spanned a range of topics from generative modeling, variational inference, source compression, graph-structured learning to condensed matter physics. Before joining MSR she was a Research Scientist at Google Brain. She received her PhD in theoretical condensed-matter physics in 2016 at the University of Amsterdam, where she also worked as a postdoctoral researcher as part of the Amsterdam Machine Learning Lab (AMLAB). In 2019 she won the Faculty of Science Lecturer of the Year award at the University of Amsterdam for teaching a machine learning course in the master of AI.
Adji Bousso Dieng (Princeton, USA)
Bio: Adji Bousso Dieng is an Assistant Professor of Computer Science at Princeton University where she leads Vertaix on research at the intersection of artificial intelligence and the natural sciences. She is also a Research Scientist at Google AI and the founder and President of the nonprofit The Africa I Know. She has recently been named the Annie T. Randall Innovator of 2022 for her research and advocacy. She received her Ph.D. from Columbia University and was advised by David Blei and John Paisley. Her doctoral work received many recognitions, including a Google Ph.D. Fellowship in Machine Learning, a rising star in Machine Learning nomination, and the Savage Award.
Vibhav Gogate (UT Dallas, USA)
Abstract: Recently there has been growing interest in developing probabilistic models that are robust to minor perturbations in input and are explainable in that they are able to explain why they made a particular decision to a user. In this talk, I will describe a new unifying optimization task called constrained most probable explanation (CMPE), and show that the two aforementioned tasks, making models robust and explainable, can be reduced to CMPE. I will show that CMPE is strongly NP-hard in general on arbitrary probabilistic models, but only weakly NP-hard on probabilistic models having small k-separators (a sub-class of tractable models that admit poly-time marginal inference). The main virtue of this weakly NP-hard property is that we can leverage it to derive efficient cutset sampling and local search approximations for lower bounding the optimal value of CMPE. For upper bounding, I will present efficient approaches that combine graph-based partitioning techniques with approximations developed in the literature on knapsack problems. These upper bounding techniques are guaranteed to be better than linear programming based bounds on probabilistic models that admit tractable marginal inference and have much smaller computational complexity. I will end my talk by presenting experimental results and applications as well as avenues for future work. (Joint work with Sara Rouhani, Rohith Peddi and Tahrima Rahman)
Bio: Vibhav Gogate is an Associate Professor in the Computer Science Department at the University of Texas at Dallas. He got his Ph.D. at University of California, Irvine in 2009 and then did a two-year post-doc at University of Washington. His research interests are in AI, machine learning and data mining. His ongoing focus is on probabilistic graphical models; their first-order logic based extensions such as Markov logic and probabilistic programming; tractable probabilistic models and explainable AI. He is a recipient of the national science foundation CAREER award and the co-winner of 2010 and 2012 UAI inference competitions.
Vikash K. Mansinghka (MIT, USA)
Abstract: A great deal of enthusiasm has been focused on building increasingly large neural models. This talk will review progress along an alternate scaling route based on probabilistic programs whose source code is partly written by AI engineers and partly learned from data. This approach integrates key ideas from large-scale generative modeling and deep learning with probabilistic inference and symbolic programming. Unlike neural networks, generative probabilistic programs can report what they know, what they don’t, and why; they can be modularly designed, debugged, trained, and tested; and they can learn new symbolic code rapidly and accurately from sparse data.
This talk will review probabilistic programs, written in Gen, that outperform machine learning in perceiving the 3D structure of cluttered tabletop scenes and forecasting macroeconomic time series. It will also present results from the SPPL language (based on probabilistic circuits) showing how to analyze decision tree fairness ~1000x faster than previous approaches.
This talk will highlight fundamental tradeoffs between tractability and expressiveness, and discuss roles for PPL implementations of tractable probabilistic models (and of alternative notions of tractability) in scaling AI systems engineering.
Bio: Vikash Mansinghka is a Principal Research Scientist at MIT, where he leads the Probabilistic Computing Project. Vikash holds S.B. degrees in Mathematics and in Computer Science from MIT, as well as an M.Eng. in Computer Science and a PhD in Computation. He also held graduate fellowships from the National Science Foundation and MIT’s Lincoln Laboratory. His PhD dissertation on natively probabilistic computation won the MIT George M. Sprowls dissertation award in computer science, and his research on the Picture probabilistic programming language won an award at CVPR. He co-founded three VC-backed startups: Prior Knowledge (acquired by Salesforce in 2012) and Empirical Systems (acquired by Tableau in 2018), and Common Sense Machines (funded in 2020). He served on DARPA’s Information Science and Technology advisory board from 2010-2012, currently serves on the editorial board for the Journal of Machine Learning Research, and co-founded the International Conference on Probabilistic Programming.
Baharan Mirzasoleiman (UCLA, USA)
Abstract: Large datasets have been crucial to the success of modern machine learning models. However, training on massive data has two major limitations. First, it is contingent on exceptionally large and expensive computational resources, and incurs a substantial cost due to the significant energy consumption. Second, in many real-world applications such as medical diagnosis, self-driving cars, and fraud detection, big data contains highly imbalanced classes and noisy labels. In such cases, training on the entire data does not result in a high-quality model. In this talk, I will argue that we can address the above limitations by developing techniques that can identify and extract the most informative subsets for learning from massive datasets. Training on such subsets not only reduces the substantial costs of learning from big data, but also improves their accuracy and robustness against noisy labels. I will discuss how we can develop effective and theoretically rigorous techniques that provide strong guarantees for the learned models’ quality and robustness against noisy labels.
Bio: Baharan Mirzasoleiman is an Assistant Professor in the Computer Science Department at University of California Los Angeles. Baharan’s research focuses on developing new methods that enable efficient and robust learning from massive datasets. She received her PhD from ETH Zurich, and was a Postdoc at Stanford University. She was awarded an ETH medal for Outstanding Doctoral Dissertation, and a Google Anita Borg Memorial Scholarship. She was also selected as a Rising Star in EECS from MIT.
Elizaveta Semenova (University of Oxford, UK)
Abstract: Gaussian processes (GPs), implemented through multivariate Gaussian distributions for a finite collection of data, are the most popular approach in spatial statistical modelling. In this context they are used to encode correlation structures over space and can generalise well in interpolation tasks. Despite their flexibility, off-the-shelf GPs present serious computational challenges which limit their scalability and practical usefulness in applied settings. I will present a deep generative modelling approach to tackle this challenge: for a particular spatial setting, a class of GP priors is approximated through prior sampling and subsequent training of a variational autoencoder (VAE). Given a trained VAE, the resultant decoder allows spatial inference to become incredibly efficient due to the low dimensional, independently distributed latent Gaussian space representation of the VAE. Once trained, inference using the VAE decoder replaces the GP within a Bayesian sampling framework. This approach provides tractable and easy-to-implement means of approximately encoding spatial priors and facilitates efficient statistical inference.
Bio: Liza is currently a postdoctoral research associate at the University of Oxford where she works on scalable methods and flexible models for spatiotemporal data. More broadly, interests lie at the intersection of Bayesian statistics and such applied fields as epidemiology, public policy and drug discovery.