Perception Through Structured Generative Models

August 28, at ECCV 2020


The highly structured nature of the visual world has inspired many computer vision researchers to pursue an analysis-by-synthesis approach: to understand an image, one should be able to reproduce it with a model. A good model should also be able to extrapolate into unseen space or time: given a 2D or 2.5D image of a partially occluded object, what is the full 3D extent? Given a fragment of a video, how does the remainder play out? Generative models of images, video, and 3D data have made great strides in recent years, but their utility as causal or interpretable models has not always advanced in step. For example, while GANs can currently generate beautiful images, they do not necessarily learn a latent space of graphics-like or semantically-interpretable elements. In this workshop, we aim to explore how generative models can facilitate perception, and in particular, how to design and use structured generative models (of images, video, and 3D data) for computer vision inference applications.

Recordings of Invited Talks

Opening remarks
Max Welling
University of Amsterdam
J. Kevin O'Regan
University Paris Descartes
Sanja Fidler
University of Toronto
Ruslan Salakhutdinov
Carnegie Mellon University
Carl Vondrick
Columbia University

Accepted papers

  • Vadim Sushko, Edgar Schönfeld, Dan Zhang, Juergen Gall, Bernt Schiele, Anna Khoreva. "3D Noise and Adversarial Supervision Is All You Need for Multi-Modal Semantic Image Synthesis". PDF link.
  • Sarthak Bhagat, Vishaal Udandarao, Shagun Uppal, Saket Anand. "DisCont: Self-Supervised Visual Attribute Disentanglement using Context Vectors". PDF link.
  • Weiyu Du, Oleh Rybkin, Lingzhi Zhang, Jianbo Shi. "Toward Continuous-Time Representations of Human Motion". PDF link.

Call for papers

We are soliciting original contributions in computer vision, robotics, and machine learning relating to the following topics:

  • Inverse graphics
  • Generative models for images, video, 3D data
  • Reconstruction or prediction as objectives for representation learning
  • Learning disentangled and/or interpretable representations
  • Novel methods for structured generative modelling
  • Generation for prediction, anomaly detection, compression, search, etc.
  • Managing and leveraging visual stochasticity
  • Incorporating hierarchy and graphics-like elements into machine learning
  • Causal and forward models of visual data

Submission deadline: August 20. We encourage authors to submit earlier, anytime between August 1 and August 20, to help us spread out the reviewing work.

Submit your paper to our OpenReview site, using the ECCV 2020 "final copy" latex kit.

Submissions should be 4 pages long, including references. The 4-page limit helps eliminate dual-submission conflicts with ECCV and other conferences. (E.g., even papers accepted to ECCV may be dual-submitted here, provided that they are shortened to 4 pages.)

The workshop organizers will review the papers in a single-blind fashion. All accepted papers will be included in a poster presentation session. Accepted papers will be published in the proceedings.

Program schedule

Morning/afternoon session

09:00 Pittsburgh (EST) / 14:00 London (UTC+1)
Max Welling
Combining Generative and Discriminative Models
09:40 Pittsburgh (EST) / 14:40 London (UTC+1)
J. Kevin O'Regan
Thinking about vision in a different way: the world as an outside memory
10:20 Pittsburgh (EST) / 15:20 London UTC+1)
Peter Battaglia
Structured understanding and interaction with the world
11:00 Pittsburgh (EST) / 16:00 London UTC+1)
Poster session 1

Afternoon/evening session

17:00 Pittsburgh (EST) / 22:00 London (UTC+1)
Sanja Fidler
AI for 3D Content Generation
17:40 Pittsburgh (EST) / 22:40 London (UTC+1)
Ruslan Salakhutdinov
Geometric Capsule Autoencoders for 3D Point Clouds
18:20 Pittsburgh (EST) / 23:20 London (UTC+1)
Carl Vondrick
Data and Task Generalization
19:00 Pittsburgh (EST) / 00:00 London UTC+1)
Poster session 2

Access the zoom links through the ECCV virtual platform.


Adam Harley
Carnegie Mellon University
Katerina Fragkiadaki
Carnegie Mellon University

Contact: Adam Harley (

Website theme based on Scene Graph Representation and Learning.