The highly structured nature of the visual world has inspired many computer vision researchers to pursue an analysis-by-synthesis approach: to understand an image, one should be able to reproduce it with a model. A good model should also be able to extrapolate into unseen space or time: given a 2D or 2.5D image of a partially occluded object, what is the full 3D extent? Given a fragment of a video, how does the remainder play out? Generative models of images, video, and 3D data have made great strides in recent years, but their utility as causal or interpretable models has not always advanced in step. For example, while GANs can currently generate beautiful images, they do not necessarily learn a latent space of graphics-like or semantically-interpretable elements. In this workshop, we aim to explore how generative models can facilitate perception, and in particular, how to design and use structured generative models (of images, video, and 3D data) for computer vision inference applications.
Submissions should be 4 pages long, including references. The 4-page limit helps eliminate dual-submission conflicts with ECCV and other conferences. (E.g., even papers accepted to ECCV may be dual-submitted here, provided that they are shortened to 4 pages.)
The workshop organizers will review the papers in a single-blind fashion. All accepted papers will be included in a poster presentation session. Accepted papers will be published in the proceedings.
09:00 Pittsburgh (EST) / 14:00 London (UTC+1)
Combining Generative and Discriminative Models
09:40 Pittsburgh (EST) / 14:40 London (UTC+1)
J. Kevin O'Regan
Thinking about vision in a different way: the world as an outside memory
10:20 Pittsburgh (EST) / 15:20 London UTC+1)
Structured understanding and interaction with the world
11:00 Pittsburgh (EST) / 16:00 London UTC+1)
Poster session 1
17:00 Pittsburgh (EST) / 22:00 London (UTC+1)
AI for 3D Content Generation
17:40 Pittsburgh (EST) / 22:40 London (UTC+1)
Geometric Capsule Autoencoders for 3D Point Clouds