A new data set reveals just how bad AI is at reasoning—and suggests that a new hybrid approach might be the best way forward.
Questions, questions: Known as CLEVRER, the data set consists of 20,000 short synthetic video clips and more than 300,000 question and answer pairings that reason about the events in the videos. Each video shows a simple world of toy objects that collide with one another following simulated physics. In one, a red rubber ball hits a blue rubber cylinder, which continues on to hit a metal cylinder.
The questions fall into four categories: descriptive (e.g., “What shape is the object that collides with the cyan cylinder?”), explanatory (“What is responsible for the gray cylinder’s collision with the cube?”), predictive (“Which event will happen next?”), and counterfactual (“Without the gray object, which event will not happen?”)...
Continue reading Karen Hao's article in MIT Technology Review, March 6, 2020.
Also see the research article: CLEVRER: Kexin Yi, et al., "Collision Events for Video Representation and Reasoning," arXiv:1910.01442v.