Back to Publications
2021arXiv (Cornell University)

One-shot Visual Reasoning on RPMs with an Application to Video Frame Prediction

He, Wentao, Ren, Jianfeng, and Bai, Ruibin

Abstract

One-shot Visual Reasoning on RPMs with an Application to Video Frame Prediction

Raven's Progressive Matrices (RPMs) are frequently used in evaluating human's visual reasoning ability. Researchers have made considerable effort in developing a system which could automatically solve the RPM problem, often through a black-box end-to-end Convolutional Neural Network (CNN) for both visual recognition and logical reasoning tasks. Towards the objective of developing a highly explainable solution, we propose a One-shot Human-Understandable ReaSoner (Os-HURS), which is a two-step framework including a perception module and a reasoning module, to tackle the challenges of real-world visual recognition and subsequent logical reasoning tasks, respectively. For the reasoning module, we propose a 2+1 formulation that can be better understood by humans and significantly reduces the model complexity. As a result, a precise reasoning rule can be deduced from one RPM sample only, which is not feasible for existing solution methods. The proposed reasoning module is also capable of yielding a set of reasoning rules, precisely modeling the human knowledge in solving the RPM problem. To validate the proposed method on real-world applications, an RPM-like One-shot Frame-prediction (ROF) dataset is constructed, where visual reasoning is conducted on RPMs constructed using real-world video frames instead of synthetic images. Experimental results on various RPM-like datasets demonstrate that the proposed Os-HURS achieves a significant and consistent performance gain compared with the state-of-the-art models.

Keywords

Visual reasoningSemantic reasonerComputer scienceArtificial intelligenceFrame (networking)Set (abstract data type)Shot (pellet)Convolutional neural networkReasoning systemQualitative reasoningBlack boxDeductive reasoningMachine learningComputer vision

Authors from this organization

Ruibin Bai

Ruibin Bai

Director of Lab

Computer Science and Operations Research