TEACh Two-Agent Task Completion (TATC) Challenge

We will soon be accepting submissions for the TEACh Two Agent Task Completion (TATC) Embodied AI Challenge. Hand-coded, rule-based planning agents, even when attempting to handle many corner cases, achieve less than 25% success. We hope to see learning-based methods that can do much better! (Leaderboard coming soon!)
Code, precomputed features, and AI2Thor simulator are all available for a quick start on GitHub

TEACh TATC Challenge @ CVPR 2022 Embodied AI Workshop

Overview

This challenge focuses on how to produce instructions for, ask and answer questions about, and carry out embodied visual tasks in a shared virtual environment. We challenge researchers to consider partner models of the world, pragmatics of instruction and question generation, barge-in communication, and instruction following from verbal and nonverbal cues. Such challenges are not captured by current datasets for embodiment [1, 2, 3, 4, 5].

Key Topics

Grounded Natural Language and Dialogue
Egocentric and Robotic vision
Interactive/Causal Reasoning
Learning from Demonstration
Task and Symbolic Planning
Deep Reinforcement Learning
Commonsense Reasoning

Anthony Liang

Ishika Singh

Mohit Shridhar

Tiffany Min

Yonatan Bisk

Jesse Thomason

Challenge Details

Guidelines
Participants are required to upload their model to our evaluation server (coming soon!) with [EAI22] in the submission title, e.g., [EAI22] Seq2seq Model. The evaluation server automatically evaluates the models on an unseen test set. Final numbers for the prize challenge will be frozen on Jun 12. Winning submissions will be required to submit a brief (private) report of technical details for validity checking. We will also conduct a quick code inspection to ensure that the challenge rules weren't violated (e.g., peeking at unavailable info on test scenes for either the Commander or Follower agent).

Dataset
The challenge is based on the TEACh Dataset, which contains over 3,000 episodes of human-human dialogues for guiding a human-controlled agent to complete household chores in the AI2THOR simulator. Agents interact with environments through discrete actions with end effector click positions, and with one another via a text-chat interface.

Important Dates

Timeline
Challenge Opens	Feb 14
Leaderboard closes	Jun 12
Winner announcement	Jun 17

Evaluation (Links coming soon!)
It is likely that you will submit your pre-trained Commander and Follower agents, which will be run on an evaluation server on unseen data.

Metric
The submissions will be ranked by Unseen Success Rate.

Rules

Include [EAI22] in the submission title e.g., [EAI22] Seq2seq Model.
Do not exploit the metadata in test-scenes: you should solve the vision-language grounding problem without misusing the metadata from THOR. For leaderboard evaluations, Follower agents should just use Follower camera RGB input and dialogue utterances. You cannot use additional depth, mask, metadata info etc. from the simulator on Test Seen and Test Unseen scenes. Commander agents are free to use all the metadata returned by ProgressCheck and SearchObject actions (e.g., object segmentation masks and map locations), and RGB inputs from the Commander, Follower, and Object Search cameras. Submissions that use additional info on test scenes will be disqualified. However, during training you are allowed to use additional info for auxiliary losses etc.
During evaluation, agents will be restricted to a number of maximum steps to be decided soon.
You can publish your results on the leaderboard only once every 7 days.
Do not spam the leaderboard with repeated submissions (under different email accounts) in order to optimize on the test set. Fine-tuning should be done only on the validation set. Violators will be disqualified from the challenge.
Try to solve the TEACh Two-Agent Task Completion (TATC) challenge: all submissions must be attempts to solve the TEACh TATC dataset.
Share who you are: you must provide a team name and affiliation.
(Optional) Share how you solved it: if possible, share information about how the task was solved. Link an academic paper or code repository if public.
Only submit your own work: you may evaluate any model on the validation set locally, but must only submit your own work for evaluation against the test set on the leaderboard.

FAQ

Do you need to submit a report?
Winning submissions will be required to submit a brief (private) report of technical details for validity checking. Also consider submitting a workshop paper to EAI by TBD. See submission guidelines for EAI.

Is there a prize for the winner?
TBD