EPIC-KITCHENS: Largest-ever Computer Vision Dataset from wearable cameras

Computer Science researchers at the University of Bristol have released EPIC-KITCHENS, a dataset filmed in 32 kitchens across four cities. The films, which include 11.5 million images, have been annotated with 40,000 action examples and half a million objects. This ground-breaking dataset will help machines to learn and advance first-person vision, enabling improvements in robotics, healthcare and augmented reality.

EPIC-KITCHENS is the largest-ever video dataset using wearable cameras, available to the academic research community, for automatic understanding of object interactions in daily living. It is aimed to advance the field of first-person vision, perceiving the world from the wearer’s perspective, as well as the wearer’s intentions and interactions. Wearable vision is believed to be the next step beyond hand-held (e.g. mobile) computer vision.

“First-person vision has been hindered for years by the unavailability of big data,” said Dr Dima Damen, Senior Lecturer at the Department of Computer Science. “EPIC-KITCHEN will allow training of data-intensive machine learning algorithms. It offers a bed of interesting challenges from typical object detection - locating objects in the video, to behaviour analysis and action anticipation.”

EPIC-KITCHENS consists of 11.5M images, recorded by 32 individuals in their own homes over several consecutive days. The dataset is fully annotated for actions and objects in these videos. Around 40,000 action examples and half a million objects have been annotated. The annotation is unique in that it is based on the participants narrating their own videos, thus reflecting true intention. The ground-truth was then crowd-sourced based on these narrations.

EPIC-KITCHENS is the outcome of a 12-months collaboration with the University of Toronto (Canada), a leading research lab in deep learning and computer vision, and the University of Catania (Italy), a highly active research group in first-person vision. Dr Antonino Furnari, a senior research fellow at the University of Catania, expressed his excitement at today’s release, “I believe EPIC-KITCHENS will accelerate research in the field by providing realistic data useful to study and understand human-object interactions.” The collaborators invite research groups worldwide to compete on available challenges and introduce new ones. Dr Sanja Fidler, Assistant Professor at the University of Toronto, noted that the plan is to “track the community’s progress on established challenges, with held-out test ground truth, via online leaderboards.”

The effort to collect, annotate, benchmark and release EPIC-KITCHENS required the dedication of 11 researchers across the three universities, for a full year. Will Price, PhD student at the University of Bristol noted, “As a first year PhD student, I have found participating in the collection and annotation of a dataset enlightening. Working with a dataset of this size introduced me to the necessity of high performance computing to process data in a timely fashion, and challenged my skills in training and modifying neural networks including architectural improvements.” EPIC-KITCHENS has made use of Bristol’s BlueCrystal4 as well as GW4’s JADE high performance computers to process such a large scale dataset. “This is the largest dataset to be released by Bristol to-date, in terms of size. The professional effort of the Data.Bris team has been instrumental in today’s release,” Dr Damen stated.

The annotation of EPIC-KITCHENS was made possible via a charitable donation from Nokia Technologies to Dr Dima Damen, as well as seed funding from Bristol’s Jean-Golding Institute for data intensive research.