Skip to Content

StandardSim: A Synthetic Dataset For Retail Environments

We’re proud to present StandardSim: a first-of-its kind synthetic dataset for retail environments. Our pioneering work has recently been accepted as a research paper to ICIAP 2021. Our contributions include (1) the release a large-scale, photorealistic dataset for use in non-commercial research, (2) the introduction of a computer vision task we call change detection, and (3) a codebase to train deep learning model benchmarks on this novel dataset.

Learn More

Change Detection Task: Note the objects appearing or disappearing on the shelves, as well as slight variations in lighting.


Autonomous checkout systems rely on visual and sensory inputs to carry out fine-grained scene understanding in retail environments. The issue is that retail environments present unique challenges compared to typical indoor scenes, owing to the vast number of densely-packed, unique and yet similar objects. The problem becomes even more pronounced when only RGB input is available,—especially for data-hungry tasks such as instance segmentation.

To address the lack of datasets for retail, we present StandardSim: a large-scale, photorealistic synthetic dataset featuring annotations for semantic segmentation, instance segmentation, depth estimation, and object detection. Our dataset provides multiple views per scene and enables multi-view representation learning. Further, we introduce a novel task central to autonomous checkout called change detection—requiring pixel-level classification of takes, puts, and shifts in objects over time. We benchmark widely-used models for segmentation and depth estimation on our dataset, show that our test set constitutes a difficult benchmark compared to current smaller-scale datasets, and show that our training set provides models with crucial information for autonomous checkout tasks.

StandardSim is the result of work from Dan Fiscetti, Kenny Kihara, and Mohammed Azeem Sheikh. It also includes significant contributions from Cristina Mata and Nick Locascio, who are no longer at Standard AI but are are co-authors on the technical paper.

See here an example of data augmentation via domain texture randomization.

See here our various Data Modalities, including Before Image, After Image, Depth Image, and Normals.

See here StandardSim’s size and feature set in comparison to other datasets.

We’re looking for talented engineers to work with us and help solve complex real-world problems at scale. If you’d like to join us, please apply here:

Sharethis page:Share on TwitterShare on LinkedIn