About this dataset


To advance research in leveraging semantic information and multi-sensor data to enhance the performances of SLAM and 3D reconstruction in complex indoor scenes, we propose a novel and complex indoor dataset named CID-SIMS, where semantic annotated RGBD images, inertial measurement unit (IMU) measurements and wheel odometer data are provided from a ground wheeled robot viewpoint. The dataset consists of 22 challenging sequences captured in 9 different scenes including office building and apartment environments. Notably, our dataset achieves two significant breakthroughs. Firstly, semantic information and multi-sensor data are provided meanwhile for the first time. Secondly, GeoSLAM is utilized for the first time to generate ground truth trajectories and 3D point clouds within 2 cm accuracy. With spatial-temporal synchronous ground truth trajectories and 3D point clouds, our dataset is capable of evaluating SLAM and 3D reconstruction algorithms in a unified global coordinate system.

Sequences


Sequences in this dataset are mainly from a ground wheeled robot with two camera views (approximately 0.26 m (high) and 0.12 m (low) above the ground), except for two sequences that contain handheld situations when going downstairs. The dataset covers a large variety of scenes and motion modes. There are some challenging scenarios such as motion blurs, illumination changes, reflective objects and weak textures.

Statistics


Office Building

We collect 13 sequences in a typical office building, which cover 6 different scenes. Sequences in floor scenes are challenging because of insufficient rotation and weak textures.

  • Office: Three sequences are captured inside a small office room (about 6 m × 7 m), where all the objects are static.
  • Floor14: Three sequences are captured in the open office area on the 14th floor of a building, including illumination changes and dynamic pedestrians.
  • Downstairs: A sequence is collected in the stairwell by carrying the robot going downstairs by hand. The sequence is under 6-DoF motion and the wheel odometer data is missing.
  • Two floors: A long sequence is recorded by firstly moving on the 14th floor, then going downstairs, next moving on the 13th floor and finally going back to the 14th floor. When passing through the stairs, the wheel odometer data is unavailable.
  • Floor3: Three sequences are captured along corridors (about 70 m) with straight line motion, dynamic pedestrians, reflective objects and an uphill movement.
  • Floor13: Two long sequences are recorded in a long corridor (about 65 m).
  • Apartment

    We record 9 sequences in real living environments, which cover 3 different apartments. The apartments are about 10 m × 10 m, including a living room, two or three bedrooms, a bathroom and a kitchen. These rooms are cluttered with small and unstructured obstacles. Sequences in apartment scenes contain rich loops and rotational motions. There are also challenging situations such as motion blurs, weak textures and reflective objects.

  • Apartment1: Three sequences are captured in an apartment during the day, where the tiled floor is reflective.
  • Apartment2: Three sequences are collected in an apartment during the day. The apartment is messy, in which dust and litter cause uneven floors.
  • Apartment3: Three sequences are acquired in an apartment at night with the lights turned on.