cover

Advancing VLN: O3D-SIM's Role in Human-Robot Collaboration and Simulation

16 Dec 2025

O3D-SIM introduces an open-set 3D representation that significantly improves the success rate of natural language queries.

cover

O3D-SIM Visualization: Accurately Identifying Unseen Objects and Instances

16 Dec 2025

Qualitative analysis showcases O3D-SIM's open-set capability, identifying objects like wheelchairs and mannequins unseen by closed-set methods.

cover

Quantitative Evaluation of O3D-SIM: Success Rate on Matterport3D VLN Tasks

16 Dec 2025

Quantitatively evaluates O3D-SIM using the Matterport3D dataset and Success Rate metric in the Habitat simulator

cover

Evaluating Novel 3D Semantic Instance Map for Vision-Language Navigation

16 Dec 2025

The experimental section details the evaluation of the O3D-SIM representation and its integration with ChatGPT for Vision-Language Navigation (VLN).

cover

VLN: LLM and CLIP for Instance-Specific Navigation on 3D Maps

16 Dec 2025

The Language-Guided Navigation module leverages an LLM (like ChatGPT) and the open-set O3D-SIM.

cover

3D Mapping Initialization: Using RGB-D Images and Camera Parameters

14 Dec 2025

O3D-SIM creation starts with capturing posed RGB-D images and camera parameters.

cover

Building Open-Set 3D Representation: Feature Fusion and Geometric-Semantic Merging

14 Dec 2025

O3D-SIM is built by projecting 2D masks and embeddings to 3D, using DBSCAN for initial refinement.

cover

Open-Set Semantic Extraction: Grounded-SAM, CLIP, and DINOv2 Pipeline

13 Dec 2025

Describes the process of extracting open-set semantic instance information.

cover

Semantic Instance Extraction: CLIP and DINO Features for 3D Mapping

10 Dec 2025

Details the O3D-SIM pipeline for VLN. It extracts open-set semantic instance information (masks, CLIP/DINO features) from RGB-D images