1. arXiv
    D3Net: A Speaker-Listener Architecture for Semi-supervised Dense Captioning and Visual Grounding in RGB-D Scans
    Chen, Dave Zhenyu, Wu, Qirui, Niessner, Matthias, and Chang, Angel X.
  2. CVPR
    Scan2Cap: Context-aware Dense Captioning in RGB-D Scans
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021


  1. ECCV
    ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
    Chen, Dave Zhenyu, Chang, Angel X., and Niessner, Matthias
    In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XX 16 2020