About Me


I am a research scientist at Adobe Research. My current focus is video generation.

Prior to that, I completed my Ph.D. (Dec2023) at the College of Engineering, Computing, and Cybernetics (CECC) at the Australian National University (ANU). I was also a previous research student at the Australian Centre for Robotic Vision (ACRV@ANU). I was advised by Prof. Stephen Gould (ANU), Prof. Qi Wu (UoA) and Prof. Lexing Xie (ANU).

I love Embodied AI and AICG! – β€œWhat I cannot create I do not understand.”

My latest works include Video/3D Generation and Training LLMs/VLMs for Navigation.


News

2024.02.12   Finally! My first day in Adobe Research as a full-time research scientist! So happy to be back and working with everyone! 😊πŸ”₯πŸ”₯

2024.01.20   Thrilled to share that our LRM: Large Reconstruction Model for Single Image to 3D and Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model have been accepted to ICLR2024 as Oral and Poster papers! πŸ˜†πŸŽ‰πŸŽ‰ A wonderful ending to my PhD @ANUCECC and a fantastic start to my new journey @AdobeResearch! Thanks @HaoTan for your recognition and your great advise⭐! Thanks Team πŸ™Œ! Thanks Adobe! 😊❀️❀️

2023.12.09   Congrats to @GengzeZhou for his paper acceptance to AAAI2024 – NavGPT: Explicit Reasoning in Vision-and-Language Navigation with Large Language Models! An amazing work at the start of his PhD and a wonderful attempt to integrate LLM and Embodied Agents! πŸ˜€πŸ”₯πŸ”₯

2023.07.14   Our papers Scaling Data Generation in Vision-and-Language Navigation (Oral) and Learning Navigational Visual Representations with Semantic Map Supervision have been accepted to ICCV 2023! The projects were completed/initialized during my first internship at Adobe! It was my great pleasure to work on them with my friends around the world (@ZunWang, @JialuLi, @HaoTan)! πŸ˜€πŸ˜Šβ€οΈ Thank heaps OpenGVLab@Shanghai AI Laboratory for the great support! β­πŸ™Œ

2023.02.19   Join Adobe Research again (intern)! Working on Text-to-3D Generation and Single-Image-to-3D Reconstruction, totally unfamiliar topics to me! 😊πŸ”₯πŸ”₯

2022.12.29   Paper HOP+: History-Enhanced and Order-Aware Pre-Training for Vision-and-Language Navigation by Yanyuan Qiao, Yuankai Qi, Zheng Yu, Peng Wang, Qi Wu and myself has been accepted by TPAMI! Congrats Yanyuan!!! πŸ˜€πŸ˜€πŸ˜€

2022.06.19

  • Attending CVPR2022 in person!!! Finally meeting so many great researchers! I have learned so much!!! ❀️❀️❀️
  • Congrats to Zun Wang, Dong An and Team JoyBoy for winning the 1st Place in the Room-Across-Room (RxR) Habitat Challenge 2022!!! πŸ˜†βš‘βš‘

2022.06.15   Visiting Professor Eric Xin Wang and the ERIC Lab at the University of California Santa Cruz! It was amazing to learn from so many young researchers! πŸ˜„

2022.05.10   Invited talk by the NLP Lab at the Fudan University, really enjoyed chatting with everyone! πŸ˜„

2022.03.28   My VLN project has been selected to be a part of the NVIDIA Academic Hardware Grant Program! πŸ˜† Thank you so much NVIDIA for the A100 GPU grant!!! 😭😭😭

2022.03.14   I have started a research internship at the Creative Intelligence Lab in Adobe Research in San Jose, California, US!!! πŸ˜†πŸ˜†πŸ˜†

2022.03.02

  • Our paper Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation has been accepted to CVPR 2022! 😊 I am so happy to share lots of thoughts about VLN in this paper! See you guys in New Orleans! ❀️
  • Paper HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation by Yanyuan Qiao, Yuankai Qi, Peng Wang, Qi Wu and myself has been accepted to CVPR 2022! Congrats Yanyuan on the first paper in her PhD! πŸ˜€

2021.08.17   Paper The Road To Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation by Yuankai Qi, Zizheng Pan, Ming-Hsuan Yang, Anton van den Hengel, Qi Wu and myself has been accepted to ICCV 2021! πŸ˜€

2021.04.10   Paper Learning Structure-Aware Semantic Segmentation with Image-Level Supervision by Jiawei Liu, Dr. Jing Zhang, Prof. Nick Barnes and myself, has been accepted to IJCNN 2021! Congrats Jiawei on his first paper in computer vision! πŸ˜€

2021.03.16   Our Thinking-VLN repo is online! Come to enjoy our immature ideas and share your thoughts! Just for FUN thinking!

2021.03.06   Our paper A Recurrent Vision-and-Language BERT for Navigation has been accepted to CVPR 2021 as an Oral paper with 3 strong accepts! πŸ˜†πŸ˜†πŸ˜†

2020.10.05   I gave a guest lecture in the Deep Learning Course at ANU (ENGN8536) about Vision and Language Research! My first lecture at Uni! Nervous and Fun! πŸ˜€

2020.09.26   Our paper Language and Visual Entity Relationship Graph for Agent Navigation has been accepted to NeurIPS 2020! πŸ˜€

2020.09.15   Our paper Sub-Instruction Aware Vision-and-Language Navigation has been accepted to EMNLP 2020! My first paper! 😊


Research

Scaling Data Generation in Vision-and-Language Navigation
Zun Wang, Jialu Li, Yicong Hong, Yi Wang, Qi Wu, Mohit Bansal, Stephen Gould, Hao Tan, Yu Qiao
International Conference on Computer Vision (ICCV), 2023

Learning Navigational Visual Representations with Semantic Map Supervision
Yicong Hong, Yang Zhou, Ruiyi Zhang, Franck Dernoncourt, Trung Bui, Stephen Gould, Hao Tan
International Conference on Computer Vision (ICCV), 2023

Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation
Yicong Hong, Zun Wang, Qi Wu, Stephen Gould
Conference on Computer Vision and Pattern Recognition (CVPR), 2022

A Recurrent Vision-and-Language BERT for Navigation
Yicong Hong, Qi Wu, Yuankai Qi, Cristian Rodriguez-Opazo, Stephen Gould
Conference on Computer Vision and Pattern Recognition (CVPR), 2021

Language and Visual Entity Relationship Graph for Agent Navigation
Yicong Hong, Cristian Rodriguez-Opazo, Yuankai Qi, Qi Wu, Stephen Gould
Conference on Neural Information Processing Systems (NeurIPS), 2020

Sub-Instruction Aware Vision-and-Language Navigation
Yicong Hong, Cristian Rodriguez-Opazo, Qi Wu, Stephen Gould
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2020

1st Place Solutions for RxR-Habitat Vision-and-Language Navigation Competition (CVPR 2022)
Dong An, Zun Wang, Yangguang Li, Yi Wang, Yicong Hong, Yan Huang, Liang Wang, Jing Shao
Room-Across-Room (RxR) Habitat Challenge (CVPR Embodied AI Workshop), 2022

HOP: History-and-Order Aware Pre-training for Vision-and-Language Navigation
Yanyuan Qiao, Yuankai Qi, Yicong Hong, Zheng Yu, Peng Wang, Qi Wu
Conference on Computer Vision and Pattern Recognition (CVPR), 2022

The Road to Know-Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation
Yuankai Qi, Zizheng Pan, Yicong Hong, Ming-Hsuan Yang, Anton van den Hengel, Qi Wu
International Conference on Computer Vision Systems (ICCV), 2021