Haoran Geng | 耿浩然

I am a senior student in Turing Class at the School of Electronic Engineering and Computer Science(EECS), Peking University with GPA ranking 1/95. I'm also a research visitor at Stanford University and a research intern at Beijing Institute for General Artificial Intelligence (BIGAI). I am honored to be advised by Prof. He Wang, Prof. Leonidas J. Guibas and Dr. Siyuan Huang. In addition, I am privileged to work closely with Prof. Li Yi, Prof. Hao Dong, Prof. Yaodong Yang and Prof. Baoquan Chen. I am also grateful to have grown up and studied with my twin brother Yiran Geng, which has been a truly unique and special experience for me.

Email / Google Scholar / Github / Twitter / WeChat

News

[2024/03] I am honored to receive the Berkeley Fellowship Award and Stanford Graduate Fellowship Award.

[2023/12] Excited to announce Simulately🤖, a go-to toolkit for robotics researchers navigating diverse simulators!

[2023/12] I'm honored to be selected as one of the Person of the Year of Peking University.

[2023/10] I gave an Oral Presentation on UniDexGrasp++ at ICCV 2023.

[2023/10] 🎉 UniDexGrasp++ is selected as Best Paper Finalist at ICCV 2023.

[2023/08] 🎉 UniDexGrasp++ is selected as Oral Presentation at ICCV 2023.

[2023/07] 🎉 Two papers get accepted to ICCV 2023 with UniDexGrasp++ receiving final reviews of all strong accepts (the highest ratings).

[2023/07] 🎉 One paper gets accepted to Machine Learning Journal.

[2023/07] We will present UniDexGrasp++ at RSS 2023 @ Learning Dexterous Manipulation on July 14, 2023

[2023/06] Welcome to check out our posters for GAPartNet, PartManip and UniDexGrasp at CVPR 2023 on June 20, 2023

[2023/06] I will present GAPartNet at CVPR 2023 @ StruCo3D on June 18, 2023

[2023/06] I will present Learning Part-Aware Visual Actionable Affordance for 3D Articulated Object Manipulation at CVPR 2023 @ 3DVR on June 18, 2023

[2023/03] 🎉 GAPartNet is selected as a highlight at CVPR 2023 (Top 10% of accepted papers, top 2.5% of submissions).

[2023/02] 🎉 Three papers get accepted to CVPR 2023 with GAPartNet receiving final reviews of all accepts (the highest ratings).

[2023/01] 🎉 One paper gets accepted to ICRA 2023.

Research

My research interest is broadly in 3D Computer Vision and Robotics, with particular interests in generalizable object perception, understanding and manipulation currently. My research objective is to build an intelligent agent with the robust and generalizable ability to perceive and interact with a complex real-world environment. Representative works are highlighted.

	Simulately: Handy information and resources for physics simulators for robot learning research. Haoran Geng Yuyang Li, Yuzhe Qin, Ran Gong, Wensi Ai, Yuanpei Chen, Puhao Li, Junfeng Ni, Zhou Xian, Songlin Wei, Yang You, Yufei Ding, Jialiang Zhang Website / Github Open-source Project Selected into CMU 16-831 Simulately is a project where we gather useful information of robotics & physics simulators for cutting-edge robot learning research.
	SAGE🌿: Bridging Semantic and Actionable Parts for Generalizable Articulated-Object Manipulation under Language Instructions Haoran Geng, Songlin Wei, Congyue Deng, Bokui Shen, He Wang†, Leonidas Guibas† ArXiv / Project Page / Video / Bibtex @misc{geng2023sage, title={SAGE: Bridging Semantic and Actionable Parts for GEneralizable Articulated-Object Manipulation under Language Instructions}, author={Haoran Geng and Songlin Wei and Congyue Deng and Bokui Shen and He Wang and Leonidas Guibas}, year={2023}, eprint={2312.01307}, archivePrefix={arXiv}, primaryClass={cs.RO} } ICLR 2024 @ LLMAgents We present SAGE, a framework bridging the understanding of semantic and actionable parts for generalizable manipulation of articulated objects using Large Language Models(LLMs) and Visual-Language Models(VLMs).
	ShapeLLM: Universal 3D Object Understanding for Embodied Interaction Zekun Qi, Runpei Dong, Shaochen Zhang, Haoran Geng, Chunrui Han, Zheng Ge, He Wang, Li Yi, Kaisheng Ma ArXiv / Project Page / Bibtex @article{shapellm24, author = {Zekun Qi and Runpei Dong and Shaochen Zhang and Haoran Geng and Chunrui Han and Zheng Ge and He Wang and Li Yi and Kaisheng Ma}, title = {ShapeLLM: Universal 3D Object Understanding for Embodied Interaction}, journal = {CoRR}, volume = {abs/2402.17766}, year = {2024}, eprinttype = {arXiv}, eprint = {2402.17766}, } Under Review We present ShapeLLM, the first 3D Multimodal Large Language Model (LLM) designed for embodied interaction, exploring a universal 3D object understanding with 3D point clouds and languages.
	ManipLLM:Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation Xiaoqi Li, Mingxu Zhang, Yiran Geng, Haoran Geng, Yuxing Long, Yan Shen, Renrui Zhang, Jiaming Liu, Hao Dong† ArXiv / Project Page / Bibtex @misc{li2023manipllm, title={ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation}, author={Xiaoqi Li and Mingxu Zhang and Yiran Geng and Haoran Geng and Yuxing Long and Yan Shen and Renrui Zhang and Jiaming Liu and Hao Dong}, year={2023}, eprint={2312.16217}, archivePrefix={arXiv}, primaryClass={cs.CV} } CVPR 2024 We present ManipLLM, introducing an innovative approach for robot manipulation that leverages the robust reasoning capabilities of Multimodal Large Language Models (MLLMs) to enhance the stability and generalization of manipulation.
	Make a Donut🍩: Language-guided Hierarchical EMD-Space Planning for Zero-shot Deformable Object Manipulation Yang You, Bokui Shen, Congyue Deng, Haoran Geng, He Wang, Leonidas Guibas† ArXiv / Project Page / Bibtex @misc{ you2023make, title={Make a Donut: Language-Guided Hierarchical EMD-Space Planning for Zero-shot Deformable Object Manipulation}, author={Yang You and Bokui Shen and Congyue Deng and Haoran Geng and He Wang and Leonidas Guibas}, year={2023}, eprint={2311.02787}, archivePrefix={arXiv}, primaryClass={cs.RO} } Under Review In this work, we introduce a demonstration-free hierarchical planning approach capable of tackling intricate long-horizon tasks without necessitating any training
	UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning Weikang Wan, Haoran Geng, Yun Liu, Zikang Shan, Yaodong Yang, Li Yi, He Wang† (equal contribution) ArXiv / Project Page / Code / Media(CFCS) / Bibtex @article{wan2023unidexgrasp++, title={UniDexGrasp++: Improving Dexterous Grasping Policy Learning via Geometry-aware Curriculum and Iterative Generalist-Specialist Learning}, author={Wan, Weikang and Geng, Haoran and Liu, Yun and Shan, Zikang and Yang, Yaodong and Yi, Li and Wang, He}, journal={arXiv preprint arXiv:2304.00464}, year={2023} } ICCV 2023, Oral Presentation* with all top ratings (strong accept) ICCV 2023, Best Paper Finalist We propose a novel, object-agnostic method for learning a universal policy for dexterous object grasping from realistic point cloud observations and proprioceptive information under a table-top setting.
	ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes Ran Gong, Jiangyong Huang, Yizhou Zhao, Haoran Geng, Xiaofeng Gao, Qingyang Wu, Wensi Ai, Ziheng Zhou, Demetri Terzopoulos, Song-Chun Zhu, Baoxiong Jia, Siyuan Huang, ArXiv / Project Page / Code / Bibtex @article{gong2023arnold, title={ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes}, author={Gong, Ran and Huang, Jiangyong and Zhao, Yizhou and Geng, Haoran and Gao, Xiaofeng and Wu, Qingyang and Ai, Wensi and Zhou, Ziheng and Terzopoulos, Demetri and Zhu, Song-Chun and Jia, Baoxiong and Huang, Siyuan}, journal={arXiv preprint arXiv:2304.04321}, year={2023} } ICCV 2023 CoRL 2022 @ LangRob, Spotlight Presentation We present ARNOLD, a benchmark that evaluates language-grounded task learning with continuous states in realistic 3D scenes.
	GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts Haoran Geng, Helin Xu, Chengyang Zhao, Chao Xu, Li Yi, Siyuan Huang, He Wang† ArXiv / Project Page / Code / Dataset / Poster / CVPR Page / Media(CFCS) / Bibtex @article{geng2022gapartnet, title={GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts}, author={Geng, Haoran and Xu, Helin and Zhao, Chengyang and Xu, Chao and Yi, Li and Huang, Siyuan and Wang, He}, journal={arXiv preprint arXiv:2211.05272}, year={2022} } CVPR 2023, Highlight* (Top 2.5% of submissions) with all top ratings We propose to learn cross-category generalizable object perception and manipulation skills via Generalizable and Actionable Parts(GAPart), and present GAPartNet, a large-scale interactive dataset with rich part annotations.
	PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations Haoran Geng, Ziming Li, Yiran Geng, Jiayi Chen, Hao Dong, He Wang† ArXiv / Project Page / Code / Dataset / Poster / CVPR Page / Bibtex @article{geng2023partmanip, title={PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations}, author={Geng, Haoran and Li, Ziming and Geng, Yiran and Chen, Jiayi and Dong, Hao and Wang, He}, journal={arXiv preprint arXiv:2303.16958}, year={2023} } CVPR 2023 We introduce a large-scale, cross-category part-based object manipulation benchmark with tasks in realistic, vision-based settings and design a novel augmented state-to-vision distillation method for these challenging tasks.
	UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy Yinzhen Xu, Weikang Wan, Jialiang Zhang, Haoran Liu, Zikang Shan, Hao Shen, Ruicheng Wang, Haoran Geng, Yijia Weng, Jiayi Chen, Tengyu Liu, Li Yi, He Wang† ArXiv / Project Page / Code / CVPR Page / Bibtex @article{xu2023unidexgrasp, title={UniDexGrasp: Universal Robotic Dexterous Grasping via Learning Diverse Proposal Generation and Goal-Conditioned Policy}, author={Xu, Yinzhen and Wan, Weikang and Zhang, Jialiang and Liu, Haoran and Shan, Zikang and Shen, Hao and Wang, Ruicheng and Geng, Haoran and Weng, Yijia and Chen, Jiayi and others}, journal={arXiv preprint arXiv:2303.00938}, year={2023} } CVPR 2023 We tackle the problem of learning universal robotic dexterous grasping from a point cloud observation under a table-top setting.
	Learning Part-Aware Visual Actionable Affordance for 3D Articulated Object Manipulation Yuanchen Ju, Haoran Geng, Ming Yang, Yiran Geng, Yaroslav Ponomarenko, Taewhan Kim, He Wang, Hao Dong† Paper / Video / Workshop @article{geng2023partmanip, title={PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations}, author={Geng, Haoran and Li, Ziming and Geng, Yiran and Chen, Jiayi and Dong, Hao and Wang, He}, journal={arXiv preprint arXiv:2303.16958}, year={2023} } CVPR 2023 @ 3DVR, Spotlight Presentation* We introduces Part-aware Affordance Learning methods. Our approach first learns a part prior, subsequently generating an affordance map. We further enhance precision by introducing a part-level scoring system, designed to identify the best part for manipulation.
	RLAfford: End-to-End Affordance Learning for Robotic Manipulation Yiran Geng, Boshi An, Haoran Geng, Yuanpei Chen, Yaodong Yang†, Hao Dong† ArXiv / Project Page / Video / Code / Media (CFCS) / Bibtex @article{geng2022end, title={End-to-End Affordance Learning for Robotic Manipulation}, author={Geng, Yiran and An, Boshi and Geng, Haoran and Chen, Yuanpei and Yang, Yaodong and Dong, Hao}, journal={International Conference on Robotics and Automation (ICRA)}, year={2023} } ICRA 2023 In this study, we take advantage of visual affordance by using the contact information generated during the RL training process to predict contact maps of interest.

Before 2022

Ministry of Education Talent Program Thesis (Physics Track)
Haoran Geng, Yiran Geng, Xintian Dong, Yue Meng, Xujv Sun, Houpu Niu
PDF

I was selected for the Ministry of Education Talent Program and conducted physics research at Nankai University during my high school years.

Recent Talks

I gave an Oral Presentation on UniDexGrasp++ at ICCV on October 4, 2023.

I presented Towards Generalizable Object Perception and Manipulation at Shenlan Open Courses on July 23, 2023.
Poster / Website / Video

I presented Towards Generalizable Object Perception and Manipulation at Stanford UGVR Research Salon on July 9, 2023.
Slides

I presented GAPartNet at CVPR 2023 @ StruCo3D on June 18, 2023.
Video / Poster

I presented Learning Part-Aware Visual Actionable Affordance for 3D Articulated Object Manipulation at CVPR 2023 @ 3DVR on June 18, 2023.
Video

I presented Generalizable Cross-Category Object Perception and Manipulation at Noah's Ark Lab, Huawei on June 2, 2023.

I presented PartManip: Learning Cross-Category Generalizable Part Manipulation Policy from Point Cloud Observations at Turing Student Research Forum on May 27, 2023 and was honored to receive the Best Presentation Award and Best Poster Award.
Media / Poster

I presented Generalizable Part-based Cross-Category Object Perception and Manipulation at CVPR2023 Seminar (hosted by SenseTime) on May 28, 2023.
Media / Poster

I presented GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts at Scientific Research Exhibition (hosted by School of EECS, PKU) on May 13, 2023 and was honored to receive the Best Presentation Award.
Media / Poster

I presented GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts at China3DV 2023 on April 21, 2023, which was selected as a Fastforward Report (11/75).
Media / Poster / Video

I presented Learning Generalizable and Actionable Parts for Cross-Category Object Perception and Manipulation at the Turing Student Research Forum at CFCS, Peking University on June 8, 2022 and was honored to receive the Outstanding Presentation Award.
Video / Media

Services

Reviewer: CVPR, ICCV, NeurIPS, RSS

I serve as the chair of Turing Student Research Forum 2023.

One of the leaders of Linux Club of Peking University(LCPU)

We organized High Performance Computing Integrated Competitiveness Competition and I was also the manager of the AI part.

Experience

	Stanford University 2023.02 - Present Visiting Research Student through the UGVR Program Research Advisor: Prof. Leonidas J. Guibas
	Beijing Institute for General Artificial Intelligence (BIGAI) 2021.12 - Present Research Intern Research Advisor: Siyuan Huang Academic Advisor: Prof. Song-Chun Zhu
	Visual Computing and Learning Lab(VCL) 2022.6 - 2022.9 Summer Research Intern Research Advisor: Prof. He Wang Academic Advisor: Prof. Baoquan Chen
	Peking University (PKU) 2020.09 - Present Undergraduate Student, Turing Class GPA ranking: 1/95 Research Advisor: Prof. He Wang

Selected Awards and Honors

2024: Stanford Graduate Fellowship Award, Stanford

2024: Berkeley Fellowship Award, UC Berkeley

2023: ICCV Best Paper Award (Marr Prize) Finalist

2023: Person of the Year (10 people/year), Peking University

2023: Research Rising Star Award (First Price), BIGAI

2023: Outstanding Overseas Exchange Scholarship

2023: Academic Innovation Award of Peking University

2023: May Fourth Scholarship (Highest-level Scholarship for Peking University, 125/65k+)

2021-2023: Merit Student of Peking University

2023: Turing Student Research Forum: Best Presentation Award & Best Poster Award

2023: School of EECS Research Exhibition: Best Presentation Award

2022: Center on Frontiers of Computing Studies (CFCS) Outstanding Student

2022: Arawana Scholarship

2021-2023: Zhongying Moral Education Scholarship

2022: Turing Student Research Forum: Outstanding Presentation Award

2021: National Scholarship (Highest Honor for undergraduates in China)

2021: SenseTime Scholarship (Youngest winner, 30/year in China)

2021: Ministry of Education Top Talent Program Scholarship

This homepage is designed based on Jon Barron's website and deployed on Github Pages. Last updated: Aug. 29, 2023
© 2024 Haoran Geng