Lei Ke - CMU

Home
Publications
Misc

Lei Ke

Postdoc at CMU

I am a Postdoctoral Research Associate at Computer Science of Carnegie Mellon University, working with Katerina Fragkiadaki. Previsouly, I was a postdoc researcher at Computer Vision Lab of ETH Zurich, with Martin Danelljan and Fisher Yu. I obtained my Ph.D. degree from CSE Department at HKUST in mid 2023, supervised by Chi-Keung Tang and Yu-Wing Tai. During the PhD journey, I also spent two years as a visiting scholar at ETH Zurich. My research goal is to enable machine to achieve 4D and multi-modality scene understanding from videos/images. I received my B.E. degree from the school of computer science at Wuhan University.

More info: [Email],[Google Scholar],[X],[GitHub], where my leading opensource projects obtains over 8K+ GitHub stars.

Updates

I will serve as an Area Chair at NeurIPS 2025.
2024.11: Invited guest lecture at Texas A&M University on Vision Foundation Model.
2024.04: Joined CMU as a postdoc to work with Prof. Katerina Fragkiadaki, and if you're passionate about vision / robotics with extensive research experience, feel free to reach out for research collaboration!
2023.11: We released the Gaussian Grouping project.
2023.11: Talks on Scene Understanding with Vision Foundation Models at Stanford SVL and MARVL.
2023.06: We released the HQ-SAM and SAM-PT projects.
2023.06: Technical committee of VOTS 2023 Challenge. The workshop will take place on October 3rd @ICCV2023.
2023.05: Passed the PhD thesis defense and become a Dr.!
2023.03: Mask-free VIS is accepted in CVPR 2023.
2023.01: BCNet is accepted in TPAMI 2023.
2022.07: Video Mask Transfiner on high-quality VIS is accepted by ECCV 2022! We released the HQ-YTVIS dataset.
2022.03: PCAN serves as the baseline in BDD100K MOTS challenge at CVPR 2022 Workshop on Autonomous Driving.
2022.03: Mask Transfiner on high-quality instance segmentation is accepted by CVPR 2022!
2022.03: Invited talk on PCAN at AI Time Seminar, Tsinghua Univ (Virtual).
2022.02: Invited talk on Mutltiple Object Tracking & Segmentation in Autonomous Driving at TechBeat.
2021.12: Invited spotlight talk for PCAN at SwissTech Convention Center, EPFL.
2021.10: PCAN for Multiple Object Tracking and Segmentation is accepted by NeurIPS 2021 as spotlight.
2021.07: Our paper on occlusion-aware video inpainting accepted by ICCV 2021.
2021.01: I joined CVL at ETHz as a visiting PhD student supervised by Prof.Fisher Yu and Dr.Martin Danelljan.
2021.03: Our paper BCNet on occlusion-aware instance segmentation accepted by CVPR 2021!
2021.10: Passed the PhD Qualifying Exam.
2020.07: Two papers (GSNet and CPMask) accepted by ECCV 2020.
2020.02: Our paper on 3D human pose estimation has been accepted by CVPR 2020 for oral presentation.
2019.07: Our paper on image captioning accepted by ICCV 2019.
2019.05: I will start my Ph.D study at CSE, HKUST this autumn.
2019.02: Our paper on video captioning accepted by CVPR 2019.

Recent Publications

	TAPIP3D: Tracking Any Point in Persistent 3D Geometry arXiv 2025 Bowei Zhang, Lei Ke, Adam W. Harley, Katerina Fragkiadaki (* denotes equal contribution) TAPIP3D: Long-term feed-forward 3D point tracking in persistent 3D point maps. [Paper] [Project]
	Video Depth without Video Models CVPR 2025 Bingxin Ke, Dominik Narnhofer, Shengyu Huang, Lei Ke, Torben Peters, Katerina Fragkiadaki, Anton Obukhov, Konrad Schindler RollingDepth: A universal monocular depth estimator for arbitrarily long videos. [Paper] [Project] [HuggingFace]
	DreamScene4D: Dynamic Multi-Object Scene Generation from Monocular Videos NeurIPS 2024 Wen-Hsuan Chu, Lei Ke, Katerina Fragkiadaki DreamScene4D for generating 3D dynamic scenes of multiple objects from monocular videos. [Paper] [Project] [Web]
	Gaussian Grouping: Segment and Edit Anything in 3D Scenes ECCV 2024 Mingqiao Ye, Martin Danelljan, Fisher Yu, Lei Ke ♠ (♠ denotes Project Lead) Gaussian Grouping for open-world 3D Anything reconstruction, segmentation and editing. [Paper] [Project] [Web]
	SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking ECCV 2024 Siyuan Li, Lei Ke, Yung-Hsu Yang, Luigi Piccinelli, Mattia Segù, Martin Danelljan, Luc Van Gool Jointly considers semantics, location, and appearance priors for MOT. [Paper] [Project]
	Matching Anything By Segmenting Anything CVPR 2024 Siyuan Li, Lei Ke, Martin Danelljan, Luigi Piccinelli, Mattia Segu, Luc Van Gool, Fisher Yu MASA provides a universal instance appearance model for matching any objects in any domain. Highlight (~3% acceptance rate) [Paper] [Project]
	Segment Anything in High Quality NeurIPS 2023 Lei Ke, Mingqiao Ye, Martin Danelljan, Yifan Liu, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu (* denotes equal contribution) We propose HQ-SAM to upgrade SAM for high-quality zero-shot segmentation. HQ-SAM receives 2000+ Github stars in one month. [Paper] [Project] [Pdf]
	Segment Anything Meets Point Tracking arXiv Frano Rajič, Lei Ke, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu We propose SAM-PT to extend SAM to zero-shot video segmentation with point-based tracking. SAM-PT receives 500+ Github stars in one week. [Paper] [Project]
	BiMatting: Efficient Video Matting via Binarization NeurIPS 2023 Haotong Qin, Lei Ke, Xudong Ma, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Xianglong Liu, Fisher Yu (* denotes equal contribution) An accurate and efficient video matting model using binarization. [Paper] [Project]
	Cascade-DETR: Delving into High-Quality Universal Object Detection ICCV 2023 Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin Danelljan, Fisher Yu Promoting DETR's detection accuracy in universal domains via cascade attention. [arXiv] [Project]
	Mask-Free Video Instance Segmentation CVPR 2023 Lei Ke, Martin Danelljan, Henghui Ding, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu Removing video and image mask annotation necessity for highly accurate VIS. [Paper] [arXiv] [Website] [Project]
	OVTrack: Open-Vocabulary Multiple Object Tracking CVPR 2023 Siyuan Li, Tobias Fischer, Lei Ke, Henghui Ding, Martin Danelljan, Fisher Yu First method and benchmark for open vocabulary tracking. [Paper] [arXiv] [Website] [Project]
	Video Mask Transfiner for High-Quality Video Instance Segmentation ECCV 2022 Lei Ke, Henghui Ding, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu A new HQ-YTVIS benchmark and transformer-based method for highly accurate VIS. [Paper] [arXiv] [Website] [Project] [Dataset]
	Mask Transfiner for High-Quality Instance Segmentation CVPR 2022 Lei Ke, Martin Danelljan, Xia Li, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu An efficient transformer-based method for highly accurate instance segmentation. Transfiner receives 300+ Github stars in 3 months. [Paper] [arXiv] [Project] [Website] [Youtube] [Blog] [Zhihu]
	Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation NeurIPS 2021 Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu Efficient cross-attention on space-time memory for video instance segmentation. Spotlight (3% acceptance rate). PCAN receives 200+ Github stars in one month. [Paper] [arXiv] [Project] [Youtube] [Poster] [Zhihu]
	Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers CVPR 2021 & TPAMI 2023 Lei Ke, Yu-Wing Tai, Chi-Keung Tang Instance segmentation with bilayer decoupling structure for occluder & occludee. BCNet receives 300+ Github stars in 6 months. [Paper] [arXiv] [TPAMI Link] [Project] [Zhihu] [Video] [Poster]
	Occlusion-Aware Video Object Inpainting ICCV 2021 Lei Ke, Yu-Wing Tai, Chi-Keung Tang [Paper] [arXiv] [Project & Dataset]
	Commonality-Parsing Network across Shape and Appearance for Partially Supervised Instance Segmentation ECCV 2020 Qi Fan, Lei Ke, Wenjie Pei, Chi-Keung Tang, Yu-Wing Tai *( denotes equal contribution)** [Paper] [arXiv] [Project]
	GSNet: Joint Vehicle Pose and Shape Reconstruction with Geometrical and Scene-aware Supervision ECCV 2020 Lei Ke, Shichao Li, Yanan Sun, Yu-Wing Tai, Chi-Keung Tang [Paper] [arXiv] [Supp] [Project]
	Cascaded Deep Monocular 3D Human Pose Estimation with Evolutionary Training Data CVPR 2020 Shichao Li, Lei Ke, Kevin Pratama, Yu-Wing Tai, Chi-Keung Tang, Kwang-Ting Cheng Oral Presentation, 5.03% acceptance rate [Paper] [Supp] [arXiv] [Project]
	Reflective Decoding Network for Image Captioning ICCV 2019 Lei Ke, Wenjie Pei, Ruiyu Li, Xiaoyong Shen, Yu-Wing Tai [Paper] [Supp] [arXiv] [Poster] [Project]
	Memory-Attended Recurrent Network for Video Captioning CVPR 2019 Wenjie Pei, Jiyuan Zhang, Xiangrong Wang, Lei Ke, Xiaoyong Shen and Yu-Wing Tai [Paper] [arXiv]

Experiences

	2024.04—Now: Postdoc at MLD, CMU
	2023.07—2024.03: Postdoc at CVL, ETHz
	2021.01—2023.03: Visiting PhD student at CVL, ETHz
	2019.05—2023.05: HKUST Computer Vision Research Assistant
	2017.11—2019.11: Tencent Youtu X-lab Computer Vision Research Intern, worked closely with Wenjie Pei.
	2017.05—2017.10: Alibaba Engineering Intern
	2016.05—2017.02: Undergraduate Research Assistant at Wuhan University