Sijie Zhu
|
Research Scientist
ByteDance Inc.
Email: zhusijie006 at gmail.com
Feel free to reach out for any questions or potential collaborations.
Google Scholar | Github
|
Sijie Zhu is a Research Scientist at ByteDance Inc. He received his PhD degree from UCF supervised by Prof. Chen Chen. He won the award for Excellence in Outstanding Dissertation (College of Engineering & Computer Science 2022). He received his master's degree from University of Chinese Academy of Sciences, and bachelor's degree from University of Science and Technology of China.
During summer 2021, he was a research intern in Adobe Research, working with Zhe Lin, Scott Cohen, Zhifei Zhang, Jason Kuen.
During summer 2022, he was a research intern in ByteDance, working with Heng Wang, Linjie Yang, Xiaohui Shen, Quan Wang.
Research Interests
My research interests include
Multimodal LLM for Image/Video Understanding
Intelligent/Generative Image/Video Editing
Geo-localization / Metric Localization
Metric Learning / Image Retrieval
Selected Publications
-
Multi-Reward as Condition for Instruction-based Image Editing
Xin Gu, Ming Li, Libo Zhang, Fan Chen, Longyin Wen, Tiejian Luo, Sijie Zhu
arXiv:2411.04713  
-
Beyond Raw Videos: Understanding Edited Videos with Large Multimodal Model
Lu Xu, Sijie Zhu, Chunyuan Li, Chia-Wen Kuo, Fan Chen, Xinyao Wang, Guang Chen, Dawei Du, Ye Yuan, Longyin Wen
arXiv:2406.10484  
[Dataset]  
-
CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts
Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen
arXiv:2405.05949  
[Code]  
-
Edit3K: Universal Representation Learning for Video Editing Components
Xin Gu, Libo Zhang, Fan Chen, Longyin Wen, Yufei Wang, Tiejian Luo, Sijie Zhu
arXiv:2403.16048  
-
R2Former:
Unified Retrieval and Reranking Transformer for Place
Recognition
Sijie Zhu, Linjie Yang, Chen Chen, Mubarak Shah, Xiaohui Shen, Heng Wang
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023 (Highlight, top 10% of accepted papers)  
[Code]  
(The first place at the MSLS Place Recognition Challenge.)
-
TopNet: Transformer-based Object Placement Network for Image Compositing
Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023  
-
TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization
Sijie Zhu, Mubarak Shah, Chen Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022  
[Code]  
-
GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing
Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen
European Conference on Computer Vision (ECCV), 2022  
-
Deep Learning-Based Human Pose Estimation: A Survey
Ce Zheng, Wenhan Wu, Chen Chen, Taojiannan Yang, Sijie Zhu, Ju Shen, Nasser Kehtarnavaz, Mubarak Shah
ACM Computing Surveys  
[Project Page]  
[Versions: 1, 2, 3, 4]
-
3D Human Pose Estimation with Spatial and Temporal Transformers
Ce Zheng, Sijie Zhu, Matias Mendieta, Taojiannan Yang, Chen Chen, Zhengming Ding
IEEE International Conference on Computer Vision (ICCV), 2021  
[Code]  
-
VIGOR: Cross-View Image Geo-localization beyond One-to-one Retrieval
Sijie Zhu, Taojiannan Yang, Chen Chen
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021  
[Dataset and Code]  
[Poster]
-
A3D: Adaptive 3D Networks for Video Action Recognition
Sijie Zhu, Taojiannan Yang, Matias Mendieta, Chen Chen
arXiv:2011.12384
-
GradAug: A New Regularization Method for Deep Neural Networks
Taojiannan Yang, Sijie Zhu, Chen Chen
Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS), 2020  
[Poster]  
[Code]
-
Visual Explanation for Deep Metric Learning
Sijie Zhu, Taojiannan Yang, Chen Chen
IEEE Transactions on Image Processing (TIP), 2021  
[Code]  
[Versions: 1, 2, 3, 4]
-
MutualNet: Adaptive ConvNet via Mutual Learning from Network Width and Resolution
Taojiannan Yang, Sijie Zhu, Chen Chen, Yan Shen, Mi Zhang, Andrew Willis
European Conference on Computer Vision (ECCV), 2020 (Oral)  
[Code]  
[Video Presentation]
-
Revisiting Street-to-Aerial View Image Geo-localization and Orientation Estimation
Sijie Zhu, Taojiannan Yang, Chen Chen
Winter Conference on Applications of Computer Vision (WACV), 2021
-
Density Map Guided Object Detection in Aerial Images
Changlin Li, Taojiannan Yang, Sijie Zhu, Chen Chen, Shanyue Guan
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (EarthVision Workshop), 2020  
[Code]
-
Video Anomaly Detection for Smart Surveillance
Sijie Zhu, Chen Chen, Waqas Sultani
Book Chapter of Computer Vision: A Reference Guide, 2020  
[Arxiv]
→ Full list of publications
|