Xuri Ge's Homepage (葛旭日的个人主页)

Xuri Ge (葛旭日)[CV]

Best way to reach me → Email: xuri.ge@sdu.edu.cn or xurigexmu@gmail.com

Now, I (Xuri Ge) am an Assistant Professor at the School of Artificial Intelligence, Shandong University (SDU), China. Previously, I received my PhD at School of Computing Science, University of Glasgow, Scotland, UK in 2024 and a member of the GAIR-Lab in Information, Data and Analysis (IDA) group. My principal supervisor is Prof. Joemon M Jose and second supervisor is Dr. Gerardo Aragon Camarasa. I received my master's degree from Xiamen University in 2020. My advisors are Prof. Rongrong Ji and Minghui Shi. During my master, I worked in the Laboratory of MAC, Artificial Intelligence Department, School of Informatics, Xiamen University, China.
More recently, my main research attention in multimodal information retrieval, Computer Vision, Natural Language Processing and Multimedia Sentiment Analysis, mainly including cross-modal retrieval, multimodal recommendation, facial action unit detection, image captioning, medical image analysis, etc.

I am actively seeking self-motivated Master students(MSC)/Research Assistants(RA) with a strong background in AI.
长期招收保研生和考研生，并且有博士报考意愿也可联系！！.
与国内外多个课题组有长期合作，有海外读博或者硕士意愿可推荐；与国内多个企业具有合作，可推荐实习/工作机会，例如华为、腾讯、上海AI Lab等
The research topics mainly include but are not limited to the following:

Multi-modal Representation Learning

Multi-modal Large Language Models

Computer Vision (CV) and Natural Language Processing (NLP)

Multimodal Information Retrieval/Recommendation

Multimedia Sentiment Analysis(Facial Action Unit Recognition/Sentiment Recognition)
If you’re interested, please don’t hesitate to reach out via email (xuri.ge@sdu.edu.cn).

--Latest News--

[Pin to Top]March 23, 2026
We will organize a The 2nd EReL@MIR Workshop on Efficient Representation Learning for Multimodal Information Retrieval at The ACM Multimedia 2026 (Rio de Janeiro, Brazil — 10–14 November 2026). All submissions about multimodal IR are welcome!

[Pin to Top]Apr 5, 2026
One paper is accepted by Automation in Construction (Q1 Top, Impact Factor: 11.5)!!! Congratulations to Tianhao

[Pin to Top]Apr 5, 2026
One paper is accepted by ACM ICMR 2026 (CCF-B)!!! Congratulations to Kaiwen

[Pin to Top]Apr 5, 2026
One paper is accepted by ACM SIGIR 2026 (CCF-A)!!! Congratulations to Junchen

[Pin to Top]March 17, 2026
Three papers are accepted by IEEE ICME 2026 (CCF-B)!!! Congratulations to all authors.

[Pin to Top]Jan 12, 2026
Two papers are accepted by The Web Conference 2026 (WWW) (CCF-A)!!! Congratulations to Chunhao, Yuanzi.

[Pin to Top]Jan 12, 2026
One paper is accepted by Information Processing and Management (IP&M)(SACI 1区，CCF-B)!!! Congratulations to Hui Ye.

[Pin to Top]Jan 12, 2026
One paper is accepted by AAAI2026 (CCF-A)!!! Congratulations to Feng Zhang.

[Pin to Top]Aug 21, 2025
We will organize a R3AG 2025: The Second Workshop on Refined and Reliable Retrieval-Augmented Generation at The 2025 SIGIR-AP (Xian, China) on December 10, 2025. All submissions about IR are welcome!

[Pin to Top]Sept 8, 2025
One paper is accepted by IEEE Transactions on Knowledge and Data Engineering(TKDE)(CCF-A，JCR Q1)!!! Congratulations to Junchen.

[Pin to Top]Aug 21, 2025
One paper is accepted by Information Fusion (CCF-A,IF=15.5)!!! Congratulations to Linqing.

[Pin to Top]July 1, 2025
One paper is accepted by ACM MM2025 (CCF-A)!!! Congratulations to Songpei.

[Pin to Top]May 1, 2025
One paper is accepted by ICML2025 (CCF-A)!!! Congratulations to Yaoqin and Junchen.

[Pin to Top]Jun 20, 2025
One paper is accepted by Transactions on Knowledge and Data Engineering(TKDE)(SCI 1区)!!! Congratulations to Jie Wang.

[Pin to Top]Mar 21, 2025
We will organize a The 1st NIP@IR Workshop at The 2025 SIGIR (Padua, Italy) on 17 July, 2025. All submissions about IR are welcome!

Apr 29, 2025
We organized The 1st EReL@MIR Workshop at The Web Conference 2025, WWW2025 (Sydney, Australia) on 28-29 April, 2025. All submissions about efficient MIR are welcome!

Publications

Published:

Xuri Ge, Chunhao Wang, Xindi Wang, Zheyun Qin, Zhumin Chen, Xin Xin^✉.
MCoT-MVS: Multi-level Vision Selection by Multi-modal Chain-of-Thought Reasoning for Composed Image Retrieval. [PDF]
WWW-2026 (CCF-A)

Hui Ye, Xuri Ge^✉, Junqi Wang, Junchen Fu, Xin Xin, ..., Pengjie Ren, Zhumin Chen.
Beyond efficient fine-tuning: Efficient hybrid fine-tuning of CLIP models guided by explainable ViT attention. [PDF]
IP&M (SCI一区， CCF-B)

Fuhai Chen, Feng Zhang, Xiaoguang Ma, Yiyi Zhou, Jiarong Liu, Xuri Ge^✉
Vision-language Incremental Learning with Dual Class-individual Memory. [PDF]
AAAI-2026 (CCF-A)

Junchen Fu, Xuri Ge^✉, Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Kaiwen Zheng, Yongxin Ni, Joemon M Jose
Efficient and Effective Adaptation of Multimodal Foundation Models in Sequential Recommendation. [PDF]
IEEE Transactions on Knowledge and Data Engineering (TKDE) (CCF-A, IF=10.4)

Linqing Li, Zhifeng Wang, Joemon M. Jose, Xuri Ge^✉
LLM Supporting Knowledge Tracing Leveraging Global Subject and Student Specific Knowledge Graphs. [PDF]
Information Fusion (CCF-A, IF=15.5)

Songpei Xu, Xuri Ge^✉, Chaitanya Kaul, Roderick Murray-Smith
HandSolo: A Mid-Air Hand Pose Interaction Method Based on Disentangled Degrees-of-Hand-Freedom. [Arxiv]
ACM Multimedia 2025 (ACM MM2025) (CCF-A)

Yaoqin He, Junchen Fu, Kaiwen Zheng, Songpei Xu, Fuhai Chen, Jie Li, Joemon M. Jose, Xuri Ge^✉
Double-Filter: Efficient Fine-tuning of Pre-trained Vision-Language Models via Patch&Layer Filtering. [Arxiv]
The 42nd International Conference on Machine Learning (ICML2025) (CCF-A)

Xuri Ge, Linqing Li, Songpei Xu, Kaiwen Zheng, Yaoqin He, Junchen Fu, Joemon M. Jose
The DenseCap-Guided Attention Network For Image-Text Matching. [Arxiv]
Companion Proceedings of the ACM on Web Conference 2025. (CCF-A)

Jie Wang, Alexandros Karatzoglou, Ioannis Arapakis, Joemon M. Jose, Xuri Ge
Beyond Accuracy: Decision Transformers for Reward-driven Multi-objective Recommendations. [Arxiv]
Transactions on Knowledge and Data Engineering(TKDE) (IF=10.4, JCR Q1)

Kaiwen Zheng, Xuri Ge^✉, Juncheng Fu, Jun Peng, Joemon M. Jose
Multimodal Representation Learning Techniques for Comprehensive Facial State Analysis. [Arxiv]
IEEE International Conference on Multimedia&Expo 2025 (ICME), 2025. (CCF-B)

Xuri Ge, Fuhai Chen, Songpei Xu, Fuxiang Tao, Jie Wang and Joemon M. Jose
Hire: Hybrid-modal Interaction with Multiple Relational Enhancements for Image-Text Matching. [Arxiv]
ACM Transactions on Intelligent Systems and Technology (TIST), 2025. (IF=7.2, JCR Q1)

Yidan Wang, Xuri Ge, Xin Chen, Ruobing Xie, Yan Su, Xu Zhang, Zhumin Chen, Jun Ma, Xin Xin
Exploration and Exploitation of Hard Negative Samples for Cross-Domain Sequential Recommendation. [Coming soon]
The Eighteenth International Conference on Web Search and Data Mining(WSDM), 2025. (CCF-B)

Xuri Ge, Junchen Fu, Fuhai Chen, Shan An, Nicu Sebe, Joemon M Jose
Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning. [pdf]
The 32nd ACM Multimedia Conference (ACM MM24), 2024. (Core Rank A*, CCF-A)

Junchen Fu, Xuri Ge^✉, Xin Xin, Alexandros Karatzoglou, Ioannis Arapakis, Jie Wang, Joemon M. Jose
IISAN: Efficiently Adapting Multimodal Representation for Sequential Recommendation with Decoupled PEFT. [pdf] [code]
the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024. (Core Rank A*, CCF-A)

Zijun Long, Xuri Ge^✉, Richard Mccreadie, Joemon M. Jose
CFIR: Fast and Effective Long-Text To Image Retrieval for Large Corpora. [pdf] [code]
the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024. (Core Rank A*, CCF-A)

Fuxiang Tao, Xuri Ge^✉, Wei Ma, Anna Esposito, Alessandro Vinciarelli.
Cross-Data Multilevel Attention for Depression Detection: Analyzing the Interplay Between Read and Spontaneous Speech. [pdf]
IEEE International Conference on Bioinformatics and Biomedicine(IEEE BIBM 2024), 2024.[CCF B]

Xuri Ge, Songpei Xu, Fuhai Chen^✉, Jie Wang, Guoxin Wang, Shan An, Joemon M. Jose^✉
3SHNet: Boosting Image-Sentence Retrieval via Visual Semantic-Spatial Self-Highlighting. [J] [pdf] [code]
Information Processing and Management (IP&M), 2024. (IF=8.6, JCR Q1)

Jie Wang, Alexandros Karatzoglou, Ioannis Arapakis, Xin Xin, Xuri Ge, Jose M. Joemon
Sparks of Surprise: Multi-objective Recommendations with Hierarchical Decision Transformers for Diversity, Novelty, and Serendipity. [pdf]
33rd ACM International Conference on Information and Knowledge Management (CIKM), 2024. (Core Rank A)

Tong Shi, Xuri Ge, Joemon M Jose, Nicolas Pugeault, Paul Henderson
Detail-Enhanced Intra-and Inter-modal Interaction for Audio-Visual Emotion Recognition. [Arxiv]
27th International Conference on Pattern Recognition (ICPR), 2024.

Songpei Xu, Xuri Ge^✉, Chaitanya Kaul, Roderick Murray-Smith.
HpEIS: Learning Hand Pose Embeddings for Multimedia Interactive Systems. [pdf coming]
IEEE Conference on Multimedia Expo (ICME), 2024. [CORE Rank A]

Xuri Ge^✉, Joemon M. Jose, Songpei Xu, Xiao Liu, Hu Han
MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection. [J] [pdf]
ACM Transactions on Intelligent Systems and Technology (TIST), 2024. (IF=7.2, JCR Q1)

Jie Wang, Bansal Kanha, Arapakis Ioannis, Xuri Ge, Joemon M. Jose
Empowering Legal Citation Recommendation via Efficient Struction-Tuning of Pre-trained Language Models. [pdf]
The 46th European Conference on Information Retrieval (ECIR), 2024. [CORE Rank A]

Xuri Ge^✉, Joemon M. Jose, Pengcheng Wang, Arunachalam Iyer, Xiao Liu, Hu Han
ALGRNet: Multi-Relational Adaptive Facial Action Unit Modelling for Face Representation and Relevant Recognitions. [J] [pdf]
IEEE Transactions on Biometrics, Behavior, and Identity Science (TBIOM), 2023.

Xuri Ge, Fuhai Chen^✉, Songpe Xu, Fuxiang Tao, Joemon M. Jose.
Cross-modal Semantic Enhanced Interaction for Image-Sentence Retrieval. [pdf]
Winter Conference on Applications of Computer Vision (WACV2023), 2023.[CORE Rank A]

Fuxiang Tao, Xuri Ge^✉, Wei Ma, Anna Esposito, Alessandro Vinciarelli.
Multi-Local Attention for Speech-based Depression Detection. [pdf]
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.[CCF B]

Songpei Xu, Chaitanya Kaul, Xuri Ge, Roderick Murray-Smith.
Continuous Interaction with a Smart Speaker via Low-dimensional Embeddings of Dynamic Hand Pose. [pdf]
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.[CCF B]

Xuri Ge^✉, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu.
Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval. [pdf]
ACM International Conference on Multimedia (ACM MM), 2021. [CORE Rank A*]

Xuri Ge^*✉, Pengcheng Wang^*, Hu Han, Joemon M. Jose, Zhonglong Ji, Zhongqin Wu, Xiao Liu.
Local Global Relational Network for Facial Action Units Recognition. (Long-paper, Full Oral Report) [pdf]
IEEE International Conference on Automatic Face and Gesture Recognition (FG), 2021. [CORE Rank B] [TH-CPL Rank B]

Fuhai Chen, Rongrong Ji^✉, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang.
Variational Structured Semantic Inference for Diverse Image Captioning. [pdf] [BibTex]
The 33th Conference on Neural Information Processing Systems (NeurIPS). 2019. [CORE Rank A*]

Xuri Ge , Fuhai Chen, Chen Shen, Rongrong Ji^✉
Colloquial Image Captioning. (Oral Report) [pdf] [BibTex]
IEEE International Conference on Multimedia and Expo (ICME), 2019. [CORE Rank A]

Pre-print:

Fuhai Chen#, Xuri Ge#, Xiaoshuai Sun, Yue Gao, Jianzhuang Liu, Fufeng Chen^✉, Wenjie Li^✉
Differentiated Relevances Embedding for Group-based Referring Expression Comprehension. [Arxiv]
Under review, 2023.

Fuxiang Tao, Wei Ma, Xuri Ge^✉, Anna Esposito, Alessandro Vinciarelli
The Relationship Between Speech Features Changes When You Get Depressed: Feature Correlations for Improving Speed and Performance of Depression Detection. [Arxiv]
Under review, 2023.

Fuhai Chen, Rongrong Ji^✉, Chengpeng Dai, Xuri Ge, Shengchuang Zhang, Xiaojing Ma, Yue Gao.
Factored Attention and Embedding for Unstructured-view Topic-related Ultrasound Report Generation. [Arxiv]

Xuri Ge^✉,Linqing Li,Songpei Xu, Kaiwen Zheng, Yaoqin He, Junchen Fu, Joemon M. Jose
The DenseCap-Guided Attention Network For Image-Text Matching. [pdf coming]
Under review.

Patent:

Xuri Ge^✉, Zhilong Ji, Xiao Liu
检索方法、电子设备及计算机可读介质.
Published: CN 112287159 A; Num: 202011506349.4

Xuri Ge^✉, Zhilong Ji, Xiao Liu
弹幕生成方法、装置、电子设备及计算机存储介质.
Published: CN 112016573 A; Num: 202011112941.6

Activities

Conference PC and Reviewer: ICML, NeurIPS, ICLR, ACM Multimedia, IJCAI, WWW, CIKM, WACV, AISTATS, ICME, BMVC, ECIR, FG, ICASSP, etc.

Journal Reviewer: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), International Journal of Computer Vision(IJCV), IEEE TRANSACTIONS ON MULTIMEDIA (TMM), Transactions on Knowledge and Data Engineering(TKDE), IEEE Transactions on Circuits and Systems for Video Technology(TCSVT), Pattern Recognition(PR), The ACM Transactions on Information Systems (TOIS), Information Processing and Management(IP&M), Multimedia Systems, etc.

中国图象图形学学会(CSIG)会员.

Organizing committee of 3D Multimedia Analytics, Search and Generation (3DMM 2024) in ICME 2024 workshop (Link).

Organizing committee of AutoGen-CDR19 challenge in MICCAI 2019 (Link).

1st Prize (team name: MAC-Group), award on Workshop of Automatic Generation of Cardiovascular Diagnostic Report, The 22th Medical Image Computing Computer Assisted Intervention (MICCAI 2019), 2019.

5th Prize (team name: SenseTime, method name: GraphLayout), award on ICDAR 2019 Robust Reading Challenge.

Working Experiences
Msc Supervision, School of Computing Science, University of Glasgow

2023.07 - 2023.09, Kaiwen Zheng, "Facial Micro-expression Recognition". Now at University of Glasgow (PhD).

2023.07 - 2023.09, Qinglin Yang, "Facial Paralysis Estimation". Now at

2023.07 - 2023.09, Shilong Meng, "Facial Action Unit Detection". Now at

2022.07 - 2022.09, Quang Trung Tran, "Facial Micro-expression Recognition".Now at Scottish Water.

Tutor, School of Computing Science, University of Glasgow, UK

Summer 2021, Teaching assistant of “Text as Data (Master)”, University of Glasgow.

Summer 2021, Teaching assistant of “Web Science (M.)”, University of Glasgow.

Spring 2022, Tutor of “Text as Data (M.)”, University of Glasgow.

Spring 2022, Tutor of “Web Science (M.)”, University of Glasgow.

Spring 2022, Tutor of “Information Visualisation (M.)”, University of Glasgow.

Winter 2022, Tutor of “Machine Learning (M.)”, University of Glasgow.

Winter 2022, Tutor of “Computer Vision (High-level)”, University of Glasgow.

Spring 2023, Tutor of “Text as Data (M.)”, University of Glasgow.

Spring 2023, Tutor of “Web Science (M.)”, University of Glasgow.

Spring 2024, Tutor of “Web Science (M.)”, University of Glasgow.

Company Researcher

2020.07 - 2021.04, Computer Vision Researcher, TAL.

2019.03 - 2019.07, Research Intern, SenseTime.

Teaching assistant, School of Informatics, Xiamen University, China

Spring 2018, Teaching assistant of “Introduction to Artificial Intelligence”, Xiamen University.

Awards

China Scholarship Council (CSC) Scholarships, 2021.01-2025.01

Xiamen University Scholarship, 2017-2020