Yesian Rohn | Xingsong Ye
I am a Master's student in Computer Science at Fudan
University, under the supervision of Prof. Zhineng Chen in the FVL Group.
Before that, nothing had happened.
I am currently focused on projects involving OCR, MultiModal, AIGC (C means
Comedy). Please connect and collaborate with me to explore the potential of these fields and leverage them to make a positive impact on exciting computer science projects!
Email  / 
CV  / 
Google Scholar  / 
OpenReview  / 
Github  / 
Bilibili  / 
Zhihu
|
|
Publications
|
TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition
Xingsong Ye, Yongkun Du, Yunbo Tao, Zhineng Chen
arXiv, 2024
project page
/
arXiv
/
code
/
demo
We introduce TextSSR: a novel framework for Synthesizing Scene Text Recognition data via a diffusion-based universal text region synthesis model. It ensures accuracy by focusing on generating text within a specified image region and leveraging rich glyph and position information to create the less complex text region compared to the entire image. Furthermore, we utilize neighboring text within the region as a prompt to capture real-world font styles and layout patterns, guiding the generated text to resemble actual scenes. Finally, due to its prompt-free nature and capability for character-level synthesis, TextSSR enjoys a wonderful scalability.
|
|
Cross-Modality Safety Alignment
Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, Shimin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang
arXiv, 2024
project page
/
arXiv
/
code
/
dataset
We introduce a novel safety alignment challenge called Safe Inputs but Unsafe Output (SIUO) to evaluate cross-modality safety alignment. Specifically, it considers cases where single modalities are safe independently but could potentially lead to unsafe or unethical outputs when combined.
|
|
Rethinking the Elementary function fusion for Single-Image Dehazing
Yesian Rohn
Course Project of DIP, 2024
/
arXiv, 2024
arXiv
/
code
We introduce CL2S, an innovative image dehazing network that overcomes limitations of DM2F (baseline) by trying sine functions, as validated through systematic ablation experiments.
|
|
DuanzAI: Slang-Enhanced LLM with Prompt for Humor Understanding
Yesian Rohn
Xiyuan Project of FDUROP, 2023 - 2024
/
arXiv, 2024
project page
/
arXiv
/
code
We enhance LLMs' understanding of Chinese slang using curated datasets and advanced techniques, introducing DuanzAI and its application in the ChatDAI chatbot.
|
Honor
- Fudan Excellent League Member, Spring 2024.
- Third-class in Fudan Guanghua Social and Academic Practice Competition, Winter 2023.
- Fudan Outstanding Student, Autumn 2023.
- Alibaba Cloud Annual Geek of Greencode, Winter 2022.
|
Awards
- Jinrirencai Scholarship, Spring 2024.
- First-class Scholarship of Fudan University (Graduate Scholarship), Spring 2024.
- Duckbill Scholarship (Endowed Scholarship of Fudan University), Autumn 2023.
- Pujiang Innovation Forum Volunteer Nickname Bid-Winning Award, Summer 2023.
- Third-class Scholarship of Fudan University, Autumn 2022.
|
Service
Teaching / TA
Referee / Reviewer
|
|