NYCU-NLP at SemEval-2025 Task 11: Assembling Small Language Models for Multilabel Emotion Detection and Intensity Prediction

Zhe-Yu Xu, Yu-Hsin Wu, and Lung-Hao Lee*.

In Proceedings of The 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 1129–1135.


Abstract

This study describes the design of the NYCU-NLP system for the SemEval-2025 Task 11 that focuses on multi-lingual text- based emotion analysis. We instruction- tuned three small language models: Gemma-2 (27B), Mistral-small-3 (22B), and Phi-4 (14B) and then assembled them as our main system architecture. Our NYCU-NLP system participated the English Track A for multilabel emotion detection and English Track B for emotion intensity prediction. Experimental results show our best-performing submission produced a macro-averaging F1 score of 0.8225, ranking second of 74 participating teams for Track A, and ranked second among 36 teams for Track B with a Pearson correlation coefficient of 0.8373 in the task official rankings.