SMILE-Next

Abstract

Laughter is a complex social signal that conveys communicative intent beyond amusement. While prior work has focused on isolated laughter analysis tasks, a comprehensive understanding of laughter in real-world scenarios remains underexplored. We introduce SMILE-Next, a dataset for real-world laughter understanding with multimodal textual representations and question–answer annotations across three tasks: laughter detection, laughter type classification, and laughter reasoning. Building on this dataset, we propose a laughter expert LLM that leverages disentangled multimodal textual cues, together with a Mixture-of-Laugh-Experts framework and laughter-specific self-instruction for task-adaptive specialization. Experimental results show that the combination of our proposed components substantially outperforms multimodal LLM baselines, advancing robust real-world laughter understanding.

Laughter-tailored Self-Instruct

Task frequency distribution. — **Figure 2.** Task frequency distribution of the SMILE-Next dataset across three laughter understanding tasks.

Examples of Generated Self-Instruction Instances

Evaluating task: Rate each laugh in the scene based on intensity and context, and determine whether it was genuine or forced.
Input: During a tense board meeting at her office, Sarah tries to lighten the mood with a faint chuckle after the boss makes a dry joke. Her co-workers don't seem to respond much.
Answer: Forced, low intensity

Correlation task: Derive the relationship between the acoustic features and the intensity or type of laugh.
Input: Acoustic feature: irregular pace, variation in pitch (Laughter)
Answer: This could suggest a nervous laughter or possibly a fake laugh.

Table 1. Examples of laughter-specific self-instruction instances generated for the SMILE-Next dataset.

Mixture-of-Laugh-Experts

Router activation weights. — **Figure 4.** Router activation weights across laughter understanding tasks, showing task-adaptive expert selection.

Results

Quantitative Results

Table 2. Quantitative comparison of SMILE-Next against multimodal LLM baselines across laughter detection, classification, and reasoning tasks.

Qualitative Results

Figure 5. Qualitative examples of SMILE-Next predictions across laughter detection, classification, and reasoning tasks.

😆 SMILE-Next:
Teaching Large Language Models to Detect, Classify, and Reason about Laughter

SMILE-Next defines three laughter-related tasks:
Laughter Detection, Laughter Classification, and Laughter Reasoning.

Abstract

SMILE-Next: Real-World Laughter Dataset

Dataset Statistics

Laughter-tailored Self-Instruct

Mixture-of-Laugh-Experts

Results

Quantitative Results

Qualitative Results

😆 SMILE-Next: Teaching Large Language Models to Detect, Classify, and Reason about Laughter

SMILE-Next defines three laughter-related tasks: Laughter Detection, Laughter Classification, and Laughter Reasoning.

Abstract

SMILE-Next: Real-World Laughter Dataset

Dataset Statistics

Laughter-tailored Self-Instruct

Mixture-of-Laugh-Experts

Results

Quantitative Results

Qualitative Results

😆 SMILE-Next:
Teaching Large Language Models to Detect, Classify, and Reason about Laughter

SMILE-Next defines three laughter-related tasks:
Laughter Detection, Laughter Classification, and Laughter Reasoning.