Jailbreaking audio-language models with benign-sounding adversarial inputs
Frontier AI evaluation benchmark with 3,000+ expert-level questions across disciplines
Conversational red-teaming to elicit misalignment through narrative immersion and emotional pressure
VLMs misclassify 31–96% of safe situations as dangerous in visual emergency recognition
A unified framework for evaluating the Korean capabilities of language models
Framework for evaluating LLM compliance with enterprise-specific allow/deny policies
HAERAE-Vision benchmark revealing VLM failures on ambiguous and incomplete queries
First systematic black-box jailbreak for T2V models, achieving 70–84% success rates
A comprehensive benchmark for evaluating object hallucination across multiple images in vision-language models
Motion-aware approach for referring image segmentation with improved temporal understanding
Systematic security analysis of autonomous AI agent vulnerabilities · Industry Track
Compressing multi-turn jailbreak strategies into single-turn attacks
Manipulating LLM internal representations to reduce harmful outputs
Benchmark for evaluating toxicity across language-image multimodal models
Safety evaluation framework for patient-facing medical AI systems
Benchmarking whether LLM judges can recover hidden objectives in jailbreak transcripts
LLM-guided evolution for automated multi-turn jailbreak template discovery
Federated learning for heterogeneous resource-constrained clients
* Under review / target venue
DB25-0017-KR0
DB25-0016-KR0
DB25-0015-KR0
10-2025-0038904
10-2024-0124354
10-2024-0116863
From joint publications to enterprise security assessments — let's advance AI safety together.