Research

Research

15+ publications at top-tier AI conferences. 6 patents filed. Built with leading academic and industry partners — and every finding ships into product.
Publications
arXiv*

When Good Sounds Go Adversarial: Audio Jailbreak

Jailbreaking audio-language models with benign-sounding adversarial inputs

Nature

Humanity's Last Exam

Frontier AI evaluation benchmark with 3,000+ expert-level questions across disciplines

arXiv

Eliciting and Analyzing Emergent Misalignment

Conversational red-teaming to elicit misalignment through narrative immersion and emotional pressure

WACV 2026

Better Safe Than Sorry? Overreaction of VLMs

VLMs misclassify 31–96% of safe situations as dangerous in visual emergency recognition

LREC 2026

Redefining Evaluation Standards: A Unified Framework for Evaluating the Korean Capabilities of Language Models

A unified framework for evaluating the Korean capabilities of language models

ACL 2026

COMPASS: Evaluating Organization-Specific Policy Alignment in LLMs

Framework for evaluating LLM compliance with enterprise-specific allow/deny policies

ACL 2026

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

HAERAE-Vision benchmark revealing VLM failures on ambiguous and incomplete queries

ICLR 2026

Jailbreaking on Text-to-Video Models via Scene Splitting

First systematic black-box jailbreak for T2V models, achieving 70–84% success rates

CVPR 2026

Fine-Grained Multi-Image Object Hallucination Benchmark

A comprehensive benchmark for evaluating object hallucination across multiple images in vision-language models

AISTATS 2026

Towards Motion-aware Referring Image Segmentation

Motion-aware approach for referring image segmentation with improved temporal understanding

ACL 2025

sudo rm -rf agentic_security

Systematic security analysis of autonomous AI agent vulnerabilities · Industry Track

ACL WS 2025

M2S: Multi-turn to Single-turn Jailbreak (One-Shot is Enough)

Compressing multi-turn jailbreak strategies into single-turn attacks

ACL 2025

Representation Bending for Large Language Model Safety

Manipulating LLM internal representations to reduce harmful outputs

ICML 2025

ELITE: Enhanced Language-Image Toxicity Evaluation

Benchmark for evaluating toxicity across language-image multimodal models

IEEE-EMBS 2025

PatientSafeBench: Evaluating Safety of Medical LLMs

Safety evaluation framework for patient-facing medical AI systems

NeurIPS WS 2025

ObjexMT: Objective Extraction & Metacognitive Calibration

Benchmarking whether LLM judges can recover hidden objectives in jailbreak transcripts

NeurIPS WS 2025

X-Teaming Evolutionary M2S

LLM-guided evolution for automated multi-turn jailbreak template discovery

ICLR 2023

DepthFL: Depthwise Federated Learning

Federated learning for heterogeneous resource-constrained clients

* Under review / target venue

Patents
2025.12

AI Security Testing Integrated Framework and Visual Flow-Based Execution System

DB25-0017-KR0

2025.09

Multimodal Content Policy Violation Detection System

DB25-0016-KR0

2025.09

AI-Based NER Detection Guardrail

DB25-0015-KR0

2025.03

Domain-Specific AI Guardrail System

10-2025-0038904

2024.09

AI Guardrail System

10-2024-0124354

2024.08

Red Teaming Method for Security Assessment

10-2024-0116863

Open Source
GitHub

Video2Robot

200K+ views, 540+ stars on LinkedIn & X

Hugging Face

AIM Intelligence

Public models & datasets for AI safety research

Collaborate with us on AI safety research.

From joint publications to enterprise security assessments — let's advance AI safety together.

TALK TO AN EXPERTVIEW ON HUGGING FACE
aim

Ready to secure your AI?

Consult with AIM Intelligence's security experts and request a free red teaming demo optimized for your system.

EXPLORE PLATFORM