Trustworthy Research for Understandable, Safe, Technology

The TRUST Lab at Duke University conducts research in applied AI explainability, technology evaluation, and adversarial alignment to ensure AI systems are transparent, safe, and beneficial for society.

Research Areas

Our interdisciplinary team tackles challenges in developing trustworthy technology.

Applied Explainability

Applying AI explainability methods to real-world problems.

Technology Evaluation

We focus both on the technical evaluation/benchmarking of AI systems and on the assessment of the societal impact of emerging technologies.

Adversarial Alignment

Using adversarial techniques to better explain AI systems.

Current Projects

Explainability in Conservation

Applied Explainability

In collaboration with:Duke Marine Robotics and Remote Sensing Lab (Nicholas School of the Environment)

Researchers:Jiayi Zhou, Gunel Aghakishiyeva

We are applying state of the art computer vision approaches and explainable machine learning techniques to support wildlife conservation efforts. This interdisciplinary project bridges machine learning with ecological science to create transparent decision-making tools.

Exploring Geolingual and Temporal Components of AI Embeddings

Applied Explainability

In collaboration with:The Emergence Lab (Computational Media, Arts, & Culture)

Researchers:Bochu Ding, Junyu Zhang, Vivienne Foley, Alexis Golart, Neha Shukla, James Sohigian

Funding:Duke Bass Connections

This project investigates how large embedding models encode geographical and temporal information, with implications for understanding cultural biases and historical shifts in AI systems.

Consilience: AI in Interdisciplinary Research Augmentation

Technology Evaluation

In collaboration with:Society-Centered AI Initiative

Researchers:Vishnu Mukundan TM, Vihaan Nama, Tiffany Degbotse, Jiayi Zhou

Funding:Duke Deep Tech, OpenAI

This study explores how voice-based, conversational LLM agents can function as “research translators” in interdisciplinary collaborations.

Aligned Machine

Technology Evaluation

Researchers:Jiechen Li, Hannah Groos

The Aligned Machine aims to builds a benchmark of human-aligned similarity by comparing AI model outputs with human judgments of meaning, using an interactive platform designed to support public engagement with AI research.

Explainable and Adversarially Robust Sleep Monitoring

Adversarial AlignmentApplied Explainability

In collaboration with:Masters of Interdisciplinary Data Science

Researchers:Jenny Chen, Jenny Wu, Rishika Randev, Eric Ortega Rodriguez

This project addresses gaps in responsible AI for digital health by developing explainable and adversarially robust machine learning models for sleep monitoring.

Adversarial Alignment in Large Language Models

Adversarial AlignmentTechnology Evaluation

Researchers:Gunel Aghakishiyeva

We aim to turn the “bug” of adversarial attacks into a feature for improving AI transparency, trustworthiness, and alignment with human goals. In this project, we are developing an open-source adversarial probing platform for LLMs.

Recent Publications

The Term 'Agent' Has Been Diluted Beyond Utility and Requires Redefinition

Brinnae Bent

AAAI/ACM Conference on AI, Ethics, and Society (accepted) • 2025

Technology Evaluation

Semantic Approach to Quantifying the Consistency of Diffusion Model Image Generation

Brinnae Bent

CVPR Explainable AI for Computer Vision Workshop • 2024

Technology Evaluation

Get in Touch

Interested in our research? We welcome collaborations, inquiries from prospective students, and partnerships with industry and academia.

brinnae.bent@duke.edu

Duke University, Durham, NC

Trustworthy Research for Understandable, Safe, Technology

Research Areas

Applied Explainability

Technology Evaluation

Adversarial Alignment

Current Projects

Explainability in Conservation

Exploring Geolingual and Temporal Components of AI Embeddings

Consilience: AI in Interdisciplinary Research Augmentation

Aligned Machine

Explainable and Adversarially Robust Sleep Monitoring

Adversarial Alignment in Large Language Models

Recent Publications

The Term 'Agent' Has Been Diluted Beyond Utility and Requires Redefinition

Semantic Approach to Quantifying the Consistency of Diffusion Model Image Generation

Featured Videos

Responsible AI Symposium

Adversarial Alignment

Get in Touch