Talks and presentations

See a map of all the places I've given a talk!

When AI Fails: Technical Robustness and Reliability - Bridging Technical Guarantees and Legal Expectations

April 27, 2026

Invited Talk, Law School, University of São Paulo, São Paulo, Brazil

As artificial intelligence systems are increasingly deployed in safety‑critical and legally regulated domains, ensuring their robustness and reliability has become both a technical necessity and a legal requirement. This talk examines AI robustness and reliability from a technical perspective and connects these concepts to emerging legal frameworks for AI certification and compliance. I first introduce robustness evaluation using adversarial attacks, with a particular focus on perception systems in autonomous driving. Adversarial testing reveals failure modes that are often invisible to standard benchmarking and provides a principled way to assess model behavior under worst‑case perturbations. Building on this, I discuss strategies to improve system reliability through enhanced interpretability, continuous monitoring, and human‑in‑the‑loop intervention mechanisms. Interpretable models and explanations enable better detection of anomalous behavior, support real‑time risk mitigation, and provide auditable evidence for certification and liability assessment. By bridging technical evaluation methods with legal expectations for transparency, accountability, and risk management, this talk highlights how robustness testing and interpretability can serve as foundational tools for trustworthy AI deployment and AI certification in regulated applications such as self‑driving cars.

面向智能系统的可信人工智能：从鲁棒性、可解释性到可追溯性

March 27, 2026

Talk, 上海交通大学 SJTU-SCS, Shanghai, China

随着人工智能技术在自动驾驶等智能系统中的广泛应用，如何保障其在安全关键场景下的可靠性与可信性已成为亟需解决的核心问题。尽管深度学习模型在感知与决策任务中取得了显著进展，但其在对抗扰动下的脆弱性、内部机制的不透明性以及缺乏可审计能力等问题，严重制约了其在实际系统中的安全部署。本报告围绕“可信人工智能”这一主题，系统介绍报告人在鲁棒性、可解释性与可追溯性方面的研究进展。首先，从对抗攻击的角度出发，分析深度模型在复杂环境中的不稳定性，并介绍面向真实场景的对抗攻击与防御方法；其次，针对深度模型“黑箱”问题，探讨现有解释方法的局限性，并介绍基于语义概念的可解释建模方法，以提升模型决策的可理解性与一致性；在此基础上，进一步讨论面向安全关键系统的AI可追溯与合规框架，以支持模型决策过程的审计与责任界定。最后，报告将结合智能系统中的典型应用场景，探讨可信人工智能在自动化系统中的关键作用，并展望未来在系统级可信AI与跨层安全机制方面的研究方向及潜在合作机会。

Adversarial Robustness and Interpretability: Where Empirical ML Meets Formal Guarantees

February 12, 2026

Talk, KASTEL-KIT, (Karlsruher Institut für Technologie), Karlsruhe, Germany

In this talk, I will present my research on adversarial robustness and interpretability from an empirical machine learning perspective. Adversarial examples reveal systematic failure modes of neural networks and highlight the limitations of current evaluation practices, both for robustness and for explanation methods such as saliency maps. Across image classification, 3D perception, and physically grounded representations, adversarial analysis can be understood as a form of stress testing that produces concrete counterexamples to implicitly assumed system properties. Building on these observations, I will argue that many difficulties in assessing robustness, interpretability, and accountability stem from the absence of explicit specifications and formal guarantees. While empirical adversarial methods are effective at discovering violations, they do not by themselves clarify which properties should be satisfied or how they can be enforced. I will therefore discuss why formal methods, such as specification, verification, and counterexample-guided refinement, are a natural complement to adversarial machine learning, and outline potential directions for bridging these communities. The aim of the talk is to invite discussion on how empirical failure analysis can be integrated into workflows that support verifiable and accountable AI systems.

Building Trustworthy AI from the view of Adversarial Robustness and Explainability

June 04, 2025

Talk, ORAILIX, École Polytechnique, Paris, France

As artificial intelligence (AI) systems become increasingly integrated into safety-critical and high-stakes domains, ensuring their trustworthiness has emerged as a central research challenge. Two foundational pillars of trustworthy AI are adversarial robustness and explainability. Adversarial robustness addresses the vulnerability of machine learning models to carefully crafted perturbations that can cause erroneous or manipulated outputs, exposing critical weaknesses in reliability and security. Explainability, on the other hand, seeks to make AI systems transparent and interpretable, enabling stakeholders to understand, validate, and contest model decisions. In this presentation, I will introduce my research framed around these two core pillars.

Unveiling AI: Exploring Neural Networks and the Journey Towards Interpretability

September 20, 2024

Talk, Universität des Saarlandes, Saarbrücken, Germany

In this invited talk, we introduced the fundamentals of interpretability in neural networks, aiming to make the topic accessible to university students new to the field. We explored why interpretability is essential, discussed key methods for analyzing neural networks, and highlighted how these insights can pave the way for impactful research. The session aimed to inspire and equip students with the knowledge to embark on their AI research journey.

Adversarial Attack in 3D Representation

March 22, 2024

Talk, Tianjin University of Technology, Tianjin, China

I was invited to give a talk on adversarial attacks in 3D representation for master’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on adversarial attacks targeting 3D point clouds and Neural Radiance Fields (NeRF).

Adversarial Robustness in Deep Learning

September 22, 2023

Talk, Chongqing Jiaotong University, Chongqing, China

I was invited to give a talk on adversarial attacks in deep learning for bachelor’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images.

Interpretability in Deep Learning: Optimizing Saliency Maps for Improved Interpretability

April 20, 2023

Talk, LinkMedia, Inria, Rennes, France

I was invited to give a talk on the interpretability problem in deep learning to the entire research team. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on optimizing saliency maps for improved interpretability.

A Quick Tour: Deep Learning in Adversarial Context

November 27, 2020

Totorial, Workshop I: Dependable Deep Learning} of Symposium on Dependable Software Engineering Theories, Tools and Applications (SETTA), Guangzhou, China

I was invited to give a tutorial on adversarial attacks in deep learning for researchers attending the workshop. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images. I also provided an overview of existing tools for running attacks.

Hanwei ZHANG

Talks and presentations