Posts by Collection

portfolio

Portfolio item number 1

Short description of portfolio item number 1
Read more

Portfolio item number 2

Short description of portfolio item number 2
Read more

publications

Walking on the edge: Fast, low-distortion adversarial examples

Published in IEEE Transactions on Information Forensics and Security, 2020

This paper is about how to generate adversarial perturbation efficiently. Read more

Recommended citation: Zhang, H., Avrithis, Y., Furon, T., & Amsaleg, L. (2020). Walking on the edge: Fast, low-distortion adversarial examples. IEEE Transactions on Information Forensics and Security, 16, 701-713. https://ieeexplore.ieee.org/abstract/document/9186644

Smooth adversarial examples

Published in EURASIP Journal on Information Security, 2020

This paper is about how to generate smooth adversarial perturbations. Read more

Recommended citation: Zhang, H., Avrithis, Y., Furon, T., & Amsaleg, L. (2020). Smooth adversarial examples. EURASIP Journal on Information Security, 2020, 1-12. https://link.springer.com/article/10.1186/s13635-020-00112-z

NeRFail: Neural Radiance Fields-Based Multiview Adversarial Attack

Published in Proceeding of the 38th AAAI Conference on Artificial Intelligence, 2024

This paper is about adversarial robustness on NeRF. Read more

Recommended citation: Jiang, W., Zhang, H., Wang, X., Guo, Z., & Wang, H. (2024, March). Nerfail: Neural radiance fields-based multiview adversarial attack. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, No. 19, pp. 21197-21205). https://ojs.aaai.org/index.php/AAAI/article/view/30113

IPA-NeRF: Illusory Poisoning Attack Against Neural Radiance Fields

Published in Proceedings of ECAI 2024, 2024

This paper is about adversarial poisoning attack on NeRF. Read more

Recommended citation: Jiang, W., Zhang, H., Zhao, S., Guo, Z., & Wang, H. (2024). Ipa-nerf: Illusory poisoning attack against neural radiance fields. arXiv preprint arXiv:2407.11921. https://arxiv.org/pdf/2407.11921

Traceability and accountability by construction

Published in International Symposium on Leveraging Applications of Formal Methods, 2024

This paper is about how to build an accountable and traceable AI system through cryptographic signatures. Read more

Recommended citation: Wenzel, J., Köhl, M. A., Sterz, S., Zhang, H., Schmidt, A., Fetzer, C., & Hermanns, H. (2024, October). Traceability and accountability by construction. In International Symposium on Leveraging Applications of Formal Methods (pp. 258-280). Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-75387-9_16

Opti-CAM: Optimizing saliency maps for interpretability

Published in Computer Vision and Image Understanding, 2024

This paper is about how to optimize saliency maps for interpretability. Read more

Recommended citation: Zhang, H., Torres, F., Sicre, R., Avrithis, Y., & Ayache, S. (2024). Opti-CAM: Optimizing saliency maps for interpretability. Computer Vision and Image Understanding, 248, 104101. https://www.sciencedirect.com/science/article/pii/S1077314224001826

Eidos: Efficient, Imperceptible Adversarial 3D Point Clouds

Published in International Symposium on Dependable Software Engineering: Theories, Tools, and Applications (SETTA 2024), 2024

This paper is about adversarial robustness on 3D point clouds. Read more

Recommended citation: Zhang, H., Cheng, L., He, Q., Huang, W., Li, R., Sicre, R., ... & Zhang, L. (2024, November). Eidos: Efficient, imperceptible adversarial 3d point clouds. In International Symposium on Dependable Software Engineering: Theories, Tools, and Applications (pp. 310-326). Singapore: Springer Nature Singapore. https://link.springer.com/chapter/10.1007/978-981-96-0602-3_17

Saliency Maps Give a False Sense of Explanability to Image Classifiers: An Empirical Evaluation across Methods and Metrics

Published in The 16th Asian Conference on Machine Learning (Conference Track), 2024

This paper is about an empirical evaluation across saliency methods and corresponding explainable metrics. Read more

Recommended citation: Zhang, H., Figueroa, F. T., & Hermanns, H. (2024). Saliency Maps Give a False Sense of Explanability to Image Classifiers: An Empirical Evaluation across Methods and Metrics. In The 16th Asian Conference on Machine Learning (Conference Track). https://raw.githubusercontent.com/mlresearch/v260/main/assets/zhang25a/zhang25a.pdf

Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights

Published in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025

This paper is about systematized evaluation of transferable adversarial robustness on image classification. Read more

Recommended citation: Zhao, Z., Zhang, H., Li, R., Sicre, R., Amsaleg, L., Backes, M., ... & Shen, C. (2025). Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://ieeexplore.ieee.org/abstract/document/11164808/

talks

A Quick Tour: Deep Learning in Adversarial Context

Published: November 27, 2020

I was invited to give a tutorial on adversarial attacks in deep learning for researchers attending the workshop. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images. I also provided an overview of existing tools for running attacks. Read more

Interpretability in Deep Learning: Optimizing Saliency Maps for Improved Interpretability

Published: April 20, 2023

I was invited to give a talk on the interpretability problem in deep learning to the entire research team. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on optimizing saliency maps for improved interpretability. Read more

Adversarial Robustness in Deep Learning

Published: September 22, 2023

I was invited to give a talk on adversarial attacks in deep learning for bachelor’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images. Read more

Adversarial Attack in 3D Representation

Published: March 22, 2024

I was invited to give a talk on adversarial attacks in 3D representation for master’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on adversarial attacks targeting 3D point clouds and Neural Radiance Fields (NeRF). Read more

Unveiling AI: Exploring Neural Networks and the Journey Towards Interpretability

Published: September 20, 2024

In this invited talk, we introduced the fundamentals of interpretability in neural networks, aiming to make the topic accessible to university students new to the field. We explored why interpretability is essential, discussed key methods for analyzing neural networks, and highlighted how these insights can pave the way for impactful research. The session aimed to inspire and equip students with the knowledge to embark on their AI research journey. Read more

Building Trustworthy AI from the view of Adversarial Robustness and Explainability

Published: June 04, 2025

As artificial intelligence (AI) systems become increasingly integrated into safety-critical and high-stakes domains, ensuring their trustworthiness has emerged as a central research challenge. Two foundational pillars of trustworthy AI are adversarial robustness and explainability. Adversarial robustness addresses the vulnerability of machine learning models to carefully crafted perturbations that can cause erroneous or manipulated outputs, exposing critical weaknesses in reliability and security. Explainability, on the other hand, seeks to make AI systems transparent and interpretable, enabling stakeholders to understand, validate, and contest model decisions. In this presentation, I will introduce my research framed around these two core pillars. Read more

Adversarial Robustness and Interpretability: Where Empirical ML Meets Formal Guarantees

Published: February 12, 2026

In this talk, I will present my research on adversarial robustness and interpretability from an empirical machine learning perspective. Adversarial examples reveal systematic failure modes of neural networks and highlight the limitations of current evaluation practices, both for robustness and for explanation methods such as saliency maps. Across image classification, 3D perception, and physically grounded representations, adversarial analysis can be understood as a form of stress testing that produces concrete counterexamples to implicitly assumed system properties. Building on these observations, I will argue that many difficulties in assessing robustness, interpretability, and accountability stem from the absence of explicit specifications and formal guarantees. While empirical adversarial methods are effective at discovering violations, they do not by themselves clarify which properties should be satisfied or how they can be enforced. I will therefore discuss why formal methods, such as specification, verification, and counterexample-guided refinement, are a natural complement to adversarial machine learning, and outline potential directions for bridging these communities. The aim of the talk is to invite discussion on how empirical failure analysis can be integrated into workflows that support verifiable and accountable AI systems. Read more

面向智能系统的可信人工智能：从鲁棒性、可解释性到可追溯性

Published: March 27, 2026

随着人工智能技术在自动驾驶等智能系统中的广泛应用，如何保障其在安全关键场景下的可靠性与可信性已成为亟需解决的核心问题。尽管深度学习模型在感知与决策任务中取得了显著进展，但其在对抗扰动下的脆弱性、内部机制的不透明性以及缺乏可审计能力等问题，严重制约了其在实际系统中的安全部署。本报告围绕“可信人工智能”这一主题，系统介绍报告人在鲁棒性、可解释性与可追溯性方面的研究进展。首先，从对抗攻击的角度出发，分析深度模型在复杂环境中的不稳定性，并介绍面向真实场景的对抗攻击与防御方法；其次，针对深度模型“黑箱”问题，探讨现有解释方法的局限性，并介绍基于语义概念的可解释建模方法，以提升模型决策的可理解性与一致性；在此基础上，进一步讨论面向安全关键系统的AI可追溯与合规框架，以支持模型决策过程的审计与责任界定。最后，报告将结合智能系统中的典型应用场景，探讨可信人工智能在自动化系统中的关键作用，并展望未来在系统级可信AI与跨层安全机制方面的研究方向及潜在合作机会。 Read more

When AI Fails: Technical Robustness and Reliability - Bridging Technical Guarantees and Legal Expectations

Published: April 27, 2026

As artificial intelligence systems are increasingly deployed in safety‑critical and legally regulated domains, ensuring their robustness and reliability has become both a technical necessity and a legal requirement. This talk examines AI robustness and reliability from a technical perspective and connects these concepts to emerging legal frameworks for AI certification and compliance. I first introduce robustness evaluation using adversarial attacks, with a particular focus on perception systems in autonomous driving. Adversarial testing reveals failure modes that are often invisible to standard benchmarking and provides a principled way to assess model behavior under worst‑case perturbations. Building on this, I discuss strategies to improve system reliability through enhanced interpretability, continuous monitoring, and human‑in‑the‑loop intervention mechanisms. Interpretable models and explanations enable better detection of anomalous behavior, support real‑time risk mitigation, and provide auditable evidence for certification and liability assessment. By bridging technical evaluation methods with legal expectations for transparency, accountability, and risk management, this talk highlights how robustness testing and interpretability can serve as foundational tools for trustworthy AI deployment and AI certification in regulated applications such as self‑driving cars. Read more

teaching

Adversarial Robustness in Autonomous Driving

Summer School, Southwest University, 2023

Open Course on Trusted Intelligent Algorithms in Intelligent Vehicles: Planning and Control. Read more

Exploring Explainability in Machine Learning

Master Seminar, Saarland University, 2024

This seminar course delves into the crucial and evolving field of explainability in machine learning (ML). As ML models become increasingly complex and integral to various domains, understanding how these models make decisions is essential. This course will explore different methodologies for interpreting ML models, including rule-based, attribution-based, example-based, prototype-based, hidden semantics-based, and counterfactual-based approaches. Through a combination of paper readings, discussions, and presentations, students will gain a comprehensive understanding of the challenges and advancements in making ML models transparent and interpretable. Read more

Hanwei ZHANG

Posts by Collection

portfolio

publications

talks

teaching