Portfolio item number 1
Short description of portfolio item number 1
Read more
Short description of portfolio item number 1
Read more
Short description of portfolio item number 2
Read more
Published in IEEE Transactions on Information Forensics and Security, 2020
This paper is about how to generate adversarial perturbation efficiently. Read more
Recommended citation: Zhang, H., Avrithis, Y., Furon, T., & Amsaleg, L. (2020). Walking on the edge: Fast, low-distortion adversarial examples. IEEE Transactions on Information Forensics and Security, 16, 701-713. https://ieeexplore.ieee.org/abstract/document/9186644
Published in EURASIP Journal on Information Security, 2020
This paper is about how to generate smooth adversarial perturbations. Read more
Recommended citation: Zhang, H., Avrithis, Y., Furon, T., & Amsaleg, L. (2020). Smooth adversarial examples. EURASIP Journal on Information Security, 2020, 1-12. https://link.springer.com/article/10.1186/s13635-020-00112-z
Published in Proceeding of the 38th AAAI Conference on Artificial Intelligence, 2024
This paper is about adversarial robustness on NeRF. Read more
Recommended citation: Jiang, W., Zhang, H., Wang, X., Guo, Z., & Wang, H. (2024, March). Nerfail: Neural radiance fields-based multiview adversarial attack. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 38, No. 19, pp. 21197-21205). https://ojs.aaai.org/index.php/AAAI/article/view/30113
Published in Proceedings of ECAI 2024, 2024
This paper is about adversarial poisoning attack on NeRF. Read more
Recommended citation: Jiang, W., Zhang, H., Zhao, S., Guo, Z., & Wang, H. (2024). Ipa-nerf: Illusory poisoning attack against neural radiance fields. arXiv preprint arXiv:2407.11921. https://arxiv.org/pdf/2407.11921
Published in International Symposium on Leveraging Applications of Formal Methods, 2024
This paper is about how to build an accountable and traceable AI system through cryptographic signatures. Read more
Recommended citation: Wenzel, J., Köhl, M. A., Sterz, S., Zhang, H., Schmidt, A., Fetzer, C., & Hermanns, H. (2024, October). Traceability and accountability by construction. In International Symposium on Leveraging Applications of Formal Methods (pp. 258-280). Cham: Springer Nature Switzerland. https://link.springer.com/chapter/10.1007/978-3-031-75387-9_16
Published in Computer Vision and Image Understanding, 2024
This paper is about how to optimize saliency maps for interpretability. Read more
Recommended citation: Zhang, H., Torres, F., Sicre, R., Avrithis, Y., & Ayache, S. (2024). Opti-CAM: Optimizing saliency maps for interpretability. Computer Vision and Image Understanding, 248, 104101. https://www.sciencedirect.com/science/article/pii/S1077314224001826
Published in International Symposium on Dependable Software Engineering: Theories, Tools, and Applications (SETTA 2024), 2024
This paper is about adversarial robustness on 3D point clouds. Read more
Recommended citation: Zhang, H., Cheng, L., He, Q., Huang, W., Li, R., Sicre, R., ... & Zhang, L. (2024, November). Eidos: Efficient, imperceptible adversarial 3d point clouds. In International Symposium on Dependable Software Engineering: Theories, Tools, and Applications (pp. 310-326). Singapore: Springer Nature Singapore. https://link.springer.com/chapter/10.1007/978-981-96-0602-3_17
Published in The 16th Asian Conference on Machine Learning (Conference Track), 2024
This paper is about an empirical evaluation across saliency methods and corresponding explainable metrics. Read more
Recommended citation: Zhang, H., Figueroa, F. T., & Hermanns, H. (2024). Saliency Maps Give a False Sense of Explanability to Image Classifiers: An Empirical Evaluation across Methods and Metrics. In The 16th Asian Conference on Machine Learning (Conference Track). https://raw.githubusercontent.com/mlresearch/v260/main/assets/zhang25a/zhang25a.pdf
Published in IEEE Transactions on Pattern Analysis and Machine Intelligence, 2025
This paper is about systematized evaluation of transferable adversarial robustness on image classification. Read more
Recommended citation: Zhao, Z., Zhang, H., Li, R., Sicre, R., Amsaleg, L., Backes, M., ... & Shen, C. (2025). Revisiting Transferable Adversarial Images: Systemization, Evaluation, and New Insights. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://ieeexplore.ieee.org/abstract/document/11164808/
Published:
I was invited to give a tutorial on adversarial attacks in deep learning for researchers attending the workshop. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images. I also provided an overview of existing tools for running attacks. Read more
Published:
I was invited to give a talk on the interpretability problem in deep learning to the entire research team. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on optimizing saliency maps for improved interpretability. Read more
Published:
I was invited to give a talk on adversarial attacks in deep learning for bachelor’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before discussing class work on adversarial attacks and presenting state-of-the-art advancements, including our work on adversarial attacks on images. Read more
Published:
I was invited to give a talk on adversarial attacks in 3D representation for master’s students. In this session, I introduced the background, motivation, and fundamental concepts of the problem before presenting our work on adversarial attacks targeting 3D point clouds and Neural Radiance Fields (NeRF). Read more
Published:
In this invited talk, we introduced the fundamentals of interpretability in neural networks, aiming to make the topic accessible to university students new to the field. We explored why interpretability is essential, discussed key methods for analyzing neural networks, and highlighted how these insights can pave the way for impactful research. The session aimed to inspire and equip students with the knowledge to embark on their AI research journey. Read more
Published:
As artificial intelligence (AI) systems become increasingly integrated into safety-critical and high-stakes domains, ensuring their trustworthiness has emerged as a central research challenge. Two foundational pillars of trustworthy AI are adversarial robustness and explainability. Adversarial robustness addresses the vulnerability of machine learning models to carefully crafted perturbations that can cause erroneous or manipulated outputs, exposing critical weaknesses in reliability and security. Explainability, on the other hand, seeks to make AI systems transparent and interpretable, enabling stakeholders to understand, validate, and contest model decisions. In this presentation, I will introduce my research framed around these two core pillars. Read more
Published:
In this talk, I will present my research on adversarial robustness and interpretability from an empirical machine learning perspective. Adversarial examples reveal systematic failure modes of neural networks and highlight the limitations of current evaluation practices, both for robustness and for explanation methods such as saliency maps. Across image classification, 3D perception, and physically grounded representations, adversarial analysis can be understood as a form of stress testing that produces concrete counterexamples to implicitly assumed system properties. Building on these observations, I will argue that many difficulties in assessing robustness, interpretability, and accountability stem from the absence of explicit specifications and formal guarantees. While empirical adversarial methods are effective at discovering violations, they do not by themselves clarify which properties should be satisfied or how they can be enforced. I will therefore discuss why formal methods, such as specification, verification, and counterexample-guided refinement, are a natural complement to adversarial machine learning, and outline potential directions for bridging these communities. The aim of the talk is to invite discussion on how empirical failure analysis can be integrated into workflows that support verifiable and accountable AI systems. Read more
Published:
随着人工智能技术在自动驾驶等智能系统中的广泛应用,如何保障其在安全关键场景下的可靠性与可信性已成为亟需解决的核心问题。尽管深度学习模型在感知与决策任务中取得了显著进展,但其在对抗扰动下的脆弱性、内部机制的不透明性以及缺乏可审计能力等问题,严重制约了其在实际系统中的安全部署。 本报告围绕“可信人工智能”这一主题,系统介绍报告人在鲁棒性、可解释性与可追溯性方面的研究进展。首先,从对抗攻击的角度出发,分析深度模型在复杂环境中的不稳定性,并介绍面向真实场景的对抗攻击与防御方法;其次,针对深度模型“黑箱”问题,探讨现有解释方法的局限性,并介绍基于语义概念的可解释建模方法,以提升模型决策的可理解性与一致性;在此基础上,进一步讨论面向安全关键系统的AI可追溯与合规框架,以支持模型决策过程的审计与责任界定。 最后,报告将结合智能系统中的典型应用场景,探讨可信人工智能在自动化系统中的关键作用,并展望未来在系统级可信AI与跨层安全机制方面的研究方向及潜在合作机会。 Read more
Published:
As artificial intelligence systems are increasingly deployed in safety‑critical and legally regulated domains, ensuring their robustness and reliability has become both a technical necessity and a legal requirement. This talk examines AI robustness and reliability from a technical perspective and connects these concepts to emerging legal frameworks for AI certification and compliance. I first introduce robustness evaluation using adversarial attacks, with a particular focus on perception systems in autonomous driving. Adversarial testing reveals failure modes that are often invisible to standard benchmarking and provides a principled way to assess model behavior under worst‑case perturbations. Building on this, I discuss strategies to improve system reliability through enhanced interpretability, continuous monitoring, and human‑in‑the‑loop intervention mechanisms. Interpretable models and explanations enable better detection of anomalous behavior, support real‑time risk mitigation, and provide auditable evidence for certification and liability assessment. By bridging technical evaluation methods with legal expectations for transparency, accountability, and risk management, this talk highlights how robustness testing and interpretability can serve as foundational tools for trustworthy AI deployment and AI certification in regulated applications such as self‑driving cars. Read more
Summer School, Southwest University, 2023
Open Course on Trusted Intelligent Algorithms in Intelligent Vehicles: Planning and Control. Read more
Master Seminar, Saarland University, 2024
This seminar course delves into the crucial and evolving field of explainability in machine learning (ML). As ML models become increasingly complex and integral to various domains, understanding how these models make decisions is essential. This course will explore different methodologies for interpreting ML models, including rule-based, attribution-based, example-based, prototype-based, hidden semantics-based, and counterfactual-based approaches. Through a combination of paper readings, discussions, and presentations, students will gain a comprehensive understanding of the challenges and advancements in making ML models transparent and interpretable. Read more