Seminar in CS - Robust and Verifiable AI

Fall 2025 - CS 260-002

Overview

This course covers current research topics in computer science, specifically focusing on trustworthy, robust, and verifiable AI. We will begin with the basics of adversarial machine learning (AdvML) that investigates the robustness of AI models. We will then discuss various current AdvML problems, such as the safety and security of recent AI models, particularly large language models. Beyond making AI models robust against known manipulations, we will also cover topics on making AI models verifiable to foster trust and prevent unknown failures.

By taking this course, you will gain comprehensive knowledge of the robustness and verification of AI models for achieving trustworthy AI; develop essential research skills including effective reading and critical analysis of literature, oral presentation and constructive participation in scholarly discussions, and hands-on skills in a research-oriented course project; be able to apply the techniques in future work.

Course details

Course format:

Instructor: Zhouxing Shi

TA: None.

Prerequisite:

Textbook: None.

Grading Overview

Presentation

Sign up sheet

Everyone will lead a paper discussion session (35~45min) which should cover 2 to 3 research papers. The presenter will present the papers and prepare several questions for discussion with classmates during or after the presentation (ideally, half of the time is spent on interactive discussion).

Participation and discussion

To ensure that each session will at least have some discussion, you will need to sign up for two slots as scheduled participants for the discussion. Scheduled participants are expected to: 1) Ask questions; 2) Answer questions and express opinions on the presented papers; 3) Provide constructive feedback to the presenter. While we have scheduled participants, everyone (not only the scheduled participants) is highly encouraged to actively participate in discussion.

Final project

You will work on a research-oriented project in a team. Each team should have no more than 3 students. You are encouraged to form a team of 2~3 students, but individual projects are also acceptable.

Sign up on Canvas (People -> Final project)

Checkpoints:

Proposal

You will need to submit a writeup of 1~2 pages for your proposed project with the following parts:

  1. Introduction and background
    • What problem are you addressing
    • Why is it meaningful and important
    • What have other people done and what are their limitations
    • What is new in your project
  2. Proposed method
    • Initially proposed approach
    • How do you plan to implement your approach
    • What experimental settings will you start with
    • How will you evaluate your approach
  3. Collaboration plan
    • How will your team members collaborate

Grading criteria: If the proposal is reasonably completed.

Milestone presentation

Each team will deliver a presentation of 5 minutes followed by Q&A for up to 2 minutes. You are suggested to introduce the project background, problem settings, method design, preliminary results, next steps, etc.

Grading criteria: If the presentation is reasonably done.

Final presentation

Given the number of teams, we will still have a 5-min presentation + 2-min Q&A for each team. You may include a brief recap for the problem and background, and then focus on what you have achieved since the milestone presentation, including any newly designed or implemented methodology, newly obtained results, and conclusions. You do not have to complete everything by this presentation, as you will have about ten more days to finalize your report and include any further results.

Grading criteria: If the presentation is reasonably done.

Final submission (report and code)

Please submit a final report for the project, to document your work and findings. The presentation of the report should have a similar structure and quality as the papers we have read during the quarter, although it will likely be much shorter. The report should be self-contained and understandable without having to read additional papers. Finally, also include a brief statement on how each group member contributed.

Page limit: 4 pages, not including any references (the font should be a commonly used legible one with at least 11 points, and there should be at least 1-inch margins on all sides).

Suggested sections:

By default, your submission for the final project will be kept private. However, you may optionally indicate if you would like to have your report shared with the class.

Grading criteria:

Please also submit your code, including a README.md file documenting how it can be used to reproduce the results, as well as any necessary dependencies that cannot be pulled from existing websites. Grading for the final project is mainly based on the final report. The submitted code is only for validation if needed.

Schedule and Topics

Week Date Topic
1 9/29 [Lecture] Introduction
  10/1 [Lecture] Fundamentals of ML robustness (adversarial robustness)
2 10/6 [Lecture] Fundamentals of ML robustness (out-of-distribution robustness)
  10/8 [Lecture] Verification for ML (MILP, LP, and bound propagation for MLP)
3 10/13 [Seminar] Adversarial attack for ML
  10/13 [Seminar] Adversarial defense for ML
  10/15 [Seminar] OOD robustness for ML
  10/15 [Seminar] OOD robustness for pre-trained models
  10/17 [CSE Colloquium] Verification for ML (general methods) (Friday 11AM, outside class hours, optional)
4 10/20 [Seminar] Adversarial robustness in NLP
  10/20 [Seminar] NLP robustness for AIGC detection
  10/22 [Seminar] NLP robustness for code generation
  10/22 [Seminar] NLP robustness for math
5 10/27 [Lecture] Jailbreaking and prompt injection for LLM
  10/27 [Seminar] Verification for ML models (tightening incomplete verification)
  10/29 [Seminar] Verification for ML models (complete verification)
  10/29 [Seminar] Training for ML verification
6 11/3 [Seminar] Verification beyond classification
  11/3 Project proposal discussion (optional; informal meetings)
  11/5 [Seminar] Jailbreaking attacks for LLM (input-level)
  11/5 [Seminar] Jailbreaking attacks for LLM (model-level)
7 11/10 [Seminar] Prompt injection games
  11/10 [Seminar] Prompt injection applications
  11/12 [Seminar] Defenses for LLM safety
  11/12 [Seminar] Defenses for LLM security
8 11/17 Project milestone presentation & discussion
  11/19 [Seminar] Multi-modal safety & security (VLM)
  11/19 [Seminar] Multi-modal safety & security (image generation)
9 11/24 [Seminar] Prompt injection to agents
  11/24 [Seminar] LLM security beyond model-level defenses
  11/26 [Seminar] Verification for LLM (formal math reasoning)
  11/26 [Seminar] Verification for LLM (formal methods for LLM)
10 12/1 Final project presentation
  12/3 No class: finish up final project report

Conduct and Academic Integrity

See https://conduct.ucr.edu for details.

Here at UCR we are committed to upholding and promoting the values of the Tartan Soul: Integrity, Accountability, Excellence, and Respect. As a student in this class, it is your responsibility to act in accordance with these values by completing all assignments in the manner described, and by informing the instructor of suspected acts of academic misconduct by your peers. By so doing, you will not only affirm your own integrity, but also the integrity of the intellectual work of this University, and the degree which it represents. Should you choose to commit academic misconduct in this class, you will be held accountable according to the policies set forth by the University, and will incur appropriate consequences both in this class and from Student Conduct and Academic Integrity Programs. For more information regarding University policy and its enforcement, please visit: conduct.ucr.edu.

Resources