Machine Learning for Physics#

Physics 450   Spring 2026

Calendar#

Note: This schedule will evolve throughout the semseter

Overview#

Course Overview#

Welcome! This course presents an introduction to modern data science, artificial intelligence (AI) and machine learning (ML) from a physics perspective. Students will learn the basic concepts, tools, and methods of AI/ML applied to scientific challenges. Students will study how physics knowledge can be incorporated into AI/ML models to improve their learning efficiency, performance, and interpretability. Topics covered include artificial neural networks (NNs), AI/ML-enhanced modeling/simulation, deep generative models, simulation-free inference, variational inference, convolutional NNs, recursive NNs, geometric deep learning, attention mechanism and transformers, auto-encoders, and anomaly detection. Students will also explore the different types of learning from data, including supervised, semi-supervised and unsupervised learning. Applications to physics will be emphasized.

You can find more detail in the Calendar section on the specific topics that will be covered in this course.

Learning Objectives#

Upon completion of the course students will be able to:

  1. Understand the basic concepts and tools of modern data science, artificial intelligence (AI), and machine learning (ML)

  2. AI/ML for Physics: Apply AI/ML modern methods to address scientific challenges using open data

  3. Physics for AI/ML: Learn how to include physics knowledge into AI/ML models to improve their learning efficiency, performance and interpretability

Course Logistics#

Format#

Instructor#

TAs#

Online Tools#

There are several online tools you will need to use as part of this course.

Campuswire#

We will use Campuswire as a class forum, a way to message the course staff and each other, and a means to submit your attendance question.

Google Colab#

Using Google Colab, you will be able to program your code in a Jupyter notebook and submit it for us to grade.

Gradescope#

On Gradescope, you will submit your assignments and find your graded assignments.

Coursework#

Quizzes#

Short in-class quizzes will be given ~weekly. These are designed to test your conceptual knowledge about the topics discussed in the course.

Homework Assignments#

You will be assigned weekly homework assignments that will put into practice what you learned in lecture for the week.

  • You will work on the assignments both during the in-class session on Thursdays and as homework.

  • You will submit your executed (i.e. with “RunAll”) homework notebook via Gradescope.

  • Each assignment is due at the beginning of the next class unless otherwise noted. You may turn assignment in up to one week late for 50% credit (except that all assignments are strictly due the day before Reading Day).

  • Solutions to the homeworks will not be given.

  • You may collaborate on assignments but must submit your own work.

  • Graded homework will be available through Gradescope.

Projects#

At appropriate times throughout the course, you will select from a list of projects that involve demonstrating and extending your work in class by doing something cool and interesting in data analysys. You must work alone on this (i.e. without collaboration).

For projects you will put together a Jupyter notebook that demonstrates your project. The notebook should have code and demonstrate the task but also be written in an expository way that other students could, in principle, read and learn from. It is submitted in an analogous way as the regular course assignments.

Each project notebook must be submitted via Gradescope for grading. There is no late submissions allowed for the projects. If you do not submit in Gradescope by the deadline, you will receive a zero grade for that project. There are no exceptions to this policy.

Class Attendence#

In-person attendence for each lecture is mandatory (unless otherwise informed by the instructor, e.g. due to weather-related issues). Your attendence is automatically recorded by joining the lecture live room on Campuswire. The alorithm used by Campuswire is the following:

  1. Present: - Participated for more than 75% of the event duration - Arrived within 10 minutes of the start time - Did not leave more than 10 minutes before the end time

  2. Late: - Participated for at least the event duration minus 25 minutes (15 minutes late arrival + 10 minutes early exit) - Arrived within 15 minutes of the start time

  3. Absent: - Does not meet the criteria for Present or Late

Grading#

  • Class attendence and participation: 5%

  • In-class quizzes: 10%

  • Homework: 40%

  • Projects: 45%

Official grades for the course assignments will be maintained at the online MyPhysics Gradebook.

Letter grades will be assigned as follows:

  • A+   [97.0 - 100.0]

  • A     [93.0 - 96.9]

  • A-   [90.0 - 92.9]

  • B+   [87.0 - 89.9]

  • B     [83.0 - 86.9]

  • B-   [80.0 - 82.9]

  • C+   [77.0 - 79.9]

  • C     [73.0 - 76.9]

  • C-   [70.0 - 72.9]

  • D+   [67.0 - 69.9]

  • D     [63.0 - 66.9]

  • D-   [60.0 - 62.9]

  • F     [00.0 - 59.9]

Datasets#

In this section we describe the datasets used in the lectures and homeworks. There are additional scientific datasets used for the projects that as described in the projects area of the course page.

Line#

A simple line with errors. Columns are x, y and dy. The reported errors are systematically too large by a constant factor, and are set to NaN for a fraction of the samples. Target is y_true.

Applications:

  • Reading CSV into a Pandas dataframe.

  • Straight line regression.

  • Handling missing values.

  • Handling (overestimated) input errors.

Pong#

Each sample is a 2D trajectory of a ping-pong ball launched with different initial conditions. Trajectories are calculated with an analytic model that includes a linear drag term. There are three clusters of trajectories with similar initial conditions, identified by target grp. Target th0 gives the true initial launch angle in degrees. Target hit target identifies trajectories that pass through a fixed “hoop” at \(x\)=0.5.

Applications:

  • Reading HF5 into a Pandas dataframe.

  • Dimensionality reduction (20D points lie on a 2D manifold).

  • Nonlinear regression (target th0).

  • Clustering (target grp).

  • Classification (target hit).

Cosmo#

Each sample is LCDM cosmology defined by input parameters omega_b, omega_cdm, ln10^{10}A_s and H0. Corresponding targets are values of sigma8, rd, DA(0.57)/rd, DH(0.57)/rd, DA(2.34)/rd, and DH(2.34)/rd calculated with CLASS. The CLASS calculations are relatively slow (~1 hr per 1K), so the goal of this dataset is to train a faster emulator. Input values are uniformly distributed on a grid centered on the Planck2015 best fit result and spanning +/-10 sigmas.

Applications:

  • Dimensionality reduction.

  • Approximately linear regression.

Higgs#

Data from the 2014 Higgs Challenge which is now archived here.

This file is too large to include in the repo, so instead the Pandas notebook provides a function to generate higgs_data.hf5 and higgs_target.hf5 from the downloaded .csv.gz file and copy them into the installed data path.

Applications:

  • Dimensionality reduction.

  • Train/test/split.

  • Classification.

Clusters#

Demo files for clustering: 4 in 2D with 2 clusters, and 1 in 3D with 3 clusters. Data features are x0, x1 (x2) and target is y.

Applications:

  • Clustering.

Spectra#

Spectra containing two peaks with variable flux and fixed locations and widths, over a constant background, with Poisson noise added. Data features are fluxes in wavelength bins (with un-named columns). Targets are the true fluxes in each peak (‘flux1’, ‘flux2’).

Applications:

  • Dimensionality reduction.

  • Clustering.

  • Regression.

Circles#

The circles files contain 500 2D points on two concentric circles with feature names x0, x1 and target integer y = 0,1 indicating which circle they belong to.

Applications:

  • Linear clustering in higher dimensions.

  • Kernel trick.

  • Kernel PCA.

Ess#

The ess files contain 500 3D points on a 2D sheet bent into an S-shape with features named x0, x1, x2 and target value y from 0-1 giving the coordinate along the sheet.

Applications:

  • Manifold learning.

  • Locally linear embedding (LLE).

Blobs#

The blobs files contain 2K 3D points sampled from 3 Gaussian blobs with features named x0, x1, x2 and target value y = 0, 1, 2 giving their generated group membership.

Applications:

  • Clustering.

  • Density estimation.

Policies#

Covid#

  • Policies as it relates to COVID-19 can be found at https://covid19.illinois.edu

  • If you feel ill or are unable to come to class or complete class assignments due to issues related to COVID-19, including but not limited to testing positive yourself, feeling ill, caring for a family member with COVID-19, or having unexpected child-care obligations, you should contact your instructor immediately, and you are encouraged to copy your academic advisor.

About using generative AI for homework and projects#

Generative AI systems, such as ChatGPT, can be valuable tools for learning and idea refinement in this course. You are encouraged to use AI as a tutor to clarify programming concepts, debug code, or explore ideas through iterative conversations—similar to working with a peer, TA, or instructor. However, AI should not be used to directly copy-paste solutions or complete homework problems.

If you use generative AI, you must credit the source by including a comment with the original source of any code or information you incorporate into your work. Additionally, provide a brief description of how the AI was used, such as for debugging a function, refining the methodology, or improving the code efficiency. This helps ensure transparency regarding the use of AI in your work.

The goal of this course is to help you develop the skills to solve problems independently. While AI can extend your capabilities, it should be used as a tool for learning, not as a substitute for the problem-solving process. Relying on AI-generated answers or code without engaging in the problem-solving process can hinder your intellectual growth and is considered academically dishonest. As with all academic tools, AI should be used responsibly to support, not replace, your learning.

Academic Integrity#

You must never submit the work of someone else as your own. We understand that many of you will find it helpful to work with other students to master the course. But when you collaborate with your study group on homework assignments, you must be a full, active participant in developing the solutions that you submit for credit. Unlike the homework, your project assignments are to be done on your own without collaboration.

It is cheating to receive answers from another student and then use them as your own. It is cheating to submit as your own work solutions that you find by searching on the internet or using online generative AI tools such as ChatGPT, or by subscribing to an online service that suborns cheating. It is cheating—and a violation of U.S. copyright law—to give (or sell) course material to someone else who intends to redistribute and/or sell it.

All activities in this course, are subject to the Academic Integrity rules as described in Article 1, Part 4, Academic Integrity, of the Student Code.

Sexual Misconduct Reporting Obligation#

The University of Illinois is committed to combating sexual misconduct. Faculty and staff members are required to report any instances of sexual misconduct to the University’s Title IX Office. In turn, an individual with the Title IX Office will provide information about rights and options, including accommodations, support services, the campus disciplinary process, and law enforcement options.

A list of the designated University employees who, as counselors, confidential advisors, and medical professionals, do not have this reporting responsibility and can maintain confidentiality, can be found here: wecare.illinois.edu/resources/students/#confidential.

Other information about resources and reporting is available here: https://wecare.illinois.edu and https://wellness.illinois.edu.

Mental Health Services#

Significant stress, mood changes, excessive worry, substance/alcohol misuse or interferences in eating or sleep can have an impact on academic performance, social development, and emotional wellbeing. The University of Illinois offers a variety of confidential services including individual and group counseling, crisis intervention, psychiatric services, and specialized screenings which are covered through the Student Health Fee. If you or someone you know experiences any of the above mental health concerns, it is strongly encouraged to contact or visit any of the University’s resources provided below. Getting help is a smart and courageous thing to do for yourself and for those who care about you.

  • Counseling Center (217) 333-3704

  • McKinley Health Center (217) 333-2700

  • National Suicide Prevention Lifeline (800) 273-8255

  • Rosecrance Crisis Line (217) 359-4141 (available 24/7, 365 days a year)

If you are in immediate danger, call 911 *This statement is approved by the University of Illinois Counseling Center.

Students with Disabilities#

To obtain disability-related academic adjustments and/or auxiliary aids, students with disabilities must contact the course instructor and the Disability Resources and Educational Services (DRES) as soon as possible. To contact DRES, you may visit 1207 S. Oak St., Champaign, call 333-4603, e-mail disability@illinois.edu or go to https://www.disability.illinois.edu. If you are concerned you have a disability-related condition that is impacting your academic progress, there are academic screening appointments available that can help diagnosis a previously undiagnosed disability. You may access these by visiting the DRES website and selecting “Request an Academic Screening” at the bottom of the page.

Resources#

Useful references#

Quick guides#

Tools#

Git and GitHub#

Project Jupyter#

Acknowledgements#

This course was developed by Mark Neubauer. It was first taught by Mark Neubauer during the Spring 2024 semester.