Markus Schweighofer - Aktuelle Lehrveranstaltungen
WS 2025/2026 Linear Algebra in Data Science
9 = 4.5 + 4.5 ECTS credits, 4V+2Ü+P (P stands for programming projects in Julia) Content.
This course offers a mathematically rigorous exploration of modern data science and machine learning, framed through the lens of linear algebra. Part I lays the groundwork by introducing fundamental machine learning algorithms such as Linear Regression, Support Vector Machines, and various clustering techniques. We will then delve into the core role of matrix factorizations, focusing on the Singular Value Decomposition (SVD) and its application to Principal Component Analysis (PCA) for dimensionality reduction and data compression. Part II builds on this foundation to investigate advanced topics, starting with graph-based learning. You'll explore concepts like the graph Laplacian, spectral clustering, and diffusion maps to analyze network data. The course culminates with an introduction to deep learning, covering the architecture of neural networks, the backpropagation algorithm, and powerful models like Convolutional Neural Networks (CNNs) and Transformers. Throughout both parts, theoretical concepts will be paired with hands-on implementation in the Julia programming language, applying the methods to solve problems with real-world datasets. Target audience.
Students of mathematics, mathematics education (Lehramt), mathematical finance or physics having a firm background in linear algebra who like to do computer
programming and work with real datasets. Since this is the first edition of the course, even students of the Master's program in mathematics can take credit for it.
It is however primarily designed for students of mathematics in their second year of study. Credit eligibilities.
The lecture is divided into two parts. Each part can be taken for 4.5 credits. The second part depends on the first one.
Either part can be taken as an elective module (Wahlmodul) for students of mathematics (Bachelor's and exceptionally Master's program).
Part II can alternatively
be taken as compulsory module Practical Mathematics II (Pflichtmodul Praktische Mathematik II) for students of mathematics in the Bachelor's program.
Either part of the lecture is admissible for the 4.5 "special" (out of 9) ECTS credits in the sense of Appendix IV of the study regulations
("spezielles Gebiet") of the program in mathematical education (Lehramt).
Either part of the lecture is admissible in Block 2 (application) of the Advanced Data and Information Literacy Track (ADILT).
Students of mathematical finance or physics are highly welcome but should verify in which way this lecture can be validated for their study programs.
Literature. This lecture is mainly based on the recently published book Linear Algebra, Data Science and Machine Learning by Jeff Calder and Peter Olver (electronically available through the library, connect via EduVPN to access it). Please do not try to borrow the
physical copy of the book from the library as I need it for the whole semester (it is in my course reserve and I will decline attempts of borrowing).
We currently plan to cover Section 5.7, Chapters 7 and 8, Sections 9.1 to 9.7, Section 9.10 and Sections 10.1 to 10.6 of the book.
It is important to know that, unless in many other mathematics courses, we expect the participants to actively read and digest more than 300 pages (of the book).
Instead of Python, we will however use Julia as a programming language. Prerequisites. We will assume that students are basically familiar with (or autonomously get quickly acquainted with) the Chapters 1 to 4 and most of Chapter 5
of this book. For students from Konstanz this usually means that they ideally should have succeeded in the courses Linear Algebra I and II for mathematicians,
or that they are hard-working and talented and prepare in advance by working through the unknown parts of the book. Plan of the course. This plan could still change substantially
but currently is as follows:
Part I (4.5 ECTS credits, can be attended without Part II)
Chapter 1: Machine Learning and Data
Lecture 01, Oct 22: Review of Linear Algebra and the Julia Programming Language
Lecture 02, Oct 24: Basics of Machine Learning and Data
Lecture 03, Oct 29: Linear Regression
Lecture 04, Oct 31: Support Vector Machines
Lecture 05, Nov 05: Nearest Neighbor Classification
Lecture 06, Nov 07: Means Clustering
Lecture 07, Nov 12: Kernel Methods
Chapter 2: Singular Values and Principal Component Analysis
Lecture 08, Nov 14: The Singular Value Decomposition
Lecture 09, Nov 19: The Principal Components
Lecture 10, Nov 21: The Best Approximating Subspace
Lecture 11, Nov 26: PCA-based Compression
Lecture 12, Nov 28: PCA-based Compressive Sensing
Lecture 13, Dec 03: Linear Discriminant Analysis
Lecture 14, Dec 05: Multidimensional Scaling
Part II (4.5 ECTS credits, can in practice not be attended without Part I, can be credited as the compulsory module "Practical Mathematics II")
Chapter 3: Graph Theory and Graph-based learning
Lecture 15, Dec 10: Graphs and Digraphs
Lecture 16, Dec 12: The Incidence Matrix
Lecture 17, Dec 17: The Graph Laplacian
Lecture 18, Dec 19: Binary Spectral Clustering
Lecture 19, Jan 07: Distances of Graphs
Lecture 20, Jan 09: Diffusion in Graphs and Digraphs
Lecture 21, Jan 14: Diffusion Maps and Spectral Embeddings
Lecture 22, Jan 16: The Discrete Fourier Transform
Chapter 4: Neural Networks and Deep Learning
Lecture 23, Jan 21: Neural Networks and Deep Learning
Lecture 24, Jan 23: Fully Connected Networks
Lecture 25, Jan 28: Backpropagation and Automatic Differentiation
Lecture 26, Jan 30: Convolutional Neural Networks
Lecture 27, Feb 04: Graph Convolutional Neural Networks
Lecture 28, Feb 06: Transformers and Large Language Models
Assessment Methods.
Active and frequent participation in the weekly tutorial session ("Übung") will be a basic requirement for validating the module.
The exams for Part I will be oral or written, depending on the number of participants. Instead of an exam, Part II will probably be concluded by individually assigned
programming projects that apply the learned methods to real datasets. This includes a presentation of the implementation and outcome of the project. This
presentation will be scheduled at an agreed date outside of the lecture period. It is important that the presenter is able to answer questions appropriately. The
presentation together with the answers to the questions will be graded. Assignments.
Weekly written homework that has to be submitted electronically. It is important to be able to comment on the submitted solutions during the exercise class.
The homework includes programming exercises in Julia that have to be submitted electronically.
A small computer project replacing the exam for Part II of the course.
Weekly Schedule. We will meet twice per week for the lectures and once for the tutorial session. The lecture is scheduled for Wednesdays and Fridays from 11:45 to 13:15 in room F426. If this is inconvenient please write to me immediately. In the first session, we will convene on a weekly time slot for the exercise course. Communication. You will get the announcements concerning the course if and only if you are subscribed to the course on ILIAS. I strongly recommend to subscribe to the course on ZEUS which probably (?) entails an automatic inscription on ILIAS.