Advanced Research Computing 2024/25
This year’s course will broadly focus on aspects of machine learning and its interactions with AGQ.
Logistics
Please email Tudor Dimofte to register for the course, either for credit or to audit.
The course meets for three hours a week, in person near the Bayes Centre (University of Edinburgh central campus) and online. Our meeting rooms/times are:
24 Jan | 10am-1pm | Lec 1 (Rob) | G.02, 16-20 George Square (access at 19GSq) |
31 Jan | 10am-1pm | Lec 2 (Rob) | LG.10 (40 George Square Lower Teaching Hub) |
7 Feb | 10am-1pm | Lec 3 (Rob) | LG.10 (40 George Square Lower Teaching Hub) |
14 Feb | 10am-1pm | Lec 4 (Rob) | LG.10 (40 George Square Lower Teaching Hub) |
24 Feb | 2pm-5pm | Lec 5 (Christos) | M2 – Teaching Studio (Appleton Tower) |
3 Mar | 2pm-5pm | Lec 6 (Christos) | M2 – Teaching Studio (Appleton Tower) |
10 Mar | 2pm-5pm | Lec 7 (Davide) | M2 – Teaching Studio (Appleton Tower) |
17 Mar | 2pm-5pm | Lec 8 (Davide) | M2 – Teaching Studio (Appleton Tower) |
24 Mar | 4pm-5pm | guest lecture: James Sully (Anthropic) | online / view together in TBA |
28 Mar | 10am-1pm | Lec 9 (Sjoerd) | LG.10 (40 George Square Lower Teaching Hub) |
4 Apr | 10am-1pm | Lec 10 (Sjoerd) | G.02, 16-20 George Square (access at 19GSq) |
11 Apr | TBA – guest? |
Zoom links for virtual participation are sent out via email. For Glasgow-based students: 414 East Quad (Geography – opposite the cloisters in the main building) has been booked to join remotely.
Assignments: A homework coding exercise will be given every two lectures (five in total). For students taking the course for credit, you must score at least 60% on at least four out of five homeworks to pass the course.
Course Material
This course will begin with 4 lectures followed by workshops introducing modern AI/ML technologies.
To be able to participate in the workshops students will need to install Anaconda and set up an environment for running AGQ Jupyter notebooks.
Installing Anaconda
First you need to install Anaconda on your laptop:
https://docs.anaconda.com/anaconda/install/
Import an Anaconda environment
The environment file for the start of this course is found here:
https://github.com/rob-c/agq-uploads/blob/master/AGQenv.yaml
Importing an environment into Anaconda:
https://docs.anaconda.com/navigator/tutorials/manage-environments/#importing-an-environment
Anaconda cheat sheet:
https://github.com/jumdc/cheat-sheets/blob/main/cs/conda.md
Testing your Setup
After installing Anaconda and setting up the AGQenv I would recommend testing your install with this Jupyer-Notebook:
https://github.com/rob-c/agq-uploads/blob/master/TestPlayBookAGQ.ipynb
Running this notebook should at the very bottom display the line: `AGQ Tests Passed!`
If you see a bunch of red text then something has gone wrong and you may need to reach out to an expert for help. (If you don’t have access to an expert before the first workshop, don’t worry — we’ll help you out then.)
Familiarity with Python3
The first lecture of this course will require some familiarity with the Python3 programming language. Some useful online resources for becoming familiar with Python are below. If you’ve never used Python before, please take a look. We’ll help during the first few workshops as well.
Scientific programming with Python: https://git.ecdf.ed.ac.uk/pclark3/sciprog2024
Python for Beginners: https://python.land/python-tutorial
Intro to Python: https://python-course.eu/python-tutorial/
Online Python tutorials from Microsoft: https://learn.microsoft.com/en-us/shows/intro-to-python-development/
Familiarity with Git
The first workshop will go through the basics of learning how to use git to fork, clone and submit changes to a repo and installing and editing some Python3 code.
Some other online resources for mastering git are:
Online Git and GitHub training materials from Microsoft: https://learn.microsoft.com/en-us/training/paths/github-foundations/
Learning how to work with Git branches: https://learngitbranching.js.org/
W3Schools online Git training: https://www.w3schools.com/git/
Atlassian Git training: https://www.atlassian.com/git
Familiarity with Jupyter-Notebooks
Again, during this course Jupyter-labs and Jupter-notebooks will be introduced to manage, perform and store the results of Machine Learning problems.
For anyone wanting to learn more about Jupyter-notebooks there are these online resources:
Dataquest’s intro to Jupyter-Notebooks: https://www.dataquest.io/blog/jupyter-notebook-tutorial/
Geeks4Geeks: https://www.geeksforgeeks.org/how-to-use-jupyter-notebook-an-ultimate-guide/
A good tutorial found on GitHub: https://gist.github.com/rob-c/b0541f3a51f0cfb518dd5ddc648a79f7
Anaconda’s introduction to Jupyter-Notebooks: https://learning.anaconda.cloud/jupyter-notebook-basics-course
Lecture 1:
This lecture will begin by describing some of the more advanced features in Python programming as well as giving an introduction to the git version control system.
For the Workshop:
Python & Git
The example playbook for this workshop is: https://github.com/rob-c/farm
This repo provides a demonstration of using classes, inheritance, callbacks and basic exception handling.
The goals for working with this would be:
0) Fork the repo on GitHub and clone locally
1) Install and run the package in a Python virtualenv
2) Switch to a development branch
3) Add a new crop, animal and equipment class to the farm
4) Fix and break different pieces of equipment
5) After adding new ‘features’ commit this to a new branch
6) Push features back to GitHub
7) Make a PR detailing changes
Numerical Minimization
There will also be some examples of using numerical minimization to tune parameters within a PDF to some simulated data in the workshop.
This should help familiarize you with different types of machinery designed to solve the problems of extracting information from finite datasets. As well as some of the potential short-comings.
Slides:
https://github.com/rob-c/agq-uploads/blob/master/Lecture1/Lecture1.pdf
Workshop-Notebooks:
https://github.com/rob-c/agq-uploads/blob/master/Lecture1/Minimization_Problems.ipynb
https://github.com/rob-c/agq-uploads/blob/master/Lecture1/data-science-tools.ipynb
Lecture 2
This lecture and workshop will go through the fundamentals of how Deep Neural Networks are constructed.
We will work through an example of building and fitting a DNN in numpy and training it on some input data.
After doing this, we will introduce the PyTorch framework and show how it can be used to create models to perform the same task but much quicker.
The workshop will finish with describing, building and training a DNN classifier using the PyTorch framework.
Slides:
https://github.com/rob-c/agq-uploads/blob/master/Lecture2/AGQ2.pdf
Workshop-Notebook:
https://github.com/rob-c/agq-uploads/blob/master/Lecture2/DNN-Simplified-Questions.ipynb
Lecture 3
This lecture and workshop will dive deeper into different Neural Network designs.
First we will recap what is involved in building a Classifiers using DNNs. Then we will move on to discuss different neurons and how they can be used to achieve the same results.
I will introduce (Variational-)AutoEncoders, uNET and other network designs which are able to perform image analysis.
In the workshop I will go through AutoEncoders, their uses, advantages and pitfalls as well as how to use such models to perform anomaly detection.
Slides:
https://github.com/rob-c/agq-uploads/blob/master/Lecture3/AGQ3.pdf
Workshop material:
https://github.com/rob-c/agq-uploads/blob/master/Lecture3/img.png
https://github.com/rob-c/agq-uploads/blob/master/Lecture3/AGQ3_Questions.ipynb
https://github.com/rob-c/agq-uploads/blob/master/Lecture3/trained_vae_model.pth
Lecture 4
This lecture and workshop will touch on some modern Neural Network designs.
I will introduce the concept of Attention in ML, it’s uses as well as advantages and disadvantages.
I will also discuss the impact of precision on ML model design and the results from the importance of model performance in evaluation vs training.
In the workshop at the end of this lecture I will go through construction a multi-headed attention network to perform classification, and training an attention based network on a waveform.
As well as this I will discuss and demonstrate some of the technical issues surrounding training and evaluating a 1-bit neural network compared to using a network using full floating point precision.
More advanced (but related) things that have been recommended:
Machine learning for the working mathematician – seminar series at SMRI
LLM’s and optimizing performance, from the point of view of a processor – lectures by Rafi Witten