Skip to content

Commit

Permalink
differences for PR #27
Browse files Browse the repository at this point in the history
  • Loading branch information
actions-user committed Mar 28, 2024
1 parent f86be25 commit fba9a80
Show file tree
Hide file tree
Showing 18 changed files with 108 additions and 58 deletions.
2 changes: 2 additions & 0 deletions config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -61,10 +61,12 @@ contact: '[email protected]'
episodes:
- profiling-introduction.md
- profiling-functions.md
- short-break1.md
- profiling-lines.md
- profiling-conclusion.md
- optimisation-introduction.md
- optimisation-data-structures-algorithms.md
- long-break1.md
- optimisation-minimise-python.md
- optimisation-use-latest.md
- optimisation-memory.md
Expand Down
15 changes: 1 addition & 14 deletions files/line_profiler-worked-example/fizzbuzz.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,10 @@
n = 100
a=0
b=0
c=0
d=0
for i in range(1, n + 1):
if i % 3 == 0 and i % 5 == 0:
a+=1
print("FizzBuzz")
elif i % 3 == 0:
b+=1
print("Fizz")
elif i % 5 == 0:
c+=1
print("Buzz")
else:
d+=1
print(i)

print(a)
print(b)
print(c)
print(d)
print(i)
Binary file added files/pred-prey/predprey.py.lprof
Binary file not shown.
Binary file added files/pred-prey/predprey_out.prof
Binary file not shown.
8 changes: 8 additions & 0 deletions long-break1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Break
teaching: 0
exercises: 0
break: 60
---

Take a break. If you can, move around and look at something away from your screen to give your eyes a rest.
26 changes: 14 additions & 12 deletions md5sum.txt
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
"file" "checksum" "built" "date"
"CODE_OF_CONDUCT.md" "c93c83c630db2fe2462240bf72552548" "site/built/CODE_OF_CONDUCT.md" "2024-03-20"
"LICENSE.md" "b24ebbb41b14ca25cf6b8216dda83e5f" "site/built/LICENSE.md" "2024-03-20"
"config.yaml" "b413b2dfbce4f70e178cae4d6d2d6311" "site/built/config.yaml" "2024-03-20"
"config.yaml" "6d4d0c5a03448624b21b0eeaf3dd183a" "site/built/config.yaml" "2024-03-28"
"index.md" "3a6d3683998a6b866c134a818f1bb46e" "site/built/index.md" "2024-03-20"
"links.md" "8184cf4149eafbf03ce8da8ff0778c14" "site/built/links.md" "2024-03-20"
"episodes/profiling-introduction.md" "86c0d8691aae9c0aa4b6f23e216a2978" "site/built/profiling-introduction.md" "2024-03-20"
"episodes/profiling-functions.md" "afdff3941ebb6f5f9121a57eaf6dbf88" "site/built/profiling-functions.md" "2024-03-22"
"episodes/profiling-lines.md" "c00ccba171a20837aeffc9f8079d9e1d" "site/built/profiling-lines.md" "2024-03-22"
"episodes/profiling-conclusion.md" "a3c2deb1bc4efaaf4a2a70f966734b71" "site/built/profiling-conclusion.md" "2024-03-20"
"episodes/optimisation-introduction.md" "2c2bbafab97d4db78aa5735839516c81" "site/built/optimisation-introduction.md" "2024-03-20"
"episodes/optimisation-data-structures-algorithms.md" "029ce8cc24d92f2d819fadadc06b5999" "site/built/optimisation-data-structures-algorithms.md" "2024-03-20"
"episodes/optimisation-minimise-python.md" "fa4f1c8cd55a8b4ac2870d5bbc4d23d1" "site/built/optimisation-minimise-python.md" "2024-03-20"
"episodes/optimisation-use-latest.md" "4c939e9dbde33a1f47fefe5e757ff256" "site/built/optimisation-use-latest.md" "2024-03-20"
"episodes/optimisation-memory.md" "69eb84dfc419083ff12856a80750a618" "site/built/optimisation-memory.md" "2024-03-20"
"episodes/optimisation-conclusion.md" "ccd780c447f0b0ce97b8da1b2572b9c1" "site/built/optimisation-conclusion.md" "2024-03-20"
"episodes/profiling-introduction.md" "e3678e7f2f3ea9bd44e7fc0604cc650b" "site/built/profiling-introduction.md" "2024-03-28"
"episodes/profiling-functions.md" "94be5829062d7d0691bf192de13708ad" "site/built/profiling-functions.md" "2024-03-28"
"episodes/short-break1.md" "c7d9988cade5cc12dfbbf6c2a29ff2e9" "site/built/short-break1.md" "2024-03-28"
"episodes/profiling-lines.md" "7ad4ed55b19e0c874b9474912b8dd532" "site/built/profiling-lines.md" "2024-03-28"
"episodes/profiling-conclusion.md" "b5687e26387b353ef23c6292f295ca02" "site/built/profiling-conclusion.md" "2024-03-28"
"episodes/optimisation-introduction.md" "c875757a1d4df74c654322de2261fbab" "site/built/optimisation-introduction.md" "2024-03-28"
"episodes/optimisation-data-structures-algorithms.md" "ae2d5f70c4649800f35794b24c3cef2b" "site/built/optimisation-data-structures-algorithms.md" "2024-03-28"
"episodes/long-break1.md" "19a5c42e45032003c36ad8f413f44528" "site/built/long-break1.md" "2024-03-28"
"episodes/optimisation-minimise-python.md" "37e3a2643171a7e2a432902b420929be" "site/built/optimisation-minimise-python.md" "2024-03-28"
"episodes/optimisation-use-latest.md" "5948276773890e97b7898292fddbcb39" "site/built/optimisation-use-latest.md" "2024-03-28"
"episodes/optimisation-memory.md" "d819f88003ffd721bc6b1ec783a23390" "site/built/optimisation-memory.md" "2024-03-28"
"episodes/optimisation-conclusion.md" "99b7eb75c9c513b54bcf01c321aa0020" "site/built/optimisation-conclusion.md" "2024-03-28"
"instructors/instructor-notes.md" "cae72b6712578d74a49fea7513099f8c" "site/built/instructor-notes.md" "2024-03-20"
"learners/setup.md" "3465b1c09e7527d085eb32f647227dc6" "site/built/setup.md" "2024-03-20"
"learners/setup.md" "4998f740cb34f70b024206224cdbcaf6" "site/built/setup.md" "2024-03-28"
"learners/acknowledgements.md" "c4064263d442f147d3796cb3dfa7b351" "site/built/acknowledgements.md" "2024-03-20"
"profiles/learner-profiles.md" "60b93493cf1da06dfd63255d73854461" "site/built/learner-profiles.md" "2024-03-20"
2 changes: 1 addition & 1 deletion optimisation-conclusion.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Optimisation Conclusion"
teaching: 0
teaching: 5
exercises: 0
---

Expand Down
4 changes: 2 additions & 2 deletions optimisation-data-structures-algorithms.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Data Structures & Algorithms"
teaching: 0
exercises: 0
teaching: 30
exercises: 5
---

:::::::::::::::::::::::::::::::::::::: questions
Expand Down
2 changes: 1 addition & 1 deletion optimisation-introduction.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Introduction to Optimisation"
teaching: 0
teaching: 10
exercises: 0
---

Expand Down
2 changes: 1 addition & 1 deletion optimisation-memory.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Understanding Memory"
teaching: 0
teaching: 30
exercises: 0
---

Expand Down
47 changes: 32 additions & 15 deletions optimisation-minimise-python.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Understanding Python (NumPy/Pandas)"
teaching: 0
teaching: 30
exercises: 0
---

Expand Down Expand Up @@ -202,18 +202,18 @@ dis.dis(searchListC)

## Scope

When Python executes your code, it has to find the variables that you're using.
When Python executes your code, it has to find the variables and functions that you're using.

This adds an additional cost to accessing variables in Python, which isn't typically seen in compiled languages.
This adds an additional cost to accessing variables and calling functions in Python, which isn't typically seen in compiled languages.

In particular, it will first check whether the variable has been declared within the current function (local scope), if it can't find it there it will check whether it has been declared in the file (global scope) after which it may even check whether it's from an imported package.
In particular, it will first check whether the variable or functions has been declared within the current function (local scope), if it can't find it there it will check whether it has been declared in the file (global scope) after which it may even check whether it's from an imported package.

Repeated accesses to variables, will repeat these checks.
These are not implicitly cached, therefore repeated accesses to variables and functions, will repeat these checks.

The implication, is that as local scope variables are checked first, they will be faster to access.
The implication, is that as local scope variables and functions are checked first, they will be faster to use.

If you're only accessing a variable once or twice that's nothing to worry about, this is a relatively small cost.
But if a variable is being accessed regularly, such as within a loop, the impact may become visible.
But if a variable or functions is being accessed regularly, such as within a loop, the impact may become visible.

The below example provides a small demonstration of this in practice.

Expand Down Expand Up @@ -242,7 +242,7 @@ print(f"Global Scope: {timeit(test_list_global, number=repeats):.5f}ms")
print(f"Local Scope: {timeit(test_list_local, number=repeats):.5f}ms")
```

This is only a trivial example, but local scope is about 20% faster than global scope!
This is only a trivial example, whereby `N` has been copied to the local scope `N_local`, but local scope is about 20% faster than global scope!

```output
Global Scope: 0.06416ms
Expand All @@ -251,6 +251,18 @@ Local Scope: 0.05391ms

Consider copying highly accessed variables into local scope, you can always copy them back to global scope before you return from a function.

Copying functions to local scope works much the same as variables, e.g.

```py
import numpy as np

def my_function():
uniform_local = np.random.uniform

for i in range(10000):
t = uniform_local()
```

## Built-in Functions Operators

In order to take advantage of offloading computation to the CPython back-end it's necessary to be aware of what functionality is present. Those available without importing packages are considered [built-in](https://docs.python.org/3/library/functions.html) functions.
Expand Down Expand Up @@ -584,26 +596,31 @@ import pandas as pandas

N = 100000 # Number of rows in DataFrame

def genInput():
s = pandas.Series({'a' : 1, 'b' : 2})
d = {'a' : 1, 'b' : 2}
return s, d

def series():
x = pandas.Series({'a' : 1, 'b' : 2})
s, _ = genInput()
for i in range(N):
y = x['a'] * x['b']
y = s['a'] * s['b']

def dictionary():
x = {'a' : 1, 'b' : 2}
_, d = genInput()
for i in range(N):
y = x['a'] * x['b']
y = d['a'] * d['b']

repeats = 1000
print(f"series: {timeit(series, number=repeats):.2f}ms")
print(f"dictionary: {timeit(dictionary, number=repeats):.2f}ms")
```

69x slower!
65x slower!

```output
series: 236.16ms
dictionary: 3.42ms
series: 237.25ms
dictionary: 3.63ms
```

## Filter Early
Expand Down
2 changes: 1 addition & 1 deletion optimisation-use-latest.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Keep Python & Packages up to Date"
teaching: 0
teaching: 10
exercises: 0
---

Expand Down
2 changes: 1 addition & 1 deletion profiling-conclusion.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: "Profiling Conclusion"
teaching: 0
teaching: 5
exercises: 0
---

Expand Down
19 changes: 14 additions & 5 deletions profiling-functions.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Function Level Profiling"
teaching: 0
exercises: 0
teaching: 20
exercises: 20
---

:::::::::::::::::::::::::::::::::::::: questions
Expand Down Expand Up @@ -150,9 +150,9 @@ The columns have the following definitions:
|---------|---------------------------------------------------|
| `ncalls` | The number of times the given function was called. |
| `tottime` | The total time spent in the given function, excluding child function calls. |
| `percall` | The average tottime per function call (`tottime`/`percall`). |
| `percall` | The average tottime per function call (`tottime`/`ncalls`). |
| `cumtime` | The total time spent in the given function, including child function calls. |
| `percall` | The average cumtime per function call (`cumtime`/`percall`). |
| `percall` | The average cumtime per function call (`cumtime`/`ncalls`). |
| `filename:lineno(function)` | The location of the given function's definition and it's name. |

This output can often exceed the terminal's buffer length for large programs and can be unwieldy to parse, so the package `snakeviz` is often utilised to provide an interactive visualisation of the data when exported to file.
Expand Down Expand Up @@ -394,7 +394,7 @@ The value of `cities` should be a positive integer, this algorithm has poor scal
:::::::::::::::::::::::: hint

- If a hotspot isn't visible with the argument `1`, try increasing the value.
- If you think you identified the hotspot with your first profile, try investigating how the value of `int` affects the hotspot within the profile.
- If you think you identified the hotspot with your first profile, try investigating how the value of `cities` affects the hotspot within the profile.

:::::::::::::::::::::::::::::::::

Expand All @@ -415,6 +415,15 @@ Other boxes within the diagram correspond to the initialisation of imports, or i

The default configuration of the Predator Prey model takes around 10 seconds to run, it may be slower on other hardware.

Download the pre-generated `cProfile` output, this can be opened with `snakeviz` to save waiting for the profiler.


* <a href="files/pred-prey/predprey_out.prof" download>files/pred-prey/predprey_out.prof</a>

```sh
python -m snakeviz predprey_out.prof
```

:::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::: challenge

Expand Down
4 changes: 2 additions & 2 deletions profiling-introduction.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Introduction to Profiling"
teaching: 0
exercises: 0
teaching: 15
exercises: 10
---

:::::::::::::::::::::::::::::::::::::: questions
Expand Down
16 changes: 14 additions & 2 deletions profiling-lines.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: "Line Level Profiling"
teaching: 0
exercises: 0
teaching: 20
exercises: 30
---

:::::::::::::::::::::::::::::::::::::: questions
Expand Down Expand Up @@ -400,6 +400,18 @@ As this is a reference implementation of a classic sorting algorithm we are unli
:::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::::::::::::::

:::::::::::::::::::::::::::::::::: instructor

Download the pre-generated `line_profiler` output, this can be opened be to save waiting for the profiler.


* <a href="files/pred-prey/predprey.py.lprof" download>files/pred-prey/predprey.py.lprof</a>

```sh
python -m line_profiler -rm predprey.py.lprof
```

:::::::::::::::::::::::::::::::::::::::::::::
::::::::::::::::::::::::::::::::::::: challenge

## Exercise 2: Predator Prey
Expand Down
7 changes: 6 additions & 1 deletion setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,17 @@ Download the [data zip file](https://example.com/FIXME) and unzip it to your Des

This course uses Python and was developed using Python 3.11, therefore it is recommended that you have a Python 3.11 or newer environment.

<!-- Todo suggest using a venv?-->
You may want to create a new Python virtual environment for the course, this can be done with your preferred Python environment manager (e.g. `conda`, `pipenv`), the required packages can all be installed via `pip`.

<!-- conda create -n pando python
conda activate pando -->

The non-core Python packages required by the course are `pytest`, `snakeviz`, `line_profiler`, `numpy` and `matplotlib` which can be installed via `pip`.

```sh
pip install pytest snakeviz line_profiler[all] numpy matplotlib
```

To complete some of the exercises you will need to use a text-editor or Python IDE, so make sure you have your favourite available.

:::::::::::::::::::::::::::::::::::::::::::::::::::
8 changes: 8 additions & 0 deletions short-break1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
---
title: Break
teaching: 0
exercises: 0
break: 15
---

Take a break. If you can, move around and look at something away from your screen to give your eyes a rest and a chance to absorb the content covered so far.

0 comments on commit fba9a80

Please sign in to comment.