-
Notifications
You must be signed in to change notification settings - Fork 75
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
51 changed files
with
1,893 additions
and
89 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,7 +13,7 @@ jobs: | |
steps: | ||
- name: Dependabot metadata | ||
id: metadata | ||
uses: dependabot/[email protected].4 | ||
uses: dependabot/[email protected].5 | ||
with: | ||
github-token: "${{ secrets.GITHUB_TOKEN }}" | ||
- name: Approve | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -19,12 +19,12 @@ jobs: | |
uses: actions/checkout@v3 | ||
|
||
- name: Upload or update source files to Crowdin | ||
uses: crowdin/[email protected].0 | ||
uses: crowdin/[email protected].2 | ||
with: | ||
upload_sources: true | ||
|
||
- name: Download German translations | ||
uses: crowdin/[email protected].0 | ||
uses: crowdin/[email protected].2 | ||
with: | ||
upload_sources: false | ||
download_translations: true | ||
|
@@ -42,7 +42,7 @@ jobs: | |
config: crowdin.yaml | ||
|
||
- name: Download Spanish translations | ||
uses: crowdin/[email protected].0 | ||
uses: crowdin/[email protected].2 | ||
with: | ||
upload_sources: false | ||
download_translations: true | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
62 changes: 62 additions & 0 deletions
62
...eldguide/chapters/content/en/coding-compression/sections/shannons-experiment.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
# Shannon's experiment | ||
|
||
It turns out that there are limits to how small we can compress a file, and to explore this we’re going to look at multi-million dollar frauds, and a fun game that exposes the limits of compression. | ||
|
||
Every so often someone claims to have invented an amazing {glossary-link term="lossless"}lossless{glossary-link end} compression method that can compress *any* file, including compressed files. If that was true, it would mean that you could use the method to compress files down to just a few {glossary-link term="byte"}bytes{glossary-link end}. Any file could be downloaded in a fraction of a second, and computers could store billions of huge video files. It would revolutionise computing! But is there a limit on how small a file can be compressed? | ||
|
||
{panel type="curiosity"} | ||
|
||
# Fraud in data compression | ||
|
||
A number of fake systems have been produced that claim to compress any file as small as you want. | ||
They have even been demonstrated! But they all turn out to be a fake system - the common trick is to have a “compression” program that actually just hides the file being compressed somewhere else on a computer, and replaces it with a tiny file. The decompression program copies the hidden file back. It’s a very simple program to write, and looks very impressive because files are replaced with tiny ones, and then reproduced exactly. | ||
|
||
You can find a few examples of these if you search for “Pixelon”, “Adam’s platform”, “Near Zero”, or “Madison Priest” (add the terms “compression” and “fraud” if you are searching for these, as there are other legitimate organisations with similar names). Several of these organisations have taken millions of dollars from investors who didn’t understand the limits of compression, and all ended up failing. | ||
|
||
{panel end} | ||
|
||
## How small can we compress a file? | ||
|
||
With {glossary-link term="lossy"}lossy{glossary-link end} compression, there isn’t a limit to how small you can compress a file, since it’s just a matter of giving up quality to make the file smaller. You could compress a 10-megapixel photo down to just one pixel (perhaps the average colour of the whole photo). It wouldn’t be much use to anyone, but technically it’s a lossy version of the original photo. | ||
|
||
But with lossless compression, the original file needs to be able to be restored to exactly its original form. | ||
|
||
We’ve seen that compression works by taking advantage of patterns in the data being compressed. | ||
In the 1950s an interesting experiment was developed by a scientist called Claude Shannon, in which he asked humans to predict English text, and he measured how good the compression would be using their ability to make predictions. | ||
The idea is that if a computer was as good at English as a human, then that might be near the limit of what is possible. | ||
|
||
Shannon’s game is easy to play. | ||
Just click on the letter that you think is coming up next in the sentence (you’ll need to start by guessing the first letter). The number of guesses you make give an indication of how predictable the letter is. | ||
These guesses are used to estimate how small the data could be compressed -- you can see this estimate by clicking on the “Show statistics” button. | ||
The “bits per character” is the estimate of how many bits would be needed on average to represent each character. | ||
Plain English text is often stored in 7 or 8 bits for each character (using Unicode or ASCII), and you should find that using your predictions the experiment can do better than that, usually around 2 bits per character. | ||
That’s equivalent to compressing a normal file (8 bits per character) to a quarter of its size. | ||
|
||
Try it here: | ||
|
||
{interactive slug="shannon-experiment" type="whole-page" alt="Shannon's experiment"} | ||
|
||
But it’s very hard to get smaller than 1 bit per character (one eighth of the normal size). | ||
Shannon found that this seems to be a limit for how much we can compress English text. | ||
And this is one reason that we should be suspicious of any system that claims to compress English text to much smaller than one eighth of its original size. | ||
|
||
{panel type="teacher-note"} | ||
|
||
# Creating your own experiment | ||
|
||
This interactive contains an option to create your own experiment. | ||
For example, this could be used to tailor the sentence set to use words that are more familiar to your students. | ||
|
||
Additionally we have support for multiple different languages and sentence sets. | ||
Currently, we have a sentence set for Te Reo Māori, and the original sentences used by Shannon in 1951. | ||
|
||
If you would like to use another language with a different set of characters and/or accents, this also works! | ||
When creating a custom sentence, any characters that aren't already on the keyboard get added automatically. | ||
|
||
Lastly, you could also considering using a pattern that is easily guessed once they realise what is happening, such as "AAAAAAAAAAAAAAAA", "ABABABABABAB", or "blah blah blah blah blah blah blah blah blah blah". | ||
These have very close to zero information content as they are very predictable. | ||
At the other extreme, a (fake) passowrd such as "P6dQKg#S58dw66p" could be used to explore how hard it is to guess random characters. | ||
|
||
{panel end} | ||
|
||
{comment - could add more about Shannon, model at sender and received, movie about him} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
18 changes: 18 additions & 0 deletions
18
csfieldguide/chapters/migrations/0037_chapter_slug_deferred.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Generated by Django 3.2.16 on 2022-11-24 02:49 | ||
|
||
from django.db import migrations, models | ||
import django.db.models.constraints | ||
|
||
|
||
class Migration(migrations.Migration): | ||
|
||
dependencies = [ | ||
('chapters', '0036_alter_chaptersection_options'), | ||
] | ||
|
||
operations = [ | ||
migrations.AddConstraint( | ||
model_name='chapter', | ||
constraint=models.UniqueConstraint(deferrable=django.db.models.constraints.Deferrable['DEFERRED'], fields=('slug',), name='slug_deferred'), | ||
), | ||
] |
18 changes: 18 additions & 0 deletions
18
csfieldguide/chapters/migrations/0038_alter_chapter_slug.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Generated by Django 3.2.16 on 2022-11-24 03:18 | ||
|
||
from django.db import migrations, models | ||
|
||
|
||
class Migration(migrations.Migration): | ||
|
||
dependencies = [ | ||
('chapters', '0037_chapter_slug_deferred'), | ||
] | ||
|
||
operations = [ | ||
migrations.AlterField( | ||
model_name='chapter', | ||
name='slug', | ||
field=models.SlugField(), | ||
), | ||
] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
"""Module for Django system configuration.""" | ||
|
||
__version__ = "3.12.6" | ||
__version__ = "3.13.0" |
Oops, something went wrong.