# Compression

Compression means encoding [data](data.md) (such as images or texts) in a different way so that the data takes less space (memory) while keeping all the important [information](information.md), or, in plain terms, it usually means "making files smaller". Compression is pretty important as it helps us utilize memory or [bandwidth](bandwidth.md) well -- without it our hard drives would be able to store just a handful of videos, internet would be slow as hell due to the gigantic amount of transferred data and our [RAM](ram.md) wouldn't suffice for things we normally do. There are many [algorithms](algorithm.md) for compressing various kinds of data, differing by their complexity, performance, efficiency of compression etc. The reverse process to compression (getting the original data back from the compressed data) is called **decompression**. The ratio of the original data size to compressed data size is called **compression ratio** (the higher, the better). The science of data compression is truly huge and complicated AF, here we'll just mention some very basics. Also watch out: compression algorithms are often a [patent](patent.md) mine field.

{ CORRECTION NOTE: Allow me to interject -- I used to have compression ratio defined here as compressed to original, then noticed it's usually defined as a reciprocal of that, corrected it now. There seems to be some general confusion though, some actually define it as "space saved", i.e. *1 - compressed / original*. Doesn't matter that much anyway, but it's probably better to stick to an established convention. ~drummyfish }

{ I've now written a tiny LRS compression library/utility called [shitpress](shitpress.md). It's fewer than 200 LOC, so plain it can nicely serve educational purposes. The principle is simple, kind of a dictionary method, where the dictionary is simply the latest output 64 characters; if we find a long word that occurred recently, we simply reference it with mere 2 bytes. It works relatively well for most data! ~drummyfish }

{ There is a cool compressing competition known as Hutter Prize that offers 500000 pounds (no idea how much of normal money that is lol) to anyone who can break the current record for compressing [Wikipedia](wikipedia.md). Currently the record is at compressing 1GB down to 115MB. See http://prize.hutter1.net for more. ~drummyfish }

{ [LMAO](lmao.md) retard [patents](patent.md) are being granted on impossible compression algorithms, see e.g. http://gailly.net/05533051.html. See also [Sloot Digital Coding System](sloot.md), a miraculous compression algorithm that "could store a whole movie in 8 KB" lol. ~drummyfish }

We should also mention compression is not applied just to files on hard drives, it may just as well be used let's say in [RAM](ram.md) to utilize it more efficiently. [OpenGL](opengl.md) for instance offers the option to compress textures uploaded to the [GPU](gpu.md) to save space.

As for computational [complexity](complexity.md), it is mostly safe to assume that compression will be more demanding than decompression in terms of resources, and sometimes it's possible to dedicate more resources (time, memory, electricity, ...) to achieve a better compression ratio, i.e. we can "try harder" to "compress the file more". Whereas a compressed file can always be decoded only in one way (to obtain the original file) and the decompression process is normally quite fast and straightforward (e.g. "replace symbols with words from a dictionary"), it happens often that a file can be compressed in many different ways, some of which are better (smaller), and seldom there is another way but [brute force](brute_force.md) to find the best one. This asymmetry in cost of compression and decompression can be advantageous though, considering typical scenarios such as distributing compressed video over the Internet: we have to dedicate a lot of [CPU](cpu.md) time to compress the video well, but only once. The video will then be distributed to many clients and we are benefiting from saved bandwidth on every single copy we transfer, and thanks to simplicity of decompression the clients (of which there are many) aren't bothered nearly as much -- the total cost we're collectively paying is much smaller than if compression was cheap and decompression expensive.

Why don't we compress everything? Firstly because compressed data is slow to work with, it requires significant [CPU](cpu.md) time to compress and decompress data, it's a kind of a space-time tradeoff (we gain more storage space for the cost of CPU time). Compression also [obscures](obscurity.md) data, for example compressed text file will typically no longer be human readable, any code wanting to work with such data will have to include the nontrivial decompression code. Compressed data is also more prone to [corruption](corruption.md) because redundant information (which can help restoring corrupted data) is removed from it -- in fact we sometimes purposefully do the opposite of compression and make our data bigger to protect it from corruption (see e.g. [error correcting](error_correction.md) codes, [RAID](raid.md) etc.). And last but not least, many data can hardly be compressed or are so small it's not even worth it.

The basic division of compression methods is to:

- **lossless**: No information contained in the original data will be lost in the compressed data, i.e. the original file can be restored in its entirety from the compressed file.
- **lossy**: Some information contained in the original data is lost during compression, i.e. for example a compressed image will be of slightly worse quality. This usually allows for much greater compression. Lossy compressors usually also additionally apply lossless compression as well.

Furthermore we may divide compression for example to offline (compresses a whole file, may take long) and streaming (compressing a stream of input data on-the-go and in real-time), by the type of input data (binary, text, audio, ...), basic principle ([RLE](rle.md), dictionary, "[AI](ai.md)", ...) etc.

What follows is an example of how well different types of compression work for an image (screenshot of main page of Wikimedia Commons, 1280x800):

{ Though the website screenshot contained also real life photos, it still contained a lot of constant color areas which can be compressed very well, hence quite good compression ratios here. A general photo won't be compressed as much. ~drummyfish }

| compression                                         | ~size (KB)| ratio  |
| --------------------------------------------------- | --------- | ------ |
| none                                                | 3000      | 1      |
| general lossless (lz4)                              | 396       | 7.57   |
| general lossless (gzip)                             | 308       | 9.74   |
| image lossless (PNG)                                | 288       | 10.41  |
| image lossless (WEBP)                               | 176       | 17.04  |
| image lossy (JPG), good quality (75%)               | 148       | 20.27  |
| image lossy (JPG), ugly but readable (15%)          | 60        | 50     |

Mathematically there cannot exist a lossless compression algorithm that would always reduce the size of any input data -- if it existed, we could just repeatedly apply it and compress ANY data to zero bytes. And not only that -- **every lossless compression will inevitably enlarge some input files**. This is also mathematically given -- we can see compression as simply mapping input binary sequences to output (compressed) binary sequences, while such mapping has to be one-to-one ([bijective](bijection.md)); it can be simply shown that if we make any such mapping that reduces the size of some input (maps a longer sequence to a shorter one, i.e. compresses it), we will also have to map some short code to a longer one. However we can make it so that our compression algorithm enlarges a file at most by 1 bit: we can say that the first bit in the compressed data says whether the following data is compressed or not; if our algorithm fails to reduce the size of the input, it simply sets the bit to says so and leaves the original file uncompressed (in practice many algorithms don't do this though as they try to work as streaming filters, without random access to data, which would be needed here).

**Dude, how does compression really work tho?** The basic principle of lossless compression is **removing [redundancy](redundancy.md)** ([correlations](correlation.md) in the data), i.e. that which is explicitly stored in the original data but doesn't really have to be there because it can be reasoned out from the remaining data. This is why a completely random [noise](noise.md) can't be compressed -- there is no correlated data in it, nothing to reason out from other parts of the data. However human language for example contains many redundancies. Imagine we are trying to compress English text and have a word such as "computer" on the input -- we can really just shorten it to "computr" and it's still pretty clear the word is meant to be "computer" as there is no other similar English word (we also see that compression algorithm is always specific to the type of data we expect on the input -- we have to know what nature of the input data we can expect). Another way to remove redundancy is to e.g. convert a string such as "HELLOHELLOHELLOHELLOHELLO" to "5xHELLO". Lossy compression on the other hand tries to decide what information is of low importance and can be dropped -- for example a lossy compression of text might discard information about case (upper vs lower case) to be able to store each character with fewer bits; an all caps text is still readable, though less comfortably. A deeper view of compression oftentimes leads to a realization that compression is really a problem of [artificial intelligence](ai.md), for compression is really about prediction and prediction is about understanding -- this is where state-of-the-art view stands.

{ A quick intuitive example: [encyclopedias](encyclopedia.md) almost always have at the beginning a list of abbreviations they will use in the definition of terms (e.g. "m.a. -> middle ages", ...), this is so that the book gets shorter and they save money on printing. They compress the text. ~drummyfish }

**OK, but how much can we really compress?** Well, as stated above, there can never be anything such as a universal uber compression algorithm that just makes any input file super small -- everything really depends on the nature of the data we are trying to compress. The more we implicitly know about the nature of the compressed data, the more we can compress it, and this makes very good sense -- that which we already know we don't have to encode and thus the more we know, the less data there has to be (the smaller the compressed file), but also the more we become limited in what we can compress. So a general compression program will compress only a little while an image-specialized compression program will compress better (but will only work with images). If we specifically focus only on compressing English text for instance, we can assume it will only consist of words in the English language and so the compressed text doesn't have to come with English dictionary, but we also won't be able to compress [Chinese](chinese.md) text as a result. For an extreme example consider that **in theory we can make an algorithm that compresses one specific 100GB video down to 1 bit** (we just define that a bit "1" decompresses to this specific video), but it will only work for that one single video, not for video in general -- i.e. we made an extremely specialized compression and got an extremely good compression ratio, however due to such extreme specialization we can almost never use it. As said, we just cannot compress completely random data at all (as we don't know anything about the nature of such data). On the other hand data with a lot of redundancy, such as video, can be compressed A LOT. Similarly video compression algorithms used in practice work only for videos that appear in the real world which exhibit certain patterns, such as two consecutive frames being very similar -- if we try to compress e.g. static (white noise), video codecs just shit themselves trying to compress it (look up e.g. videos of confetti and see how blocky they get). All in all, some compression [benchmarks](benchmark.md) can be found e.g. at https://web.archive.org/web/20110203152015/http://www.maximumcompression.com/index.html -- the following are some approximate typical compression ratios: English text 8.33, image (lossy) 10, executable 4.

Related to the above is a cool [real life](irl.md) example from the movie streaming industry. Part of the cinematic look of many movies is the "movie grain" -- a sort of visually pleasant [analog](analog.md) [noise](noise.md) caused mainly by the physical film the movie is recorded on (in principle similar to the pleasant noise and cracks heard on vinyl records). All kino enjoyers absolutely love this noise, but like we've said, noise cannot be compressed, and so with traditional video codecs the noise is either highly degraded and even lost in encoding (reducing viewing pleasure) OR it's somewhat preserved but for the price of high bitrates and large files sizes. And so engineers have designed a new video codec, [AV1](av1.md), that addresses this issue. The codec analyzes the input video and measures parameters of the "film grain" noise. Then it compresses the video WITHOUT the noise and stores along the measured noise parameters (which are just a few numbers that take practically no space). And when the video is to be played back later on, the decoder actually SYNTHESIZES its own noise based on the recorded parameters and overlays it on the video. This is of course brilliant because the result looks visually very similar to the original, even though under a microscope the noise is actually completely different from the original noise. The point is no one is going to be looking at the noise under a microscope, people just want to enjoy the movie, but again, these are now new ASSUMPTIONS we adopted about our input data: the additional assumptions are that the input video has a noise which is important to preserve but it's fine to only preserve its statistical properties and not exact values. This further specializes the coded on this specific kind of movies and using it for other purposes may be ineffective -- it has actually been shown that using AV1 for some animated movies (not a typical kind of "kino" movie) gives WORSE results (in terms of compression ratio) than traditional codecs.

## Methods

The following is an overview of some most common compression techniques.

### Lossless

**[RLE](rle.md) (run length encoding)** is a simple method that stores repeated sequences just as one element of the sequence and number of repetitions, i.e. for example *"abcabcabc"* as *"3abc"*.

**[Entropy](entropy.md) coding** is another common technique which counts the frequencies ([probabilities](probability.md)) of symbols on the input and then assigns the shortest codes to the most frequent symbols, leaving longer codes to the less frequent. The most common such codings are **[Huffman coding](huffman_coding.md)** and **[Arithmetic coding](arithmetic_coding.md)**.

**Dictionary (substitutional) methods** try to construct a dictionary of relatively long symbols appearing in the input and then only store short references to these symbols. The format may for example choose to first store the dictionary and then the actual data with pointers to this dictionary, or it may just store the data in which pointers are stored to previously appearing sequences.

**[Predictor](predictor.md) compression** is based on making a *predictor* that tries to guess following data from previous values (which can be done e.g. in case of pictures, sound or text) and then only storing the difference against such a predicted result. If the predictor is good, we may only store the small amount of the errors it makes.

A famous family of dictionary compression algorithms are **Lempel-Ziv (LZ)** -- these two guys first proposed [LZ77](lz77.md) in (1977, sliding window) and [LZ78](lz78.md) (explicitly stored dictionary, 1978). These methods provided a basis for numerous improved/remixed algorithms, most notably [LZW](lzw.md) (1984, Welch). Additionally these algorithms are used and combined in other ones, most notably [gif](gif.md) and [DEFLATE](deflate.md) (used e.g. in gzip and png).

An approach similar to predictor is searching for a **mathematical [model](model.md) of the data** and storing only the model parameters (which should be a relatively few numbers, compared to storing the data explicitly). For example this can mean [vectorizing](vector_graphics.md) a bitmap image, i.e. finding geometric shapes (such as lines and circles) in the image (a grid of pixels) and then storing the shape parameters rather than pixel values -- this may apparently not be 100% accurate due to noise and more complex shapes, but again if we desire to preserve the data without losses, additional error correction may be applied by storing the small remaining error, which will allow for restoring the image precisely (of course, the error must really be small, otherwise we might fail to actually compress the data, and this all depends on how well our model predicts and "fits"). Similar approach is used in [vocoders](vocoder.md) used in cellphones that attempt to mathematically model human speech (however here the compression is lossy), or in [fractal](fractal.md) compression of images. A nice feature we gain here is the ability to actually "increase the resolution" (or rather generate detail) of the original data -- once we fit a model onto our data, we may use it to tell us values that are not actually present in the data (i.e. we get a fancy [interpolation](interpolation.md)/[extrapolation](extrapolation.md)).

Another property of data to exploit may be its **sparsity** -- if for example we were to compress a gigantic image which prevalently consists of large white areas, we could say that white is the implicit color and we'll only explicitly store pixels of other colors.

Some more wild techniques may include [genetic programming](genetic_programming.md) that tries to evolve a small program that reproduces the input data, or using "[AI](ai.md)" in whatever way to compress the data (in fact compression is an essential part of many [neural networks](neural_network.md) as it forces the network to "understand", make sense of the data -- many neural networks therefore internally compress and decompress the data so as to filter out the unimportant information; [large language models](llm.md) are now starting to beat traditional compression algorithms at compression ratios).

Note that many of these methods may be **combined or applied repeatedly** as long as we are getting smaller results.

Furthermore also take a look at [procedural generation](procgen.md), a related technique that allows to embed a practically infinite amount of content with only quite small amount of code.

### Lossy

In lossy compression we generally try to discard information that is not very important and/or to which we aren't very sensitive, typically by dropping precision by [quantization](quantization.md), i.e. basically lowering the number of bits we use to store the "not so important" information -- in some cases we may just drop some information altogether (decrease precision to zero). Furthermore we finally also apply lossless compression to make the result even smaller.

For **images** we usually exploit the fact that human sight is less sensitive to certain visual information, such as specific frequencies, colors, brightness etc. Common methods used here are:

- Convert image from [RGB](rgb.md) to [YUV](yuv.md), leave the Y channel (brightness) as is and reduce resolution of the U an V (color) channels. This works because human eye is less sensitive to color than brightness.
- Convert the image to frequency domain (e.g. with [DCT](dct.md) or some [wavelet transform](wavelet_transform.md)) and quantize (allocate fewer bits to) higher frequencies. This exploits the fact that human eye is less sensitive to higher frequencies. This is the basis of e.g. [jpeg](jpg.md).
- Reduce the number of possible colors -- traditional RGB uses 8 bits for each R, G and B component and so each pixel takes 3 bytes, which allows for about 6 million colors. However using just 2 bytes (65 thousand colors) many times [suffices](good_enough.md) and saves 1/3rd of the size -- see [RGB565](rgb565.md). We may also utilize an image-specific [palette](palette.md) and save the image in indexed mode, i.e. compute a palette of let's say 256 most common colors in the image, then encode the image as the palette plus pixels, of which each will only take one byte! This saves almost 2/3rds of the size. The drop of quality can further be made less noticeable with [dithering](dithering.md).
- Reduce resolution -- plain simple. However this can be made smarter by e.g. trying to detect areas with few details and only reducing the resolution there.

In **video** compression we may reuse the ideas from image compression and further employ exploiting temporal redundancy, i.e. the fact that consecutive video frames look similar, so we may only encode some kind of delta (change) against the previous (or even next) frame. The most common way is to fully record only one key frame in some time span (so called I-frame, further compressed with image compression methods), then divide it to small blocks and estimate the movement of those blocks so that they approximately make up the following frames -- we then record only the motion vectors of the blocks. This is why videos look "blocky". In the past [interlacing](interlacing.md) was also used -- only half of each frame was recorded, every other row was dropped; when playing, the frame was interlaced with the previous frame. Another cool idea is keyframe [superresolution](superresolution.md): you store only some keyframes in full resolutions and store the rest of them in smaller size; during decoding you can use the nearby full scale keyframes to upscale the low res keyframes (search for matching subblocks in the low res image and match them to those in the big res image).

In **audio** we usually straight remove frequencies that humans can't hear (usually said to be above 20 kHz), for this we again convert the audio from spatial to frequency domain (using e.g. [Fourier transform](fourier_transform.md)). Furthermore it is very inefficient to store sample values directly -- we rather use so called *differential PCM*, a lossless compression that e.g. stores each sample as a difference against the previous sample (which is usually small and doesn't use up many bits). This can be improved by a predictor, which tries to predict the next values from previous values and then we only save the difference against this prediction. *Joint stereo coding* exploits the fact that human hearing is not so sensitive to the direction of the sound and so e.g. instead of recording both left and right stereo channels in full quality rather records the sum of both and a ratio between them (which can get away with fewer bits). *Psychoacoustics* studies how humans perceive sound, for example so called *masking*: certain frequencies may for example mask nearby (both in frequency and time) frequencies (make them unhearable for humans) so we can drop them. See also [vocoders](vocoder.md). For specific kinds of audio we may further employ more detailed knowledge, for example with instrumental [music](music.md) we can just store the notes that are being played plus instruments that play them, for example with [MIDI](midi.md) -- this format was not made for compression per se, but it does allow us to store music in much smaller size than directly storing audio.

Some **general techniques** can be applied to different types of media, for example denoising. By definition [noise](noise.md) is that which is NOT a useful information and a lot of noise has very random nature, so it cannot be compressed and inflates the data a lot. If we can detect the noise in our data and then remove it, or at least suppress it, we will typically achieve a much better compression ratio. Denoising techniques often involve spectral filtering, i.e. knowing roughly the spectral characteristics of our noise, we can simply remove certain frequencies to remove the noise. Sometimes this may be very easy to implement, for example a quite trivial [median](median.md) filter works very well for removing noise from photographs. Blurring in general removes high frequency noise, but obviously it also blurs everything else, so more advanced techniques involve for example detecting edges and excluding them from the blur effect. It may also happen that we want to preserve the noise for [artistic](art.md) purposes, for example the "movie grain" noise is something we WANT to see in good movies, but here we can still make use of denoising to achieve better compression and preserve the noise as well -- a good example of this is the AV1 video codec that denoises the input video, compresses it AND stores the noise parameters along with the compressed data; the decoder then synthesizes a new noise when playing the video back and overlays it on the video.

TODO: LZW, DEFLATE etc.

## Compression Programs/Utils/Standards

Here is a list of some common compression programs/utilities/standards/formats/etc:

| util/format       | extensions | free? | media         | lossless? | notes                                        |
| ----------------- | ---------- | ----- | ------------- | --------- | -------------------------------------------- |
|[bzip2](bzip2.md)  | .bz2       | yes   | general       | yes       | Burrows-Wheeler alg.                         |
|[flac](flac.md)    | .flac      | yes   | audio         | yes       | super free lossless audio format             |
|[gif](gif.md)      | .gif       |now yes| image/anim.   | no        | limited color palette, patents expired       |
|[gzexe](gzexe.md)  |            | yes   |executable bin.| yes       | makes self-extracting executable             |
|[gzip](gzip.md)    | .gz        | yes   | general       | yes       | by GNU, DEFLATE, LZ77, mostly used by Unices |
|[jpeg](jpeg.md)    | .jpg, .jpeg| yes?  | raster image  | no        | common lossy format, under patent fire       |
|[lz4](lz4.md)      | .lz4       | yes   | general       | yes       | high compression/decompression speed, LZ77   |
|[mp3](mp3.md)      | .mp3       |now yes| audio         | no        | popular audio format, patents expired        |
|[png](png.md)      | .png       | yes   | raster image  | yes       | popular lossless image format, transparency  |
|[rar](rar.md)      | .rar       | NO    | general       | yes       | popular among normies, PROPRIETARY           |
|[vorbis](vorbis.md)| .ogg       | yes   | audio         | no        | was a free alternative to mp3, used with ogg |
|[zip](zip.md)      | .zip       | yes?  | general       | yes       | along with encryption may be patented        |
|[7-zip](7zip.md)   | .7z        | yes   | general       | yes       | more complex archiver                        |

## Code Example

Let's write a simple lossless compression utility in [C](c.md). It will work on binary files and we will use the simplest RLE method, i.e. our program will just shorten continuous sequences of repeating bytes to a short sequence saying "repeat this byte N times". Note that this is very primitive (a small improvement might be actually done by looking for sequences of longer words, not just single bytes), but it somewhat works for many files and demonstrates the basics.

The compression will work like this:

- We will choose some random, hopefully not very frequent byte value, as our special "marker value". Let's say this will be the value 0xF3.
- We will read the input file and whenever we encounter a sequence of 4 or more same bytes in a row, we will output these 3 bytes:
  - the marker value
  - byte whose values is the length of the sequence minus 4
  - the byte to repeat
- If the marker value is encountered in input, we output 2 bytes:
  - the marker value
  - value 0xFF (which we won't be able to use for the length of the sequence)
- Otherwise we just output the byte we read from the input.

Decompression is then quite simple -- we simply output what we read, unless we read the marker value; in such case we look whether the following value is 0xFF (then we output the marker value), else we know we have to repeat the next character this many times plus 4.

For example given input bytes

```
0x11 0x00 0x00 0xAA 0xBB 0xBB 0xBB 0xBB 0xBB 0xBB 0x10 0xF3 0x00
                    \___________________________/      \__/
                       long repeating sequence        marker!
```

Our algorithm will output a compressed sequence

```
0x11 0x00 0x00 0xAA 0xF3 0x02 0xBB 0x10 0xF3 0xFF 0x00
                    \____________/      \_______/
                    compressed seq.   encoded marker
```

Notice that, as stated above in the article, there inevitably exists a "danger" of actually enlarging some files. This can happen if the file contains no sequences that we can compress and at the same time there appear the marker values which actually get expanded (from 1 byte to 2).

The nice property of our algorithm is that both compression and decompression can be streaming, i.e. both can be done in a single pass as a filter, without having to load the file into memory or randomly access bytes in files. Also the memory complexity of this algorithm is constant (RAM usage will be the same for any size of the file) and time complexity is linear (i.e. the algorithm is "very fast").

Here is the actual code of this utility (it reads from stdin and outputs to stdout, a flag `-x` is used to set decompression mode, otherwise it is compressing):

```
#include <stdio.h>

#define SPECIAL_VAL 0xf3 // random value, hopefully not very common

void compress(void)
{
  unsigned char prevChar = 0;
  unsigned int  seqLen = 0;
  unsigned char end = 0;

  while (!end)
  {
    int c = getchar();

    if (c == EOF)
      end = 1;

    if (c != prevChar || c == SPECIAL_VAL || end || seqLen > 200)
    { // dump the sequence
      if (seqLen > 3)
        printf("%c%c%c",SPECIAL_VAL,seqLen - 4,prevChar);
      else
        for (int i = 0; i < seqLen; ++i)
          putchar(prevChar);

      seqLen = 0;
    }

    prevChar = c;
    seqLen++;

    if (c == SPECIAL_VAL)
    {
      // this is how we encode the special value appearing in the input
      putchar(SPECIAL_VAL);
      putchar(0xff);
      seqLen = 0;
    }
  }
}

void decompress(void)
{
  unsigned char end = 0;

  while (1)
  {
    int c = getchar();

    if (c == EOF)
      break;

    if (c == SPECIAL_VAL)
    {
      unsigned int seqLen = getchar();

      if (seqLen == 0xff)
        putchar(SPECIAL_VAL);
      else
      {
        c = getchar();

        for (int i = 0; i < seqLen + 4; ++i)
          putchar(c);
      }
    }
    else
      putchar(c);
  }
}

int main(int argc, char **argv)
{
  if (argc > 1 && argv[1][0] == '-' && argv[1][1] == 'x' && argv[1][2] == 0)
    decompress();
  else
    compress();

  return 0;
}
```

How well does this perform? If we try to let the utility compress its own source code, we get to 1242 bytes from the original 1344, which is not so great -- the compression ratio is only about 1.08 here. We can see why: the only repeating bytes in the source code are the space characters used for indentation -- this is the only thing our primitive algorithm manages to compress. However if we let the program compress its own binary version, we get much better results (at least on the computer this was tested on): the original binary has 16768 bytes while the compressed one has 5084 bytes, which is an EXCELLENT compression ratio of 3.33%! Yay :-)

## See Also

- [procedural generation](procgen.md)
- [minification](minification.md)
