Welcome to Doom9's Forum, THE in-place to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. |
|
26th February 2016, 11:06 | #1 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
(further) lossless compression with pi?
so π (the ratio of a circle's circumference to its diameter) is (very likely to be) normal (mathematically "normal", not "normal" normal) like, a common sense..
which equals to any combo of 0-9 will eventually appear somewhere within the infinite digits of π.. well, after typical lossless compression, the statistic redundancy of the compressed data will be like, none and makes further (lossless) compression almost impossible but I guess π could help here, since any random base 10 data will appear somewhere within the digits of π like eventually, an extra compression could be done by the following steps: 1. convert the binary to many base 10 data blocks 2. search the data block in the digits of π, if any match found, mark the matching digit position a1 and the length of data block a2 3. if the size of a1+a2 >= the data block itself, abort the rest steps and start the next data block matching, if not, replace the whole data block with a1 and a2 4. repeat step 1-3 till all data blocks are matched what do you think? |
26th February 2016, 18:54 | #3 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Why doesn't step3 work..?
I mean, it's totally possible that the matching position of some data block is so far far away from the decimal mark and it takes even more space to store the pointer than to store the data block directly (something like uint8_t * takes 8 bytes and uint8_t only takes one) I think step3 is required to make sure the process is actually compressing the data not blowing it even fatter |
26th February 2016, 18:55 | #4 | Link | ||
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,496
|
It won't work.
Quote:
Quote:
Last edited by wonkey_monkey; 26th February 2016 at 19:01. |
||
26th February 2016, 19:44 | #5 | Link |
I'm Siri
Join Date: Oct 2012
Location: void
Posts: 2,633
|
Well, that's a shame..
How about using it as some kind of data encryption? Like, only take the first 1000 digits of pi Find the closest (not perfect) match in this 1000 digits and convert the data block to position, length and error.. Sounds funny to me |
29th February 2016, 02:18 | #6 | Link |
Registered User
Join Date: Jul 2007
Posts: 16
|
You can't losslessly compress random data (in average). A simplified argument goes like this:
Take all 2^n possible different files of size n. Compress them with your random compressor by 1-bit. Now you have 2^n files of size n-1, but there can be only 2^(n-1) different files of size n-1, then you can't decompress correctly half of your files. |
29th February 2016, 22:30 | #7 | Link | ||
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,248
|
Quote:
I think the most important difference between your idea and how dictionary compression is implemented in practice is that you are using a fixed dictionary (i.e. the digits of π), while practical implementations use a dynamic dictionary. The latter means that the dictionary that is being built/updated while the data is being compressed/decompressed. Most often, a "sliding window" buffer, containing the last N symbols that have been compressed/decompressed, is used here. Another difference is that your dictionary has an infinite size (i.e. the digits of π), while piratical application use a bounded size dictionary (e.g. the last N symbols that have been processed). Overall, I think your idea performs worse than the algorithms use in practice, because (a) your dictionary does not adapt to the input data and (b) because of that you will need a VERY large "a1" to find a match - which means "a1+a2" will almost always be bigger than the data block itself. Quote:
If a "perfect" compressor (which makes every input smaller by at least one byte) did exist, you could apply it recursively again and again, until eventually every input gets compress to a single byte (or even zero byte?). Clearly makes no sense
__________________
Go to https://standforukraine.com/ to find legitimate Ukrainian Charities 🇺🇦✊ Last edited by LoRd_MuldeR; 29th February 2016 at 22:51. |
||
Thread Tools | Search this Thread |
Display Modes | |
|
|