Welcome to Doom9's Forum, THE inplace to be for everyone interested in DVD conversion. Before you start posting please read the forum rules. By posting to this forum you agree to abide by the rules. 
26th February 2016, 11:06  #1  Link 
I'm Siri
Join Date: Oct 2012
Location: Providence, RI
Posts: 2,505

(further) lossless compression with pi?
so π (the ratio of a circle's circumference to its diameter) is (very likely to be) normal (mathematically "normal", not "normal" normal) like, a common sense..
which equals to any combo of 09 will eventually appear somewhere within the infinite digits of π.. well, after typical lossless compression, the statistic redundancy of the compressed data will be like, none and makes further (lossless) compression almost impossible but I guess π could help here, since any random base 10 data will appear somewhere within the digits of π like eventually, an extra compression could be done by the following steps: 1. convert the binary to many base 10 data blocks 2. search the data block in the digits of π, if any match found, mark the matching digit position a1 and the length of data block a2 3. if the size of a1+a2 >= the data block itself, abort the rest steps and start the next data block matching, if not, replace the whole data block with a1 and a2 4. repeat step 13 till all data blocks are matched what do you think?
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated 
26th February 2016, 18:54  #3  Link 
I'm Siri
Join Date: Oct 2012
Location: Providence, RI
Posts: 2,505

Why doesn't step3 work..?
I mean, it's totally possible that the matching position of some data block is so far far away from the decimal mark and it takes even more space to store the pointer than to store the data block directly (something like uint8_t * takes 8 bytes and uint8_t only takes one) I think step3 is required to make sure the process is actually compressing the data not blowing it even fatter
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated 
26th February 2016, 18:55  #4  Link  
Formerly davidh*****
Join Date: Jan 2004
Posts: 2,061

It won't work.
Quote:
Quote:
Last edited by wonkey_monkey; 26th February 2016 at 19:01. 

26th February 2016, 19:44  #5  Link 
I'm Siri
Join Date: Oct 2012
Location: Providence, RI
Posts: 2,505

Well, that's a shame..
How about using it as some kind of data encryption? Like, only take the first 1000 digits of pi Find the closest (not perfect) match in this 1000 digits and convert the data block to position, length and error.. Sounds funny to me
__________________
If I got new ideas, will post here: https://github.com/IFeelBloated 
29th February 2016, 02:18  #6  Link 
Registered User
Join Date: Jul 2007
Posts: 17

You can't losslessly compress random data (in average). A simplified argument goes like this:
Take all 2^n possible different files of size n. Compress them with your random compressor by 1bit. Now you have 2^n files of size n1, but there can be only 2^(n1) different files of size n1, then you can't decompress correctly half of your files. 
29th February 2016, 22:30  #7  Link  
Software Developer
Join Date: Jun 2005
Location: Last House on Slunk Street
Posts: 13,196

Quote:
I think the most important difference between your idea and how dictionary compression is implemented in practice is that you are using a fixed dictionary (i.e. the digits of π), while practical implementations use a dynamic dictionary. The latter means that the dictionary that is being built/updated while the data is being compressed/decompressed. Most often, a "sliding window" buffer, containing the last N symbols that have been compressed/decompressed, is used here. Another difference is that your dictionary has an infinite size (i.e. the digits of π), while piratical application use a bounded size dictionary (e.g. the last N symbols that have been processed). Overall, I think your idea performs worse than the algorithms use in practice, because (a) your dictionary does not adapt to the input data and (b) because of that you will need a VERY large "a1" to find a match  which means "a1+a2" will almost always be bigger than the data block itself. Quote:
If a "perfect" compressor (which makes every input smaller by at least one byte) did exist, you could apply it recursively again and again, until eventually every input gets compress to a single byte (or even zero byte?). Clearly makes no sense
__________________
There was of course no way of knowing whether you were being watched at any given moment. How often, or on what system, the Thought Police plugged in on any individual wire was guesswork. Last edited by LoRd_MuldeR; 29th February 2016 at 22:51. 

Thread Tools  Search this Thread 
Display Modes  

