palette sorting

cédric (dev) said on 03/16/2018

regarding the palette sorting, i saw that pingo had some limitations on the image definition. i probably put this to do some tests, and i simply forgot to remove it in the release — i do not know exactly from what version, probably old. so yes, it was limited to small images

during this review, i did a very small change in 0.96k that have a bit impact on filesize (it *should* be a bit better). the nice point is that it does have *any* impact on speed — since it is not another trial

i take another look on it because of this picture. it does not seem to be so well optimized — there are two iDAT chunks, so we could suspect that it was not optimized — but still pingo 0.96g -s8 was not able to save any bytes — because the rewriting process was skipped

ZephyrPrime_Keyart.png

room for improvement

this sample is interesting because it shows that pingo is not always better in this field and there is probably room for improvement here

palette sorting
zephyrprime_keyart.png - G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
tool option time savings
ECT 0.8.2 (d14dc93) -1 0.774s 0
ECT 0.8.2 (d14dc93) -2 --mt-deflate=4 0.961s 0
ECT 0.8.2 (d14dc93) -3 --mt-deflate=4 1.115s 0
pingo 0.96m -s9 3.556s 92 293 bytes
ECT 0.8.2 (d14dc93) -9 --mt-deflate=4 7.339s 958 bytes
TruePNG 0.6.2.5 - 10.185s 95 566 bytes
ECT 0.8.2 (d14dc93) -5 --pal_sort=120 --mt-deflate=4 43.663s 0
ECT 0.8.2 (d14dc93) -5 --pal_sort=120 --allfilters --mt-deflate=4 957.482s 86 955 bytes

at first sight, it seems that pingo is not that bad, but the reduction is far away from perfection — this is the worst actually — indeed, if you do recompression on *best* precedent results of each tool, here is what you should get

recompressed reductions
recompressed content
tool savings
TruePNG 0.6.2.5 124 268 bytes
ECT 0.8.2 (d14dc93) 103 015 bytes
pingo 0.96m 95 164 bytes

this is not very surprising: pingo is *very* lazy and does minimal effort, unlike ECT, which does a lot of combinations to find, in this case, a better reduction. an even better one is picked by TruePNG 0.6.2.5, but it is sample related: no tool is always better than another on all samples

palette sorting
72xpaletted - FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
tool option time savings
ECT 0.8.2 (d14dc93) -9 -s --allfilters --pal_sort=120 --mt-deflate=4 1535.516s 27 695
TruePNG 0.6.2.5 -o4 42.104s 25 243
pingo 0.96m -s8 1.33s 45 822
ECT 0.8.2 (d14dc93) (recompressed) - - 27 695 (same)
TruePNG 0.6.2.5 (recompressed) - - 43 404
pingo (recompressed) - - 46 313

pingo should find a good compromise between how much the order will be efficient and how much time it will spend to find it. probably an algorithm — perhaps the genetic used in pngwolf — could do that and select a far better solution than those tools. also, a better heuristic filter could be used probably

further optimization

this is possible to find better ways for this specific image. the result of this check depends of many factors: how is the initial order, how strong is the estimation — stronger does not always mean better —, how is done the transformation of the fully transparent if there is one, etc.

optimization example - zephyrprime_keyart.png
example options time filesize savings
WebP (0.6.1) cwebp -lossless -q 100 -m 6 in.png -o out.webp 32.977s 787,60 KB 81 890 bytes
PNG (pingo with another sorting, not implemented in release) 3.420s 736,44 KB 134 276 bytes

like you can see here, PNG has still some potential. result is even smaller thanks to the better reduction (sorting), but still the compression is probably not optimal...

cédric (dev) said on 03/19/2018

modifications from 0.96n: a bit slower, while should provide a bit better results

palette optimization, FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
-s0 profile
corpus 40x PALETTED 72x PALETTED 76x PALETTED 192x PALETTED
pingo 0.93 28.10KB 0.41s 22.08KB 0.34s 68.00KB 0.63s 310.69KB 0.43s
pingo 0.94 33.44KB 0.47s 23.78KB 0.33s 68.43KB 0.75s 307.82KB 0.62s
pingo 0.96q 33.68KB 0.44s 24.96KB 0.37s 68.43KB 0.73s 307.82KB 0.58s
-s1 profile
corpus 40x PALETTED 72x PALETTED 76x PALETTED 192x PALETTED
pingo 0.93 32.12KB 0.58s 34.05KB 0.54s 80.05KB 0.99s 311.18KB 0.45s
pingo 0.94 39.50KB 0.59s 30.16KB 0.40s 74.97KB 0.89s 318.45KB 0.73s
pingo 0.96q 42.42KB 0.62s 33.44KB 0.42s 80.77KB 1.11s 323.21KB 0.87s
-s2 profile
corpus 40x PALETTED 72x PALETTED 76x PALETTED 192x PALETTED
pingo 0.93 35.61KB 1.29s 39.37KB 0.87s 88.48KB 2.31s 289.63KB 0.49s
pingo 0.94 45.18KB 1.15s 36.38KB 0.62s 90.34KB 1.97s 327.62KB 0.92s
pingo 0.96q 46.38KB 1.26s 38.77KB 0.69s 90.79KB 2.23s 329.02KB 1.12s
-s3 profile
corpus 40x PALETTED 72x PALETTED 76x PALETTED 192x PALETTED
pingo 0.93 36.94KB 1.36s 40.70KB 0.91s 90.18KB 2.38s 322.81KB 1.27s
pingo 0.94 50.37KB 1.59s 41.37KB 0.80s 92.23KB 2.54s 329.25KB 1.30s
pingo 0.96q 49.97KB 1.51s 42.32KB 0.83s 92.53KB 2.56s 329.38KB 1.39s

cédric (dev) said on 06/06/2018

i started to make pingo a bit stronger in higher profiles from 0.97b to 0.97d. i improved results a bit at the cost of tiny speed for paletted, but the real change is done in 0.97e. regarding this image structure, pingo 0.97 is not efficient enought here. however, i could not add this way to beat other programs

indeed, pingo's estimations are not strong enought to pick the best result between default and this one. it works on this specific sample, but gives worst results on other sets. however, after some research, i finally found another solution that seems to perform well on most of my sets

why image reduction matters

palette sorting in pingo
zephyrprime_keyart.png - G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
profiles pingo 0.97 pingo 0.97e
-s0 22.71 KB 1.26 s 66.98 KB 1.56 s
-s1 48.22 KB 1.51 s 87.58 KB 1.82 s
-s2 41.35 KB 1.58 s 87.58 KB 1.82 s
-s3 60.10 KB 2.06 s 95.17 KB 2.67 s
-s4 66.93 KB 2.07 s 94.95 KB 2.66 s
-s5 75.30 KB 2.45 s 98.71 KB 3.08 s
-s6 75.30 KB 2.63 s 106.15 KB 3.57 s
-s7 82.42 KB 2.78 s 110.06 KB 3.75 s
-s8 82.42 KB 2.64 s 116.83 KB 3.91 s
-s9 82.42 KB 3.70 s 122.96 KB 5.22 s
pingo vs other programs
zephyrprime_keyart.png - G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
tool option time savings
oxipng 1.0.4 -o 6 --zopfli 120.111s 12 bytes
TruePNG 0.6.2.5 - 10.185s 95 566 bytes
pingo 0.97e -s3 2.67s 97 449 bytes
TruePNG 0.6.2.5 (recompressed) - 10.185s + 6.201s 124 268 bytes
TruePNG 0.6.2.5 -fe 35.438s 98 055 bytes
pingo 0.97e -s5 3.08s 101 080 bytes
TruePNG 0.6.2.5 (recompressed) -fe 35.438s + 6.082s 125 150 bytes
ECT 0.8.2 (926aa35) -5 --pal_sort=120 --allfilters --mt-deflate=4 957.482s 86 955 bytes
ECT 0.8.2 (926aa35) (recompressed) -5 --pal_sort=120 --allfilters --mt-deflate=4 957.482s + 6.057s 103 015 bytes
pingo 0.97e -s6 3.57s 108 696 bytes
pingo 0.97e -s9 5.22s 125 910 bytes

if pingo has better results on this specific sample, that does not mean it performs better on any files. here are results on sets you could find in this small benchmark:

0.97/0.97e on FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
profile 0.97 0.97e
-s1 479.84 KB 3.23 s 480.50 KB 3.27 s
-s2 508.23 KB 5.99 s 508.84 KB 6.09 s
-s3 512.89 KB 6.40 s 514.31 KB 6.78 s
-s4 522.86 KB 9.22 s 523.10 KB 9.43 s
-s5 524.52 KB 9.54 s 524.53 KB 9.81 s
-s6 525.84 KB 10.53 s 525.70 KB 10.40 s
-s7 526.27 KB 10.77 s 526.86 KB 11.26 s
-s8 526.63 KB 11.19 s 526.89 KB 11.44 s
-s9 527.67 KB 11.48 s 528.94 KB 12.37 s
corpus (380 files):
72xpaletted - 40xpaletted - 76xpaletted- 192xpaletted

cédric (dev) said on 07/06/2018

i changed something in 0.97.2 which should impact global re-ordering in all profiles. this could have only minor impact (better or worse) on files, but should help pingo to determine if the sorting worth it

paletted samples
sample A sample B

consider those files. if you evaluate them with a metric, it should return zero difference, they have the same filesize, etc. so at this stage, we could think that they are just identical — they are not. let's compare pingo -s8

how pingo optimizes palette
version options sample A sample B
0.85 -s8 5.71 KB saved (1.26s) 6.46 KB saved (1.26s)
0.90 -s8 5.67 KB saved (0.86s) 6.44 KB saved (0.81s)
0.95 -s8 5.15 KB saved (0.47s) 0 KB saved (0.41s)
0.97.1 -s8 3.89 KB saved (0.56s) 0 KB saved (0.42s)
0.97.2 -s8 6.65 KB saved (0.61s) 6.65 KB saved (0.60s)

0.97.1 is not able to optimize the sample B. this passive bug comes from 0.94: it seems that 0.93 (or lower) give better result here. this ability is restored (and improved) in 0.97.2

Lucas Valieri said on 12/06/2018

A little regression

zephyrprime_keyart.png
 
 pingo 0.98.41  -s9     762.482 bytes
 pingo 0.98.41  -s8     768.760 bytes
 -----
 pingo 0.98.42  -s9     768.760 bytes

cédric (dev) said on 12/06/2018

try 0.98.43

pingo
zephyrprime_keyart.png - G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
version option time savings
0.97e -s9 5.22s 122.96 KB
0.98.41 -s9 5.12s 122.96 KB
0.98.42 -s9 6.19s 116.83 KB
0.98.43 -s9 3.93s 122.96 KB

comment this