better PNG filtering

cédric (dev) said on 09/29/2017

pingo 0.91c comes with a better guessing for filtering selection. this improvement let pingo to be more accurate at the cost of a bit speed. the good news is that it does affect profiles from -s6 to higher

pingo 0.91f
original.png — FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
tool version binary options time saved
ECT df17d6e 64-bit (SSE4) -9 -s --mt-deflate=4 2.79s 28.45 KB
-9 -s --mt-deflate=4 --allfilters-c 4.01s 30.42 KB
-9 -s --mt-deflate=4 --allfilters 36.32s 36.50 KB
-5 -s --mt-deflate=4 --allfilters 6.99s 36.26 KB
Leanify 3460f4f 64-bit (default) 13.26s 32.62 KB
pingo 0.91 64-bit (default) -s8 1.37s 28.40 KB
pingo 0.91f 64-bit (default) -s6 1.21s 36.26 KB
pingo 0.91f 64-bit (default) -s7 1.44s 36.52 KB
pingo 0.91f 64-bit (default) -s8 1.73s 36.54 KB
pingo 0.92 64-bit (default) -s6 0.92s 36.24 KB
pingo 0.92 64-bit (default) -s7 1.09s 36.51 KB
pingo 0.92 64-bit (default) -s8 1.41s 36.58 KB

it does affect some samples more than other, most of the time with minor benefits (or just nothing). a brute force way from ECT could probably go further (and/or ECT should compress more on other samples)

cédric (dev) said on 10/09/2017

pingo 0.92 uses an improved solution that offers a better speed/size ratio than 0.91r. regarding filesize, it should be smaller or bigger but it should be faster in most of cases. note that the following test is done on another computer (slower and 32-bit)

pingo 0.91r / 0.92
original.png — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
pingo 0.91f (up to 0.91r) 32-bit (default) -s8 2.51s 36.54 KB
pingo 0.92 32-bit (default) -s8 1.85s 36.58 KB
pingo 0.92d 32-bit (default) -s8 1.65s 36.58 KB

new filter, sample 1

pingo 0.92, new filter: worse compression, faster speed
girl-blue-original.png — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
ECT 9290a5c 32-bit (SSE2) -1 0.28s 2.61 KB
TruePNG 0.6.2.2 32-bit (default) -a0 1.03s 3.39 KB
pingo 0.92 32-bit (default) -s0 0.19s 5.40 KB
TruePNG 0.6.2.2 32-bit (default) -a1 4.30s 6.53 KB
TruePNG 0.6.2.2 32-bit (default) -a1 -fe 14.66s 6.56 KB
pingo 0.92 32-bit (default) -s1 0.31s 8.72 KB
ECT 9290a5c 32-bit (SSE2) -5 -s --allfilters-c --mt-deflate=4 2.37s 11.97 KB
pingo 0.92 32-bit (default) -s8 1.62s 12.14 KB
pingo 0.92d 32-bit (default) -s8 1.50s 12.14 KB
pingo 0.91 32-bit (default) -s8 2.40s 12.53 KB
ECT 9290a5c 32-bit (SSE2) -9 -s --allfilters-c --mt-deflate=4 5.03s 12.62 KB
ECT 9290a5c 32-bit (SSE2) -5 -s --allfilters --mt-deflate=4 8.42s 12.76 KB

new filter, sample 2

pingo 0.92, new filter: better compression, faster speed
tweet-original.png — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
ECT 9290a5c 32-bit (SSE2) -1 0.19s 1.27 KB
TruePNG 0.6.2.2 32-bit (default) -a0 0.44s 1.61 KB
TruePNG 0.6.2.2 32-bit (default) -a1 1.59s 1.85 KB
TruePNG 0.6.2.2 32-bit (default) -a1 -fe 5.37s 1.85 KB
pingo 0.92 32-bit (default) -s0 0.11s 2.96 KB
pingo 0.92 32-bit (default) -s1 0.14s 3.84 KB
ECT 9290a5c 32-bit (SSE2) -5 -s --allfilters-c --mt-deflate=4 1.36s 4.84 KB
pingo 0.91 32-bit (default) -s8 0.97s 4.91 KB
ECT 9290a5c 32-bit (SSE2) -5 -s --allfilters --mt-deflate=4 3.72s 4.93 KB
ECT 9290a5c 32-bit (SSE2) -9 -s --allfilters-c --mt-deflate=4 2.22s 4.96 KB
pingo 0.92 32-bit (default) -s8 0.92s 5.20 KB
pingo 0.92d 32-bit (default) -s8 0.58s 5.20 KB

cédric (dev) said on 10/20/2017

i am currently testing another trick to run this even faster. this *should* not affect filesize (not too much, thought) because it is just speed optimization

pingo 0.92d, new filter: faster speed, same results
chrome-original.png — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
pingo 0.92c 32-bit (default) -s6 1.44s 19.72 KB
pingo 0.92d 32-bit (default) -s6 1.20s 19.72 KB
pingo 0.92c 32-bit (default) -s7 1.65s 20.02 KB
pingo 0.92d 32-bit (default) -s7 1.40s 20.02 KB
pingo 0.92c 32-bit (default) -s8 1.90s 20.07 KB
pingo 0.92d 32-bit (default) -s8 1.68s 20.07 KB
pingo 0.92d, new filter: faster speed, tiny worse results with -s6
72x RGBA — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
pingo 0.92c 32-bit (default) -s6 7.00s 218.59 KB
pingo 0.92d 32-bit (default) -s6 5.68s 218.57 KB
pingo 0.92c 32-bit (default) -s7 8.39s 221.03 KB
pingo 0.92d 32-bit (default) -s7 7.13s 221.03 KB
pingo 0.92c 32-bit (default) -s8 9.25s 221.49 KB
pingo 0.92d 32-bit (default) -s8 8.14s 221.49 KB
pingo 0.92d, new filter: faster speed, tiny better results with -s6, -s7, -s8
10x H-RGBA — G1820 @ 2.7 Ghz - 2 Go RAM - Windows 7 32-bit
pingo 0.92c 32-bit (default) -s6 9.67s 187.65 KB
pingo 0.92d 32-bit (default) -s6 8.28s 187.80 KB
pingo 0.92c 32-bit (default) -s7 11.54s 196.33 KB
pingo 0.92d 32-bit (default) -s7 10.22s 196.50 KB
pingo 0.92c 32-bit (default) -s8 14.59s 201.40 KB
pingo 0.92d 32-bit (default) -s8 13.26s 201.44 KB

64-bit

pingo 0.92d (better than original zopfli on this sample)
chrome-original.png — FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
tool version binary options time saved
PNGOptimizer 2d68934 64-bit (default) 0.57s no savings
ECT df17d6e 64-bit (SSE4) -1 0.35s 2.94 KB
Leanify 3460f4f 64-bit (default) 16.70s 13.29 KB
ZopfliPNG 64c6f36 64-bit (default) -m -lossy_transparent 17.60s 13.39 KB
pingo 0.92d 64-bit (default) -s0 0.20s 13.67 KB

cédric (dev) said on 02/02/2018

improvements in 0.95m for lower profiles (-s1, -s2 and -s3)

pingo 0.95k / 0.95m, FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
-s1 profile
corpus 12x RGBA 72x RGBA 10x H-RGBA 10x RGBA
pingo 0.95k 356.54KB 1.82s 193.60KB 0.87s 157.29KB 1.90s 25.44KB 0.50s
pingo 0.95m 356.54KB 1.69s 193.83KB 0.83s 157.29KB 1.75s 25.44KB 0.45s
-s2 profile
corpus 12x RGBA 72x RGBA 10x H-RGBA 10x RGBA
pingo 0.95k 365.60KB 1.92s 200.60KB 1.03s 169.75KB 2.09s 29.49KB 0.54s
pingo 0.95m 365.60KB 1.86s 200.91KB 0.98s 169.75KB 1.98s 29.49KB 0.55s
-s3 profile
corpus 12x RGBA 72x RGBA 10x H-RGBA 10x RGBA
pingo 0.95k 370.79KB 2.63s 211.90KB 2.21s 176.59KB 2.80s 32.08KB 1.15s
pingo 0.95m 370.79KB 2.52s 212.52KB 2.14s 176.59KB 2.60s 32.08KB 1.12s

more accurate detection

pingo 0.95k / 0.95m
original.png — FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
0.95k wrong -s1 0.21s 27.17 KB
0.95m good -s1 0.35s 35.17 KB
0.95k wrong -s2 0.30s 27.61 KB
0.95m good -s2 0.46s 35.64 KB
0.95k wrong -s3 0.57s 27.86 KB
0.95m good -s3 0.77s 35.98 KB
pingo 0.95k / 0.95m
big-twitter.png — FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
0.95k wrong -s1 0.30s 0.00 KB
0.95m good -s1 0.56s 17.00 KB
0.95k wrong -s2 0.59s 0.00 KB
0.95m good -s2 0.97s 24.39 KB
0.95k wrong -s3 0.59s 0.00 KB
0.95m good -s3 0.96s 24.39 KB

cédric (dev) said on 02/03/2018

improvements in 0.95p for -s0: it auto-selects fast RGB transformations

pingo 0.95m / 0.95p, FX-4100 @ 3.6 Ghz - 8 Go RAM - Windows 7 64-bit
-s0 profile
corpus 12x RGBA 72x RGBA 10x H-RGBA 10x RGBA
pingo 0.95m 302.02KB 1.23s 173.83KB 0.58s 108.55KB 1.12s 14.15KB 0.32s
pingo 0.95p 303.43KB 1.21s 175.16KB 0.63s 113.60KB 1.05s 14.26KB 0.34s

comment this