Imagemagick Resampling

When doing it with mpv isn't enough...

Introduction

Some of you may be familiar with the blog post I made about mpv's scaling filters a while ago, but it was never really meant to be shared as much as it did. That page was originally the result of an assignment that I had during my undergrad, when I was formally studying digital image processing for the first time. Naturally, it was full of mistakes and the results weren't particularly scientific. I got a lot of feedback and the page ended up evolving in an organic way, but it still has some fundamental issues that can not be fixed without a major change in the methodology.

From the top of my head I can enumerate the following problems:

  1. The resampling tests were only done for 1 or 2 test images.
  2. I kept adding meme metrics since people requested them.
  3. I also kept adding meme shaders since people requested them.
  4. I created a bunch of different test cases but didn't really have the motivation to keep all of them up to date.
  5. I originally used Matlab to compute the metrics and Excel to plot the tables, which means updating the page was a major pain in the ass.
  6. I used catmull-rom in gamma light to downscale the test images.

So, to address these issues:

  1. I chose to use the entire Manga109 dataset this time, which is probably the best widely-used dataset we have for line-art.
  2. I chose to stick to the standard distortion metrics, I'll think about adding perception metrics to the mix but only if they have decent Python implementations and actually compute at a decent speed.
  3. I chose to exclude all meme shaders. I'll stick to resampling filters this time. A following work with meme shaders isn't out of question but this is harder to automate.
  4. I chose to only test a few things, don't ask me to add your niche use cases because I won't do it.
  5. I chose to automate everything with a Python script so updating the numbers should be trivial (if this is needed at all).
  6. I chose to use the box filter in linear light to downsample this time. Since I'm downsampling to 0.5x this should be a simple average of 4 pixels, which is the best case scenario and doesn't introduce any blurriness or ringing.

With that out of the way, we can proceed with the real introduction.

Resampling

Resampling is the process of changing the number of samples of a discrete signal to obtain a new discrete representation of the underlying continuous signal. This definition comes from the idea of having a sensor of some kind producing an analog continuous-time voltage/current variance which is then periodically sampled and quantised into predefined amplitude levels so we can store it in bits/bytes.

The easiest and most classic way of resampling to a higher sample rate is via linear interpolation, if you want to find a value between two points you can simply draw a line between them. Linear interpolation can be done in a cartesian plane, through both axis, creating what we call "bilinear" interpolation. Bilinear interpolation is the simplest interpolation algorithm, the easiest to compute and probably the most widespread one.

But can something as simple as just drawing a line between 2 points give us good results? Sometimes it does, sometimes it doesn't. It really depends on the signal. Instead of taking 2 points and drawing a line, we could take more than 2 points and draw a higher-degree curve. The shape of the curve depends on the weights used in the calculation, and these weights depend on the chosen filter. The number of input samples in the calculation depends on the length/radius/support of the filter. If you want to understand how this is actually done, I suggest simply reading this explanation.

In short, the most common way of resampling images is treating each row/column as an independent 1-D signal and simply going over all of them until you have resampled the entire image. This means you have to choose a dimension to resample first but this is pretty much inconsequential to the end result. The resampling algorithm itself is pretty simple, for each output sample you simply centralise the filter on top of it and see which input samples end up inside of the window after you compute their equivalent positions, then you multiply these inputs by their corresponding weights depending on their distance to the centre.

There's a different method, usually called polar/cylindrical/elliptical resampling, that does the operation in the 2-D domain directly. The only difference here is that all samples that fit inside the 2-D filter will now be weighted simultaenously, which may drastically change how some filters behave since we're only calling the filter once per output pixel rather than twice as before.

Ortho_polar_comparison

In this page I'll include results for both orthogonal and polar resampling. It's important to note that polar resampling implementations are generally slower and most filters weren't really designed to be used in this way (the most "famous" exception being polar lanczos, since we replace the sinc function with its 2-D "equivalent", the jinc/sombrero function).

Upsampling Methodology

As stated before the entire Manga109 dataset will be used in this comparison. This dataset has manga covers that look like this:

Manga109

The dataset is downscaled with:

magick mogrify -colorspace rgb -filter box -resize 50% -colorspace srgb -path low_res inputs/*.png

The dataset is then brought back up with the following command for orthogonal resampling:

magick mogrify -colorspace RGB +sigmoidal-contrast 7.5 -filter {resampling_filter} -resize 200% -sigmoidal-contrast 7.5 -colorspace sRGB -path high_res low_res/*.png

And the following command for polar resampling:

magick mogrify -colorspace RGB +sigmoidal-contrast 7.5 -filter {resampling_filter} -distort Resize 200% -sigmoidal-contrast 7.5 -colorspace sRGB -path high_res low_res/*.png

The {resampling_filter} argument is replaced by all available resampling filters: ['Bartlett', 'Blackman', 'Bohman', 'Box', 'Catrom', 'Cosine', 'Cubic', 'Gaussian', 'Hamming', 'Hann', 'Hermite', 'Jinc', 'Kaiser', 'Lagrange', 'Lanczos', 'Lanczos2', 'Lanczos2Sharp', 'LanczosRadius', 'LanczosSharp', 'Mitchell', 'Parzen', 'Point', 'Quadratic', 'Robidoux', 'RobidouxSharp', 'Sinc', 'SincFast', 'Spline', 'CubicSpline', 'Triangle', 'Welch']

For a much better explanation of these filters, please check this page from Imagemagick.

The result is then evaluated with MAE, PSNR, SSIM and MS-SSIM. All calculations are done with all three RGB channels and in double-precision floating-point for accuracy. Metrics are normalised between [0, 1] and then averaged together with a normal arithmetic mean.

Please note that some of those filters are just aliases to other filters or aliases to sinc/jinc with different windows.

LanczosRadius is supposed to be a jinc-windowed jinc sharpened to have its third zero crossing at 3 instead of 3.2383154841662362.

LanczosSharp and Lanczos2Sharp are both supposed to be sharper jinc-windowed jincs since the original filters ended up too soft when compared to their sinc-windowed sinc orthogonal equivalents.

Upsampling Results

FilterMAEPSNRSSIMMS-SSIMMAE (N)PSNR (N)SSIM (N)MS-SSIM (N)Mean
LanczosSharp0.016330.39920.93910.99521.00000.99730.97160.99440.9908
Polar_Catrom0.016430.28000.94110.99530.98600.97041.00001.00000.9891
Hamming0.016330.41130.93840.99500.99321.00000.96210.98260.9845
Cosine0.016430.37860.93780.99500.98590.99260.95360.97960.9779
Polar_Lagrange0.016630.35810.93960.99490.96860.98800.97840.97660.9779
Welch0.016430.39150.93760.99490.98340.99550.94970.97810.9767
Lanczos0.016430.32210.93790.99490.98370.97990.95500.97910.9744
LanczosRadius0.016430.32210.93790.99490.98370.97990.95500.97910.9744
Hann0.016530.29870.93770.99480.97790.97460.95150.97160.9689
Kaiser0.016530.27330.93780.99480.97840.96880.95330.97310.9684
Bartlett0.016530.22820.93770.99490.97640.95860.95200.97520.9656
Blackman0.016630.13540.93720.99470.96360.93770.94400.96670.9530
Bohman0.016630.10850.93710.99470.96130.93160.94250.96670.9505
Polar_LanczosRadius0.016730.16680.93550.99460.95360.94480.92050.95700.9440
Parzen0.016829.99880.93640.99460.94730.90680.93290.96120.9371
Polar_RobidouxSharp0.017229.73040.93370.99460.90700.84620.89510.95980.9020
Lanczos2Sharp0.017229.63610.93440.99470.89930.82490.90480.96440.8984
Polar_CubicSpline0.017629.77380.93690.99330.85870.85600.94080.88110.8842
Sinc0.017930.22490.92970.99370.83270.95790.83770.90460.8832
SincFast0.017930.22490.92970.99370.83270.95790.83770.90460.8832
Polar_LanczosSharp0.017329.86970.93080.99360.88880.87770.85340.89820.8795
CubicSpline0.017529.61010.93180.99380.87650.81900.86760.91290.8690
Lanczos20.017529.54220.93220.99400.87080.80370.87410.92480.8683
Catrom0.017529.47880.93200.99400.86870.78940.87050.92610.8637
Polar_Cosine0.017629.75210.92890.99320.86270.85110.82660.87490.8538
Polar_Lanczos0.017629.75210.92890.99320.86270.85110.82660.87490.8538
Polar_Welch0.017629.75210.92890.99320.86270.85110.82660.87490.8538
Polar_Hann0.017729.65350.92850.99310.85110.82880.82140.87130.8432
Polar_Bartlett0.017729.60430.92870.99310.85130.81770.82310.87120.8408
Polar_Mitchell0.017929.29880.92900.99350.82950.74870.82830.89560.8255
Polar_Hamming0.017929.52290.92750.99280.83220.79930.80620.85090.8222
Polar_Kaiser0.018029.44930.92720.99270.82480.78270.80220.85060.8151
Polar_Lanczos2Sharp0.018229.21630.92690.99290.80150.73010.79810.85710.7967
Lagrange0.018229.19060.92670.99260.79760.72430.79590.84130.7898
Polar_Blackman0.018429.17890.92510.99230.78520.72160.77280.82600.7764
Polar_Robidoux0.018429.02130.92580.99270.77660.68600.78180.84890.7733
Polar_Bohman0.018429.14700.92490.99230.78080.71440.76980.82520.7725
Polar_Lanczos20.018728.99920.92320.99190.75110.68100.74500.80070.7445
Polar_Parzen0.018828.88980.92270.99180.73970.65630.73830.79600.7326
RobidouxSharp0.019328.56010.92040.99120.68570.58180.70540.76140.6836
Polar_Hermite0.020227.76270.91800.99330.59370.40170.67130.88060.6369
Mitchell0.019928.30030.91660.99030.62880.52320.65170.70600.6274
Hermite0.020227.88600.91700.99230.59530.42960.65710.82170.6259
Robidoux0.020228.13640.91410.98960.59160.48610.61610.66890.5907
Polar_Triangle0.020727.83750.91220.99000.54790.41860.58880.69030.5614
Box0.022026.52600.91050.99500.41010.12240.56490.98010.5194
Point0.022026.52600.91050.99500.41010.12240.56490.98010.5194
Polar_Box0.022026.52600.91050.99500.41010.12240.56490.98010.5194
Polar_Point0.021327.63690.90690.98800.47620.37330.51400.57150.4838
Polar_Sinc0.021327.63690.90690.98800.47620.37330.51400.57150.4838
Polar_SincFast0.021327.63690.90690.98800.47620.37330.51400.57150.4838
Triangle0.021327.63690.90690.98800.47620.37330.51400.57150.4838
Jinc0.021427.81150.90490.98680.46810.41270.48540.50260.4672
Polar_Quadratic0.022827.09150.89600.98500.32850.25010.35770.39920.3339
Polar_Gaussian0.023126.94360.89410.98490.29390.21670.33080.39260.3085
Gaussian0.023126.94240.89400.98490.29360.21640.33050.39230.3082
Quadratic0.023326.88720.89200.98400.27180.20400.30090.33670.2783
Polar_Jinc0.024726.91150.89530.98200.12850.20950.34890.22310.2275
Polar_Cubic0.025526.12760.87410.97920.04460.03240.04720.05530.0449
Polar_Spline0.025526.12760.87410.97920.04460.03240.04720.05530.0449
Cubic0.026025.98430.87080.97820.00000.00000.00000.00000.0000
Spline0.026025.98430.87080.97820.00000.00000.00000.00000.0000

Upsampling Commentary

We started this little adventure to provide a bit more robustness to the numbers compared to my previous comparison done using mpv, and there are a few surprises in the results. The first one is that orthogonal LanczosSharp ended up on top of the table, an option that shouldn't even be in the comparison. If you want to give it a try on mpv, you can use --scale=lanczos --scale-blur=0.9812505644269356.

The second surprise is that Polar_Catrom is at the second place with the best scores for both SSIM and MS-SSIM. Polar_Catrom is an extremely sharp filter, albeit with too much aliasing and ringing. Since it only rings once (when it overshoots), it's trivial to remove all ringing if we just clip the output of each pixel to the limits set by the surrounding input pixels. I have actually written a shader for it, but you can also just use libplacebo's native AR now. If you want to give the filter a try on mpv, you can use --scale=ewa_robidoux --scale-param1=0 --scale-param2=0.5.

The third surprise is that there are some Lanczos variants on top of the baseline Lanczos: Hamming, Cosine and Welch. For the last two all you gotta do is set --scale=lanczos --scale-window=window, but for the first you also have to set --scale-radius=4.

Finally, it's also a little bit funny to see Lanczos2 beating Catmull_Rom in the "orthogonal filters with radius = 2" category. And it's even funnier that Lanczos2Sharp beat them both, another filter that shouldn't even be here. To use Lanczos2 on mpv you can simply do --scale=lanczos --scale-radius=2, and for Lanczos2Sharp add --scale-blur=0.9549963639785485.

You might have also noticed that Polar_LanczosRadius is on top of Polar_LanczosSharp. Polar_LanczosRadius is sharpened to have its third zero crossing exactly at 3, which corresponds to a blur factor of 0.9264075766146068 (3.0/3.2383154841662362). To try it on mpv you can use --scale=ewa_lanczos --scale-blur=0.9264075766146068. On --vo=gpu you also have to specify --scale-radius=3.2383154841662362.

Overall it looks like the filters were more or less ranked from sharpest to softest. This makes perfect sense as the main problem with image upsampling is unintended blurriness. PSNR is calculated using the MSE so it's overly sensitive to outliers, which naturally end up being the sharp transitions. SSIM was also literally designed to give good scores to distorted images that have similarish structures, which also favours sharpness. MAE is a little more neutral than PSNR which is why we generally use it as our target when training distortion-based CNNs, so it should punish filters that are too sharp in detriment to ringing, aliasing and blocking. That's the main reason why most of those CNNs end up generating images that look like oil painting. MS-SSIM was added to neutralise stupidly minor differences as we wouldn't be able to see them anyway from normal viewing distances.

As usual, the filters are ranked based on full-reference distortion metrics that may not always correlate with the human perception of quality, and your personal preference is entirely subjective.

Please also keep in mind that only the named filters were "benchmarked", but you can make "custom" filters that score even higher using the expert controls.

Downsampling Methodology

The problem with downsampling evaluation is that we do not have a "ground truth" image, making it impossible for us to use the standard full-reference quality metrics. However, in the previous section I already made the concession that, at a 0.5x scaling ratio, the box filter is as good as it gets without producing any artifacts.

If we make the leap of faith that resampling filters will more or less keep their character regardless of scaling factor, we can use the output of box as the reference and compare other filters against it. So, in short, we want to find a filter that, at any scaling factor, will behave similarly to box at 0.5x. I'm calling this a leap of faith because this hypothesis most likely falls apart with extreme scaling factors, but as long as you keep it close to 0.5x it probably makes sense.

With that said, now we can proceed to the actual methodology. The reference is created with:

magick mogrify -colorspace RGB -filter box -resize 50% -colorspace sRGB -path box_ref inputs/*.png

The dataset is then brought downscaled with the following command for orthogonal resampling:

magick mogrify -colorspace RGB -filter {resampling_filter} -resize 50% -colorspace sRGB -path low_res box_ref/*.png

And the following command for polar resampling:

magick mogrify -colorspace RGB -filter {resampling_filter} -distort Resize 50% -colorspace sRGB -path low_res box_ref/*.png

The {resampling_filter} argument is replaced by all available resampling filters. Everything else about the process stays the same, so we can proceed to the results.

Downsampling Results

Filter MAE PSNR SSIM MS-SSIM MAE (N) PSNR (N) SSIM (N) MS-SSIM (N) Mean
Lanczos2Sharp 0.0052 38.1382 0.9941 0.9993 1.0000 1.0000 1.0000 1.0000 1.0000
Polar_Hermite 0.0058 37.9258 0.9937 0.9992 0.9632 0.9819 0.9936 0.9917 0.9826
Lanczos2 0.0062 37.1522 0.9924 0.9992 0.9443 0.9162 0.9718 0.9844 0.9542
Catrom 0.0062 37.1418 0.9923 0.9992 0.9403 0.9153 0.9704 0.9859 0.9530
Polar_Mitchell 0.0067 36.8261 0.9913 0.9992 0.9109 0.8885 0.9544 0.9830 0.9342
Polar_Robidoux 0.0071 36.7730 0.9905 0.9992 0.8899 0.8839 0.9428 0.9815 0.9246
Polar_RobidouxSharp 0.0068 35.8553 0.9911 0.9989 0.9084 0.8059 0.9514 0.9572 0.9057
Hermite 0.0072 36.1529 0.9909 0.9989 0.8826 0.8312 0.9487 0.9525 0.9037
Polar_Lanczos2Sharp 0.0075 36.0829 0.9895 0.9990 0.8669 0.8253 0.9267 0.9674 0.8966
Parzen 0.0070 35.6504 0.9901 0.9989 0.8945 0.7885 0.9355 0.9541 0.8931
CubicSpline 0.0075 35.5080 0.9895 0.9989 0.8693 0.7764 0.9271 0.9495 0.8806
RobidouxSharp 0.0081 35.7346 0.9883 0.9989 0.8346 0.7957 0.9074 0.9515 0.8723
Lagrange 0.0079 35.5718 0.9885 0.9989 0.8424 0.7818 0.9097 0.9545 0.8721
Bohman 0.0075 35.1306 0.9889 0.9988 0.8693 0.7443 0.9169 0.9404 0.8677
Polar_Parzen 0.0083 35.5274 0.9875 0.9989 0.8209 0.7781 0.8953 0.9557 0.8625
Blackman 0.0076 34.9680 0.9885 0.9987 0.8603 0.7305 0.9104 0.9359 0.8593
Polar_Bohman 0.0085 35.0970 0.9870 0.9988 0.8098 0.7415 0.8860 0.9451 0.8456
Polar_Lanczos2 0.0086 35.1477 0.9868 0.9988 0.8051 0.7458 0.8837 0.9454 0.8450
Polar_Blackman 0.0085 35.0236 0.9867 0.9988 0.8059 0.7352 0.8822 0.9434 0.8417
Mitchell 0.0089 34.8539 0.9862 0.9986 0.7878 0.7208 0.8735 0.9231 0.8263
Bartlett 0.0083 34.4112 0.9863 0.9986 0.8201 0.6832 0.8749 0.9213 0.8249
LanczosSharp 0.0082 34.0812 0.9871 0.9985 0.8260 0.6551 0.8888 0.9062 0.8190
Kaiser 0.0083 34.2164 0.9865 0.9985 0.8174 0.6666 0.8785 0.9132 0.8189
Lanczos 0.0085 33.9740 0.9864 0.9984 0.8108 0.6460 0.8771 0.9019 0.8090
LanczosRadius 0.0085 33.9740 0.9864 0.9984 0.8108 0.6460 0.8771 0.9019 0.8090
Polar_Kaiser 0.0091 34.3405 0.9850 0.9986 0.7763 0.6772 0.8552 0.9209 0.8074
Polar_LanczosRadius 0.0086 33.9796 0.9860 0.9985 0.8017 0.6465 0.8706 0.9045 0.8058
Hann 0.0087 33.9521 0.9855 0.9985 0.7991 0.6442 0.8628 0.9039 0.8025
Polar_Triangle 0.0092 34.2415 0.9855 0.9983 0.7667 0.6688 0.8620 0.8905 0.7970
Robidoux 0.0094 34.2906 0.9846 0.9984 0.7560 0.6729 0.8488 0.9015 0.7948
Polar_Bartlett 0.0094 34.0368 0.9834 0.9985 0.7544 0.6514 0.8287 0.9081 0.7857
Polar_Hamming 0.0095 34.0253 0.9833 0.9985 0.7505 0.6504 0.8279 0.9084 0.7843
Cosine 0.0090 33.5286 0.9848 0.9983 0.7793 0.6082 0.8513 0.8854 0.7810
Hamming 0.0091 33.4712 0.9845 0.9983 0.7719 0.6033 0.8461 0.8848 0.7765
Polar_Hann 0.0096 33.5662 0.9837 0.9983 0.7458 0.6114 0.8343 0.8869 0.7696
Welch 0.0093 33.2882 0.9838 0.9982 0.7603 0.5877 0.8351 0.8757 0.7647
Polar_LanczosSharp 0.0097 33.4770 0.9829 0.9983 0.7408 0.6038 0.8220 0.8849 0.7629
Polar_Cosine 0.0101 33.2418 0.9816 0.9982 0.7157 0.5838 0.8010 0.8751 0.7439
Polar_Lanczos 0.0101 33.2418 0.9816 0.9982 0.7157 0.5838 0.8010 0.8751 0.7439
Polar_Welch 0.0101 33.2418 0.9816 0.9982 0.7157 0.5838 0.8010 0.8751 0.7439
Triangle 0.0109 32.7732 0.9802 0.9977 0.6654 0.5440 0.7781 0.8165 0.7010
Jinc 0.0130 31.9518 0.9720 0.9972 0.5428 0.4741 0.6486 0.7583 0.6060
Polar_Quadratic 0.0134 31.1154 0.9713 0.9966 0.5229 0.4030 0.6371 0.6968 0.5649
Polar_Gaussian 0.0139 30.7778 0.9699 0.9963 0.4927 0.3743 0.6139 0.6610 0.5355
Gaussian 0.0139 30.7743 0.9698 0.9963 0.4923 0.3740 0.6136 0.6606 0.5352
Quadratic 0.0143 30.5742 0.9680 0.9961 0.4710 0.3570 0.5837 0.6441 0.5139
Polar_Catrom 0.0152 28.6343 0.9718 0.9945 0.4166 0.1921 0.6454 0.4575 0.4279
Polar_Lagrange 0.0155 28.6092 0.9706 0.9942 0.3998 0.1900 0.6252 0.4303 0.4113
Polar_Cubic 0.0176 28.8798 0.9528 0.9942 0.2794 0.2130 0.3432 0.4332 0.3172
Polar_Spline 0.0176 28.8798 0.9528 0.9942 0.2794 0.2130 0.3432 0.4332 0.3172
Cubic 0.0182 28.5882 0.9499 0.9938 0.2426 0.1882 0.2965 0.3843 0.2779
Spline 0.0182 28.5882 0.9499 0.9938 0.2426 0.1882 0.2965 0.3843 0.2779
Sinc 0.0191 27.6882 0.9545 0.9922 0.1896 0.1117 0.3702 0.2120 0.2209
SincFast 0.0191 27.6882 0.9545 0.9922 0.1896 0.1117 0.3702 0.2120 0.2209
Polar_Jinc 0.0196 28.1480 0.9518 0.9910 0.1603 0.1508 0.3266 0.0684 0.1765
Polar_CubicSpline 0.0205 26.4606 0.9557 0.9908 0.1098 0.0074 0.3895 0.0525 0.1398
Point 0.0223 26.3740 0.9313 0.9903 0.0000 0.0000 0.0000 0.0000 0.0000

Downsampling Commentary

The results were relatively predictable, but there are a few things worth mentioning. The first one is that this methodology seems to really like filters with only 1 or 2 lobes. Lanczos2Sharp (--dscale=lanczos --dscale-radius=2 --dscale-blur=0.9549963639785485) takes the podium but the other BC-Splines and the normal Lanczos2 aren't far behind.

It's a bit funny to see Polar_Hermite (--dscale=ewa_robidoux --dscale-param1=0 --dscale-param2=0) so high in the list, but I actually think it makes sense. Hermite is a short (radius 1) filter that does not have any negative sections, so using it to downscale makes the operation very close to an "area average", the only difference is that pixels are weighted based on distance. Hermite is usually described as a "smooth" triangle, but it scores much higher than the "normal" triangle.

It's also interesting to see that Polar_Robidoux wasn't the best polar BC-Spline, considering the filter was specifically designed to be good at -distort. In any case, it's only slightly behind Polar_Mitchell. Please remember that these BC-splines are all sharper when used in polar resampling, so Polar_Mitchell is actually a pretty sharp filter and a relatively good alternative to Catrom if you prefer the look of polar resampling.

Both Lanczos (orthogonal sinc-sinc) and Polar_Lanczos (polar jinc-jinc) were pretty mediocre at best. That's a good thing because it shows the methodology can actually punish filters that are too sharp. Polar_Catrom, the sharpest filter we have in the comparison, actually scored as one of the worst filters.

I think this test solidifies what was already common knowledge. BC-Splines are very good at downsampling, and pretty much all of them are near the top.

Outro

Nicolas Robidoux has a personal page with some recommendations, if you want a more qualitative approach.