This page is meant to be treated as a follow-up to this scaler comparison done with ImageMagick. I'm only going to talk about niche topics here, so just refer to that other page if you only want to read about scalers.
The mpv community has written several meme shaders you can use to upsample video in real time, and the rationale is that they can potentially achieve higher quality than the built-in methods. I still do not have a good way of
automating mpv tests and therefore I'll have to stick to a single test image, which is still going to be violet.png just because I'm familiarised with it. If this is your first time reading these posts, this is what violet.png looks like:
The small number of samples under test makes this very unscientific, but it is what it is.
The following shaders were "benchmarked":
I've also added orthogonal and polar lanczossharp to the mix just to have a reference point, but feel free to send me any other shaders you would like to get scored. For Anime4K I'm using the "C" mode which is defined asThe test image is downsampled with:
magick convert violet.png -colorspace rgb -filter box -resize 50% -colorspace srgb downscaled.png
It is then converted to grayscale with:
magick convert downscaled.png -colorspace gray downscaled.png
The original image is also converted to grayscale using the same command to create the reference:
magick convert violet.png -colorspace gray reference.png
We need to convert them to grayscale because some of these shaders do not have RGB support.
The images were then upsampled back with:
mpv --no-config --vo=gpu-next --no-hidpi-window-scale --window-scale=2.0 --pause=yes --screenshot-format=png --sigmoid-upscaling --deband=no --dither-depth=no --screenshot-high-bit-depth=no --glsl-shader="path/to/meme/shader" downscaled.png
Shader/Filter | MAE | PSNR | SSIM | MS-SSIM | MAE (N) | PSNR (N) | SSIM (N) | MS-SSIM (N) | Mean | ||
ravu-lite-ar-r4 | 3.46E-03 | 41.3618 | 0.9887 | 0.9987 | 0.9925 | 1.0000 | 0.9876 | 0.9371 | 0.9793 | ||
ravu-zoom-ar-r3 | 3.44E-03 | 41.2143 | 0.9889 | 0.9987 | 1.0000 | 0.9762 | 1.0000 | 0.9369 | 0.9783 | ||
ravu-lite-ar-r3 | 3.55E-03 | 41.0825 | 0.9883 | 0.9988 | 0.9614 | 0.9549 | 0.9573 | 0.9446 | 0.9546 | ||
ravu-lite-r4 | 3.72E-03 | 41.2500 | 0.9876 | 0.9989 | 0.9002 | 0.9819 | 0.9122 | 0.9861 | 0.9451 | ||
ravu-zoom-r3 | 3.79E-03 | 41.0404 | 0.9872 | 0.9990 | 0.8743 | 0.9481 | 0.8825 | 1.0000 | 0.9262 | ||
FSRCNNX_x2_8-0-4-1 | 3.77E-03 | 40.9503 | 0.9882 | 0.9987 | 0.8803 | 0.9336 | 0.9549 | 0.9171 | 0.9215 | ||
ravu-zoom-ar-r2 | 3.63E-03 | 40.5841 | 0.9877 | 0.9987 | 0.9295 | 0.8745 | 0.9196 | 0.9264 | 0.9125 | ||
ravu-lite-r3 | 3.81E-03 | 40.9251 | 0.9870 | 0.9989 | 0.8649 | 0.9295 | 0.8716 | 0.9785 | 0.9111 | ||
ravu-lite-ar-r2 | 3.73E-03 | 40.2182 | 0.9872 | 0.9988 | 0.8951 | 0.8154 | 0.8867 | 0.9513 | 0.8871 | ||
ravu-zoom-r2 | 3.94E-03 | 40.4724 | 0.9866 | 0.9989 | 0.8192 | 0.8564 | 0.8432 | 0.9757 | 0.8736 | ||
ravu-lite-r2 | 4.03E-03 | 40.0375 | 0.9856 | 0.9988 | 0.7873 | 0.7862 | 0.7780 | 0.9626 | 0.8285 | ||
FSRCNNX_x2_16-0-4-1 | 4.03E-03 | 39.9168 | 0.9873 | 0.9984 | 0.7864 | 0.7667 | 0.8952 | 0.8479 | 0.8240 | ||
EASU | 4.33E-03 | 38.8482 | 0.9841 | 0.9981 | 0.6785 | 0.5943 | 0.6789 | 0.7718 | 0.6809 | ||
lanczos | 4.67E-03 | 38.6816 | 0.9826 | 0.9984 | 0.5566 | 0.5674 | 0.5795 | 0.8360 | 0.6349 | ||
nnedi3-nns64-win8x4 | 4.41E-03 | 38.6337 | 0.9839 | 0.9973 | 0.6516 | 0.5596 | 0.6686 | 0.5421 | 0.6055 | ||
nnedi3-nns32-win8x4 | 4.46E-03 | 38.5214 | 0.9835 | 0.9973 | 0.6342 | 0.5415 | 0.6428 | 0.5334 | 0.5880 | ||
polar_lanczossharp | 4.79E-03 | 38.3683 | 0.9818 | 0.9982 | 0.5140 | 0.5168 | 0.5286 | 0.7725 | 0.5830 | ||
FSR | 5.10E-03 | 39.0008 | 0.9817 | 0.9981 | 0.4036 | 0.6189 | 0.5228 | 0.7581 | 0.5759 | ||
Anime4K_C_S | 5.60E-03 | 36.3773 | 0.9783 | 0.9972 | 0.2242 | 0.1954 | 0.2920 | 0.5063 | 0.3045 | ||
Anime4K_C_L | 5.83E-03 | 35.9398 | 0.9767 | 0.9968 | 0.1417 | 0.1248 | 0.1878 | 0.4026 | 0.2142 | ||
Anime4K_C_VL | 5.94E-03 | 35.6782 | 0.9766 | 0.9963 | 0.1015 | 0.0826 | 0.1818 | 0.2603 | 0.1565 | ||
Anime4K_C_M | 6.23E-03 | 35.3010 | 0.9744 | 0.9962 | 0.0000 | 0.0217 | 0.0375 | 0.2409 | 0.0750 | ||
bilinear | 5.78E-03 | 35.9907 | 0.9739 | 0.9954 | 0.1585 | 0.1330 | 0.0000 | 0.0000 | 0.0729 | ||
Anime4K_C_UL | 6.15E-03 | 35.1667 | 0.9750 | 0.9959 | 0.0283 | 0.0000 | 0.0777 | 0.1493 | 0.0638 |
The funniest thing about these results is that, according to distortion metrics, Anime4K is actually worse at "recreating the ground truth" than simple resampling filters. The problem seems to be that the shader is too sharp, it does not only ring but also makes the lines thinner. This isn't necessarily bad depending on how blurry your source is, but my test image here isn't blurry at all.
The slower variant of FSRCNNX has also scored worse than its smaller sibling, presumably for the same reason. I don't think igv trained these networks with images that had been downsampled in linear light, so that's probably part of the problem.
The overly-sharp shaders would probably do much better on perceptual metrics, I'll evaluate whether adding a few of them to the benchmarking script makes sense on a later date.
EASU (which is FSR without RCAS) actually scores way better than FSR, and I think it's probably a good option if you're looking for an efficient shader.
It's nice to see that bjin came back from the dead with gold. The new ravu variants perform much better than their previous incarnations and they're also free of the half-pixel shift problem.
Ravu-lite-ar-r4 is the best luma-doubler according to the metrics, and it's probably a safe choice if you're looking for a doubler. If you're not only doubling though, ravu-zoom-ar-r3 might be a better option. You can also edit their trigger conditions or use auto-profiles.lua to switch between them as needed.
While doing these tests at 2x makes sense since we can use a box filter to downsample, most people aren't always doubling their content. Scaling "performance" at lower scaling factors, such as 1.5x or 1.33x, are usually more interesting because this corresponds to 720p->1080p and 1080p->1440p, respectively. This section is supposed to better represent a real world scenario when you don't have a 4K display or don't watch comically low-resolution content.
Unfortunately, some of my choices here are only explained in the following sections. You'll have to bear with me and trust this makes (some) sense, as the alternative would be placing the fractional upsampling section at the end of the page instead of here, its most logical position.
In short, the entire methodology stays the same with a few modifications:
Shader/Filter | MAE | PSNR | SSIM | MS-SSIM | MAE (N) | PSNR (N) | SSIM (N) | MS-SSIM (N) | Mean | ||
ravu-zoom-ar-r3 | 2.88E-03 | 43.8869 | 0.9924 | 0.9991 | 1.0000 | 1.0000 | 1.0000 | 0.9763 | 0.9941 | ||
ravu-zoom-r3 | 3.09E-03 | 43.5967 | 0.9918 | 0.9991 | 0.8988 | 0.9434 | 0.9196 | 1.0000 | 0.9404 | ||
ravu-zoom-ar-r2 | 2.97E-03 | 43.3608 | 0.9918 | 0.9990 | 0.9556 | 0.8974 | 0.9236 | 0.9350 | 0.9279 | ||
FSRCNNX_x2_8-0-4-1 | 3.06E-03 | 43.0042 | 0.9919 | 0.9989 | 0.9148 | 0.8278 | 0.9341 | 0.8707 | 0.8869 | ||
ravu-zoom-r2 | 3.19E-03 | 43.0888 | 0.9913 | 0.9990 | 0.8516 | 0.8443 | 0.8511 | 0.9486 | 0.8739 | ||
FSRCNNX_x2_16-0-4-1 | 3.19E-03 | 43.0352 | 0.9920 | 0.9989 | 0.8555 | 0.8339 | 0.9420 | 0.8577 | 0.8723 | ||
ravu-lite-r4 | 3.16E-03 | 42.6660 | 0.9914 | 0.9990 | 0.8652 | 0.7618 | 0.8656 | 0.9146 | 0.8518 | ||
ravu-lite-r3 | 3.21E-03 | 42.5309 | 0.9912 | 0.9990 | 0.8433 | 0.7355 | 0.8391 | 0.9025 | 0.8301 | ||
ravu-lite-r2 | 3.27E-03 | 42.2439 | 0.9908 | 0.9989 | 0.8145 | 0.6795 | 0.7887 | 0.8763 | 0.7897 | ||
lanczos | 3.46E-03 | 41.9607 | 0.9901 | 0.9989 | 0.7282 | 0.6242 | 0.6903 | 0.8382 | 0.7203 | ||
EASU | 3.35E-03 | 41.7479 | 0.9901 | 0.9987 | 0.7810 | 0.5827 | 0.6968 | 0.7592 | 0.7049 | ||
polar_lanczossharp | 3.44E-03 | 41.7621 | 0.9898 | 0.9988 | 0.7390 | 0.5855 | 0.6573 | 0.8253 | 0.7018 | ||
ravu-lite-ar-r4 | 3.41E-03 | 41.7120 | 0.9902 | 0.9986 | 0.7522 | 0.5757 | 0.7113 | 0.6697 | 0.6772 | ||
ravu-lite-ar-r3 | 3.43E-03 | 41.6404 | 0.9901 | 0.9986 | 0.7392 | 0.5617 | 0.6951 | 0.6684 | 0.6661 | ||
nnedi3-nns32-win8x4.hook | 3.47E-03 | 41.5189 | 0.9901 | 0.9986 | 0.7211 | 0.5380 | 0.6953 | 0.6926 | 0.6618 | ||
nnedi3-nns64-win8x4.hook | 3.47E-03 | 41.5101 | 0.9901 | 0.9986 | 0.7210 | 0.5363 | 0.6971 | 0.6896 | 0.6610 | ||
ravu-lite-ar-r2 | 3.48E-03 | 41.4288 | 0.9898 | 0.9985 | 0.7178 | 0.5205 | 0.6496 | 0.6602 | 0.6370 | ||
FSR | 5.02E-03 | 40.4762 | 0.9856 | 0.9981 | 0.0000 | 0.3346 | 0.0969 | 0.4210 | 0.2131 | ||
bilinear | 4.33E-03 | 38.7610 | 0.9848 | 0.9974 | 0.3224 | 0.0000 | 0.0000 | 0.0000 | 0.0806 |
The numbers are all closer together at 1.5x, which shows that your choice of scaler becomes less relevant as you decrease the scaling factor.
It's a bit funny to see FSR get worse here, since a lower scaling factor corresponds to a higher quality preset in AMD's terms, but I think the problem is just that the shader is too sharp.
It's pretty easy to see that the slightly blurry downsampling step is hurting the doublers. This section previously used catrom+PC to downsample them (which is now only used to generate the "input"), but honestly that was a bit overkill. Using mpv's default downsampling filter makes more sense as that's what most users are probably using.
The conclusion is that at lower scaling factors the benefits these shaders bring become smaller, but the new ravu-zoom variants are all pretty damn good. I personally recommend sticking to the r3 variants, with or without AR.
Using shaders to upsample chroma is an old meme, as igv ported Shiandow's KrigBilateral to mpv years ago. KrigBilateral implements kringing interpolation, which is commonly used in geostatistics. You can read more about it here.
bjin has also released chroma variants of ravu in the past, and I've written a few shaders myself in an attempt to learn GLSL (ended up being too fun and now I can't spend a week without editing these shaders...).
Anyway... My first (not very successful) chroma meme was called jointbilateral and it pretty much implemented this paper. In short, chromatic information from pixels with similar luminosities is preferred (simply uses luma-deltas as an extra weighting factor). The inherent flaw with this idea is that it can't overshoot, so it's literally incapable of reaching some intensity levels depending on how chroma subsampling is done.
My second attempt at this is what's generally known as "CfL", which was written taking inspiration from this paper. In short, Chroma from Luma (Prediction) is a technique that applies a linear regression to map the missing chromaticities based on known luminosities. This is paired with a sharp spatial filter that gets dynamically mixed-in based on the local correlation between chroma and luma.
This test is a little different and it requires a slightly more elaborate process. The first problem with this is that, with real video content, we rarely get a 4:4:4 copy. This means that we generally don't have a reference point to work with, so we have to create one on our own.
I've semi-arbitrarily decided to use a downsampled version of this key visual:
Two versions of the image were created using FFmpeg:
ffmpeg -i original.png -pix_fmt yuv444p10le -vf libplacebo=colorspace=bt709:color_primaries=bt709:color_trc=bt709:range=tv:format=yuv444p10le -c:v libx265 -preset placebo -x265-params lossless=1 444.mp4
ffmpeg -i original.png -pix_fmt yuv420p10le -vf libplacebo=colorspace=bt709:color_primaries=bt709:color_trc=bt709:range=tv:format=yuv420p10le -c:v libx265 -preset placebo -x265-params lossless=1 420.mp4
The idea here was creating (nearly) lossless variants with and without chroma subsampling.
The "reference" image was obtained by screenshotting "444.mp4" on mpv.
The images under test were obtained by screenshotting "420.mp4" on mpv using the various chroma resampling filters/shaders.
mpv options remains the same with the exception that we don't need
Shader/Filter | MAE | PSNR | SSIM | MS-SSIM | MAE (N) | PSNR (N) | SSIM (N) | MS-SSIM (N) | Mean | ||
cfl_12+4 | 2.54E-03 | 41.7586 | 0.9920 | 0.9993 | 1.0000 | 1.0000 | 0.9959 | 0.9965 | 0.9981 | ||
cfl_12 | 2.56E-03 | 41.7166 | 0.9921 | 0.9993 | 0.9893 | 0.9887 | 1.0000 | 1.0000 | 0.9945 | ||
cfl_4 | 2.59E-03 | 41.5668 | 0.9917 | 0.9993 | 0.9607 | 0.9483 | 0.9490 | 0.9574 | 0.9539 | ||
krigbilateral | 2.65E-03 | 40.7990 | 0.9913 | 0.9991 | 0.9167 | 0.7413 | 0.9004 | 0.8249 | 0.8458 | ||
ravu-zoom-r3-chroma | 3.03E-03 | 40.4487 | 0.9887 | 0.9990 | 0.6087 | 0.6469 | 0.5806 | 0.7343 | 0.6426 | ||
ravu-zoom-r2-chroma | 3.10E-03 | 40.3055 | 0.9884 | 0.9990 | 0.5502 | 0.6083 | 0.5413 | 0.7185 | 0.6046 | ||
jointbilateral | 3.09E-03 | 39.2373 | 0.9883 | 0.9988 | 0.5563 | 0.3203 | 0.5193 | 0.5127 | 0.4771 | ||
fastbilateral | 3.10E-03 | 39.1749 | 0.9881 | 0.9988 | 0.5527 | 0.3035 | 0.5012 | 0.5258 | 0.4708 | ||
lanczos | 3.31E-03 | 39.4604 | 0.9869 | 0.9988 | 0.3781 | 0.3805 | 0.3500 | 0.5152 | 0.4059 | ||
polar_lanczossharp | 3.37E-03 | 39.2526 | 0.9865 | 0.9987 | 0.3362 | 0.3244 | 0.2931 | 0.4526 | 0.3516 | ||
bilinear | 3.78E-03 | 38.0491 | 0.9841 | 0.9982 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
Chroma upscaling is often seen as less important since the human visual system is less sensitive to chromatic variations, but all shaders managed to score higher than the built-in solutions. The numeric deltas obviously aren't as bit this time, but the top scoring shader is still significantly ahead of Lanczos (main reference point since it's the default).
If we look at the numbers, CfL is the best shader on all metrics. The default 12-tap variant scores higher than all other shaders and only loses to itself on MAE and PSNR when the 4-tap regression is turned on. I've been chasing minor improvements for a while now and the shader became very good.
Krigbilateral comes right after that and the reason it's so low is because its PSNR score sucks. This shader is extremely hit or miss, and sometimes the pixels are just terribly off-target (easily seen in the hair ornament).
Ravu is also an option is you really dislike what the regression-based shaders do, as it's analogous to a very good polar filter and relatively free of weird artifacts.
You should probably skip Joint/FastBilateral unless you really can't run any of the other shaders due to performance constraints (and in that case you're better off with built-in filters anyway). This shader is kinda hit or miss just like Krig, but since it can't overshoot at all it tends to be worse than even the built-in solutions on very complex scenes.
To conclude, if you want to use a chroma shader you should probably use CfL. My opinion on this matter may be biased since I'm the author of the shader though, so feel free to do your own tests.
Antiringing solutions is a topic that I hadn't covered in the previous iteration of this page, but now that we have more than a single option we can also compare them.
In short, antiringing filters attempt to remove overshoots generated by sharp resampling filters when they meet a sharp intensity delta. What is commonly referred to as ringing is simply consequential to the filter's impulse response.
The following image shows this very well:
The negative weights in the filter are there for it to be able to quickly respond to high-frequency transitions, but it makes the filter overshoot a little bit before reaching its final destination. The "intensity" of the ringing is directly related to the magnitude of the secondary lobes. The second lobe, which is almost always negative, is responsible for the overshooting in can see in this example, but filters with more lobes ring once per lobe, and the ringing can be "positive" as well (within the range set by the original pixels) with positive lobes. The "length" of the rings is directly related to the length of the lobes, which is why filters like polar lanczos have "longer" rings (the zero crossings don't fall exactly at the integers, but rather slightly after them).
The methodology here almost is equal to the one used for upsampling, with the only difference being that we have to include
AR is only really necessary when you're using sharp filters, it makes no sense alongside blurry filters because blurry filters don't ring hard enough for it to be noticeable. There are a few sharp memes that are worth trying with AR though if you feel adventurous, but enerally speaking I think polar lanczossharp are pretty well balanced (a bit blurry even).
I'm including my AR shader in this comparison because I think it's better at keeping everything but the overshoots intact, which does create some weird artifacts with sharp transitions sometimes, specially if you use it at ludicrous scaling factors, but the output is generally good enough on real content.
Filter | MAE | PSNR | SSIM | MS-SSIM | MAE (N) | PSNR (N) | SSIM (N) | MS-SSIM (N) | Mean | ||
polar_lanczossharp_ar_060 | 4.62E-03 | 38.3812 | 0.9826 | 0.9984 | 0.8805 | 0.8303 | 0.9474 | 0.5981 | 0.8141 | ||
polar_lanczossharp_pc_080 | 4.65E-03 | 38.3944 | 0.9825 | 0.9984 | 0.7577 | 1.0000 | 0.8078 | 0.6900 | 0.8139 | ||
polar_lanczossharp_ar_055 | 4.63E-03 | 38.3829 | 0.9826 | 0.9984 | 0.8318 | 0.8527 | 0.9138 | 0.6534 | 0.8129 | ||
polar_lanczossharp_pc_085 | 4.64E-03 | 38.3931 | 0.9825 | 0.9984 | 0.7828 | 0.9829 | 0.8244 | 0.6539 | 0.8110 | ||
polar_lanczossharp_ar_065 | 4.61E-03 | 38.3782 | 0.9827 | 0.9983 | 0.9228 | 0.7924 | 0.9731 | 0.5380 | 0.8066 | ||
polar_lanczossharp_ar_050 | 4.64E-03 | 38.3834 | 0.9826 | 0.9984 | 0.7784 | 0.8586 | 0.8732 | 0.7045 | 0.8037 | ||
polar_lanczossharp_pc_075 | 4.66E-03 | 38.3939 | 0.9825 | 0.9984 | 0.7191 | 0.9934 | 0.7781 | 0.7222 | 0.8032 | ||
polar_lanczossharp_pc_090 | 4.64E-03 | 38.3904 | 0.9825 | 0.9984 | 0.7989 | 0.9483 | 0.8331 | 0.6221 | 0.8006 | ||
polar_lanczossharp_ar_070 | 4.60E-03 | 38.3741 | 0.9827 | 0.9983 | 0.9563 | 0.7400 | 0.9906 | 0.4747 | 0.7904 | ||
polar_lanczossharp_pc_070 | 4.67E-03 | 38.3934 | 0.9824 | 0.9984 | 0.6775 | 0.9875 | 0.7445 | 0.7500 | 0.7899 | ||
polar_lanczossharp_ar_045 | 4.66E-03 | 38.3826 | 0.9825 | 0.9984 | 0.7183 | 0.8491 | 0.8231 | 0.7515 | 0.7855 | ||
polar_lanczossharp_pc_095 | 4.64E-03 | 38.3850 | 0.9825 | 0.9984 | 0.8039 | 0.8794 | 0.8319 | 0.5998 | 0.7787 | ||
polar_lanczossharp_pc_065 | 4.68E-03 | 38.3921 | 0.9824 | 0.9984 | 0.6305 | 0.9702 | 0.7055 | 0.7747 | 0.7702 | ||
polar_lanczossharp_ar_075 | 4.59E-03 | 38.3685 | 0.9827 | 0.9983 | 0.9827 | 0.6682 | 1.0000 | 0.4073 | 0.7646 | ||
polar_lanczossharp_pc_100 | 4.64E-03 | 38.3808 | 0.9825 | 0.9984 | 0.8014 | 0.8251 | 0.8272 | 0.5919 | 0.7614 | ||
polar_lanczossharp_ar_040 | 4.68E-03 | 38.3803 | 0.9824 | 0.9984 | 0.6523 | 0.8197 | 0.7634 | 0.7948 | 0.7575 | ||
polar_lanczossharp_pc_060 | 4.69E-03 | 38.3898 | 0.9823 | 0.9984 | 0.5876 | 0.9415 | 0.6656 | 0.7990 | 0.7484 | ||
polar_lanczossharp_ar_080 | 4.59E-03 | 38.3615 | 0.9827 | 0.9983 | 0.9960 | 0.5779 | 0.9981 | 0.3335 | 0.7264 | ||
polar_lanczossharp_pc_055 | 4.70E-03 | 38.3878 | 0.9823 | 0.9984 | 0.5422 | 0.9158 | 0.6240 | 0.8203 | 0.7256 | ||
polar_lanczossharp_ar_035 | 4.69E-03 | 38.3765 | 0.9824 | 0.9984 | 0.5820 | 0.7701 | 0.6947 | 0.8333 | 0.7200 | ||
polar_lanczossharp_pc_050 | 4.72E-03 | 38.3835 | 0.9822 | 0.9984 | 0.4902 | 0.8598 | 0.5697 | 0.8427 | 0.6906 | ||
polar_lanczossharp_ar_085 | 4.59E-03 | 38.3533 | 0.9827 | 0.9983 | 1.0000 | 0.4731 | 0.9879 | 0.2573 | 0.6796 | ||
polar_lanczossharp_ar_030 | 4.71E-03 | 38.3716 | 0.9823 | 0.9984 | 0.5073 | 0.7070 | 0.6182 | 0.8690 | 0.6753 | ||
polar_lanczossharp_pc_045 | 4.73E-03 | 38.3806 | 0.9822 | 0.9984 | 0.4285 | 0.8230 | 0.5117 | 0.8494 | 0.6532 | ||
polar_lanczossharp_ar_025 | 4.73E-03 | 38.3656 | 0.9822 | 0.9984 | 0.4296 | 0.6301 | 0.5353 | 0.9001 | 0.6238 | ||
polar_lanczossharp_ar_090 | 4.59E-03 | 38.3439 | 0.9827 | 0.9982 | 0.9922 | 0.3529 | 0.9688 | 0.1754 | 0.6223 | ||
polar_lanczossharp_pc_040 | 4.75E-03 | 38.3763 | 0.9821 | 0.9984 | 0.3813 | 0.7680 | 0.4600 | 0.8662 | 0.6189 | ||
polar_lanczossharp_pc_035 | 4.76E-03 | 38.3720 | 0.9821 | 0.9984 | 0.3320 | 0.7127 | 0.4066 | 0.8836 | 0.5837 | ||
polar_lanczossharp_ar_020 | 4.75E-03 | 38.3581 | 0.9821 | 0.9985 | 0.3488 | 0.5339 | 0.4432 | 0.9280 | 0.5635 | ||
polar_lanczossharp_ar_095 | 4.59E-03 | 38.3329 | 0.9826 | 0.9982 | 0.9704 | 0.2117 | 0.9398 | 0.0888 | 0.5527 | ||
polar_lanczossharp_pc_030 | 4.77E-03 | 38.3667 | 0.9820 | 0.9984 | 0.2725 | 0.6445 | 0.3413 | 0.8941 | 0.5381 | ||
polar_lanczossharp_ar_015 | 4.78E-03 | 38.3496 | 0.9820 | 0.9985 | 0.2660 | 0.4253 | 0.3446 | 0.9518 | 0.4969 | ||
polar_lanczossharp_pc_025 | 4.79E-03 | 38.3606 | 0.9819 | 0.9984 | 0.2203 | 0.5660 | 0.2776 | 0.9081 | 0.4930 | ||
polar_lanczossharp_ar_100 | 4.60E-03 | 38.3206 | 0.9826 | 0.9982 | 0.9369 | 0.0539 | 0.9039 | 0.0000 | 0.4737 | ||
polar_lanczossharp_pc_020 | 4.80E-03 | 38.3541 | 0.9819 | 0.9985 | 0.1672 | 0.4828 | 0.2127 | 0.9175 | 0.4451 | ||
polar_lanczossharp_ar_010 | 4.80E-03 | 38.3398 | 0.9819 | 0.9985 | 0.1803 | 0.2999 | 0.2376 | 0.9717 | 0.4224 | ||
polar_lanczossharp_pc_015 | 4.82E-03 | 38.3473 | 0.9818 | 0.9985 | 0.1135 | 0.3962 | 0.1430 | 0.9272 | 0.3950 | ||
polar_lanczossharp_pc_010 | 4.83E-03 | 38.3407 | 0.9817 | 0.9985 | 0.0691 | 0.3107 | 0.0826 | 0.9356 | 0.3495 | ||
polar_lanczossharp_ar_005 | 4.82E-03 | 38.3288 | 0.9818 | 0.9985 | 0.0916 | 0.1580 | 0.1233 | 0.9884 | 0.3403 | ||
polar_lanczossharp_pc_005 | 4.84E-03 | 38.3330 | 0.9817 | 0.9985 | 0.0240 | 0.2130 | 0.0178 | 0.9423 | 0.2993 | ||
polar_lanczossharp | 4.84E-03 | 38.3164 | 0.9817 | 0.9985 | 0.0000 | 0.0000 | 0.0000 | 1.0000 | 0.2500 |
Previously, this section talked about the different strengths and weaknesses of Pixel Clipper and libplacebo's AR, but that's not really neccessary anymore. Libplacebo's AR has been recently updated and it looks great now. It's actually shocking how similar the solutions look at 2x considering how different the mechanisms are.
The only caveat is that libplacebo's AR is still a bit blurrier at similar strengths, so keep that in mind.
As you can see in the table above, the sweetspot for libplacebo's AR seems to be around 0.6 taking all the metrics into account. I could end the commentary here but if you pay attention you can also see that 0.6 isn't at the top of any of the individual metrics, but it does score above average on all of them. This is actually pretty interesting as these metrics tell us different things:
If we choose to focus on MAE, the sweetspot seems to be around 0.85. MAE is very good at telling us what is numerically closer to the reference without biasing this towards any specific type of artifacts (regardless of how humans perceive them).
If you only want to know how bad the worst pixels are you can focus on PSNR instead, and in this case Pixel Clipper seems to be much better than libplacebo's AR (presumably because the latter is smoother). The sweetspot seems to be around 0.8 here for PC, and 0.5 for libplacebo's AR.
SSIM is good at telling us about sharpness and the whole structural profile of the picture, it doesn't care too much about brightness or contrast variations but it'll heavily punish blurriness or sharp transitions that shouldn't exist. We can see that Libplacebo's AR is better than PC here (again because it's smoother), with 0.75 strength at the top.
MS-SSIM is just SSIM at different scales to emulate a viewer looking at the picture from different distances. This metric tends to correlate better to the human perception of quality than the other 3, and if you're not pixel peeping to catch the ringing this is probably a better representation of reality. For this metric, the picture with no AR at all is the top scorer. FIR filters need to overshoot a little bit to reach some intensity levels and in this case (since polar lanczossharp isn't that sharp to begin with), the ringing doesn't seem to be a big enough problem for it to hurt the score. This isn't true for all filters and it might not be true for all scaling factors either, but for polar lanczossharp at 2x you could easily make an argument that AR is probably unnecessary.
To conclude, my opinion is that you can probably skip AR unless you're using a very sharp filter. If you want to use AR though, the ideal strength will depend on the choice of filter, scaling factor and content type, but something around 0.6 should be pretty safe (if you want to use PC for the higher PSNR you should probably use it at higher strengths like 0.8).
Pixel Clipper actually has a downsampling variant as well, which I believe is probably less broken than mpv's native
The following image will be used for downsampling AR tests:
My current method to evaluate downsampling filters is to concede that at 0.5x linear light box is as good as it gets, since in this case we're just averaging 4 pixels together. We can't use box itself for other scaling factors though, so the aim is to find a filter that produces a similar result but is also usable at any other arbitrary scaling factor.
To eliminate any differences caused by how differently mpv and ImageMagick perform linear light conversion, the box reference was also obtained using mpv.
The images were basically generated using the following commands:
mpv --no-config --vo=gpu-next --no-hidpi-window-scale --window-scale=0.5 --pause=yes --screenshot-format=png --linear-downscaling --correct-downscaling --deband=no --dither-depth=no --screenshot-high-bit-depth=no --dscale=filter --glsl-shader="Pixel Clipper_downscaling.glsl" higres.png
mpv --no-config --vo=gpu-next --no-hidpi-window-scale --window-scale=0.5 --pause=yes --screenshot-format=png --linear-downscaling --correct-downscaling --deband=no --dither-depth=no --screenshot-high-bit-depth=no --dscale=filter --scale-antiring=1 higres.png
Please note that you have to set
Filter | MAE | PSNR | SSIM | MS-SSIM | MAE (N) | PSNR (N) | SSIM (N) | MS-SSIM (N) | Mean | ||
lanczos_pc_100 | 2.73E-03 | 42.0482 | 0.9950 | 0.9994 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | ||
lanczos_pc_75 | 3.04E-03 | 41.3289 | 0.9945 | 0.9994 | 0.8532 | 0.8748 | 0.9223 | 0.9429 | 0.8983 | ||
lanczos_pc_50 | 3.35E-03 | 40.1205 | 0.9938 | 0.9993 | 0.7074 | 0.6644 | 0.8048 | 0.8127 | 0.7473 | ||
polar_lanczossharp_ar_100 | 3.28E-03 | 40.3036 | 0.9930 | 0.9993 | 0.7381 | 0.6962 | 0.6898 | 0.7798 | 0.7260 | ||
polar_lanczossharp_pc_100 | 3.45E-03 | 40.1487 | 0.9927 | 0.9993 | 0.6562 | 0.6693 | 0.6369 | 0.7703 | 0.6832 | ||
polar_lanczossharp_ar_75 | 3.56E-03 | 39.8009 | 0.9926 | 0.9993 | 0.6036 | 0.6087 | 0.6265 | 0.7744 | 0.6533 | ||
polar_lanczossharp_pc_75 | 3.82E-03 | 39.5368 | 0.9921 | 0.9992 | 0.4812 | 0.5627 | 0.5379 | 0.6968 | 0.5697 | ||
polar_lanczossharp_ar_50 | 3.97E-03 | 38.7195 | 0.9917 | 0.9992 | 0.4127 | 0.4204 | 0.4776 | 0.6237 | 0.4836 | ||
polar_lanczossharp_pc_50 | 4.17E-03 | 38.4965 | 0.9911 | 0.9991 | 0.3134 | 0.3816 | 0.3921 | 0.5300 | 0.4043 | ||
lanczos | 3.93E-03 | 37.7231 | 0.9917 | 0.9990 | 0.4286 | 0.2469 | 0.4812 | 0.4032 | 0.3900 | ||
polar_lanczossharp | 4.83E-03 | 36.3049 | 0.9886 | 0.9988 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 |
The first thing we can see is that AR seems to be a net-positive in general, and the score differences we see here are much larger than their upsampling counterparts.
It's also good to see that libplacebo's AR seems to work even better when downsampling, as it can remove ringing without introducing any significant blur. This consolidates that there's little reason to use Pixel Clipper over Libplacebo's native AR if you're using polar filters.
It's also worth mentioning that downsampling AR only makes sense when you're using a sharp filter. Blurry filters such as mitchell do not benefit nearly as much from this (since there's almost no ringing to remove), and the remark that some mild ringing can be perceived as a positive thing by the human eye is also valid here. Personally I'd say you probably need some AR for anything sharper than catrom, but catrom itself is still perfectly usable without AR.
Benchmarking mpv performance is a bit tricky due to a few things:
In any case, the following numbers were produced with these mpv settings with a 720p anime episode (~24 minutes long video) on a 5600X+6600XT+Windows 11:
Measure-Command { mpv --no-config --vo=gpu-next --gpu-api=vulkan --audio=no --untimed=yes --video-sync=display-desync --vulkan-swap-mode=immediate --window-scale=2.0 --fullscreen=yes }
Preset/Shader | FPS | Time | Relative |
fast | 2057 | 41.71 | 1.0000 |
default | 1615 | 53.13 | 0.7851 |
fastbilateral | 1605 | 53.46 | 0.7802 |
jointbilateral | 1544 | 55.57 | 0.7506 |
cfl_12 | 1529 | 56.10 | 0.7434 |
cfl_4 | 1526 | 56.21 | 0.7420 |
cfl_12+4 | 1525 | 56.27 | 0.7413 |
ravu-lite-r2 | 1453 | 59.04 | 0.7065 |
EASU | 1447 | 59.29 | 0.7035 |
krigbilateral | 1417 | 60.55 | 0.6888 |
ravu-lite-ar-r2 | 1411 | 60.81 | 0.6859 |
ravu-lite-r3 | 1384 | 61.98 | 0.6730 |
ravu-lite-ar-r3 | 1369 | 62.68 | 0.6654 |
ravu-zoom-r2 | 1364 | 62.89 | 0.6632 |
high-quality | 1351 | 63.50 | 0.6568 |
ravu-lite-r4 | 1292 | 66.40 | 0.6281 |
ravu-lite-ar-r4 | 1287 | 66.66 | 0.6257 |
ravu-zoom-ar-r2 | 1278 | 67.16 | 0.6210 |
FSR | 1255 | 68.35 | 0.6102 |
high-quality+AR | 1178 | 72.86 | 0.5724 |
pixelclipper | 1127 | 76.13 | 0.5479 |
ravu-zoom-r3 | 1088 | 78.83 | 0.5291 |
nnedi3-nns32-win8x4 | 1008 | 85.13 | 0.4900 |
ravu-zoom-ar-r3 | 1006 | 85.33 | 0.4888 |
FSRCNNX_x2_8-0-4-1 | 912 | 94.08 | 0.4433 |
nnedi3-nns64-win8x4 | 640 | 134.07 | 0.3111 |
FSRCNNX_x2_16-0-4-1 | 359 | 238.75 | 0.1747 |
Please keep in mind that the choice of built-in filter doesn't matter that much as the weights get stored in a LUT, so something like lanczos and spline36 have virtually identical performance. In short, filters with larger radii will obviously be slower, polar filters are slower than their orthogonal counterparts and AR obviously also makes things slower.
For the shaders, the number are more or less what I expected, but it's nice to see that in the grand scheme of things these shaders don't really slow the playback down that much. Good old polar lanczossharp with AR is already as slow as most shaders.
Still though, the conclusion here seems to be that if you're running mpv on a semi-decent computer, these shaders will never be a problem.
Please note that you can make it go even faster with hardware decoding, but
I want to make it clear though that you shouldn't take the results as gospel. Mathematical image quality metrics do not always correlate perfectly with how humans perceive image quality, and your personal preference is entirely subjective. You should take this page as what it is, a research that produces numbers, but you should not take these numbers for granted before understanding what they actually mean.