Compressed Video Quality Assessment Challenge 2024
Participate
Here you can register your team and send your own VQA or IQA method scores to take part in this challenge. You will see the results on the current page and, also, within the leaderboard.
You can download the dataset via the links within the Download Data section. Also, you can use any public datasets for training your method.
Please note that the login form is located at the bottom of this page. We kindly ask you to scroll down to access the form and log in.
MATLAB solutions are not allowed. It will be checked at the end of the challenge.
Before the final evaluation on the hidden test data (which will be shared a week before the competition ends) you need to choose a single final submission on the test data and send it as an archive with the corresponding method launch script, weights checkpoint (if needed), and dockerfile to the challenge email (compressed-vqa-challenge-2024@videoprocessing.ai).
Submitting
- Compute your method scores for all the distorted videos of the public test (validation) data (210 videos). We kindly suggest you convert challenge dataset videos from MP4 format to raw YUV format before feeding them to your method, as it may lead to enhanced correlation with the subjective scores.
ffmpeg -i {video name}.mp4 -pix_fmt yuv420p -vcodec rawvideo -f rawvideo {video name}.yuv
- Create a JSON-file in the following format:
{ ... <Original sequence> : <Preset>: <Codec> : <CRF> : <Method Value>, ... }
Example
{ "crowd-run-2019" : { "fast": { "x264": { "1000": 0.5, "2000": 0.7 } "vvenc": {...} } "slow": {...} } }
-
Send this file to the submission acceptance form by clicking on the “Upload File” button. You should have a total of 210 metric scores (all validation set videos).
-
Fill the “Method Type” field with the following format: “<NR/FR>_<Image/Video>”.
-
NR/FR - for No-Reference method (don’t need GT frames for evaluation) or Full-Reference method (need GT frames for evaluation)
-
Image/Video - if the method requires image input or video input
-
Evaluation
The evaluation consists of the comparison of the predictions with the reference ground truth subjective scores obtained by pairwise crowdsource subjective comparison. To study more about the subjective quality evaluation procedure we used, you can go to the FAQ section at Subjectify.us.
We use the Spearman rank-order correlation (SROC), Pearson linear correlation (PLCC), and Kendall rank-order correlation (KROCC).
Please pay attention: we calculate the correlation coefficient (SROCC, KROCC, PLCC) on the different original (pristine) videos and encoding presets SEPARATELY. Therefore, to get a single correlation for the whole dataset, we use Fisher Z-transform to average group correlations weighted proportionally to group size as follows:
1) Iterate through all the original videos; for each calculate correlation coefficients, as many times as the quantity of unique presets for the current video (i.e. for park-joy-2022 with 3 presets fast, medium, and slow we obtain 3 correlations)
2) Use the inverse hyperbolic tangent (artanh) on each value of the correlation coefficient. Replace possible infinity with artanh(0.99999)
3) Apply weighted arithmetic mean to obtained values (for SROCC we use only those groups with size >= 15 and for PLCC and KROCC with size >= 6)
4) Calculate the hyperbolic tangent (tanh) of the weighted mean. Take the absolute value of it and replace 0.99999 with 1
5) The obtained value represents the correlation between your method scores and the subjective scores on our dataset
You can kindly find the example of how we calculate VQA and IQA methods correlations via the GitHub repository.
The final score is the post-processed average value of the mentioned correlation coefficients: (SROCC + PLCC + KROCC) / 3. As post-processing, we subtract 0.8, divide by 0.2 and cube it. These measures are explained by the chosen subjective evaluation methodology, which differs from the traditional MOS and brings the correlation values closer to 1.
Login Form