Andrey Norkin's page

AFGS1 specification

AOMedia has recently finalized the AFGS1 specification, which defines a standalone film grain synthesis model. The specification allows to apply film grain synthesis process used in AV1 to other video codecs. The specification uses ITU-T T.35 SEI message to transport the model parameters. More details about this new specification and rationale for the design choices are available here.

AV1 decoder model

Most modern video codecs have some form of a decoder model, sometimes called a video buffering verifier (VBV) or a hypothetical reference decoder (HRD). Decoder models improve interoperability and may also provide a decoder with instructions on when to start decoding a frame to be able to display it on time. AV1 video codec also defines a decoder model coupled with a system of profiles and levels. This post describes some basic AV1 concepts, the AV1 decoder model, and explains reasoning behind decisions made when developing the model. More details here.

Deep learning for speeding up a video encoder

Deep learning can actually make encoding video faster. In this work, we used a CNN to speed up a VP9 intra-frame encoding by more than three times at a cost of 1.71% higher bitrate at the same quality. A fully convolutional CNN is used in the video encoder to make predictions of how to partition a superblock without performing the RDO recursively. More details here.

Film grain synthesis in AV1

Film grain is widely present in the motion picture and TV material and is often a part of the creative intent. Since film grain shows a high degree of randomness, it is difficult to compress efficiently. AV1 video codec has mandatory support of the film grain synthesis that makes it possible to model film grain or sensor noise in the encoder, denoise the video before encoding and add the synthesized grain after decoding, which enables significant bitrate savings on grainy content. More details here.

HDR video pre-processing

One of the options for streaming high dynamic range (HDR) video is HDR10 that uses HEVC Main10 encoding, ST.2084 transfer function, BT.2020 color space, Y'CbCr 4:2:0 non-constant luminance color format. A combination of non-constant luminance Y'CbCr 4:2:0 with highly non-linear transfer function can sometimes cause artifacts in saturated colors at the boundaries of the color gamut. A fast algorithms that helps to alleviate this problem via a closed-form solution has been proposed. More details here.

Deblocking filter in HEVC/H.265

As many other video codecs, HEVC uses block-based prediction and transform coding. In such a coding scheme, discontinuities, called block artifacts, can occur in the reconstructed signal at the block boundaries. The HEVC/H.265 standard defines a deblocking in-loop filter that can significantly reduce visual artifacts thus improving both perceptual video quality and contributing to more efficient compression. Various aspects of the HEVC deblocking filter design are discussed in this article.

Keeping perceived 3D scene proportions by adjusting virtual camera parameters

Stereo perception is a common factor in 3D displays, 360-degree video and AR/VR applications. The 3D and stereo material is typically optimized for a particular display size and viewing distance. When the content is shown on a different display size and/or from different viewing distance, the perceived proportions of objects will be distorted compared to the intended ones. This can make the scene look unnatural and even lead to eye strain and fatigue when observing the content. This article proposes solution for adapting 3D scene rendering parameters to new viewing conditions such that the perceived proportions of the objects in 3D scene are the same as for the reference viewing conditions.