GitHub - davkenn/virtualexposures

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 101 Commits
src		src
tests		tests
.gitignore		.gitignore
README		README
requirements.txt		requirements.txt

Repository files navigation

This is an implementation of Video Enhancement Using Per-Pixel Virtual Exposures "https://csbio.unc.edu/mcmillan/pubs/sig05_bennett.pdf"

This paper develops a way to clarify underexposed video footage by applying
to it both a temporal bilateral filter and a spatial bilateral filter and tone
mapping the result.

First, input pixels are put through a tone mapping function run on the luminance
channel. The tone mapped result is ignored except for the ratio of its resulting
pixel value to its input pixel value. This is how many pixels we aim to combine
in the temporal and spatial phases of bilateral filtering.

The temporal filter is run first. It combines pixels in time in the same location,
both before and after. Its result is calculated by multiplying all neighboring
pixels by two gaussian distributions: one based on the distance of the neighboring
pixel in luminance value and one based on the distance of the neighboring pixel
in number of intervening frames. The sigma value for the luminance gaussian
distribution is fixed. The sigma value for the gaussian distribution over distance
in frames is a function of the result of the tone mapping function: the target
number of pixels we want to combine defined as:

(2 * gain_ratio *  gaussianLuminance(perfect_match, sigma1) *
                                gaussianTemporalDistance(perfect_match,sigma2)
where gain_ratio==result of our tonemapping phase, and perfect_match==0)

Therefore, we dynamically choose our sigma2 based on how many pixels we need
to combine. If we need to combine a lot of pixels, we use a wider temporal
window of neighboring frames.

We sum all neighboring pixels and from this take the denominator to determine
how many pixels we have combined in the temporal phase. If we are far short
of the target, we apply a lot of spatial bilateral filtering. If we are just short
of the target, we apply some spatial bilateral filter. If we have reached the
target, we apply none.

Finally we apply the tone mapping function and the output pixel value
is the result.

The current state of this project simplifies the paper in a few ways. Rather
than applying a locally variable tone mapping function, I use a global tone
mapping function. Instead of applying a more exact amount of spatial bilateral
filtering, I only apply none, some, or a lot. Also, the paper was written with
RAW video footage in mind, hopefully implemented in real time on cameras.
Because of the patterns of noise in compressed versus raw footage, this program
would probably be more effective on less compressed formats.