A framework for inpainting missing parts of a video sequence recorded with a
moving or stationary camera is presented in this work. The region to be
inpainted is general: it may be still or moving, in the background or in the
foreground, it may occlude one object and be occluded by some other object.
The algorithm consists of a simple pre-processing stage and two steps of
video inpainting. In the pre-processing stage we roughly segment each frame
into foreground and background. We use this segmentation to build three
image mosaics that help to produce time consistent results and also improve
the performance of the algorithm by reducing the search space. In the first
video inpainting step we reconstruct moving objects in the foreground that
are `occluded' by the region to be inpainted. To this end we fill the gap as
much as possible by copying information from the moving foreground in other
frames, using a priority-based scheme. In the second step, we inpaint the
remaining hole with the background. To accomplish this, we first align the
frames and directly copy when possible. The remaining pixels are filled-in
by extending spatial texture synthesis techniques to the spatio-temporal
domain. The proposed framework has several advantages over state of the art
algorithms that deal with similar types of data and constraints. It permits
some camera motion, is simple to implement, is fast, does not require
statistical models of background nor foreground, works well in the presence
of rich and cluttered background and the results show that there is no
visible blurring or motion artifacts. A number of real examples taken with a
consumer hand-held camera are shown supporting these findings.
[Journal: | Conference: ]
The result below illustrates the first case as mentioned above. The person with the orange jacket, the phone-box in the center and the lamp-post towards the left of center, are filled in with the stationary background. The inpainted sequence is illustrated on the right. Notice that there are no inconsistencies from one frame to another in the areas that are filled-in.
The more important and difficult to handle problem is, filling-in an occluded moving person. To address this, we assume that we are given a "motion confidence image" Mc, which indicates if a pixel belongs to moving foreground or stationary background. The following synthetic example shows that given a perfect Mc, we can perfectly inpaint the moving object.
The implementation of our algorithm on a real-life video sequence (50 frames) is shown below. We use a crude optical-flow based "motion confidence image" Mc. The sequence on the right shows completed foreground, and the one below shows the completely inpainted sequence. Observe that the motion of the completed person and the background filled-in, is globally consistent.
The sequence below illustrates the application of our technique to removing large moving objects from videos. Notice how well the moving person is synthesized in the region of object removal. The orange basket-ball visible in the completely inpainted sequence is seen only for a few frames in the original video (60 frames). Our background completion ensures that all available temporal information is preserved. ( Frame size has been reduced from 640x480 to 320x240, for faster viewing )
Note: During background completion in a damaged frame at a location (say) p, we find a matching patch of background and not only copy it to Ψp (refer to paper for explanation) but also to all damaged frames that have a data loss at that particular location, thus achieving consistency. When we are faced with the problem of inpainting a moving person (as above), it is therefore important that we FIRST complete the moving person and THEN complete the static background, so that we do not synthesize background at a location where otherwise we would want part of the completed moving object to be.
The following is a synthetic example where a part of the video is manually cut-out, to simulate a damaged camera sensor or speckle on the lens or camera film damage as in the case of old movies. Observe the background independent nature of our "moving person inpainting".
The inpainting scheme proposed here also works fairly well for camera motions that do not adhere to our constraints (section 2A of paper). The following is a simulation of a 'home-video' situation where a person of interest is being occluded as he moves along his trajectory.
The following result illustrates that our algorithm can deal with very large and moving occlusions in presence of camera motion. The original videos are of 640x480 resolution and can be obtained here.
Copyright NoticeThe purpose of all the material presented here is to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.