A moving 3D image constructed from video is shown by the following Youtube.
The proposed algorithm is a proposal of a method of acquiring a distance scene in the forward direction of movement from a moving image by a single camera mounted on a moving body of a car or a drone traveling forward in real time. Accordingly, it is an object of the present invention to provide an image sensor which can be applied to an automated driving technique of a car or a drone without using a large number of expensive and various kinds of sensors.
In recent years, sensors that capture the outside world are mounted on moving objects such as cars and drones. Stanford University, Carnegie Mellon University, Google Car initiatives at DARPA Urban Challenge.
This purpose is to acquire distance information for automatic driving of cars and drone, and these sensor information is used for that purpose. As an example, if automatic operation of a car is taken, laser, ultrasonic, infrared and stereo cameras are main in distance sensors. Especially, about 10 laser sensors are installed in one car. Among them are extremely expensive laser sensors. In addition, its performance also includes a range of distances near a car such as an ultrasonic sensor as a target area. Also, even with a laser sensor reaching a place away from the car, it has a problem that it is difficult to specify the object to measure the distance. It has a problem of deciding which external object the point group of the distance measured by the distance sensor in the stereo camera corresponds to.
This technique reconstructs a dynamic distance landscape based on tracking pixels of motion pictures. Out method also can be used for making a 3D map for automatic car driving. Also, from the moving image when the car is stopped, the distance of the moving object in the scne can be frame-wisely extracted.
On the other hand, as a related technology, there is a tracking technique of an object in a moving image. Conventional tracking technology simply tracks the movement of a pedestrian or the like.
There are two categories of tracking. One uses an area template of images, and the other uses feature templates (describing features of images). Those using image templates are supposed to be based on a particle filter and mean-shift. Though not mentioned in the above document, deep learning has recently been proposed as an image area.
Generally, in tracking by region template, it is necessary to first take out the tracking target image from the image. In the Particle Filter and mean-shift method, a tracking movement destination is determined from a moving image by using a single region template and using the likelihood (particle filter) and the normalized mean-shift histogram of its pixel value histogram Method. Both are processing for each object to be tracked.
Next, conventional feature-based tracking methods will be described. In the SIFT method and the SURF (Speed UP Robust Feature) method, features of an image called sift and SURF are extracted, a point called a keypoint is defined, the sift and SURF features there are expressed, Estimate in the image. It can be said that tracking is impossible in places where these features do not exist.
As mentioned above, these tracking is the object tracking itself, not the calculation of the distance between the camera and the object.
The patent of the proposed method is now pending.
The following our paper explaines the outline of the proposed method called
Ryuichi Oka, Kesisuke Hata: "Reconstructing a moving 3D image from video caputured by a forward-moving camera", MIRU 2018, PS3-1, 8-th August (2018).