Project Summary
We create a 3D model of flat objects from single view. The method used is based on a paper by Criminisi [1].
Acknowledgement
This project is a part of the Computer Vision course (COMP 5421) during spring 2014. The project partners were Manohar Kuse (myself) and Sunil Jaiswal. The course instructor was Prof. C. K. Tang.
I further acknowledge that the part of text and images presented here are borrowed from the course webpage.
Downloads
Source Code for the Project : [link] (Uses Qt & OpenCV2)
Code read me : [readme]
Original Image : box.png
View 3d rendering : [YouTube]
X3d model file : [box.x3d]
Original Images : microwave_front.jpg, microwave_back.jpg
View 3d rendering : [YouTube]
Algorithm Description
Step – 1
I selected this image. I would have a 3d model of the box. Note that this method requires a 3-point perspective image as input.
Step – 2
Next step is to calculate the vanishing points. Vanishing point is a point on image plane at which point, parallel lines in the scene appear to meet. This is calculated from the manual markings of several parallel lines in the image.
Note that, the lines that are actually parallel in real world may not be necessarily be parallel in image plane.

After having marked the parallel lines, it is possible to calculate the vanishing point by finding the intersection point of these lines. Atleast 2 lines in each direction need to be marked. Marking more than 3 lines give robustness.
Vanishing point from more than 2 lines can be calculated (in a least square sense) as described by Bob Collins. View the Bob Collins’ method description here.
Step – 3
Next step is to calculate the Projection matrix (P). The projection matrix maps the 3D co-ordinates onto the 2D plane. Note that we use the homogeneous co-ordinate system.
We note that the projection matrix is formed using the vanishing points. Each of the columns of P are the vanishing points in x,y,z direction respectively. The last (4th) column is the image of world origin. However,
the columns are upto a scaling constant. Thus we need reference points to evaluate the scaling constants. See the slide (slide is borrowed from lecture notes by Prof. Tang) below for details. Also view the Matlab-code to
evaluate the scaling constants and projection matrix, here.

Step – 4
Once we have the projection matrix we can calculate the homography matrix (H) corresponding to each of the normal planes (ie. XY, YZ, XZ). This can be done (for Hxy) by taking the 1st, 2nd and 4th column of
the projection matrix (P) calculated in previous step. Similarly, Hyz, Hxz can be calculated. These homographies are used to transform the image and create the texture maps. Reverse warping is used. Show below are
the texture maps for the example image.



Step – 5
Once we have a texture map of the normal planes of the object, we can crop the required portion of the image and create a 3D model using tools like Blender.
References
[1] Criminisi, Antonio, Ian Reid, and Andrew Zisserman. “Single view metrology.”International Journal of Computer Vision 40.2 (2000): 123-148.
[2] “Single View Metrology”, Project Page, Oxford University http://www.robots.ox.ac.uk/~vgg/projects/SingleView/
[3] Additional resources on Projective Geometry and Homogeneous Co-ordinates http://www.cs.cmu.edu/~ph/869/www/misc.html
[4] Former TA Desmond Tsoi Yau Chat and Stanley Ng Ho Lun for course COMP621 at HKUST report, titled “Single View Metrology”. PDF
[5] Criminisi, Antonio. Accurate visual metrology from single and multiple uncalibrated images. Springer, 2001. Thesis.