viernes, 30 de mayo de 2014

Matching Features with ORB using OpenCV (Python code)

Matching Features with ORB and Brute Force using OpenCV (Python code)

Today I will explain how to detect and match feature points using OpenCV. I will be using OpenCV 2.4.9

Funtions we will be using:

- cv2.VideoCapture()
   -.read()
- cv2.ORB()
   - .detect()
   -.compute()
- cv2.BFMatcher()
   -.match()
- cv2.imread()
- cv2.cvtColor()
- cv2.line()

The Algorithm:

1. First of all we must imports the funtions that we are going to use, for this we write:


import numpy as np
import cv2

2. we need to declare the orb detector that we will use and the Brute force matcher:


orb = cv2.ORB()
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

The parameter cv2.NORM_HAMMING specifies the distance measurement to be used, in this case, hamming distance. For ORB this is the only option we have for this parameter.

The second parameter is a boolean  if it is true, the matcher returns only those matches with value (i,j) such that i-th descriptor in set A has j-th descriptor in set B as the best match and vice-versa.

3. obtain the controll of the video capturing device (camera). 

camera = cv2.VideoCapture(0)

the parameter for the funcion VideoCapture is the id of the device we are using to capture video. if there is only one camera connected, pass 0 to the function. 

4. load Image to match. 


imgTrainColor=cv2.imread('train.png')
imgTrainGray=cv2.cvtColor(imgTrainColor,cv2.COLOR_BGR2GRAY)

the first funtion returns the image we are trying to match to our video, the parameter is the name of the image and it should be located in the same directory of the project. 

the second funtion changes the space from BGR yo GRAY. we are going to do all the processing of the image on its gray scale version. the parameters are the image we want to change to gray scale, and the secon is the type of conversion we want to achieve. 

5. we get keypoints and descriptors of our image.


kpTrain = orb.detect(imgTrainGray,None)
kpTrain, desTrain = orb.compute(imgTrainGray, kpTrain)

the first function is used to get the Key points in the image that we pass as first argument, the second argument is a mask but here we will use none. 

the second function computes the descriptors of the image using the keypoints. 

this image shows the key points obtained by ORB



6. obtaining and processing the video frame. 


 ret, imgCamColor = camera.read()
imgCamGray=cv2.cvtColor(imgCamColor,cv2.COLOR_BGR2GRAY)
kpCam = orb.detect(imgCamGray,None)
kpCam, desCam = orb.compute(imgCamGray, kpCam)

first, we get the frame by using the first function and then we get the keypoints and compute the descriptors as before. 

7. we find matches between the images and the video frame:

matches = bf.match(desCam,desTrain)
 thres_dist = (sum(dist) / len(dist)) * 0.5
matches = [m for m in matches if m.distance < thres_dist]  

we need to delete the matches that correspond to errors, we do this by setting a distance threshold and obtain the matches points whose distances are less than the threshold. Here I used as threshold half of the distance mean across all the array of matching points. 

8. plotting results. 

for i in range(len(matches)):
   pt_a = (int(kpTrain[matches[i].trainIdx].pt[0]), int(kpTrain[matches[i].trainIdx].pt[1]+hdif))
   pt_b = (int(kpCam[matches[i].queryIdx].pt[0]+w2), int(kpCam[matches[i].queryIdx].pt[1]))
   cv2.line(result, pt_a, pt_b, (255, 0, 0))

for every matching point we get the point in both images a draw a line between them. prior to this we must create the result image that is going to have both matching images. 

9. destroy windows and release camera.

cv2.destroyAllWindows()
camera.release()

THE CODE:
# -*- coding: utf-8 -*-
"""
@author: Javier Perez
@email: javier_e_perez21@hotmail.com

"""
import numpy as np
import cv2
  
ESC=27   
camera = cv2.VideoCapture(1)
orb = cv2.ORB()
bf = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)

imgTrainColor=cv2.imread('train.png')
imgTrainGray = cv2.cvtColor(imgTrainColor, cv2.COLOR_BGR2GRAY)

kpTrain = orb.detect(imgTrainGray,None)
kpTrain, desTrain = orb.compute(imgTrainGray, kpTrain)

firsttime=True

while True:
   
    ret, imgCamColor = camera.read()
    imgCamGray = cv2.cvtColor(imgCamColor, cv2.COLOR_BGR2GRAY)
    kpCam = orb.detect(imgCamGray,None)
    kpCam, desCam = orb.compute(imgCamGray, kpCam)
    matches = bf.match(desCam,desTrain)
    dist = [m.distance for m in matches]
    thres_dist = (sum(dist) / len(dist)) * 0.5
    matches = [m for m in matches if m.distance < thres_dist]   

    if firsttime==True:
        h1, w1 = imgCamColor.shape[:2]
        h2, w2 = imgTrainColor.shape[:2]
        nWidth = w1+w2
        nHeight = max(h1, h2)
        hdif = (h1-h2)/2
        firsttime=False
       
    result = np.zeros((nHeight, nWidth, 3), np.uint8)
    result[hdif:hdif+h2, :w2] = imgTrainColor
    result[:h1, w2:w1+w2] = imgCamColor

    for i in range(len(matches)):
        pt_a=(int(kpTrain[matches[i].trainIdx].pt[0]), int(kpTrain[matches[i].trainIdx].pt[1]+hdif))
pt_b=(int(kpCam[matches[i].queryIdx].pt[0]+w2), int(kpCam[matches[i].queryIdx].pt[1]))
        cv2.line(result, pt_a, pt_b, (255, 0, 0))

    cv2.imshow('Camara', result)
  
    key = cv2.waitKey(20)                                 
    if key == ESC:
        break

cv2.destroyAllWindows()
camera.release()

RESULTS:






DOCUMENTATION:

Capturing video:
http://docs.opencv.org/modules/highgui/doc/reading_and_writing_images_and_video.html

ORB implementation:
http://docs.opencv.org/trunk/doc/py_tutorials/py_feature2d/py_orb/py_orb.html

3 comentarios:

  1. result[hdif:hdif+h2, :w2] = trainImg

    ValueError: could not broadcast input array from shape (2936,64,3) into shape (0,64,3)

    ResponderBorrar
    Respuestas
    1. your training image resolution needs to be smaller or similar to the stream resolution

      Borrar
  2. your training image resolution needs to be smaller or similar to the stream resolution

    ResponderBorrar