| @ -0,0 +1,416 @@ | |||
| This blog post going over the basic image manipulation things you can | |||
| do with Open CV. [Open CV](https://opencv.org/) is an open-source | |||
| library of computer vision tools. Open CV is written to be used in | |||
| conjunction with deep learning frameworks like | |||
| [TensorFlow](https://www.tensorflow.org/). This tutorial is going to | |||
| be using Python3, although you can also use Open CV with C++, Java, | |||
| and [Matlab](https://www.mathworks.com/products/matlab.html) | |||
| # Reading and Displaying Images | |||
| The first thing that you want to do when you start playing around with | |||
| open cv is to import the dependencies required. Most basic computer | |||
| vision projects with OpenCV will use NumPy and matplotlib. All images | |||
| in Open CV are represented as NumPy matrices with shape (x, y, 3), | |||
| with the data type uint8. This essentially means that every image is a | |||
| 2d matrix with three color channels for BGR where each pixel can have | |||
| an intensity between 0 and 255. Zero is black where 255 is white in | |||
| grayscale. | |||
| ```python | |||
| # Open cv library | |||
| import cv2 | |||
| # numpy library for matrix manipulation | |||
| import numpy as np | |||
| # matplotlib for displaying the images | |||
| from matplotlib import pyplot as plt | |||
| ``` | |||
| Reading an image is as easy as using the "cv2.imread" function. If | |||
| you simply try to print the image with Python's print function, you | |||
| will flood your terminal with a massive matrix. In this post, we are | |||
| going to be using the infamous | |||
| [Lenna](https://en.wikipedia.org/wiki/Lenna) image which has been used | |||
| in the Computer Vision field since 1973. | |||
| ```python | |||
| lenna = cv2.imread('lenna.jpg') | |||
| # Prints a single pixel value | |||
| print(lenna[50][50]) | |||
| # Prints the image dimensions | |||
| # (width, height, 3 -- BRG) | |||
| print(lenna.shape) | |||
| ``` | |||
| [ 89 104 220] | |||
| (440, 440, 3) | |||
| By now you might have noticed that I am saying "BRG" instead of "RGB"; | |||
| in Open CV colors are in the order of "BRG" instead of "RGB". This | |||
| makes it particularly difficult when printing the images using a | |||
| different library like matplotlib because they expect images to be in | |||
| the form "RGB". Thankfully for us we can use some functions in the | |||
| Open CV library to convert the color scheme. | |||
| ```python | |||
| def printI(img): | |||
| rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) | |||
| plt.imshow(rgb) | |||
| printI(lenna) | |||
| ``` | |||
|  | |||
| Going a step further with image visualization, we can use matplotlib | |||
| to view images side by side to each other. This makes it easier to | |||
| make comparisons when running different algorithms on the same image. | |||
| ```python | |||
| def printI3(i1, i2, i3): | |||
| fig = plt.figure() | |||
| ax1 = fig.add_subplot(1,3,1) | |||
| ax1.imshow(cv2.cvtColor(i1, cv2.COLOR_BGR2RGB)) | |||
| ax2 = fig.add_subplot(1,3,2) | |||
| ax2.imshow(cv2.cvtColor(i2, cv2.COLOR_BGR2RGB)) | |||
| ax3 = fig.add_subplot(1,3,3) | |||
| ax3.imshow(cv2.cvtColor(i3, cv2.COLOR_BGR2RGB)) | |||
| def printI2(i1, i2): | |||
| fig = plt.figure() | |||
| ax1 = fig.add_subplot(1,2,1) | |||
| ax1.imshow(cv2.cvtColor(i1, cv2.COLOR_BGR2RGB)) | |||
| ax2 = fig.add_subplot(1,2,2) | |||
| ax2.imshow(cv2.cvtColor(i2, cv2.COLOR_BGR2RGB)) | |||
| ``` | |||
| If we zero out the other colored layers and only left one channel, we | |||
| can visualize each channel individually. In the following example | |||
| notice that image.copy() generates a deep-copy of the image matrix -- | |||
| this is a useful NumPy function. | |||
| ```python | |||
| def generateBlueImage(image): | |||
| b = image.copy() | |||
| # set the green and red channels to 0 | |||
| # note images are in BGR | |||
| b[:, :, 1] = 0 | |||
| b[:, :, 2] = 0 | |||
| return b | |||
| def generateGreenImage(image): | |||
| g = image.copy() | |||
| # sets the blue and red channels to 0 | |||
| g[:, :, 0] = 0 | |||
| g[:, :, 2] = 0 | |||
| return g | |||
| def generateRedImage(image): | |||
| r = image.copy() | |||
| # sets the blue and green channels to 0 | |||
| r[:, :, 0] = 0 | |||
| r[:, :, 1] = 0 | |||
| return r | |||
| def visualizeRGB(image): | |||
| printI3(generateRedImage(image), generateGreenImage(image), generateBlueImage(image)) | |||
| ``` | |||
| ```python | |||
| visualizeRGB(lenna) | |||
| ``` | |||
|  | |||
| # Grayscale Images | |||
| Converting a color image to grayscale reduces the dimensionality | |||
| because you are squishing each color layer into one channel. Open CV | |||
| has a built-in function to do this. | |||
| ```python | |||
| glenna = cv2.cvtColor(lenna, cv2.COLOR_BGR2GRAY) | |||
| printI(glenna) | |||
| ``` | |||
|  | |||
| The builtin function works in most applications, however, you | |||
| sometimes want more control in which color layers are weighted more in | |||
| generating the grayscale image. To do that you can | |||
| ```python | |||
| def generateGrayScale(image, rw = 0.25, gw = 0.5, bw = 0.25): | |||
| """ | |||
| Image is the open cv image | |||
| w = weight to apply to each color layer | |||
| """ | |||
| w = np.array([[[ bw, gw, rw]]]) | |||
| gray2 = cv2.convertScaleAbs(np.sum(image*w, axis=2)) | |||
| return gray2 | |||
| ``` | |||
| ```python | |||
| printI(generateGrayScale(lenna)) | |||
| ``` | |||
|  | |||
| Notice that the sum of the weights is equal to 1 if it above 1, it | |||
| would brighten the image but if it was below 1, it would darken the | |||
| image. | |||
| ```python | |||
| printI2(generateGrayScale(lenna, 0.1, 0.3, 0.1), generateGrayScale(lenna, 0.5, 0.6, 0.5)) | |||
| ``` | |||
|  | |||
| We could also use our function to display the grayscale output of each | |||
| color layer. | |||
| ```python | |||
| printI3(generateGrayScale(lenna, 1.0, 0.0, 0.0), generateGrayScale(lenna, 0.0, 1.0, 0.0), generateGrayScale(lenna, 0.0, 0.0, 1.0)) | |||
| ``` | |||
|  | |||
| Based on this output, the red layer is the brightest which makes sense | |||
| because the majority of the image is in a pinkish/red tone. | |||
| # Pixel Operations | |||
| Pixel operations are simply things that you do to every pixel in the | |||
| image. | |||
| ## Negative | |||
| To take the negative of an image, you simply invert the image. Ie: if | |||
| the pixel was 0, it would now be 255, if the pixel was 0 it would now | |||
| be 255. Since all the images are unsigned ints of length 8, right | |||
| once, a pixel hits a boundary, it would automatically wrap over which | |||
| is convenient for us. With NumPy, if you subtract a number from a | |||
| matrix, it would do that for every element in that matrix -- neat. | |||
| Therefore if we wanted to invert an image we could just take 255 and | |||
| subtract it from the image. | |||
| ```python | |||
| invert_lenna = 255 - lenna | |||
| printI(invert_lenna) | |||
| ``` | |||
|  | |||
| ## Darken And Lighten | |||
| To brighten and darken an image you can add constants to the image | |||
| because that would push the image closer twords 0 and 255 which is | |||
| black and white. | |||
| ```python | |||
| bright_bad_lenna = lenna + 25 | |||
| printI(bright_bad_lenna) | |||
| ``` | |||
|  | |||
| Notice that the image got brighter but in some parts the image got | |||
| inverted. This is because when we add two images, and we don't want to | |||
| wrap, we have to set a clipping threshold to be the 0 and 255. IE: | |||
| when we add a constant to the image at pixel 240, we don't want it to | |||
| wrap back to 0, we just want it to retain a value of 255. Open CV has | |||
| built-in functions for this. | |||
| ```python | |||
| def brightenImg(img, num): | |||
| a = np.zeros(img.shape, dtype=np.uint8) | |||
| a[:] = num | |||
| return cv2.add(img, a) | |||
| def darkenImg(img, num): | |||
| a = np.zeros(img.shape, dtype=np.uint8) | |||
| a[:] = num | |||
| return cv2.subtract(img, a) | |||
| brighten_lenna = brightenImg(lenna, 50) | |||
| darken_lenna = darkenImg(lenna, 50) | |||
| printI2(brighten_lenna, darken_lenna) | |||
| ``` | |||
|  | |||
| ## Contrast | |||
| Adjusting the contrast of an image is a matter of multiplying the | |||
| image by a constant. Multiplying by a number greater than 1 would | |||
| increase the contrast and multiplying by a number lower than 1 would | |||
| decrease the contrast. | |||
| ```python | |||
| def adjustContrast(img, amount): | |||
| """ | |||
| changes the data type to float32 so we can adjust the contrast by | |||
| more than integers, then we need to clip the values and | |||
| convert data types at the end. | |||
| """ | |||
| a = np.zeros(img.shape, dtype=np.float32) | |||
| a[:] = amount | |||
| b = img.astype(float) | |||
| c = np.multiply(a, b) | |||
| np.clip(c, 0, 255, out=c) # clips between 0 and 255 | |||
| return c.astype(np.uint8) | |||
| ``` | |||
| ```python | |||
| printI2(adjustContrast(lenna, 0.8) ,adjustContrast(lenna, 1.3)) | |||
| ``` | |||
|  | |||
| # Noise | |||
| I most cases you don't want to add random noise to your image, | |||
| however, in some algorithms, it becomes necessary to do for testing. | |||
| Noise is anything that makes the image imperfect. In the "real world" | |||
| this is usually in the form of dead pixels on your camera lens or | |||
| other things distorting your view. | |||
| ## Salt and Pepper | |||
| Salt and pepper noise is adding random black and white pixels to your | |||
| image. | |||
| ```python | |||
| import random | |||
| def uniformNoise(image, num): | |||
| img = image.copy() | |||
| h, w, c = img.shape | |||
| x = np.random.uniform(0,w,num) | |||
| y = np.random.uniform(0,h,num) | |||
| for i in range(0, num): | |||
| r = 0 if random.randrange(0,2) == 0 else 255 | |||
| img[int(x[i])][int(y[i])] = np.asarray([r, r, r]) | |||
| return img | |||
| printI2(uniformNoise(lenna, 1000), uniformNoise(lenna, 7000)) | |||
| ``` | |||
|  | |||
| # Image Denoising | |||
| It is possible to remove the salt and pepper noise from an image to | |||
| clean it up. Unlike how my professor worded it, this is not | |||
| "enhancing" the image, this is merely using filters that remove the | |||
| noise from the image by blurring it. | |||
| ## Moving Average | |||
| The moving average technique sets each pixel equal to the average of | |||
| its neighborhood. The bigger your neighborhood the more the image is | |||
| blurred. | |||
| ```python | |||
| bad_lenna = uniformNoise(lenna, 6000) | |||
| blur_lenna = cv2.blur(bad_lenna,(3,3)) | |||
| printI2(bad_lenna, blur_lenna) | |||
| ``` | |||
|  | |||
| As you can see, most of the noise was removed from the image but, | |||
| imperfections were left. To see the effects of the filter size, you | |||
| can play around with it. | |||
| ```python | |||
| blur_lenna_3 = cv2.blur(bad_lenna,(3,3)) | |||
| blur_lenna_8 = cv2.blur(bad_lenna,(8,8)) | |||
| printI2(blur_lenna_3, blur_lenna_8) | |||
| ``` | |||
|  | |||
| ## Median Filtering | |||
| Median filters transform every pixel by taking the median value of its | |||
| neighborhood. This is a lot better than average filters for noise | |||
| reduction because it has less of a blurring effect and it is extremely | |||
| well at removing outliers like salt and pepper noise. | |||
| ```python | |||
| median_lenna = cv2.medianBlur(bad_lenna,3) | |||
| printI2(bad_lenna, median_lenna) | |||
| ``` | |||
|  | |||
| # Remarks | |||
| Open CV is a vastly powerful framework for image manipulation. This | |||
| post only covered some of the more basic applications of Open CV. | |||
| Future posts might explore some of the more advanced techniques in | |||
| computer vision like filters, Canny edge detection, template matching, | |||
| and Harris Corner detection. | |||