Personal blog written from scratch using Node.js, Bootstrap, and MySQL. https://jrtechs.net

416 lines
10 KiB

  1. This blog post going over the basic image manipulation things you can
  2. do with Open CV. [Open CV](https://opencv.org/) is an open-source
  3. library of computer vision tools. Open CV is written to be used in
  4. conjunction with deep learning frameworks like
  5. [TensorFlow](https://www.tensorflow.org/). This tutorial is going to
  6. be using Python3, although you can also use Open CV with C++, Java,
  7. and [Matlab](https://www.mathworks.com/products/matlab.html)
  8. # Reading and Displaying Images
  9. The first thing that you want to do when you start playing around with
  10. open cv is to import the dependencies required. Most basic computer
  11. vision projects with OpenCV will use NumPy and matplotlib. All images
  12. in Open CV are represented as NumPy matrices with shape (x, y, 3),
  13. with the data type uint8. This essentially means that every image is a
  14. 2d matrix with three color channels for BGR where each pixel can have
  15. an intensity between 0 and 255. Zero is black where 255 is white in
  16. grayscale.
  17. ```python
  18. # Open cv library
  19. import cv2
  20. # numpy library for matrix manipulation
  21. import numpy as np
  22. # matplotlib for displaying the images
  23. from matplotlib import pyplot as plt
  24. ```
  25. Reading an image is as easy as using the "cv2.imread" function. If
  26. you simply try to print the image with Python's print function, you
  27. will flood your terminal with a massive matrix. In this post, we are
  28. going to be using the infamous
  29. [Lenna](https://en.wikipedia.org/wiki/Lenna) image which has been used
  30. in the Computer Vision field since 1973.
  31. ```python
  32. lenna = cv2.imread('lenna.jpg')
  33. # Prints a single pixel value
  34. print(lenna[50][50])
  35. # Prints the image dimensions
  36. # (width, height, 3 -- BRG)
  37. print(lenna.shape)
  38. ```
  39. [ 89 104 220]
  40. (440, 440, 3)
  41. By now you might have noticed that I am saying "BRG" instead of "RGB";
  42. in Open CV colors are in the order of "BRG" instead of "RGB". This
  43. makes it particularly difficult when printing the images using a
  44. different library like matplotlib because they expect images to be in
  45. the form "RGB". Thankfully for us we can use some functions in the
  46. Open CV library to convert the color scheme.
  47. ```python
  48. def printI(img):
  49. rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
  50. plt.imshow(rgb)
  51. printI(lenna)
  52. ```
  53. ![png](media/cv1/output_6_0.png)
  54. Going a step further with image visualization, we can use matplotlib
  55. to view images side by side to each other. This makes it easier to
  56. make comparisons when running different algorithms on the same image.
  57. ```python
  58. def printI3(i1, i2, i3):
  59. fig = plt.figure()
  60. ax1 = fig.add_subplot(1,3,1)
  61. ax1.imshow(cv2.cvtColor(i1, cv2.COLOR_BGR2RGB))
  62. ax2 = fig.add_subplot(1,3,2)
  63. ax2.imshow(cv2.cvtColor(i2, cv2.COLOR_BGR2RGB))
  64. ax3 = fig.add_subplot(1,3,3)
  65. ax3.imshow(cv2.cvtColor(i3, cv2.COLOR_BGR2RGB))
  66. def printI2(i1, i2):
  67. fig = plt.figure()
  68. ax1 = fig.add_subplot(1,2,1)
  69. ax1.imshow(cv2.cvtColor(i1, cv2.COLOR_BGR2RGB))
  70. ax2 = fig.add_subplot(1,2,2)
  71. ax2.imshow(cv2.cvtColor(i2, cv2.COLOR_BGR2RGB))
  72. ```
  73. If we zero out the other colored layers and only left one channel, we
  74. can visualize each channel individually. In the following example
  75. notice that image.copy() generates a deep-copy of the image matrix --
  76. this is a useful NumPy function.
  77. ```python
  78. def generateBlueImage(image):
  79. b = image.copy()
  80. # set the green and red channels to 0
  81. # note images are in BGR
  82. b[:, :, 1] = 0
  83. b[:, :, 2] = 0
  84. return b
  85. def generateGreenImage(image):
  86. g = image.copy()
  87. # sets the blue and red channels to 0
  88. g[:, :, 0] = 0
  89. g[:, :, 2] = 0
  90. return g
  91. def generateRedImage(image):
  92. r = image.copy()
  93. # sets the blue and green channels to 0
  94. r[:, :, 0] = 0
  95. r[:, :, 1] = 0
  96. return r
  97. def visualizeRGB(image):
  98. printI3(generateRedImage(image), generateGreenImage(image), generateBlueImage(image))
  99. ```
  100. ```python
  101. visualizeRGB(lenna)
  102. ```
  103. ![png](media/cv1/output_11_0.png)
  104. # Grayscale Images
  105. Converting a color image to grayscale reduces the dimensionality
  106. because you are squishing each color layer into one channel. Open CV
  107. has a built-in function to do this.
  108. ```python
  109. glenna = cv2.cvtColor(lenna, cv2.COLOR_BGR2GRAY)
  110. printI(glenna)
  111. ```
  112. ![png](media/cv1/output_14_0.png)
  113. The builtin function works in most applications, however, you
  114. sometimes want more control in which color layers are weighted more in
  115. generating the grayscale image. To do that you can
  116. ```python
  117. def generateGrayScale(image, rw = 0.25, gw = 0.5, bw = 0.25):
  118. """
  119. Image is the open cv image
  120. w = weight to apply to each color layer
  121. """
  122. w = np.array([[[ bw, gw, rw]]])
  123. gray2 = cv2.convertScaleAbs(np.sum(image*w, axis=2))
  124. return gray2
  125. ```
  126. ```python
  127. printI(generateGrayScale(lenna))
  128. ```
  129. ![png](media/cv1/output_17_0.png)
  130. Notice that the sum of the weights is equal to 1 if it above 1, it
  131. would brighten the image but if it was below 1, it would darken the
  132. image.
  133. ```python
  134. printI2(generateGrayScale(lenna, 0.1, 0.3, 0.1), generateGrayScale(lenna, 0.5, 0.6, 0.5))
  135. ```
  136. ![png](media/cv1/output_19_0.png)
  137. We could also use our function to display the grayscale output of each
  138. color layer.
  139. ```python
  140. printI3(generateGrayScale(lenna, 1.0, 0.0, 0.0), generateGrayScale(lenna, 0.0, 1.0, 0.0), generateGrayScale(lenna, 0.0, 0.0, 1.0))
  141. ```
  142. ![png](media/cv1/output_21_0.png)
  143. Based on this output, the red layer is the brightest which makes sense
  144. because the majority of the image is in a pinkish/red tone.
  145. # Pixel Operations
  146. Pixel operations are simply things that you do to every pixel in the
  147. image.
  148. ## Negative
  149. To take the negative of an image, you simply invert the image. Ie: if
  150. the pixel was 0, it would now be 255, if the pixel was 0 it would now
  151. be 255. Since all the images are unsigned ints of length 8, right
  152. once, a pixel hits a boundary, it would automatically wrap over which
  153. is convenient for us. With NumPy, if you subtract a number from a
  154. matrix, it would do that for every element in that matrix -- neat.
  155. Therefore if we wanted to invert an image we could just take 255 and
  156. subtract it from the image.
  157. ```python
  158. invert_lenna = 255 - lenna
  159. printI(invert_lenna)
  160. ```
  161. ![png](media/cv1/output_25_0.png)
  162. ## Darken And Lighten
  163. To brighten and darken an image you can add constants to the image
  164. because that would push the image closer twords 0 and 255 which is
  165. black and white.
  166. ```python
  167. bright_bad_lenna = lenna + 25
  168. printI(bright_bad_lenna)
  169. ```
  170. ![png](media/cv1/output_28_0.png)
  171. Notice that the image got brighter but in some parts the image got
  172. inverted. This is because when we add two images, and we don't want to
  173. wrap, we have to set a clipping threshold to be the 0 and 255. IE:
  174. when we add a constant to the image at pixel 240, we don't want it to
  175. wrap back to 0, we just want it to retain a value of 255. Open CV has
  176. built-in functions for this.
  177. ```python
  178. def brightenImg(img, num):
  179. a = np.zeros(img.shape, dtype=np.uint8)
  180. a[:] = num
  181. return cv2.add(img, a)
  182. def darkenImg(img, num):
  183. a = np.zeros(img.shape, dtype=np.uint8)
  184. a[:] = num
  185. return cv2.subtract(img, a)
  186. brighten_lenna = brightenImg(lenna, 50)
  187. darken_lenna = darkenImg(lenna, 50)
  188. printI2(brighten_lenna, darken_lenna)
  189. ```
  190. ![png](media/cv1/output_30_0.png)
  191. ## Contrast
  192. Adjusting the contrast of an image is a matter of multiplying the
  193. image by a constant. Multiplying by a number greater than 1 would
  194. increase the contrast and multiplying by a number lower than 1 would
  195. decrease the contrast.
  196. ```python
  197. def adjustContrast(img, amount):
  198. """
  199. changes the data type to float32 so we can adjust the contrast by
  200. more than integers, then we need to clip the values and
  201. convert data types at the end.
  202. """
  203. a = np.zeros(img.shape, dtype=np.float32)
  204. a[:] = amount
  205. b = img.astype(float)
  206. c = np.multiply(a, b)
  207. np.clip(c, 0, 255, out=c) # clips between 0 and 255
  208. return c.astype(np.uint8)
  209. ```
  210. ```python
  211. printI2(adjustContrast(lenna, 0.8) ,adjustContrast(lenna, 1.3))
  212. ```
  213. ![png](media/cv1/output_33_0.png)
  214. # Noise
  215. I most cases you don't want to add random noise to your image,
  216. however, in some algorithms, it becomes necessary to do for testing.
  217. Noise is anything that makes the image imperfect. In the "real world"
  218. this is usually in the form of dead pixels on your camera lens or
  219. other things distorting your view.
  220. ## Salt and Pepper
  221. Salt and pepper noise is adding random black and white pixels to your
  222. image.
  223. ```python
  224. import random
  225. def uniformNoise(image, num):
  226. img = image.copy()
  227. h, w, c = img.shape
  228. x = np.random.uniform(0,w,num)
  229. y = np.random.uniform(0,h,num)
  230. for i in range(0, num):
  231. r = 0 if random.randrange(0,2) == 0 else 255
  232. img[int(x[i])][int(y[i])] = np.asarray([r, r, r])
  233. return img
  234. printI2(uniformNoise(lenna, 1000), uniformNoise(lenna, 7000))
  235. ```
  236. ![png](media/cv1/output_36_0.png)
  237. # Image Denoising
  238. It is possible to remove the salt and pepper noise from an image to
  239. clean it up. Unlike how my professor worded it, this is not
  240. "enhancing" the image, this is merely using filters that remove the
  241. noise from the image by blurring it.
  242. ## Moving Average
  243. The moving average technique sets each pixel equal to the average of
  244. its neighborhood. The bigger your neighborhood the more the image is
  245. blurred.
  246. ```python
  247. bad_lenna = uniformNoise(lenna, 6000)
  248. blur_lenna = cv2.blur(bad_lenna,(3,3))
  249. printI2(bad_lenna, blur_lenna)
  250. ```
  251. ![png](media/cv1/output_39_0.png)
  252. As you can see, most of the noise was removed from the image but,
  253. imperfections were left. To see the effects of the filter size, you
  254. can play around with it.
  255. ```python
  256. blur_lenna_3 = cv2.blur(bad_lenna,(3,3))
  257. blur_lenna_8 = cv2.blur(bad_lenna,(8,8))
  258. printI2(blur_lenna_3, blur_lenna_8)
  259. ```
  260. ![png](media/cv1/output_41_0.png)
  261. ## Median Filtering
  262. Median filters transform every pixel by taking the median value of its
  263. neighborhood. This is a lot better than average filters for noise
  264. reduction because it has less of a blurring effect and it is extremely
  265. well at removing outliers like salt and pepper noise.
  266. ```python
  267. median_lenna = cv2.medianBlur(bad_lenna,3)
  268. printI2(bad_lenna, median_lenna)
  269. ```
  270. ![png](media/cv1/output_43_0.png)
  271. # Remarks
  272. Open CV is a vastly powerful framework for image manipulation. This
  273. post only covered some of the more basic applications of Open CV.
  274. Future posts might explore some of the more advanced techniques in
  275. computer vision like filters, Canny edge detection, template matching,
  276. and Harris Corner detection.