Repository where I mostly put random python scripts.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

587 lines
372 KiB

  1. {
  2. "cells": [
  3. {
  4. "cell_type": "markdown",
  5. "metadata": {},
  6. "source": [
  7. "# K-Means Algorithm\n",
  8. "\n",
  9. "The general idea of clustering is to group data with similar traits. The main benefit of this is the ability to extract information from new data because you know what it is most similar to, thus giving you valuable insight. In the field of machine learning, this is considered unsupervised learning because it requires no labels on the data -- the algorithm auto assigns clusters, and you infer behavior off of those clusters.\n",
  10. "\n",
  11. "Clustering has many applications such as image segmentation, preference predictions, compression, model fitting.\n",
  12. "\n",
  13. "Although you can trace the idea of k-means clustering back to 1967 with a paper by Hugo Steinhaus, James MacQueen was the first to coin the term k-means in 1956. MacQueen's paper title \"Some Methods For Classification and Analysis of Multivariate Observations\" goes over the k-means process that segments an N-dimensional population into k sets. Note: when we refer to k in the algorithm, that is the number of sets that we are dividing the population.\n",
  14. "\n",
  15. "\n",
  16. "A great deal of MacQueens article discusses optimality for the k-means algorithm, which is an important area to discuss, especially when considering the time at which the article got published. Back in 1967, computers were very slow and expensive. Although we had proofs that can guarantee that we could find an optimal solution, they were a NP-Hard problem. This is critical because NP-Hard problems are problems that are exponential to solve.\n",
  17. "\n",
  18. "Although the k-means algorithm did not guarantee the optimal solution, there was a subset of problems that it did guarantee an optimal solution-- the specifics of these problems got discussed later in the article. Nerveless, since this algorithm wasn't computationally expensive and generally gave good results, it was a huge breakthrough at the time. \n",
  19. "\n",
  20. "\n",
  21. "This algorithm can be broken down into four major segments:\n",
  22. "\n",
  23. "## Step 1:\n",
  24. "\n",
  25. "Pick k random points as cluster centers called centroids.\n",
  26. "\n",
  27. "## Step 2: \n",
  28. "\n",
  29. "Assign each point to nearest cluster by calculating its distance to each centroid.\n",
  30. "\n",
  31. "## Step 3: \n",
  32. "\n",
  33. "Find new cluster center by taking the average of the assigned points.\n",
  34. "\n",
  35. "## Step 4:\n",
  36. "\n",
  37. "Repeat steps 2 and 3 until no cluster assignments change."
  38. ]
  39. },
  40. {
  41. "cell_type": "markdown",
  42. "metadata": {},
  43. "source": [
  44. "# Python Implementation\n",
  45. "\n",
  46. "Implementing this in python is rather straight forward. Given data, cluster it into k sets and return the cluster assignments and cluster values."
  47. ]
  48. },
  49. {
  50. "cell_type": "code",
  51. "execution_count": 1,
  52. "metadata": {},
  53. "outputs": [],
  54. "source": [
  55. "import sys\n",
  56. "import numpy as np\n",
  57. "\n",
  58. "def distToClust(val, clusterCenter):\n",
  59. " \"\"\"\n",
  60. " Distance measure to cluster, can change\n",
  61. " this to be different types of distances\n",
  62. " \"\"\"\n",
  63. " return np.linalg.norm(val-clusterCenter)\n",
  64. "\n",
  65. "\n",
  66. "def closestCenter(val, clusters):\n",
  67. " \"\"\"\n",
  68. " Finds the cluster closest to the presented\n",
  69. " value\n",
  70. " \"\"\"\n",
  71. " curMin = sys.maxsize\n",
  72. " curIndex = 0 \n",
  73. " for k in range(0, len(clusters)):\n",
  74. " d = distToClust(val, clusters[k])\n",
  75. " if d < curMin:\n",
  76. " curIndex = k\n",
  77. " curMin = d\n",
  78. " return curIndex\n",
  79. "\n",
  80. "\n",
  81. "def kmeansAlgo(k, data):\n",
  82. " \"\"\"\n",
  83. " k: number of clusters\n",
  84. " data: nxd numpy matrix where n is number of elements\n",
  85. " and d is the number of dimensions in the data. \n",
  86. " \n",
  87. " return: tuple of assignments and clusters where clusters\n",
  88. " is a list of clusteroids and assignments maps each value\n",
  89. " to a single cluster\n",
  90. " \"\"\"\n",
  91. " \n",
  92. " n = data.shape[0] # length of data to cluster\n",
  93. " d = data.shape[1] # dimensionality of data\n",
  94. " \n",
  95. " # maps each element in data to a cluster\n",
  96. " assignments = np.zeros(n, dtype=np.int) \n",
  97. " \n",
  98. " clusters = []\n",
  99. " for i in range(0, k):\n",
  100. " clusters.append(data[i])\n",
  101. " \n",
  102. " reAssigned = True\n",
  103. " generations = 0\n",
  104. " while reAssigned:\n",
  105. " reAssigned = False\n",
  106. " \n",
  107. " # assign clusters\n",
  108. " for i in range(0, n):\n",
  109. " c = closestCenter(data[i], clusters)\n",
  110. " if c != assignments[i]:\n",
  111. " reAssigned = True\n",
  112. " assignments[i] = c\n",
  113. " \n",
  114. " # re-compute centers\n",
  115. " clusterValues = []\n",
  116. " for _ in range(0, k):\n",
  117. " clusterValues.append([])\n",
  118. " for i in range(0, n):\n",
  119. " clusterValues[assignments[i]].append(data[i])\n",
  120. " for i in range(0, k):\n",
  121. " clusters[i] = np.average(clusterValues[i], axis=0)\n",
  122. " generations = generations + 1\n",
  123. " print(\"Clustering took \" + str(generations) + \" generations\")\n",
  124. " return assignments, clusters"
  125. ]
  126. },
  127. {
  128. "cell_type": "code",
  129. "execution_count": null,
  130. "metadata": {},
  131. "outputs": [],
  132. "source": []
  133. },
  134. {
  135. "cell_type": "markdown",
  136. "metadata": {},
  137. "source": [
  138. "# Image Segmentation\n",
  139. "\n",
  140. "Using our k-means algorithm we can cluster the pixels in an image together.\n",
  141. "\n",
  142. "## Clustering on Color\n",
  143. "\n",
  144. "When we cluster the pixels of an image based on color, we map pixels with similar color to the same cluster. Since an image is a three dimensional matrix, we can do this just fine. When clustering our data, the input is going to just be the list of pixel values. Note: as far as the k-means algorithm is conserned, there is no coordinates, just a list of pixels. The RGB values of the pixels gets clustered together. \n",
  145. "\n",
  146. "To make things run a bit faster, we are going to be using the k-means implementation from sklearn."
  147. ]
  148. },
  149. {
  150. "cell_type": "code",
  151. "execution_count": 20,
  152. "metadata": {},
  153. "outputs": [],
  154. "source": [
  155. "def segmentImgClrRGB(imgFilename, k):\n",
  156. "\n",
  157. " #1. Load the image \n",
  158. " img = cv2.imread(imgFilename)\n",
  159. " \n",
  160. " h = img.shape[0]\n",
  161. " w = img.shape[1]\n",
  162. " \n",
  163. " img.shape = (img.shape[0] * img.shape[1], 3)\n",
  164. " \n",
  165. " #5. Run k-means on the vectorized reponses X to get a vector of labels (the clusters); \n",
  166. " # \n",
  167. " kmeans = KMeans(n_clusters=k, random_state=0).fit(img).labels_\n",
  168. " \n",
  169. " #6. Reshape the label results of k-means so that it has the same size as the input image\n",
  170. " # Return the label image which we call idx\n",
  171. " kmeans.shape = (h, w)\n",
  172. "\n",
  173. " return kmeans"
  174. ]
  175. },
  176. {
  177. "cell_type": "markdown",
  178. "metadata": {},
  179. "source": [
  180. "After we have our pixel assignment we, want some useful way to display it. In this algorithm we are coloring in each pixel with the assignments's clusteroid center. IE: if our algorithm ran with three clusters, the new image would only have three colors in it."
  181. ]
  182. },
  183. {
  184. "cell_type": "code",
  185. "execution_count": 7,
  186. "metadata": {},
  187. "outputs": [],
  188. "source": [
  189. "import skimage\n",
  190. "from sklearn.cluster import KMeans\n",
  191. "from numpy import linalg as LA\n",
  192. "\n",
  193. "def colorClustering(idx, imgFilename, k):\n",
  194. " img = cv2.imread(imgFilename)\n",
  195. " \n",
  196. " clusterValues = []\n",
  197. " for _ in range(0, k):\n",
  198. " clusterValues.append([])\n",
  199. " \n",
  200. " for r in range(0, idx.shape[0]):\n",
  201. " for c in range(0, idx.shape[1]):\n",
  202. " clusterValues[idx[r][c]].append(img[r][c])\n",
  203. "\n",
  204. " imgC = np.copy(img)\n",
  205. "\n",
  206. " clusterAverages = []\n",
  207. " for i in range(0, k):\n",
  208. "# print(len(clusterValues[i])/(idx.shape[1]*idx.shape[0]))\n",
  209. " clusterAverages.append(np.average(clusterValues[i], axis=0))\n",
  210. " \n",
  211. " for r in range(0, idx.shape[0]):\n",
  212. " for c in range(0, idx.shape[1]):\n",
  213. " imgC[r][c] = clusterAverages[idx[r][c]]\n",
  214. " \n",
  215. " return imgC"
  216. ]
  217. },
  218. {
  219. "cell_type": "markdown",
  220. "metadata": {},
  221. "source": [
  222. "Next we need a way of printing the images, I usually use matplot lib. I'm also displaying the image that we are going to be using for the clutering."
  223. ]
  224. },
  225. {
  226. "cell_type": "code",
  227. "execution_count": 9,
  228. "metadata": {},
  229. "outputs": [
  230. {
  231. "data": {
  232. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAVcAAAD8CAYAAADDneeBAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nOy9Wa8lS3bf91sRkdMezlR1695bd2Y3yaZboiaSMj0QBAQJgkHAgGHI07seDH8Afw4/6sF+lSHDTwJkP4gGDL3IMm2LpthuqtVquod769Zwhj3lEBHLD5G5d+599jl1qpvXLhq1gEKdnRkZGRnDirX+awhRVd7RO3pH7+gd/dmS+f+6Ae/oHb2jd/T/R3rHXN/RO3pH7+gboHfM9R29o3f0jr4Besdc39E7ekfv6Bugd8z1Hb2jd/SOvgF6x1zf0Tt6R+/oG6BvjLmKyN8Wke+LyA9E5L/8pt7zjt7RO3pHbyPJN+HnKiIW+BPgbwI/Af4Z8J+o6h//mb/sHb2jd/SO3kL6piTX3wJ+oKo/VNUW+PvAv/8NvesdvaN39I7eOnLfUL0fAT8e/f4J8NfvKmyzUl05H12R+2sX2ZbQ9BMRAQVVZZDFpS+bCh5I6AevEJFdEdXh4eENBw0Y13V3Ww+f3msX+zeMETIn+KD4kJqtKAJIX5MevEv6h4frcvsmiiAcfPu2Ibt7OrxH0vdnmcOHSGYNGjomVUFZFmiEEAKqymK9wUfTvz+Cxv69pq9XUNXdGPwCdLSGw46986Hj7z/WX8fL3XFTjv55pHEH43OkusN7ots/jj50u64HfOPdxfpb+x8kkmSvsXYrRKyxAERNsw8RpB9nVdkOt6BY03+7EZy1hBBSvX2ZPMsoMpNKayRGRRU674lRiZp+D90oAkaEsshxVhAZ3rfrb9VUXBWiD0RVjLU0bQtiqOumf9/wbTueMayDqDGtwZjun52c0jYNi03T8wpFVbl6/uMXqvresf78ppjrsSHcWwIi8neBvwtgixmf/MZ/SAgK1oDeLVBLP3LD/xjBGLDWoqr9wk8ff6vsqI7xNVXFOUeMkRjjrXcaY7blY4y36jv2jsNnVXX7bPodsMZQljm+aTk7O+PDJwWvrlu+erHBxw7VDotgsQQUZdcOYwxowGgkoOl33z5jDIoh9sNgD5iAYLe/rQSimH6TEoSIIzKfVkzn5yxePeNv/zt/mXlpePTolKZp8E3EuZw//OH/zc+er/jTn70iCIgBHyPOZahGvPeIGERsP4mHvt0f3/FY7cq8vn/v6vPx9bvmwOE1K6lPw15b+rnR9+2xese/j793fx6Kghi9VdYiSM9I073xO+PtOkbv27vHbqEZkW25YQ6O23O7rXb3t4VJWVHXG4osQxSiMUxLy8XFBZevrqkbTxs8NrNk2H7dpfZlWcbJNCfLhIjinGE6qWjXC6aTkswIeeZ4/PiC9x/PAUNd19RtYLnacLVc07aeTbNGxNLUHc4K56czPvrgEU8enyNxg4hgjNn+y7IMVaXxHb4JrJYNKhaXZ/zzP/4TmiC8ulrStAEflLb1BFWCRsDgo1IajxD4rd/8K3zw3gV/+L//Hzy5eAxB+V//5BlfPntONJa68/yD/+o//9NbE6unb4q5/gT4ZPT7Y+Bn4wKq+veAvwdQnjxRVUENaIwY2S2+uxbQdjdVUDV0XbedQLCbeMaY7S4z/A1smehQznu/XeSHZW+98x4a6jhcbInJWlQDRMU5S547TmZT3NkpMUbKckJ4vsDEhtwaQhD6Pb2XKnaLSFXJnGFSTNDYbZnrcrlM32+FcIRPDYsV7Zm+6evt2+0wWJTNas2T986xs5IvPv2ISSloaDmZVXRNoMxyrpbnbK6uuHQ1a83xrsBFsBqJxmKMbvtEt+JHGrTdYj8c37s21v1vH1/b0XHG/CBSIehQ5z7jO9bO421/0GsQNSC6V0cUMJoYbLo+fsqgGndtMWYrD4/bN2htQ52aGrr9jnR9JIUeMPi06cbEgBWibzibFDy6OOPLL39GpKKsKpqmoesCxhhymxO1o7YOZwSraQStC6y6FVUxpcLSNjVlljGvck5nJdMq42RSMp9P8SGwXi/YbBq6AMtlTdt65vNTHp1OASjLkrbZELXjZF4QQ4MzQoxxK1QZY4gxYq0ldwYnhjzPaeqOddtRVjmx6QUx8UDEZAYCoEkAyEQpbcd/9nf+A3y95GRa8PR3/hqr1YYoOS9ebgiLS3701Uvy00f3jvU3xVz/GfDLIvIF8FPgPwb+07sKR1W8JonLiNsuHpXXAgSjXdn08+j2whj+HtPAjPRg8oUQcM5tJ+xY8jxWz+G1od6h7vFv0UHNAOcc8/mc5XKB956Tkxkvv36FECkLB0YIIU0WgxDDSL3cSjOGLHeEpqXIHWVZQvSoWCLCpmnTJiJJUh23XEyvDvXrbbsgY+Cjj55yffWCbn1DmUG9WTItphgnGAt5YZkUOd/69AOePprzb8WMf/g//wE/udxgjSM0HeQOay3ee5wM4ENiKkk63Elmb2pUHatld43JfdePUezhkGFTHT8bOWS5ff2Q1OF732n6kgeMX00PP+2e3zFY6f8e12u3ku1WQBiVuYvZizE90jNs+nZ8d+/vLRLRf0fm4PxszsX5CbFd89V1TdM0AASNWJNjScJQbsHEllmZ88WnT7l8+YLrxQ0fPXqfibUEXzCdVpyfTCjznGqSYUUxRsjFICHHdx3WGuIkZ7PZMCsznEScc1xcnBH8BOcMmYUQOmK0hKDE7YYUEZ+YbAgd03KKDBJ1jLSbjpcvrwnBpnWdCaHzGGOTvEHASOA3/+p3ya0nyyKlDczeO8E/OuPVYsNv/vrn/PZvfJfv/+Ql//Af/5Nb/T2mb4S5qqoXkf8C+B8BC/zXqvov7iov9LumSkIQJakSwPb/Q0lwuAYcVeXvfNfARA4WxaCyD/DCmLGmwQp7EsJD1dR9BhD3mK33kabzGCMsVhsmbsqTJ0/Qr1+AyWmaDh8ibZcwoqGeQbIZoIbpdErXdSyXS2azCVEM1mT4V5d4gRiPq8cWIbDPoIqiwFrho48+wq9e8NH7T7DG0MVA7lLbrRWQSFVYpvmMdRP5G//uX+W//x/+CYtVQ/QBl2cJi7OG6XRKs1knRuscnUoPweykfFVNC0F2YzlmvMP/ZgvojVT3/n8zbCJH50m8597t8dq/P5Q5LMtWUtxufQftFOhBmJ30GQcs3Uia87rd/4g9rogqsZc03Rb73EEHA8UEFnEXDdKsOTJvD7Wsbc0m/S6Kgjx3LJc3lGXJLBhWq1WCkdQQ2pa8sJyfznh6MuP9x+d8+OgUZwLVtz+i6wIXFxeYQvjqq6/ACKvVimpSpe9wFptlZEQMBdWkAHUEFaZFjvc15cRRFDn4hmmVoxpRDcQY8Rq3NgBrLUEDYAihJbOGZWwgNLx6dUUXAqgyKSY0nWJEiQR8jIQQyF2Gs/DFFx/xF/+N72C0JasKxBmiQFY45mbGB6cz6s6wbjukWd/Z78A344r1plTMn+jTv/Z3gH6i3iNwHMe9DjGx47joXfWM6S61/vD+m9Q5MOgxJHGImZVlyUmVEeKG+XzKyxcLOp+YXyRAFKLKllGoKkVuqYoMFUuz2XB6Omd5s6AoM6azE5bLZVoMWqT39FifHS1GNQrs1MxZ6Tifz4ihIY8rvvudX+WjJxeUlaOaFmTOcFJNmeQFWFAfyLKMLgb+9Gcv+G/+wf/EshGwiZlMJiVVkVHXawzCfH7KOgrPnz9PCyL0zL1nQWPm+iZ9DCOGdoSB6ujvY/UJYzxct300SJF3v3sfUz2cP3ub2VbjGGGkA3wl+xsNPf45tMuMtIt9bLb/9rFWY+7GZc0d+HH6bRBRjE0G1kmV8eH779HWNavFgiyf0RFZLNegFmeFKrP89d/6dX70o/+LJ48f8a1f+phMIlXuKIC2aTidlmRZQVAlquVquSSEgI9KlmXkLmk4dd2SZVkSZrom4aciFFmG9x0GofUd3gfCdoOOdD72aysJAcAWKrAIHz39AGMgRMMf/tH3efbsklU
  233. "text/plain": [
  234. "<Figure size 432x288 with 1 Axes>"
  235. ]
  236. },
  237. "metadata": {
  238. "needs_background": "light"
  239. },
  240. "output_type": "display_data"
  241. }
  242. ],
  243. "source": [
  244. "import cv2\n",
  245. "import matplotlib.pyplot as plt \n",
  246. "\n",
  247. "def printI(img):\n",
  248. " rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)\n",
  249. " plt.imshow(rgb)\n",
  250. "\n",
  251. "printI(cv2.imread(\"bg.jpg\"))"
  252. ]
  253. },
  254. {
  255. "cell_type": "code",
  256. "execution_count": 11,
  257. "metadata": {},
  258. "outputs": [
  259. {
  260. "name": "stdout",
  261. "output_type": "stream",
  262. "text": [
  263. "***Segmentation done***\n"
  264. ]
  265. },
  266. {
  267. "data": {
  268. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAVcAAAD8CAYAAADDneeBAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO2dcawf1XXnv6dgbK8hBmogNqbLixZbxqg0wdCNQxGQskmjZL1aKYjWRKyWrf/JboKqNDwUaZPdLorbVaus1GgllLSbCprEStI1qtL1EgIqFSQQt6Eb45jQNZu84o1xSxSloizQs3+8uY/77rv3zr0z987cO3M+0tP7/WbuzJzf7zfznTPnnnsuMTMEQRCEtPzU2AYIgiBMERFXQRCEDIi4CoIgZEDEVRAEIQMiroIgCBkQcRUEQchANnEloncT0Qkieo6IFnMdRxAEoUQoR54rEZ0F4FkAtwBYAvAUgF9m5meSH0wQBKFAcnmu1wF4jpn/NzP/PwBfALAv07EEQRCK4+xM+70UwA+090sAft7VeN05G/icjedmMqUezl6/Ea+98vLYZgBYtkVn4/p1OP/cDQCAH/3k7wEAL7/y6sp6m90pP49pj5CWjevXZT/G+eduWDl39GUuzLah27Xh26/veD/6yd+vOucB4McvLp1h5ots2+QSV7IsWxV/IKIDAA4AwDkbNuGq69+XyZQ62LKwe+X1mZPHRrRkGd2eXQvbAAD79u4EABx+/AT27d2Jw4+fwPGTL6za7szJYyvbqs+xZWF378+k2yOkRf2+Q6HOoxAOP36i87Yh+/Nhnu/qtX7Of/XTv/5/XNvnEtclAJdp77cDWHUVMvN9AO4DgE2bt8y2wEFf0TC3TyXMukiaJ7TvBNftMUVWKIuhRRWIF8e+YmrbX4jA6sc1bdi1sG2NU2Ejl7g+BeAKIloA8NcAbgPwK5mOVS26R5fSM+sraqG27Nu7c+XEO/jAI9Y2ukiby/T/QrmECpKtfWpx7EussLrW7du703nOK7J0aDHzawD+LYAjAI4DOMTMcgV50AVmy8Lulb9YUnvCod7N4v6bVl6bn6XtGK7lZqhERDgtId4XEC5I+s1Wf10TKW3OkooVy6bNW3juMVegW9xV91J1bzB2PzYbgLXx1jba7ua6Xbk8d6GdlCEB5anWIqa+m0XsZ7hh756jzLzHtk5GaE0A01NsE9RQr1i/AA8/fiLq0dC1H92GGFuEcul6XoyFS0BT3xxyxVyFDvT14nRRVZ6hzzvMkaGwuP+mNd6r/vjp6gywxWaFfBw/+UIy77UWj1VnCJvFcy2QGKHztY2JU4YKbYiX4rtoQ+N8Ql5ihLVG8SwBEdeCiXl07rp+18K2lQttTM9RvNZySRmjnBMSFigUXVhjt1H4Up1cnsuuhW2Ats58fJSLSRDCEHEtDHNk1Cpxa96bYukTYNVWj2nGiKVuQ4yw7jIEWqgP/fdOPVJqDkgqVoHYRFAnVGBd+aau/ZoXjH5BxV5MehK5bZisMC4Sc02DpGJNjKFipHpieG3pNoIwNiKuhaELps/bs8VXbdkBMQKcUkBrHaEjCKmQmGslKKE1H+d8gwbM9Cr1PjTH0RTbmPHieltbewkTCFNHYq6FE/vo7/JcbUn6fZPIfSLb5gWLuI7HVOKtJQy5lZhrpQxVuOX4yReyiZ2IaN30GfacmxzCmvLzirgWSN/x9vr2ehzW9GpN4UslhOrktIUfcgq50J+xPcExSX0TEXEtDN+je4rx/yECm0r8JNe1PLqGgkr1XlPRJ+3QhYhrwagLYdfCNquwqmWqnS3eavsbA/2zjFEBXwgT1qmLqAu9Dm2q70CyBQpDr2Cle3sqK8DnvdrquXahrfCKWh+aPaBnOogHOzxyMwvDHJHW14MVz7VQTBHdtbDNWTLQJcJdwwguAXQt1+/05l1f30aEdRj0p4MUwio5y90QcS0YXRyVMNke7X2dVvqytnY6vs4um0j6HqXEcxoWPW6espNyLiGDVKMSRVwLxRRRsxC2DddkgOZ+9f8h9JlrSUIB49NXaH1PJlOmr7cuMdcCsY2sss3o6pthwCbAsbHYrrFSc+YBYXz0WHnKWQgEN+K5FoKvqpWrupWv999sFxMSMLEJa0xcVjIEyqLPbyGx13DEcy2A0Ed1UxRDp7A2wwUp07HaPFpXTQRheMRjHRYR1xHZsrDberLrnVddvU3zODnoGioQ6kS81jgkLFAwIbPBmjO8CoJQBiKuI+OKUQKwdmKZxE7HLSI8b/o8QahMgTllDPRBxLVAzEEBbZ5rDDLLqpACEdh2JOY6AG2C5ur08c3eGrJfQbCRqmMrpnj6HClCXM9ev3HV+6nEEF3iZ1aLCjnRRUiFVPQRVvFYwylCXAF7IrxO7WLrOqHbTnTlveoJ4ILQlS7CapsVWLzVdoqKufp++LFL5sUSMpW1C7196NQsuRL1JS9yWqibc59iLCKsYRTjuepVfNq8s9LDBkoQ+wqTvh/fd5Jr/P7i/psAtNe4FG+6Dmzno0ybno8ixHXj+nWr3tvEwlwWkqZUOi4P4PDjJ6Kr+KcW2F0L21Y9/rU9GorAlo+K76vfLVRUc1TpnwNFhQXaqG2Meo7pUlzHSXGsrrOC7tu7E4v7b6rqt5kjOWf7FdZShOf68iuvrlnmOxF0L600D9aMkfqyAUI7BobwCnUb1evYi0k82TroEgYQYY2nGM91KhejTeR9n62EeJdNWIF+F6F6yhBvthz6POGUcJ6GUoqtRXiuodRS1cesQtVm89gng2sQQ9/e5LE/l5Ae29NWjtQs1wCFms4pYuaxbcDmiy/j62/98Mp7lxj5ijeXEhYA7GlYfQpI5/bqXfakvGAOPvBIsn0JaejqqOQSUrXvHAKaK6xxw949R5l5j21dq7gS0e8BeC+A08x8VbPsQgBfBHA5gOcB3MrMLzXr7gFwJ4DXAXyImY+0GajEVff4zB/eJzAhw0OHEl/T/jZhbJtpNTdDCCuwfAFNJfQzJcYqnD2kB5ozXuwT15CwwH8D8LsA/kBbtgjgYWY+SESLzfu7iehKALcB2A1gG4CvEdEOZn7dd4DXXnl55bVrauiuqUYrIof8AqvbHWqr65F8TGHNgXR2lYk5/QsQfl50DQcM/WhvO54vDTKknWu/Oq3iysx/SkSXm7YBuLF5/TkAjwK4u1n+BWZ+BcBJInoOwHUAnmg7jiJmBFYpoYAUgxrGEJwUmQxC/cxxzrPYHN8usd+uHVqXMPMpAGDmU0R0cbP8UgDf0NotNcvWQEQHABwAgHM2bHJ6rC7axKyLF9mVUkQ+lpiOthRCq+JpMiNsmXQR1tjKWDV1SOl0sTt1KhZZllmDusx8HzPvYeY9Z5+zIfgA+kR7LnyT9tVSm0AQxqLLjS9UfPrUNKiNrp7rD4loa+O1bgVwulm+BOAyrd12AK2/1NnrN7amLoX+4LYkfkVuDzN2VoCx8c2CkAvde3XZkBpVI0EhnWt2+qQ6dhlwUqsXG0pXz/VBAHc0r+8AcFhbfhsRrSeiBQBXAHiyn4mrianKbyax65W1zL9cdtUaMmjj8OMnOl8c6kIMrZnQZzCCq1iJLrj6/tXruQ7njb3pzMkTjaXVcyWiz2O582oLES0B+DiAgwAOEdGdAL4P4P0AwMzHiOgQgGcAvAbgg22ZAiGYP7itA8kmbupOHDI2X99nlw4q17TXtXixMeii2qcavS/2ag7BzRGr1YuY6MdSx5szIZkDIqp+QrIFftmx6p2O9vcCuLePUSHEiFaXav9KYGOFVu3D58nWKLi+FJU+QhQqmPv27gweiBDqcfrEwVYNbG7hhKFCRFOluOGvMT+oujBDEvFDBNYUPl/HmN4+RjBrFFYb5kWROnXL9nvpv7fu0ZqpRLk8KsnVXU3f33zKwgoUOvwVCE+qDxVjWy5f1wLUtXqfsaSYEqQN8wLTH0dj03uGfExVx5y6yIacA30GEtTuvf72R/Y7R2gVUxXLRFXw0S+2rh0bXZKkfReN8lan2lmlGEI4zA6RLh1JEvvLQ2jIpo841iysbRTrufq
  269. "text/plain": [
  270. "<Figure size 432x288 with 1 Axes>"
  271. ]
  272. },
  273. "metadata": {
  274. "needs_background": "light"
  275. },
  276. "output_type": "display_data"
  277. }
  278. ],
  279. "source": [
  280. "idx = segmentImgClrRGB(\"bg.jpg\", 5)\n",
  281. "res = colorClustering(idx, \"bg.jpg\", 5)\n",
  282. "printI(res)"
  283. ]
  284. },
  285. {
  286. "cell_type": "code",
  287. "execution_count": null,
  288. "metadata": {},
  289. "outputs": [],
  290. "source": []
  291. },
  292. {
  293. "cell_type": "markdown",
  294. "metadata": {},
  295. "source": [
  296. "Just like you can segment with RGB values, you can also try different color spaces such as HSV. Playing around with the distance measure would also give you different results.\n",
  297. "\n",
  298. "\n",
  299. "\n",
  300. "## Texture Clustering\n",
  301. "\n",
  302. "Similarly to color segmentation, you can cluster based on texture detection. The premis of this method is that you generate a bunch of filters of different shapes, sizes, and scales. When you apply this filter bank to a pixel, it will have similar responses as pixels in regions with a similar texture.\n",
  303. "\n",
  304. "\n",
  305. "The first step is to construct a filter bank."
  306. ]
  307. },
  308. {
  309. "cell_type": "code",
  310. "execution_count": 12,
  311. "metadata": {},
  312. "outputs": [],
  313. "source": [
  314. "'''\n",
  315. "The Leung-Malik (LM) Filter Bank, implementation in python\n",
  316. "\n",
  317. "T. Leung and J. Malik. Representing and recognizing the visual appearance of\n",
  318. "materials using three-dimensional textons. International Journal of Computer\n",
  319. "Vision, 43(1):29-44, June 2001.\n",
  320. "\n",
  321. "Reference: http://www.robots.ox.ac.uk/~vgg/research/texclass/filters.html\n",
  322. "'''\n",
  323. "\n",
  324. "def gaussian1d(sigma, mean, x, ord):\n",
  325. " x = np.array(x)\n",
  326. " x_ = x - mean\n",
  327. " var = sigma**2\n",
  328. "\n",
  329. " # Gaussian Function\n",
  330. " g1 = (1/np.sqrt(2*np.pi*var))*(np.exp((-1*x_*x_)/(2*var)))\n",
  331. "\n",
  332. " if ord == 0:\n",
  333. " g = g1\n",
  334. " return g\n",
  335. " elif ord == 1:\n",
  336. " g = -g1*((x_)/(var))\n",
  337. " return g\n",
  338. " else:\n",
  339. " g = g1*(((x_*x_) - var)/(var**2))\n",
  340. " return g\n",
  341. "\n",
  342. "def gaussian2d(sup, scales):\n",
  343. " var = scales * scales\n",
  344. " shape = (sup,sup)\n",
  345. " n,m = [(i - 1)/2 for i in shape]\n",
  346. " x,y = np.ogrid[-m:m+1,-n:n+1]\n",
  347. " g = (1/np.sqrt(2*np.pi*var))*np.exp( -(x*x + y*y) / (2*var) )\n",
  348. " return g\n",
  349. "\n",
  350. "def log2d(sup, scales):\n",
  351. " var = scales * scales\n",
  352. " shape = (sup,sup)\n",
  353. " n,m = [(i - 1)/2 for i in shape]\n",
  354. " x,y = np.ogrid[-m:m+1,-n:n+1]\n",
  355. " g = (1/np.sqrt(2*np.pi*var))*np.exp( -(x*x + y*y) / (2*var) )\n",
  356. " h = g*((x*x + y*y) - var)/(var**2)\n",
  357. " return h\n",
  358. "\n",
  359. "def makefilter(scale, phasex, phasey, pts, sup):\n",
  360. "\n",
  361. " gx = gaussian1d(3*scale, 0, pts[0,...], phasex)\n",
  362. " gy = gaussian1d(scale, 0, pts[1,...], phasey)\n",
  363. "\n",
  364. " image = gx*gy\n",
  365. "\n",
  366. " image = np.reshape(image,(sup,sup))\n",
  367. " return image\n",
  368. "\n",
  369. "def makeLMfilters():\n",
  370. " sup = 49\n",
  371. " scalex = np.sqrt(2) * np.array([1,2,3])\n",
  372. " norient = 6\n",
  373. " nrotinv = 12\n",
  374. "\n",
  375. " nbar = len(scalex)*norient\n",
  376. " nedge = len(scalex)*norient\n",
  377. " nf = nbar+nedge+nrotinv\n",
  378. " F = np.zeros([sup,sup,nf])\n",
  379. " hsup = (sup - 1)/2\n",
  380. "\n",
  381. " x = [np.arange(-hsup,hsup+1)]\n",
  382. " y = [np.arange(-hsup,hsup+1)]\n",
  383. "\n",
  384. " [x,y] = np.meshgrid(x,y)\n",
  385. "\n",
  386. " orgpts = [x.flatten(), y.flatten()]\n",
  387. " orgpts = np.array(orgpts)\n",
  388. "\n",
  389. " count = 0\n",
  390. " for scale in range(len(scalex)):\n",
  391. " for orient in range(norient):\n",
  392. " angle = (np.pi * orient)/norient\n",
  393. " c = np.cos(angle)\n",
  394. " s = np.sin(angle)\n",
  395. " rotpts = [[c+0,-s+0],[s+0,c+0]]\n",
  396. " rotpts = np.array(rotpts)\n",
  397. " rotpts = np.dot(rotpts,orgpts)\n",
  398. " F[:,:,count] = makefilter(scalex[scale], 0, 1, rotpts, sup)\n",
  399. " F[:,:,count+nedge] = makefilter(scalex[scale], 0, 2, rotpts, sup)\n",
  400. " count = count + 1\n",
  401. "\n",
  402. " count = nbar+nedge\n",
  403. " scales = np.sqrt(2) * np.array([1,2,3,4])\n",
  404. "\n",
  405. " for i in range(len(scales)):\n",
  406. " F[:,:,count] = gaussian2d(sup, scales[i])\n",
  407. " count = count + 1\n",
  408. "\n",
  409. " for i in range(len(scales)):\n",
  410. " F[:,:,count] = log2d(sup, scales[i])\n",
  411. " count = count + 1\n",
  412. "\n",
  413. " for i in range(len(scales)):\n",
  414. " F[:,:,count] = log2d(sup, 3*scales[i])\n",
  415. " count = count + 1\n",
  416. "\n",
  417. " return F\n"
  418. ]
  419. },
  420. {
  421. "cell_type": "markdown",
  422. "metadata": {},
  423. "source": [
  424. "Now that we have our filters we are going to display them to verify that they are what we expect."
  425. ]
  426. },
  427. {
  428. "cell_type": "code",
  429. "execution_count": 13,
  430. "metadata": {},
  431. "outputs": [
  432. {
  433. "data": {
  434. "image/png": "iVBORw0KGgoAAAANSUhEUgAABH4AAAQ8CAYAAADkJVl+AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nOy9e5AlV33n+T1d3aWWurrV6geS6BI0YFkgMA+jAdmyhxJtDViwhvGaMMYeKwbHKhzh2cFhVxh5ZoiN3XGs8U4Z73iHWFuBvdIGjLHXj0AGvITcULAMb/EGgfVAiAK9hdRdklotdeX+kedX+au8p/Jm5X1m1ucTcePkPSdv3pOZ3zzn5Dm/8zshyzIBAAAAAAAAAED32DHpDAAAAAAAAAAAwGig4wcAAAAAAAAAoKPQ8QMAAAAAAAAA0FHo+AEAAAAAAAAA6Ch0/AAAAAAAAAAAdBQ6fgAAAAAAAAAAOspAHT8hhNeGEL4dQrg9hHDdsDIF3QftQBPQDTQF7UAT0A00Be1AE9ANNAHdQB1ClmXNfhjCjKR/knSVpBVJn5f0S1mWfXN42YMugnagCegGmoJ2oAnoBpqCdqAJ6AaagG6gLjsH+O0rJN2eZdmdkhRCeL+kN0jaVGSHdofs8PkXas/qPQP87eR5bK7d5zCK/N/ysB7Msuxwzd23pJ1Du0N2dA/XfRoY9jmMUjcSZc60QJkzGch/L5Q5/Wm7bqT2aYcyZzpom24kypxpYdjncNdj0oOnslBz922pG6n92hl3mTNIx88RSd9z31ckvbK8UwjhWknXStL5B87Vf3rXH2nuxMoAfzt5VvfNt/ocRpH/K9+6+N0t7N5XO2XdLC29g+s+BQz7HIatG4kyZxqhzJkM5L8Xypz+tF03Uju0Q5kzfbRBNxJlzjQy7HNYXFzcyu7bUjdS+7Uz7jJnkI6fVC9kz7yxLMuul3S9JF12MGRzJ1a0cHxLYp46lo8ttfocpiD/fbVT1s3C8cVpyPdAtD3/0sTPgTJn2llLRy9ftaSFmxvkf3jLD1Dm9GOTezcwA9zDKbj+013mjOieVT6vLVkSpA3a2fZlzhQyBfmf7jJnREzBdR8Y2seToe3aGXf+B6nCVyRd5L7PS/rBYNmBbQLagSagG2gK2oEmoBtoCtqBJqAbaAK6gVoM0vHzeUkXhxCeE0KYlfRmSTcNJ1vQcdAONAHdQFPQDjQB3UBT0A40Ad1AE9AN1KLxVK8sy54OIfwbSR+RNCPpz7Ms+8bQcgadBe1AE9DNFDOqaUJDAu1MENNGS6YIeaZeN3ZNx/n8tfh+jpOp1w5MJegGmoBuoC6D+PhRlmUflvThIeUFthFoB5qAbqApaAeagG6gKWgHmoBuoAnoBuowUMcPAAAAAGwzvJUR1j/DYa0UVsE1BwCALULVAQAAAAAAAADQUej4AQAAAAAAAADoKEz1AgCArTPlDp0Btg1+CG8SzyUOn8dP3fvMPQEAgAhVAgAAAAAAAABAR8HiBwAA6jNqiwKGIwCaM4kl3g0cPk8fWAYBAECEoh4AAAAAAAAAoKPQ8QMAAAAAAAAA0FGY6gUAAACjBQfAANNL1ZQwnlkAgE5AcQ4AAAAAAAAA0FGw+AEAgP6wfDtAe5ikk2f/vwwvtp+UhrivAACtg6IbAAAAAAAAAKCjYPEDAAAAAMMHy59ughUQAECamS3sG7a4v3GmwW9EMQ0AAAAAAAAA0Fno+AEAAAAAAAAA6Ch9O35CCH8eQrg/hPB1F3cghHBzCOG2GJ432mxCG0E70AR0M0Wsuc+o2aGBhyLQDjSh07rZoaE8WwMzzrJkjHRaO1tlzYXlD2wA3UBT0M6Eman47IofX+/u6vMJpe87Ep8hUudwN0h6bSnuOknHsyy7WNLx+B2gzA1CO7B1bhC6gWbcILQDW+cGoRtoxg1CO7B1bhC6gWbcILQDDenb8ZNl2SckPVyKfoOkG+P2jZLeOOR8QQdAO9AEdANNQTvQBHQDTUE7NcEKaAPoBpqCdsZIyqonZcmzO35m42e3+9S1+Cn/z4hoakB0fpZl90hSDJ8xvCxBx0E70AR0A01BO9AEdANNQTvQBHQDTUE7UIuRL+ceQrhW0rWSdP6Bc7W6b17Lx5ZG/bcjpe3nMJL8v29xqIcr62b52Du47lPA0M9hyLqRKHOmkdV981q+ijJn3JD/BJQ5fRnJ8zpm2qCdnjLnqne0/tp3Iv9Trhupo2VOi/MvjeAc/gHd1GGk5xBqplXt14fVPfNa/gmX/6wUbkZVekWZ07Tj574QwoVZlt0TQrhQ0v2b5ivLrpd0vSRddjBkcydWtHB8+GIeJ8vHllp9DhPOfy3tlHWzcHxx0vkemLbnXxrwHAYz7abMGec5DNkMf/mqJS3c3Cf/o3M+S5kjpU2Hz7jtpxLpozA33sJ9bkNdJbWwzOnzfNd6XofBCB1Ot0E7fcucFk6HGlg7E3ZC3gbdSC0sc/rQ9rpWmvg5NCpz2q4baQjXvdzO2ZFIm0mkzybi6raZrO21Ji1ftqSFWxal00WcpOK7j/NttnJaTZoWsTdJuiZuXyPpAw2PA9sPtANNQDfQFLQDTUA30BS0A01AN9AUtAO16GvxE0L4C0kLkg6FEFYk/U+S3inpr0IIvybpbklvGmUmoZ2gnW3MACOV6GaEVFl9tHB0uQzaqSA1zHNmk21jhA4Gp4ltpxuvhdRIYha/j/r++zJn0svNN2Sk2kldkw6U05WUz6+luujHtitzYGignQFI1WllCx5f5ljcLhe3KxGXsgwyrG71VtWnXVwo/WeqTT7Ednrfjp8sy35pk6Rjg/89dBm0A01AN9AUtANNQDfQFLQDTUA30BS0A4PQ0b50AAAAAAAAAAAY+apeALCN6LoZelsx89N+033GDUMPo8fu/Wwi7alNtsu/HeaUnw5M7+kk5XLAf98mU/5agT0z26WupbwAgCak6q1diTSL210K/fZZibiUw+fU1OlTMXzcxXlHzlkM7bfWFvPHqHLuvEUoRgEAAAAAAAAAOgoWPwDQjO0y4thWvOO5stO6Uy6taiSBkf72Ux7N8jzmtk8l0kdh8QOTo+59XFOuB9s/VZYMGyuHGI6sB46fc9BLO+hX9oQa+wybSVo7w3Cpsu5J1V/egmdPDM+J4d6KNH8Mj2nJ2lEnXVrKgsfiTim3+Dmloj2W2r+KLT43FJkAAAAAAAAAAB0Fix8A6M/aJtswfdjoxj4XZ6MbNpLg5xqnLD12JeKgXcwoH0W1e+lHsWzI5yEX94j7nVEeMcPypxuklqb1Pp4ySU+q8EPgrcVMR6O2/JEYmtwqWAHloJvhUrfcr7rudSx+dm3hv6qoaylRld+6zw1WQ+OnrBHfzvFpVm95a529pVCS9sfwQOm7VFj8pNrEvu1sFj7WpvK6eLy0j1S0tx6X9HT8bnXwafWSsr5uWM5RPAIAAAAAAAAAdBQ6fgAAAAAAAAAAOgpTvQC6wFbNY6vMU7tuGt5FvBmqmakednGmj3tj+IhLMzNU7+xu1FO9GHIYPXaNzdz5cCLtThf3QAx9WWJmzqnlS6G9+Oe76t4+nEiz+uFcFzeqaV8wOFud/rXVqSvTWCbgKLw+Vfcvdf2qHP7vSKRtZb/9m/znVknp+0yNtLr7+f39Mt6bOahmOthwsGvrNTIbv6ecMNt0rgMu7mAMz3dxh0uhd5NgdaW/hzZl614X90gp7UGXZvs97OJOxPDJeGyfVv5vzxDKW4pFAAAAAAAAAICOgsUPQFso93b7UdaqLtynNtkug6VP+7ARAe+M7rkxvMDFmTXH12LoRypMV97iB9qPOXe2Ua/5UpokfdbFrcTQlyU2UuYdIUK3sJFSf493Kh/9vC1+f8Cl2Ujl81ycWf8M2/IHy43R4K0UNsO3FVLORo2yM/hMxej4pK2BcPzc/x6UrW9SzmNTFjy7EvvNlr77uJRD+ZnSfs/W1q0cUtY0pt21xH6p9rDX91opzh8jtcy2txDaodyCMmUZVAUWQb2krHsszi82cE6MN2sdb91jVj1HXNxFFucPfDSG1mj2XSOrMeq
  435. "text/plain": [
  436. "<Figure size 1440x1440 with 48 Axes>"
  437. ]
  438. },
  439. "metadata": {
  440. "needs_background": "light"
  441. },
  442. "output_type": "display_data"
  443. }
  444. ],
  445. "source": [
  446. "def displayFilterBank():\n",
  447. " \n",
  448. " fig, axs = plt.subplots(6, 8, figsize=(20,20))\n",
  449. " F = makeLMfilters()\n",
  450. " for row in range(0, 6):\n",
  451. " for col in range(0, 8):\n",
  452. " f = F[:, :, row * 6 + col]\n",
  453. " axs[row][col].imshow(f, cmap='hot', interpolation='nearest')\n",
  454. " axs[row][col].grid(True)\n",
  455. "# plt.show()\n",
  456. " plt.savefig('filters.png')\n",
  457. "displayFilterBank()"
  458. ]
  459. },
  460. {
  461. "cell_type": "markdown",
  462. "metadata": {},
  463. "source": [
  464. "With the filter bank we can now start clustering based on the responses from each filter response. When we clustered with color, we had three dimensions, with our filter bank we now have 48 dimensions."
  465. ]
  466. },
  467. {
  468. "cell_type": "code",
  469. "execution_count": 17,
  470. "metadata": {},
  471. "outputs": [],
  472. "source": [
  473. "def generateGrayScale(image):\n",
  474. " w = np.array([[[ 0.25, 0.5, 0.25]]])\n",
  475. " gray2 = cv2.convertScaleAbs(np.sum(image*w, axis=2))\n",
  476. " return gray2\n",
  477. "\n",
  478. "def segmentImg(imgFilename, clusters):\n",
  479. " #1. Load and display the image from which you want to segment the foreground from the background\n",
  480. " # Make sure to convert your image to grayscale after loading\n",
  481. " img = cv2.imread(imgFilename)\n",
  482. " h = img.shape[0]\n",
  483. " w = img.shape[1]\n",
  484. " gImg = generateGrayScale(img)\n",
  485. " printI(gImg)\n",
  486. " \n",
  487. " #2. Create an overcomplete bank of filters F\n",
  488. " F = makeLMfilters()\n",
  489. " \n",
  490. " s = F.shape\n",
  491. " \n",
  492. " #3. Convolve the input image with every filter in the bank of filters \n",
  493. " responses = [None] * s[2]\n",
  494. " for k in range(0, s[2]):\n",
  495. " fb = F[:, :,k]\n",
  496. " response = cv2.filter2D(gImg,-1, fb)\n",
  497. " responses[k] = response\n",
  498. " \n",
  499. " \n",
  500. " #4.Take the absolute values of the responses and \n",
  501. " # reshape the reponse tensor into a matrix of size [row*cols, num_filters]\n",
  502. " a = np.zeros((h*w, s[2]), dtype=np.float)\n",
  503. " \n",
  504. " for r in range(0, h):\n",
  505. " for c in range(0, w):\n",
  506. " for f in range(0, s[2]):\n",
  507. " t = abs(responses[f][r][c])\n",
  508. " a[r * w + c][f] = abs(responses[f][r][c])\n",
  509. "\n",
  510. " #5. Run k-means on the vectorized reponses X to get a vector of labels (the clusters); \n",
  511. " kmeans = KMeans(n_clusters=clusters, random_state=0).fit(a).labels_ \n",
  512. " \n",
  513. " #6. Reshape the label results of k-means so that it has the same size as the input image\n",
  514. " # Return the label image which we call idx\n",
  515. " kmeans.shape = (h, w)\n",
  516. "\n",
  517. " return kmeans\n"
  518. ]
  519. },
  520. {
  521. "cell_type": "code",
  522. "execution_count": 19,
  523. "metadata": {},
  524. "outputs": [
  525. {
  526. "data": {
  527. "image/png": "iVBORw0KGgoAAAANSUhEUgAAAVcAAAD8CAYAAADDneeBAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjIsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy8li6FKAAAgAElEQVR4nO2dbaxm1XXf/8vjOzBgM7zYl9zMJR6sTDJg09oWIlhUyciOU2JZ4ZMRlVpNJaqxLJCdmspALdXtB1SSSlb6Iak0atxSJTZGTlKQFZU6NCiqZIxNYhvDDIHUU/uWKZNQm7QOgfF498M9Z2bfPXvvs/bbOeucZ/2kR/e55zlnn3Xe/mfttdfem4wxUBRFUeryhqkNUBRFWSIqroqiKA1QcVUURWmAiquiKEoDVFwVRVEaoOKqKIrSgGbiSkQ3E9FzRPQCEd3Taj+KoigSoRZ5rkS0C8CfA/gAgC0AXwPwD4wxz1bfmaIoikBaea43AHjBGPM/jDGvA3gQwC2N9qUoiiKONzYqdx+A71n/bwH4uaARF1xkLrjokkamKDmsre06+33vheF38OtnDF49fa72c/r0maZ2jY19HubEnjWqXubuXWllXnjRxdVtqMXf/s0PvctTbP7bv/khvvu9rb8yxrzV93srcfVdhR3xByI6AuAIAOzecwmued/hRqYoqWxu7D37/eD6GvZfERaYEy9vi+nxU6cBAFsnX2lr3MjY52JOHFxfq1JO7NrHuOa6G6vsvxXHnn7CuzzV7kO/cNP/DP3WKiywBeAq6/9NAC/aKxhjjhpjrjfGXP/GC/Y0MkNpTf/w9Q/zXMVoafQvuxKWKqyA38badrcS168BOEBEVxPRbgC3AXik0b6UiqR4rT1LFdilHIcyTIsXQpOwgDHmR0R0J4BHAewC8FljzDMt9qW05cTLZ5IEtveYNjf2YuvkK2f/zg0V1uXT2sNuFXOFMeYPAfxhq/KVtvReqCus7g3pxq4Orq/tEFj7L7C8mKyyzRxCAWOjPbQUNpw41f4rdkUbUzY39p79KMtAhdVPM89VkUttYbvmuht3eLBuiEBZLiqsYdRzXTFaeYz6kMmj5OXGibPrNY+j4iqcmlVot5yD62tnq/B2Vb5GjuRQeKC3R8MD7aiV6+pDhXUYDQsIxhae/nutBiGfmLoPY2qeoxse4CCxgUsF348KahoqrgKJPdy5Iuvmr9pwRfTY008MPmC2wNrlulXUPk1LEtLsUeaNhgWE01fd7So84BeCHHHI7YUTg+PhSBMyafYo80fFdUXwebothBWAN3OgZfyvFBXWnbS6L1YNFVfhxFp83QahWKhgSgGRKrDaoKa0RMVVIK5IHj91elBkxyK1wcpFmsAq59h/xa6g16qNWemouM6IFC/WxRbsXqxPvHzm7KcmvgdRWlVTPdadxK6PCmseKq4rRCxskCKyqd5rCwFXFOmouC6MFI/MF24oFdhjTz9RHDpoicZZlbHQPFeBtHz4ffmltsAeXF87K7BDVfleRPtqo3RRVfxIC9ksBfVcBRKrvtfo0ZRaxpA3G/NW3W3HHsxFPVVlKlRcZ8jWyVeCAskVztB6dqjAjpXWaPzyCWtL4VtlUeVmZXC8VumhHqloWEAobvXdJ4b9stwusfb6KULEnZ1gKlZZVHuOnzqtaW8To+I6E2LTpdQKFdQWWHdm2NaoqG6TIqrSX5RzRsMCgumr/2ONHOXLhbWFMUUk7fDBGB6UCmtbNNc1HfVcZ8KUQ/O5AtuLpS2gvfczZj6rCqqf1JCAeq9tUM9V2QFHxH0ebKyxq0VYQIU1Tuo5104e9VHPdWJCIjGlp+o2lPnoH97eQ8oV0JzjVGEdH85YvspO1HMVSomA1JphlevFtmywcmdjUGFtByefWeGj4johvVCkDIjNLbOkDJvWHjRHMFVUlTmiYYGJGFMsuGO+SkMFVZkzKq4T0Eo0OOXG8mVzy6yFiqmyJFRcGxMTDE66DEcMU0WpRZmlqLBOj6Zj1UXFtSGuYHBzDw+ur+1oJPIJT6vZU1XklkMo31VFdBxEiOva2i6RKUklxKayBsI3ODffsFQEfd6rCqui1EOEuMaYa2NMjy2sHI9h/xW7Jkno9glrSi+fsYcSVPJQr3U8RIjrnjVirZc7+tMcKU3O58CNB3O8bDeUoSirjghxBfxeUh8zch9a6SLrE60Uj8H2XscQWZd+n0M2h35XkZ0v3FkolGFEdyLoH3I3ub5nyTFC++aWKKw++m1C10sZn9yRzHSsgXLEeK4cQl6sNA+2dkt+aZXbd35iMVafsIb6lbtdIu3RsabwupU4pSNg6RgDfER7rnPFJ1xTeQLcF0+OsPa/+X7ff8WuHZ6sIgffvaiean1EeK6vnjaLnZbCN/5pqueQ6vmlevI1zrstsLY3O1V4Q4njG4vXt4702Kt9r0nzqEWIa89SBdaFc2PnwBXV1rHqa6670TuCkmYUTMPQcyXVa00dhctdf2qxHRRXIvosgA8BOGWMeWe37HIAXwCwH8AJALcaY77f/XYvgNsBnAHwMWPMoykGueOE+n6zkRZv9QmXa7d9bEM3dkiMah13ah4ul/7G7m949WDnic977eOu/bWtIWIthjOcWmw5Mdf/COBmZ9k9AB4zxhwA8Fj3P4joWgC3AXhHt81vEVGzeoU0YeXim58qhTkdt++GXoXaydKRLqyh/biflgx6rsaYPyGi/c7iWwAc6r4/AOBxAHd3yx80xrwG4DtE9AKAGwB8JbaP06fPJLew5ww8MoYopR6HLbBjdxyYQuSm6oG26kgNuU09AHfO/rkvlNyY65XGmJMAYIw5SUTr3fJ9AGxrt7pl50FERwAcAYDdey5ZdM4qF66o1ko/cx+4Vg0Ybgy238dY4QFuiEnxM4eGrTHhCnLtBi1fP1bjW9EYcxTAUQC4+LKf2LHO0BtW2nB5Lpw5qGqUPyd8jVxjNHD5UszsHNwYKsDn8Ans1A1G0skV15eIaKPzWjcAnOqWbwG4ylpvE8CLQ4WtrQ2/FVO8utTfWomVXa7U6VY43mutdBdbYPvwQExgXQFMFbuD62tej4vjhfkEeM5iWyMk4N4b2qEgTm4ngkcAHO6+HwbwsLX8NiK6gIiuBnAAwJNlJvLJFbAxPN1UcbTXb+2ppohGzRhZqJNBqPsst1std71QB4jeNrsjREq5Emn1Ypg6ZioZTirW57HdePUWItoC8GkA9wN4iIhuB/BdAB8GAGPMM0T0EIBnAfwIwB3GmOqtF60GiW4tYqnhgtr2DE2VbXd2iHl3JS3FQzmwQ6lhnCo9xzNNsT0UUpiTJ9vypaAerB8yxhsSHZVL1/eZQ7fecfb/Gg0Q3JtpqtxZjsCOKa7A+eeM24iR82DZAuvLHkhpQIn1fEuxzRaJmEdm2zsHga0prKHrkiuurT1f+95o0TD30TvvesoYc73vN1E9tGpRejONMUB360avHHKzB0o9Fzc9a+gBDg0WU4p9DKHuvP3+7CEh5yCwtZCWOcBN6+vXS+kdWXqsIsW1lZeaQ8twwZgNbFwR9wksMHwj1hBYLm5oISS6taqqvvJ19K9pqZErzSmjZD8ixTWFUmHleB6tBDbkvU49jKKvCzLnjV8z9jZUju/3UDy3Fr7ypXeKaOF4uB5d65CA5PMbY/biGhqLgJve0y+b0utoKaQlYYfQuY15s1M3brTed2sBr8lYmQ05DZycczhXUe2Zvbj2cMUx1A1wSGBreJOu0I0dbsjF10UXCIus78FZUmvyHEb9miJljCuyqyCsgCBx5Y6WX4PYyFshSoQwdBwtqv+tG8hyzh2wvHxIX4cIKUydixurvayKsAJCxPX0af/JrD1dikuKl5EqhFy7W3jEY2CLbIvxaacOLwwh8WUxpqhyc6Fzyl0KIsQ1lTG9XHd/rb1NTtnS07cAGbOHch7w0hxdCUzlqUpLy5KGeHF1W9RD4tNyFH5fa37qflP25ZYpSUx9+OLYHA+kVXpXivhJniaEw9QhgNovU2khlhLEi2tPLRGrnbyfIobcjIahF4lEYrMthOB4PtLDA8o2NUV
  528. "text/plain": [
  529. "<Figure size 432x288 with 1 Axes>"
  530. ]
  531. },
  532. "metadata": {
  533. "needs_background": "light"
  534. },
  535. "output_type": "display_data"
  536. }
  537. ],
  538. "source": [
  539. "idx1 = segmentImg(\"bg.jpg\",6)\n",
  540. "res1 = colorClustering(idx1, \"bg.jpg\", 6)\n",
  541. "printI(res1)"
  542. ]
  543. },
  544. {
  545. "cell_type": "markdown",
  546. "metadata": {},
  547. "source": [
  548. "With the texture clustering, you can see that part of the ocean was in the same cluster as part of the sky or beach. This is to be expected because they were clustered on texture and not color. When doing texture detection a common thing to pick up is blurryness. This maxes texture clustering useful for things like background removal when your foreground is in focus and the background is blurred. "
  549. ]
  550. },
  551. {
  552. "cell_type": "markdown",
  553. "metadata": {},
  554. "source": [
  555. "# Takaways\n",
  556. "\n",
  557. "Due to the k-means algorithm not always converting on an optimum answer and being profoundly affected by outlines, it is rarely used by itself. However, it frequently used to bootstrap and influence other algorithms.\n",
  558. "As the MacQueen stated initially in his article, \"there is no feasible, general method which always yield an optimal partition.\" Due to the nature of NP-Hard problems, this fact is unlikely to change anytime soon. More recently, people have rebooted k-means to include a beam search to avoid converging on local maxima-- this process is called \"k-means++\" and Vassilvitskii outlines it in his 2007 paper titled \"K-means++: the advantages of careful seeding\".\n",
  559. "\n",
  560. "Another major question in this field of research research is: how do we choose k? Newer work with k-means and other clustering techniques look into how we can automatically select a k using the elbow technique or other techniques like GVF.\n",
  561. "\n",
  562. "K-means has had a lasting effect on the field of machine learning. Most textbooks and AI classes cover k-means clustering as a starting point when teaching people about unsupervised learning. Moreover, algorithms to this day are still using k-means as a tool behind the scenes to pre-process data before it gets fed to the next step in the data pipeline. \n"
  563. ]
  564. }
  565. ],
  566. "metadata": {
  567. "kernelspec": {
  568. "display_name": "Python 3",
  569. "language": "python",
  570. "name": "python3"
  571. },
  572. "language_info": {
  573. "codemirror_mode": {
  574. "name": "ipython",
  575. "version": 3
  576. },
  577. "file_extension": ".py",
  578. "mimetype": "text/x-python",
  579. "name": "python",
  580. "nbconvert_exporter": "python",
  581. "pygments_lexer": "ipython3",
  582. "version": "3.7.6"
  583. }
  584. },
  585. "nbformat": 4,
  586. "nbformat_minor": 4
  587. }