Cartoonify Reality is an OpenCV project that uses core OpenCV functions and the K-Means clustering algorithm to perform image processing and cartoony them. This project shows how good image processing can be used to do the same task that normally takes a machine learning model and high computational power. Usually, the same task is done using either auto-encoders or GANs(Generative Adversarial Networks). In this blog, we will explore more of Computer Vision and Image processing.
The workflow is as described below-
We will start by working on images as input. These images can easily be converted to videos. Take the source image, blur it using a Bilateral filter. Use canny to finds its edges. Then convert the image into HSV(Hue-Saturation-Value) format and apply K-means clustering to it. Convert it back to RGB(Red-Green-blue) format and draw contours on the image. Finally, use erosion to thicken the boundaries and display the output image.
CONTENT
- Computer Vision
- OpenCV
- Blurring
- Canny Edge-detection
- K-Means Clustering
- Contours
- Erosion
- Raspberry Pi
- Output
Computer Vision
Computer Vision(CV) is a field of computing that deals with replicating the complexity and vision of the human eye. Various algorithms and formulae are developed to help computers visualize better. Input is taken from cameras, and output can vary with use. The most famous application as of today is object detection as shown below.
OpenCV
OpenCV is an open-source software containing pre-built functions and algorithms used for implementing Image Processing and Computer Vision. The python library that we will be using is “cv2”. It is one of the most widely used libraries in python with over 18 million downloads. We will be using some of those built-in functions to process our input image.
Start by creating a new python file and importing it.
import cv2
Define the main function of our code-
# takes input image as parameter
def cartoon(image):
# returns the output image
return output
Blurring
Blurring in terms of digital image processing is smoothing of the image, removing noise from it. Filtering is one of the fundamental operations of Image Processing. Filters like Gaussian Blur, Median Blur blur images, but they also tend to smooth the edges. To avoid that we will use the Bilateral Filter.
Now we will apply Bilateral filter to our cartoon function-
STEP -1:
import numpy as np
output=np.array(image)
x,y,c=output.shape
STEP-2:
for i in range(c):
output[:,:,i]=cv2.bilateralFilter(output[:,:,i],5,150,150)
We filter each channel of the colored image separately, as the filter works only on 2 dimensions.
Canny Edge-detection
Canny is a famous edge detection technique that uses the Canny 86 algorithm to detect edges. Since any edge detection technique is prone to noise in the image, we use the bilateral filter to remove them.
Now, we will code canny:
edge=cv2.Canny(output, 100, 200)
K-Means Clustering
K-Means clustering is a clustering algorithm in which n observations are partitioned into k clusters. You may be thinking why do we need it? The answer is simple, think of the problem by considering each pixel in an image. Each pixel in an image is represented by 8 bits, which means each color channel can have 256 possible shades/values. Now if we cluster 8(n) bits(observations) into 5(k) bits, then the number of shades is reduced. If that happens, then the output image, instead of having a variety of shades will have clusters of same shades.
STEP-1: convert the format of image from RGB to HSV for easy histogram functionality. Draw a histogram of each and append it in a list.
output=cv2.cvtColor(output,cv2.COLOR_RGB2HSV)
hists = []
hist,_=np.histogram(output[:,:,0],bins =np.arange(180+1))
hists.append(hist)
hist,_=np.histogram(output[:,:,1],bins =np.arange(256+1))
hists.append(hist)
hist,_=np.histogram(output[:,:,2],bins =np.arange(256+1))
hists.append(hist)
STEP-2: Create the following function to update the value of the centroid. Line 17 finds the centroid value for each new cluster.
def update_c(C,hist):
while True:
groups=defaultdict(list)
for i in range(len(hist)):
if(hist[i] == 0):
continue
d=np.abs(C-i)
index=np.argmin(d)
groups[index].append(i)
new_C=np.array(C)
for i,indice in groups.items():
if(np.sum(hist[indice])==0):
continue
new_C[i]=int(np.sum(indice*hist[indice])/np.sum(hist[indice]))
if(np.sum(new_C-C)==0):
break
C=new_C
return C,groups
STEP-3: Create a new function with the name given below. It defines the number of centroid values for each channel and calls the above function.
def K_histogram(hist):
alpha=0.001
N=80
C=np.array([128])
while True:
C,groups=update_c(C,hist)
new_C=set()
for i,indice in groups.items():
if(len(indice)<N):
new_C.add(C[i])
continue
z, pval=stats.normaltest(hist[indice])
if(pval<alpha):
left=0 if i==0 else C[i-1]
right=len(hist)-1 if i ==len(C)-1 else C[i+1]
delta=right-left
if(delta >=3):
c1=(C[i]+left)/2
c2=(C[i]+right)/2
new_C.add(c1)
new_C.add(c2)
else:
new_C.add(C[i])
else:
new_C.add(C[i])
if(len(new_C)==len(C)):
break
else:
C=np.array(sorted(new_C))
return C
STEP-4: We now have the centroid value for each channel. Then we cluster each value relative to the centroids and finally reshape the image.
output=output.reshape((-1,c))
for i in range(c):
channel=output[:,i]
index=np.argmin(np.abs(channel[:, np.newaxis] - C[i]), axis=1)
output[:,i]=C[i][index]
output=output.reshape((x,y,c))
output=cv2.cvtColor(output, cv2.COLOR_HSV2RGB)
Contours
A contour is a closed curve joining all the points with either the same intensity or color. They help in visualizing the shape of objects or figures in an image. Normally to use this filter we first need to use either canny or some thresholding. The specifics of the function may get tricky, it is recommended to go through its documentation. We will use this method in our code to give boundaries to our output image so that each object is distinctly visible.
STEP -1 : find the contours using edge as source image.
contours,_=cv2.findContours(edge,cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
STEP-2 : draw the contours on the output image.
cv2.drawContours(output,contours,-1,0,thickness=1)
Erosion
Erosion is a morphological process that mostly deals with altering shapes in an image. Dilation and erosion are sister processes. In simple terms, they are used to thicken or lessen boundary shapes in an image. We will use it to erosion to thicken the contour boundaries to make them stand out.
We will code erosion using OpenCV now:
for i in range(3):
output[:,:,i]=cv2.erode(output[:,:,i], kernel, iterations=1)
Since erosion also works only on 2-dimensional images, we use a loop for each colored channel.
Raspberry Pi
Raspberry Pi is a small computer. It has its own RAM, processor, and other components. It mostly uses Rasbian as its operating system but, as of today, it can support almost anything. This device has brought about a huge change in the field of IoT. From home automation to self-driving cars, it is used in almost every hardware project. We will use it to capture input images for our code.
1. Installation
It may get tricky while installing OpenCV in raspberry pi. It can also be easily done by running this command:
pip install opencv-python
If the command fails you can refer to the following link: https://www.learnopencv.com/install-opencv-4-on-raspberry-pi/
2. Input
After installation, we only need to fix the camera in the raspberry pi camera port. After that, we can either use images as input or a video. We can capture images through the camera or simply load custom images. Also, the same thing can be done for videos as well.
A demo code for capturing a single frame from a video using camera :
cap= cv2.VideoCapture(0)
while(True):
ret, frame = cap.read()
cv2.imwrite('image.jpg',frame)
break
cap.release()
cv2.destroyAllWindows()
Output
1. Image
To output an image we call the cartoon function we an image as input, and save it.
output = cartoon(cv2.imread("original.jpg"))
cv2.imwrite("cartoon.jpg", output)
2. Video
For converting a video to the cartoon, we create a new python file and name it “video.py”. There we import the cartoon function and call it for each frame of the video.
import numpy as np
import cv2
from caartoonize import caart
videoCaptureObject = cv2.VideoCapture(0)
out = cv2.VideoWriter('out.mp4', cv2.VideoWriter_fourcc(*'MP4V'), 24, (720, 1280))
result = True
while(result):
ret,img = videoCaptureObject.read()
img=cartoon(img)
cv2.imshow("original",np.array(img))
out.write(img)
if(cv2.waitKey(1) & 0xFF==ord('q')):
break
videoCaptureObject.release()
cv2.destroyAllWindows()
Line 7 creates an object for us to be able to write the output in video format. For each frame in video capture, we call cartoon function. If q is pressed on the keyboard, the raspberry pi camera stops taking video and the cartoon video gets saved.
With this, we come to the end of the tutorial, hope you learned some new filters of OpenCV and something about the K-Means Algorithm. If any doubts or errors come you can try and go through the respective documentation. Also if you face any syntax errors or logical ones you can comment below. The entire working code and directory of this project can be found in this GitHub repository: https://github.com/Shaashwat05/Cartoonify_reality