Computer Vision

4 min readOct 5, 2021

When we look around, we see many objects and our brain can instantly distinguish what all these objects are. The purpose of computer vision is that machines, like humans, can detect the object they see and distinguish what it is.
If we look at the definitions of Computer Vision;

“the construction of explicit, meaningful descriptions of physical objects from images” (Ballard & Brown, 1982)
“computing properties of the 3D world from one or more digital images” (Trucco & Verri, 1998)
“to make useful decisions about real physical objects and scenes based on sensed images” (Sockman & Shapiro, 2001)

In fact, we use many technologies that make use of computer vision throughout the day. For example, when we search for an image in search engines, algorithms that analyze that image are used to find relevant images. Something we use more often is facial recognition; many of us unlock the phone with the face detection option. This happens when we introduce our face to the machine before and our face is detected and analyzed every time we want to open the screen. In the same way, in the Facebook face recognition system, it is estimated who the people in the photo are by using the previous data.

👀So how do machines see?

Colors are represented by numbers: Each color is represented by a unique HEX value in computer science. This is how machines are taught to recognize the colors contained within image pixels.

Image Segmentation: It is done to identify groups of similar colors and to segment the image, that is, to separate the object from the background. The color gradient technique is used to find the edges of different objects.

Finding corners: Following segmentation, images are searched for certain features. In a brief, algorithms look for lines that intersect at an angle and cover a certain area of the image with a single color shade. Features (corners), are the building blocks that aid go for more detailed information included in an image.

Finding textures: To accurately describe any image, it is necessary to determine the texture in the image. The texture difference between the two objects allows a system to correctly categorize an object.
Make a guess: By matching the image with those found in the database, the most similar object is presented as a guess.
Finally, the machine sees the larger and clearer picture and checks whether the picture description is correct according to the applied algorithmic instructions.

Computer Vision Applications

🌟 Image Classification: Predict which class an image belongs to.

🌟 Object Detection: For object detection, image classification is used primarily to determine which class the image belongs to. Then their appearance in an image or video is detected and enclosed in a bounding box.

🌟 Object Tracking: When a certain object is detected, it follows it. Object tracking is usually done on sequentially captured images or real-time videos.

🌟 Segmentation: It is the coloring of images that belong to a specific class to separate them from the background.

🌟 Content-based image retrieval: Used to browse, search and retrieve images from large data stores based on the content of images rather than their data tags.