Page 143 - Artificial Intellegence_v2.0_Class_11
P. 143
During image processing/matching, if the image being checked, maps to the image in the system as per the pixel layout, then
the computer responds that the images match. So, the computer ‘sees’ images differently from humans!
Computer Vision: Primary Tasks
There are mainly four tasks that Computer Vision undertakes:
1. Semantic Segmentation (Image Classification)
2. Classification + Localization
3. Object Detection
4. Instance Segmentation
Classification Instance
Classification + Localization Object Detection Segmentation
CAT CAT CAT, DOG CAT, DOG
Single object Multiple objects
Semantic Segmentation
Semantic segmentation describes the process of assigning each pixel in an image to a class name (for example, flower,
person, road, sky, sea, or car). The idea of segmentation is to teach computers to process and understand an image at
the pixel level. In simple terms, computers can segment an image, paint objects in the image with different colours, and
predict what is in them. For example, an autonomous car recognizes objects on the road and labels them as per the
classes already fed into it.
Classification and Localization
Once the object has been detected, it is classified or mapped to a particular class/label. Then,
the process of localization begins. This involves determining where the object is in the image
and drawing a bounding box around it.
CAT
Object Detection
When people watch a video or image, they immediately identify the objects that
appear in it. This intelligence can be duplicated with a computer. If we have multiple
objects in the image, the algorithm will identify them all and locate each of them
(insert a bounding box). So, you have multiple bounding boxes and labels around
the objects.
DOG, CAT
AI Applications and Methodologies 141

