Understanding Computer Vision
Computer vision is a specialized area of computer science that focuses on developing digital frameworks capable of processing, interpreting and understanding visual input (images or videos), in ways quite similar to human capabilities. The concept of vision here revolves around training computers to understand and interpret images.
Common Uses of Computer Vision
For instance, computer vision systems are commonly used for numerous tasks including:
- Object identification: The system examines videos to recognize an object (or objects) that meets the search criteria, while monitoring its movements.
- Recognition of an object: The technology examines visual input and identifies a specific object in a photo or video.
- Object categorization: The system scrutinizes visual content and allocates the object in a photo or video to a relevant category. For example, it can differentiate animals from other objects in an image.
The Mechanics Behind Computer Vision
One pressing question in Machine Learning (ML) is, “How exactly do human brains function, and how can we emulate that with our algorithms?" Presently there are a handful of working and comprehensive theories of brain computation. All said, even though Neural Nets are designed to “emulate the operation of the brain,” nobody knows for sure, due to our limited understanding of the functionality of the brain.
A similar irony extends to computer vision and machine learning. It’s hard to pinpoint how closely the algorithms used resemble our cognitive processes, given our inadequate knowledge of how the brain and eyes perceive images.
To train a substantial accuracy model, especially in Deep Learning, usually, tens of thousands of images are required - the larger the number of images, the better. Even if Transfer Learning is employed to capitalize on the insights of an already trained model, several thousand images are still needed for your training.
Considering the substantial amount of computational power and storage needed to train deep learning models for computer vision, it’s no wonder advancements in these areas have accelerated Machine Learning to its current level.
Applications of Computer Vision
Contrary to some beliefs, computer vision and artificial intelligence isn't some far-off future technology. Many facets of our everyday lives are indeed already affected by computer vision. Here are a few implementations of this technology today:
- Media categorization: Many of us are already making use of computer vision systems in organizing our media. Modern software has access to our media libraries and automatically assigns tags, facilitating more structured browsing.
- Medical sector: Medical diagnostics greatly depend on image processing. This is crucial given 90% of medical data is image information. Notable examples include X-rays, MRI, and mammography. Image segmentation is equally valuable in medical image examination. Google states that machine vision technologies identify various types of cancer more accurately than human physicians, owing to its ability to distinguish tumors from benign areas that may appear similar.
- Facial recognition: Facial recognition technology matches face images of individuals to their identities. This technology can be found in major products we use daily. Facebook, for instance, employs machine vision to tag people in photos. Facial authentication technology is widely used by many mobile devices, enabling users to unlock their phones using facial recognition.
- Self-driving vehicles: Computer vision plays a big role in autonomous vehicles making sense of their surroundings. As the cameras installed in these vehicles capture videos from different angles, the computer vision software analyzes them in real time, identifying road signs, nearby objects (like pedestrians or other vehicles), and more. Tesla's Autopilot is a notable application of this technology.
- Virtual and Augmented reality: Computer vision is heavily utilized in Augmented Reality (AR) applications. This technology allows AR apps to identify physical objects in real-time, including locations and specific objects within a certain physical area, and use this information to place virtual elements within the physical world.