Top 3 Breakthroughs in Computer Vision in 2024
Introduction Over the past few year, AI has become popular; however, no year matches the pace of the past few years. 2024 was an extremely crucial year for Computer Vision, with new models and breakthroughs occurring rapidly. Today, I'll be talking about the 3 biggest advancements in the Computer Vision field during 2024. 3. Vision Language Modeling Simply put, Vision Language Models, or VLMs for short, are Large Language Models (LLMs) that take an image and/or text as input. Think of ChatGPT, where you import an image and then type out a question that you have over the image. It will be able to understand both and output an answer. These are also what you would call a "multimodal model," which is a model that can understand multiple different types of sources/inputs. Although Vision Language Models have existed for a while, they have recently become better and more accurate. However, this doesn't mean it's all sunshine an rainbows: there are still several barri...