The competition in the field of artificial intelligence continues unabated. Elon Musk’s model called Grok, developed on the X platform; not only responds with text, but also sees its surroundings with a camera and can interact directly with this visual data. This ability will take Grok, which is currently competing with rival systems such as GPT-4 and Google Gemini, to a different class.
Grok watches and interacts with the camera
The new camera-supported visual recognition capabilities of Musk’s artificial intelligence system also overlap with Tesla’s long-developed autonomous driving technology. This version of Grok, which can perceive the world with a camera, clearly revealed Musk’s integration strategy between technology companies.
The system can directly analyze and comment on visual content. When users show an object to Grok’s camera or scan the environment, the AI offers relevant explanations in real time.
Moreover, this ability is not limited to English; Grok can now respond to commands in Turkish. In recent Turkish tests, it was observed that the AI recognized environmental elements and made meaningful explanations.
Companies such as Meta and Google have also made major breakthroughs in visual perception. However, Musk’s AI model has the potential to spread this technology to a much wider user base.
Grok’s new capability shows that a significant threshold has been crossed in the transition from text to image or image to meaning. The visual interaction feature, which is still in beta, is expected to be integrated into mobile devices and Tesla cars in the near future.