MacDirectory magazine is the premiere creative lifestyle magazine for Apple enthusiasts featuring interviews, in-depth tech reviews, Apple news, insights, latest Apple patents, apps, market analysis, entertainment and more.
Issue link: https://digital.macdirectory.com/i/1505412
Infographic - If AI Image Generators Are So Smart Why Do They Struggle to Write and Count? Seyedali Mirjalili - Professor, Director of Centre for Artificial Intelligence Research and Optimisation, Torrens University Australia Special thanks to The Conversation for republishing permission. Generative AI tools such as Midjourney, Stable Diffusion and DALL-E 2 have astounded us with their ability to produce remarkable images in a matter of seconds. Despite their achievements, however, there remains a puzzling disparity between what AI image generators can produce and what we can. For instance, these tools often won’t deliver satisfactory results for seemingly simple tasks such as counting objects and producing accurate text. If generative AI has reached such unprecedented heights in creative expression, why does it struggle with tasks even a primary school student could complete? Exploring the underlying reasons helps sheds light on the complex numerical nature of AI, and the nuance of its capabilities. AI’s limitations with writing Humans can easily recognise text symbols (such as letters, numbers and characters) written in various different fonts and handwriting. We can also produce text in different contexts, and understand how context can change meaning. Current AI image generators lack this inherent understanding. They have no true comprehension of what any text symbols mean. These generators are built on artificial neural networks trained on massive amounts of image data, from which they “learn” associations and make predictions. Combinations of shapes in the training images are associated with various entities. For example, two inward-facing lines that meet might represent the tip of a pencil, or the roof of a house.