Midjourney 7.0 Faces Criticism for Declining Text Generation Quality

Apr 7

Midjourney's recent release of version 7.0 has introduced features like voice prompting and a faster draft mode. However, the update has drawn significant criticism for its deteriorated performance in rendering text within images. Users report that the AI's ability to accurately generate textual elements has not only failed to improve but has regressed compared to previous versions.

Artists and designers have taken to social media to express their dissatisfaction. For instance, Instagram user @digiart.of.alex highlighted persistent issues, stating,

"There are still issues: ❌ Expand got worse - watermarks, logos, random text all over ❌ Hands are still a mess 🖐️ ❌ Text generation still..."

Despite claims of enhanced prompt interpretation and image quality, the persistent shortcomings in text generation have overshadowed these advancements. Users have noted that while the model excels in creating complex visuals, it struggles with producing coherent and legible text, often resulting in garbled or nonsensical characters.

This decline in text rendering quality has sparked discussions within the AI art community. Some users express frustration over the model's inability to handle textual prompts effectively, while others highlight the need for specialized training data to address this issue. The challenge lies in the AI's difficulty distinguishing and replicating the intricate details of letterforms, leading to inconsistent and often unreadable text outputs.

In response to the feedback, Midjourney's development team acknowledges the limitations and assures users of ongoing efforts to enhance the model's capabilities. They emphasize that version 7.0 represents a foundational shift in architecture, with plans for regular updates aimed at addressing current shortcomings, including text generation.

We decided to try it ourselves. In here, we had Midjourney try to create an image with the word “Biochemistry.” Version 6.1 got 3 of the 4 images right on the first try. Version 7.0, however, did not get it right in 7 consecutive attempts. It is important to note, however, that the images 7.0 produced were much better in quality and followed the instructions better (other than the text).