On April 1, Alibaba (09988) unveiled its unified image generation and editing model, Wan2.7-Image. To move away from generic "AI faces," the model enhances virtual avatar creation capabilities, supporting comprehensive customization from bone structure and eyes to subtle facial features, achieving distinct appearances for each user. The model not only achieves a qualitative leap in visual effects but also breaks through the limitations of traditional AI image generation, such as standardized faces and difficulties in instruction alignment, through an upgrade in its full-chain capabilities. Wan2.7-Image newly supports a "color palette" function, allowing users to extract or input various colors and their proportions from a reference image with one click to generate images in the same color scheme. Users can freely adjust the number and proportion of colors to create custom color schemes. In terms of image editing, Wan2.7-Image possesses powerful interactive editing capabilities, supporting "precise frame selection editing." Through precise framing, users can add, align, or move elements or logos in specified areas, achieving pixel-level intent alignment. Additionally, the model supports the generation of up to 12 images in a set, enabling creativity to transition from a single image to a coherent visual narrative. Wan2.7-Image also features superior text rendering capabilities, achieving print-quality rendering of lengthy text, tables, and complex formulas. It supports 12 languages and can handle ultra-long text inputs of up to 3K tokens, outputting content equivalent to a full A4 page of an academic paper. Tables, mathematical formulas, and multilingual typesetting can all be stably rendered. Wan2.7-Image adopts an advanced unified architecture for generation and understanding, achieving semantic mapping in a shared latent space. This means the model no longer guesses text to fit pixels but possesses underlying semantic cognition. Currently, the Wan2.7-Image-pro version has been simultaneously launched, offering more stable composition and precise understanding.
Comments