Apple researchers have released a new AI image editing tool model, which allows users to describe what they want to change in a photo in plain language, without needing to touch the software.
The MGIE model is able to crop, flip, resize, and add filters to images through text prompts.
MGIE stands for MLLM-Guided Image Editing, and it can be applied to simple and complex image editing tasks.
It combines two different uses of multimodal language models. First, it learns how to intercept user prompts, then “imagines” what the edit would look like.
When using MGIE, users just need to type out what they want to change about an image.
The researchers said, “Instead of brief but ambiguous guidance, MGIE derives explicit visual-aware intention and leads to reasonable image editing. We conduct extensive studies from various editing aspects and demonstrate that our MGIE effectively improves performance while maintaining competitive efficiency. We also believe the MLLM-guided framework can contribute to future vision-and-language research.”
MGIE was made available via GitHub for download, and a web demo was released on Hugging Face Spaces. Apple didn’t reveal plans for the model beyond research.
Apple CEO Tim Cook has said the company wants to add more AI features to its devices this year. Back in December, researchers released an open source machine learning framework, to simplify training AI model on Apple Silicon chips.