On Monday, Google’s AI venture DeepMind announced spirita generative AI model that can create playable games from simple prompts after learning game mechanics from hundreds of thousands of gameplay videos.
Genie (abbreviation) is a collaboration between Google and the University of British Columbia. generative interactive Environment – Use a single image to create side-scrolling 2D platformer games based on user prompts, such as Super Mario Bros. or Contra.
“In recent years, generative AI has emerged with models that can generate novel and creative content through language, images, and even video,” says Google DeepMind. “Today, we introduce Genie, a new paradigm for generative AI and generative interactive environments.”
Genie can create interactive and playable environments from a single image prompt, thanks to features described by Google researchers. A latent action model that infers actions between video frames, a video tokenizer that converts raw video frames into individual tokens, and a dynamic model that determines the next frame.
“Rather than adding inductive bias, we’re focusing on scale,” Google DeepMind developer Tim Rockteschel said on Twitter. “We are using his dataset of over 200,000 hours of video from his 2D platform to train an 11 billion world model… [then] IThe Genie learns various potential actions that control the character in an unsupervised manner and in a consistent manner. ”
Rocktäschel continued that Genie can also convert other media types into games. Included Google DeepMind research paperThe Genie can be directed to generate a virtual world with different action controls from different inputs.
“Our model can transform any image into a playable 2D world,” says Rocktäsche. “Genie can bring human-designed sketches and other creations to life, including the beautiful artwork of Seneca and Caspian, two of his youngest creators in the world.”
Genie is skilled at creating 2D worlds; text or imageRocktäschel showed that AI models can do more than build side-scrollers, including the possibility of teaching other AI models or “agents” about the 3D world.
“We also trained Genie on robotics data (RT-1) without actions and demonstrated that it is possible to learn action-controllable simulators there as well,” he said. “We believe this is a promising step toward a common global model of AGI.”
Also known as singularityArtificial general intelligence (AGI) refers to AI that can understand and apply learned knowledge to a wide range of tasks, just like humans.
According to Google DeepMind, the Genie dataset filters publicly available internet videos, specifically videos with titles like “dpeedrun” and “playthrough” while excluding words like “movie” and “unboxing.” It is said that it was generated by
According to Google DeepMind, advances in AI technology, hardware, and datasets have made it possible to create consistent, conversational language and images that are “crisp and beautiful.”
“When selecting keywords, manually find the checked results to see if the results are typically producing 2D platformer gameplay videos that are as good as other types of videos that share similar keywords. “We confirm that,” the researchers continued.
“Genie allows you to train future AI agents with a never-ending curriculum of newly generated worlds,” says Google DeepMind. “Our paper provides a proof of concept that the potential behaviors learned by Genie can be transferred to real human-designed environments, but this only scratches the surface of what might be possible in the future. only.”
Thanks to the launch of OpenAI’s GPT-4 last year, technology companies like Google, Microsoft, and Amazon have been investing heavily in generative AI. Earlier this month, Google announced the launch of a subscription-based version of its Gemini AI model after rebranding from Google Bard.
Representatives from Google and its DeepMind program did not immediately respond to requests for comment from Google. Decryption.
Edited by Ryan Ozawa.