ImaginateAR: Democratizing AR Creation Through AI-Powered Spatial Intelligence

Date:10/22/2025
Category:Research

Transforming How We Create Augmented Experiences

Today, we're excited to share groundbreaking research that takes a significant step in AI assisted augmented reality creation. ImaginateAR, developed by our research team, represents the first mobile tool for AI-assisted AR authoring that combines outdoor scene understanding, rapid 3D asset generation, and natural language interaction.

Accepted at UIST 2025, this work demonstrates how advanced AI can empower anyone to create personalized AR experiences anywhere—transforming the complex process of AR authoring into something as natural as speaking your imagination.

The Vision: AR Creation for Everyone

Imagine a world where creating immersive AR content requires no technical expertise—where a teacher can transform their schoolyard into an interactive history lesson, or where friends can fill a beach with dancing penguins through natural language voice commands.

While AR has proven transformative in mobile games and educational applications, content creation has typically remained locked behind professional tools requiring specialized skills. ImaginateAR breaks down these barriers through three core technological innovations:

1. Enhanced Outdoor Scene Understanding

Our custom-built pipeline creates structured scene graphs from real-world environments. Unlike existing models trained primarily on indoor datasets, our approach enables automatic labeling and intelligent spatial reasoning for diverse outdoor settings.

Key innovations:

  • Autonomous semantic labeling without user queries

  • Structured 3D scene graphs for LLM-based spatial reasoning

  • Can operate in varied outdoor environments

2. Fast 3D Asset Generation

Traditional 3D modeling demands hours of expert work. High-quality generative models often require several minutes to generate custom assets—impractical for real-time creation. Our pipeline generates fully textured 3D meshes in less than a minute by combining:

  • GPT prompt enhancement for optimal generation

  • DALL-E 2 image synthesis

  • InstantMesh for rapid single-image-to-3D lifting

3. LLM-Driven Speech Interface

Natural interaction lies at the heart of accessible AR creation. Our multi-agent LLM system enables sophisticated voice-driven authoring through specialized agents:

  • Brainstorming Agent: Generates contextual scene ideas

  • Action Plan Agent: Interprets user requests and structures tasks

  • Assembly Agent: Executes spatial placement with intelligent reasoning

Users can create through natural language, e.g., "Put a dancing T-Rex on the grass" or "Make a helicopter hover over the shed", with the system handling complex spatial reasoning automatically.

Research Validation: Technical Excellence and User Enthusiasm

Our comprehensive evaluation demonstrated both technical superiority and strong user preference:

Technical Performance

  • Scene Understanding: Our pipeline improved upon base OpenMask3D models and ablated variants across five diverse outdoor scenes

  • Asset Generation: Achieved comparable quality to state-of-the-art methods while quickly delivering

Real-World User Study

Through a three-part study with 20 participants in a public park, we discovered fascinating insights about AI-human collaboration in creative processes:

  • Hybrid Workflows Dominate: 18 of 20 participants preferred combining AI creativity with manual precision

  • Strong Creative Support: ImaginateAR received a Creativity Support Index score of 68.8 ("Upper Second-class Honours" equivalent)

  • Diverse Content Creation: Participants authored everything from educational scenes to whimsical animal parties

  • High Engagement: Users rated the system highly for Enjoyment (7.71/10) and Expressiveness (7.55/10)

The Future of Spatial Computing

ImaginateAR represents more than a technical achievement—it embodies our commitment to making advanced spatial computing accessible to everyone. The research reveals compelling insights about human-AI collaboration in creative workflows:

Users want co-creation, not automation. Rather than replacing human creativity, AI excels as a creative partner that accelerates ideation while preserving human agency and control. Participants consistently used AI to generate creative "blueprints," then refined results manually for precise personalization.

Context-aware intelligence matters. By understanding real-world environments through scene graphs, AI can make intelligent spatial decisions—positioning objects appropriately, scaling elements realistically, and maintaining spatial coherence across complex scenes.

Natural interaction unlocks creativity. Voice-driven interfaces eliminate technical barriers, allowing users to focus on creative expression rather than tool mastery. The most successful interactions felt conversational and collaborative rather than transactional.

Technical Deep Dive: Advanced Pipeline Architecture

For our technical audience, ImaginateAR's architecture demonstrates sophisticated integration of multiple AI systems:

Offline Scene Processing leverages enhanced OpenMask3D with dense monocular depth estimation, GPT-4o semantic classification, and clustering to create compact yet comprehensive scene representations.

Dynamic Asset Generation employs a multi-stage pipeline optimized for AR requirements: prompt boosting via GPT, reference image synthesis through DALL-E 2 center-region editing, and InstantMesh 3D lifting.

Real-time Authoring Interface built on Unity with ARFoundation integrates Niantic VPS for precise localization, multi-agent LLM orchestration for speech processing, and adaptive UI supporting manual, AI-assisted, and AI-decided interaction modes.

Looking Forward: Building the Foundation for Ubiquitous AR Creation

While ImaginateAR demonstrates significant progress toward accessible AR authoring, it also illuminates exciting directions for future development:

  • Enhanced Scene Understanding: Moving beyond bounding boxes to support more granular spatial reasoning and part-level interactions

  • Dynamic Content Support: Incorporating animation, audio, and interactive behaviors for richer experiences

  • Collaborative Creation: Enabling multi-user co-authoring and scene sharing across the global community

  • Platform Evolution: Expanding beyond mobile to AR headsets and other emerging spatial computing platforms

Conclusion: Imagination Made Real

ImaginateAR takes a crucial step toward our vision of democratized spatial computing—where creativity knows no technical boundaries and anyone can transform their environment through the power of imagination and AI collaboration.

This research demonstrates our continued commitment to pushing technological frontiers while maintaining human agency and creative expression at the center of innovation. As generative models continue improving in speed and quality, and as spatial computing platforms become more prevalent, the techniques pioneered in ImaginateAR will help unlock unprecedented creative possibilities for users worldwide.

The future of AR isn't just about consuming immersive content—it's about empowering everyone to create it. With ImaginateAR, that future moves one step closer to reality.

Publication Details: ImaginateAR was accepted to UIST 2025 and represents collaborative work between Niantic Spatial, University of Washington, and University College London. The research team includes Jaewook Lee, Filippo Aleotti, Diego Mazala, Guillermo Garcia-Hernando, Sara Vicente, Oliver James Johnston, Isabel Kraus-Liang, Jakub Powierza, Donghoon Shin, Jon E. Froehlich, Gabriel Brostow, and Jessica Van Brummelen.

Learn more about our research at nianticspatial.github.io/imaginatear