{"id":1147961,"date":"2025-08-20T09:00:00","date_gmt":"2025-08-20T16:00:00","guid":{"rendered":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/?p=1147961"},"modified":"2025-09-09T10:06:53","modified_gmt":"2025-09-09T17:06:53","slug":"mindjourney-enables-ai-to-explore-simulated-3d-worlds-to-improve-spatial-interpretation","status":"publish","type":"post","link":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/blog\/mindjourney-enables-ai-to-explore-simulated-3d-worlds-to-improve-spatial-interpretation\/","title":{"rendered":"MindJourney enables AI to explore simulated 3D worlds to improve spatial interpretation"},"content":{"rendered":"\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1.jpg\" alt=\"Three white line icons on a gradient background transitioning from blue to pink. From left to right: a network or molecule structure with a central circle and six surrounding nodes, a 3D cube, and an open laptop with an eye symbol above it.\" class=\"wp-image-1147994\" srcset=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1.jpg 1400w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><\/figure>\n\n\n\n<p>A new research framework helps AI agents explore three-dimensional spaces they can\u2019t directly detect. Called <a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/publication\/mindjourney-test-time-scaling-with-world-models-for-spatial-reasoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">MindJourney<\/a>, the approach addresses a key limitation in vision-language models (VLMs), which give AI agents their ability to interpret and describe visual scenes.&nbsp;&nbsp;<\/p>\n\n\n\n<p>While VLMs&nbsp;are strong&nbsp;at identifying objects in&nbsp;static&nbsp;images,&nbsp;they struggle to&nbsp;interpret&nbsp;the interactive 3D world behind 2D images.&nbsp;This&nbsp;gap shows up&nbsp;in spatial&nbsp;questions&nbsp;like&nbsp;\u201cIf I sit on the couch&nbsp;that is on my right&nbsp;and face the chairs, will the kitchen be to my right or left?\u201d\u2014tasks that require an agent to&nbsp;interpret&nbsp;its&nbsp;position and movement through space.&nbsp;<\/p>\n\n\n\n<p>People&nbsp;overcome this challenge by mentally exploring a space,&nbsp;imagining moving through it and combining those mental snapshots to work out where objects are.&nbsp;MindJourney&nbsp;applies the same process&nbsp;to&nbsp;AI agents,&nbsp;letting&nbsp;them roam a virtual&nbsp;space before answering spatial questions.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"how-mindjourney-navigates-3d-space\">How&nbsp;MindJourney&nbsp;navigates 3D space<\/h2>\n\n\n\n<p>To perform this type of spatial navigation,\u00a0MindJourney\u00a0uses a\u00a0<em>world model<\/em>\u2014in this case,\u00a0a video generation system trained on a large collection of videos captured from a single moving viewpoint, showing actions such as going forward and turning left or right, much like a 3D cinematographer. From this, it learns to predict how a new scene would appear from different perspectives.<\/p>\n\n\n\n<p>At inference time, the model can generate photo-realistic images of a scene based on possible movements from the agent\u2019s current position. It generates multiple possible views of a scene while the VLM acts as a filter, selecting the constructed perspectives that are most likely to answer the user&#8217;s question.<\/p>\n\n\n\n<p>These are kept and expanded in the next iteration, while less promising paths are discarded. This process, shown in Figure 1, avoids the need to generate and evaluate thousands of possible movement sequences by focusing only on the most informative perspectives.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"854\" src=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1.jpg\" alt=\"Figure 1. Given a spatial reasoning query, MindJourney searches through the imagined 3D space using a world model and improves the VLM's spatial interpretation through generated observations when encountering a new  challenges. \" class=\"wp-image-1147968\" srcset=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1.jpg 1400w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1-300x183.jpg 300w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1-1024x625.jpg 1024w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1-768x468.jpg 768w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney-fig1-240x146.jpg 240w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 1. Given a spatial reasoning query, MindJourney searches through the imagined 3D space using a world model and improves the VLM&#8217;s spatial interpretation through generated observations when encountering new challenges.<em>&nbsp;<\/em><\/figcaption><\/figure>\n\n\n\n<p>&nbsp;<\/p>\n\n\n\n<p>To make its search through&nbsp;a simulated&nbsp;space both effective and efficient,&nbsp;MindJourney&nbsp;uses a <em>spatial beam search<\/em>\u2014an&nbsp;algorithm that prioritizes the most promising paths. It works within a fixed number of steps, each representing a movement. By balancing breadth with depth, spatial beam search enables&nbsp;MindJourney&nbsp;to gather strong supporting evidence.&nbsp;This process is illustrated in Figure 2.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"1400\" height=\"788\" src=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788.jpg\" alt=\"MindJourney pipeline diagram\" class=\"wp-image-1147897\" srcset=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788.jpg 1400w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-300x169.jpg 300w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-1024x576.jpg 1024w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-768x432.jpg 768w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-1066x600.jpg 1066w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-655x368.jpg 655w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-240x135.jpg 240w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-640x360.jpg 640w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-960x540.jpg 960w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/MindJourney_pipeline_1400x788-1280x720.jpg 1280w\" sizes=\"auto, (max-width: 1400px) 100vw, 1400px\" \/><figcaption class=\"wp-element-caption\">Figure 2. The MindJourney workflow starts with a spatial beam search for a set number of steps before answering the query. The world model interactively generates new observations, while a VLM interprets the generated images, guiding the search throughout the process.<\/figcaption><\/figure>\n\n\n\n<p class=\"has-text-align-left\">By iterating through&nbsp;simulation,&nbsp;evaluation, and integration,&nbsp;MindJourney&nbsp;can reason about spatial relationships far beyond what any single 2D image can convey, all without the need for additional training.&nbsp;On&nbsp;the&nbsp;Spatial Aptitude Training (SAT)&nbsp;benchmark,&nbsp;it improved the accuracy of&nbsp;VLMs&nbsp;by&nbsp;8%&nbsp;over&nbsp;their&nbsp;baseline&nbsp;performance.<\/p>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"999693\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Spotlight: Event Series<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300 display-block\" href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/event\/microsoft-research-forum\/?OCID=msr_researchforum_MCR_Blog_Promo\" aria-label=\"Microsoft Research Forum\" data-bi-cN=\"Microsoft Research Forum\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/05\/Research-Forum-hero_1400x788.jpg\" alt=\"Research Forum | abstract background with colorful hexagons\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Microsoft Research Forum<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p id=\"microsoft-research-forum\" class=\"large\">Join us for a continuous exchange of ideas about research in the era of general AI. Watch the first four episodes on demand.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/event\/microsoft-research-forum\/?OCID=msr_researchforum_MCR_Blog_Promo\" aria-describedby=\"microsoft-research-forum\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" data-bi-cN=\"Microsoft Research Forum\" target=\"_blank\">\n\t\t\t\t\t\t\tWatch on-demand\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-4-3 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<div class=\"yt-consent-placeholder\" role=\"region\" aria-label=\"Video playback requires cookie consent\" data-video-id=\"Z4-5NZmdV44\" data-poster=\"https:\/\/img.youtube.com\/vi\/Z4-5NZmdV44\/maxresdefault.jpg\"><iframe aria-hidden=\"true\" tabindex=\"-1\" title=\"MindJourney: Test-Time Scaling with World Models for Spatial Reasoning\" width=\"500\" height=\"375\" data-src=\"https:\/\/www.youtube-nocookie.com\/embed\/Z4-5NZmdV44?feature=oembed&rel=0&enablejsapi=1\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><div class=\"yt-consent-placeholder__overlay\"><button class=\"yt-consent-placeholder__play\"><svg width=\"42\" height=\"42\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" aria-hidden=\"true\" focusable=\"false\"><g fill=\"none\" fill-rule=\"evenodd\"><circle fill=\"#000\" opacity=\".556\" cx=\"21\" cy=\"21\" r=\"21\"\/><path stroke=\"#FFF\" d=\"M27.5 22l-12 8.5v-17z\"\/><\/g><\/svg><span class=\"yt-consent-placeholder__label\">Video playback requires cookie consent<\/span><\/button><\/div><\/div>\n<\/div><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"building-smarter-agents\">Building&nbsp;smarter agents&nbsp;&nbsp;<\/h2>\n\n\n\n<p>MindJourney&nbsp;showed&nbsp;<a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/publication\/mindjourney-test-time-scaling-with-world-models-for-spatial-reasoning\/\" target=\"_blank\" rel=\"noreferrer noopener\">strong performance<\/a>&nbsp;on multiple 3D spatial-reasoning benchmarks, and even advanced VLMs&nbsp;improved&nbsp;when paired with its imagination loop. This suggests that the spatial patterns that world models learn from raw images, combined with the symbolic&nbsp;capabilities&nbsp;of VLMs, create a more complete spatial capability&nbsp;for agents. Together, they enable agents to infer what lies beyond the visible frame and&nbsp;interpret&nbsp;the physical world&nbsp;more accurately.&nbsp;<\/p>\n\n\n\n<p>It also demonstrates that pretrained VLMs and trainable world models can work together in 3D without retraining either one\u2014pointing toward general-purpose agents capable of&nbsp;interpreting&nbsp;and acting in real-world environments. This opens the way to&nbsp;possible&nbsp;applications in autonomous robotics, smart home technologies, and accessibility tools for people with visual impairments.&nbsp;<\/p>\n\n\n\n<p>By converting systems that simply describe static images into active agents that continually evaluate where to look next,&nbsp;MindJourney&nbsp;connects computer vision with planning. Because exploration occurs entirely within the model\u2019s latent space\u2014its internal representation of the scene\u2014robots would be able to test multiple viewpoints before determining their next move,&nbsp;potentially&nbsp;reducing wear, energy use, and collision risk.&nbsp;<\/p>\n\n\n\n<p>Looking ahead, we plan to extend the framework to&nbsp;use&nbsp;world models that&nbsp;not only&nbsp;predict&nbsp;new viewpoints&nbsp;but also forecast&nbsp;how the scene might change over time.&nbsp;We envision&nbsp;MindJourney&nbsp;working&nbsp;alongside VLMs that interpret&nbsp;those predictions&nbsp;and use&nbsp;them to&nbsp;plan&nbsp;what to do&nbsp;next. This&nbsp;enhancement could enable&nbsp;agents&nbsp;more accurately&nbsp;interpret&nbsp;spatial relationships and physical dynamics, helping them to operate effectively&nbsp;in changing environments.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>MindJourney can enable AI to navigate and interpret 3D environments from limited visual input, potentially improving performance in navigation, planning, and safety-critical tasks.<\/p>\n","protected":false},"author":43868,"featured_media":1147994,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":null,"msr_hide_image_in_river":null,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556,13562],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[269148,243984,269142],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-1147961","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-research-area-computer-vision","msr-locale-en_us","msr-post-option-approved-for-river","msr-post-option-blog-homepage-featured","msr-post-option-include-in-river"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[],"related-projects":[1147868],"related-events":[],"related-researchers":[{"type":"guest","value":"yuncong-yang","user_id":"1147889","display_name":"Yuncong Yang","author_link":"<a href=\"https:\/\/yyuncong.github.io\/\" aria-label=\"Visit the profile page for Yuncong Yang\">Yuncong Yang<\/a>","is_active":true,"last_first":"Yang, Yuncong","people_section":0,"alias":"yuncong-yang"},{"type":"user_nicename","value":"Reuben Tan","user_id":43827,"display_name":"Reuben Tan","author_link":"<a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/tanreuben\/\" aria-label=\"Visit the profile page for Reuben Tan\">Reuben Tan<\/a>","is_active":false,"last_first":"Tan, Reuben","people_section":0,"alias":"tanreuben"},{"type":"user_nicename","value":"Swadheen Shukla","user_id":38248,"display_name":"Swadheen Shukla","author_link":"<a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/swads\/\" aria-label=\"Visit the profile page for Swadheen Shukla\">Swadheen Shukla<\/a>","is_active":false,"last_first":"Shukla, Swadheen","people_section":0,"alias":"swads"},{"type":"user_nicename","value":"Jianfeng Gao","user_id":32246,"display_name":"Jianfeng Gao","author_link":"<a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/jfgao\/\" aria-label=\"Visit the profile page for Jianfeng Gao\">Jianfeng Gao<\/a>","is_active":false,"last_first":"Gao, Jianfeng","people_section":0,"alias":"jfgao"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-960x540.jpg\" class=\"img-object-cover\" alt=\"Three white line icons on a gradient background transitioning from blue to pink. From left to right: a network or molecule structure with a central circle and six surrounding nodes, a 3D cube, and an open laptop with an eye symbol above it.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-960x540.jpg 960w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-300x169.jpg 300w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1024x576.jpg 1024w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-768x432.jpg 768w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1066x600.jpg 1066w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-655x368.jpg 655w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-240x135.jpg 240w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-640x360.jpg 640w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1-1280x720.jpg 1280w, https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-content\/uploads\/2025\/08\/ImprovingImagination-BlogHeroFeature-1400x788-1.jpg 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"<a href=\"https:\/\/yyuncong.github.io\/\" title=\"Go to researcher profile for Yuncong Yang\" aria-label=\"Go to researcher profile for Yuncong Yang\" data-bi-type=\"byline author\" data-bi-cN=\"Yuncong Yang\">Yuncong Yang<\/a>, <a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/tanreuben\/\" title=\"Go to researcher profile for Reuben Tan\" aria-label=\"Go to researcher profile for Reuben Tan\" data-bi-type=\"byline author\" data-bi-cN=\"Reuben Tan\">Reuben Tan<\/a>, <a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/swads\/\" title=\"Go to researcher profile for Swadheen Shukla\" aria-label=\"Go to researcher profile for Swadheen Shukla\" data-bi-type=\"byline author\" data-bi-cN=\"Swadheen Shukla\">Swadheen Shukla<\/a>, and <a href=\"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/people\/jfgao\/\" title=\"Go to researcher profile for Jianfeng Gao\" aria-label=\"Go to researcher profile for Jianfeng Gao\" data-bi-type=\"byline author\" data-bi-cN=\"Jianfeng Gao\">Jianfeng Gao<\/a>","formattedDate":"August 20, 2025","formattedExcerpt":"MindJourney can enable AI to navigate and interpret 3D environments from limited visual input, potentially improving performance in navigation, planning, and safety-critical tasks.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1147961","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/users\/43868"}],"replies":[{"embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/comments?post=1147961"}],"version-history":[{"count":32,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1147961\/revisions"}],"predecessor-version":[{"id":1149179,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/posts\/1147961\/revisions\/1149179"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media\/1147994"}],"wp:attachment":[{"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1147961"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/categories?post=1147961"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/tags?post=1147961"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1147961"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=1147961"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=1147961"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1147961"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1147961"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1147961"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=1147961"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/new-cm-edgedigital.pages.dev\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=1147961"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}