{"id":44153,"date":"2025-07-09T09:00:00","date_gmt":"2025-07-09T16:00:00","guid":{"rendered":""},"modified":"2025-10-06T10:34:44","modified_gmt":"2025-10-06T17:34:44","slug":"reasoning-reimagined-introducing-phi-4-mini-flash-reasoning","status":"publish","type":"post","link":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/","title":{"rendered":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\" id=\"state-of-the-art-architecture-redefines-speed-for-reasoning-models\">State of the art architecture redefines speed for reasoning models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Microsoft is excited to unveil a new edition to the Phi model family: <strong>Phi-4-mini-flash-reasoning<\/strong>. Purpose-built for scenarios where compute, memory, and latency are tightly constrained, this new model is engineered to bring advanced reasoning capabilities to edge devices, mobile applications, and other resource-constrained environments. This new model follows Phi-4-mini, but is built on a new hybrid architecture, that achieves up to 10 times higher throughput and a 2 to 3 times average reduction in latency, enabling significantly faster inference without sacrificing reasoning performance. Ready to power real world solutions that demand efficiency and flexibility, Phi-4-mini-flash-reasoning is available on <a href=\"https:\/\/ai.azure.com\/\">Azure AI Foundry<\/a>, <a href=\"https:\/\/build.nvidia.com\/microsoft\" target=\"_blank\" rel=\"noreferrer noopener\">NVIDIA API Catalog<\/a>, and <a href=\"http:\/\/aka.ms\/flashreasoning-hf\" target=\"_blank\" rel=\"noreferrer noopener\">Hugging Face<\/a> today.<\/p>\n\n\n\n<aside class=\"cta-block cta-block--align-left cta-block--has-image wp-block-msx-cta\" data-bi-an=\"CTA Block\">\n\t<div class=\"cta-block__content\">\n\t\t\t\t\t<div class=\"cta-block__image-container\">\n\t\t\t\t<img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"575\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Azure_Hero_Ellipse_OffWhite_FullGrad_cropped-1024x575.webp\" class=\"cta-block__image\" alt=\"A colorful background with a curved line\" srcset=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Azure_Hero_Ellipse_OffWhite_FullGrad_cropped-1024x575.webp 1024w, https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Azure_Hero_Ellipse_OffWhite_FullGrad_cropped-300x169.webp 300w, https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Azure_Hero_Ellipse_OffWhite_FullGrad_cropped-768x432.webp 768w, https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Azure_Hero_Ellipse_OffWhite_FullGrad_cropped.webp 1260w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/>\t\t\t<\/div>\n\t\t\n\t\t<div class=\"cta-block__body\">\n\t\t\t<h2 class=\"cta-block__headline\">Azure AI Foundry<\/h2>\n\t\t\t<p class=\"cta-block__text\">Create without boundaries\u2014Azure AI Foundry has everything you need to design, customize, and manage AI applications and agents<\/p>\n\t\t\t\t\t\t\t<div class=\"cta-block__actions\">\n\t\t\t\t\t<a\n\t\t\t\t\t\thref=\"https:\/\/ai.azure.com\/\"\n\t\t\t\t\t\tclass=\"btn cta-block__link btn-link\"\n\t\t\t\t\t\t\t\t\t\t\t>\n\t\t\t\t\t\tExplore solutions\t\t\t\t\t<\/a>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t<\/div>\n<\/aside>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"efficiency-without-compromise\">Efficiency without compromise&nbsp;<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Phi-4-mini-flash-reasoning balances math reasoning ability with efficiency, making it potentially suitable for educational applications, real-time logic-based applications, and more.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Similar to its predecessor, Phi-4-mini-flash-reasoning is a 3.8 billion parameter open model optimized for advanced math reasoning. It supports a 64K token context length and is fine-tuned on high-quality synthetic data to deliver reliable, logic-intensive performance deployment.&nbsp;&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-s-new\">What&#8217;s new?<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">At the core of Phi-4-mini-flash-reasoning is the newly introduced decoder-hybrid-decoder architecture, SambaY, whose central innovation is the Gated Memory Unit (GMU), a simple yet effective mechanism for sharing representations between layers.&nbsp; The architecture includes a self-decoder that combines Mamba (a State Space Model) and Sliding Window Attention (SWA), along with a single layer of full attention. The architecture also involves a cross-decoder that interleaves expensive cross-attention layers with the new, efficient GMUs. This new architecture with GMU modules&nbsp;drastically improves decoding efficiency, boosts long-context retrieval performance and enables the architecture to deliver exceptional performance across a wide range of tasks.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Key benefits of the SambaY architecture include:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Enhanced decoding efficiency.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Preserves linear prefiling time complexity.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Increased scalability and enhanced long context performance.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Up to 10 times higher throughput.<\/li>\n<\/ul>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Decoder-hybrid-decoder-architecture.webp\" alt=\"A diagram of a computer program\" class=\"wp-image-44177 webp-format\" data-orig-src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Decoder-hybrid-decoder-architecture.webp\"><figcaption class=\"wp-element-caption\"><em>Our decoder-hybrid-decoder architecture taking Samba [RLL+25] as the self-decoder. Gated Memory Units (GMUs) are interleaved with the cross-attention layers in the cross-decoder to reduce the decoding computation complexity. As in YOCO [SDZ+24], the full attention layer only computes the KV cache during the prefilling with the self-decoder, leading to linear computation complexity for the prefill stage.<\/em><\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"phi-4-mini-flash-reasoning-benchmarks\">Phi-4-mini-flash-reasoning benchmarks&nbsp;<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Like all models in the Phi family, Phi-4-mini-flash-reasoning is deployable on a single GPU, making it accessible for a broad range of use cases. However, what sets it apart is its architectural advantage. This new model achieves significantly lower latency and higher throughput compared to Phi-4-mini-reasoning, particularly in long-context generation and latency-sensitive reasoning tasks.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">This makes Phi-4-mini-flash-reasoning a compelling option for developers and enterprises looking to deploy intelligent systems that require fast, scalable, and efficient reasoning\u2014whether on premises or on-device.&nbsp;<\/p>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Generation-Latencies.webp\" alt=\"A graph of a number of people\" class=\"wp-image-44178 webp-format\" data-orig-src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Generation-Latencies.webp\"><\/figure>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Prompt-2000.webp\" alt=\"A graph with red and blue dots and numbers\" class=\"wp-image-44179 webp-format\" data-orig-src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Prompt-2000.webp\"><figcaption class=\"wp-element-caption\"><em>The top plot shows inference latency as a function of generation length, while the bottom plot illustrates how inference latency varies with throughput. Both experiments were conducted using the vLLM inference framework on a single A100-80GB GPU with tensor parallelism (TP) set to 1.<\/em><\/figcaption><\/figure>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img decoding=\"async\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Math-benchmarks.webp\" alt=\"A graph of different colored bars\" class=\"wp-image-44181 webp-format\" data-orig-src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Math-benchmarks.webp\"><figcaption class=\"wp-element-caption\"><em>A more accurate evaluation was used where Pass@1 accuracy is averaged over 64 samples for AIME24\/25 and 8 samples for Math500 and GPQA Diamond. In this graph, Phi-4-mini-flash-reasoning outperforms Phi-4-mini-reasoning and is better than models twice its size.<\/em><\/figcaption><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"what-are-the-potential-use-cases\">What are the potential use cases?&nbsp;<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Thanks to its reduced latency, improved throughput, and focus on math reasoning, the model is ideal for:&nbsp;<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Adaptive learning platforms<\/strong>, where real-time feedback loops are essential.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>On-device reasoning assistants<\/strong>, such as mobile study aids or edge-based logic agents.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Interactive tutoring systems<\/strong> that dynamically adjust content difficulty based on a learner\u2019s performance.<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">Its strength in math and structured reasoning makes it especially valuable for education technology, lightweight simulations, and automated assessment tools that require reliable logic inference with fast response times.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Developers are encouraged to connect with peers and Microsoft engineers through the <a href=\"https:\/\/aka.ms\/foundrydevs\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Developer Discord community<\/a> to ask questions, share feedback, and explore real-world use cases together.&nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"microsoft-s-commitment-to-trustworthy-ai\">Microsoft\u2019s commitment to trustworthy AI&nbsp;<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Organizations across industries are leveraging Azure AI and <a href=\"https:\/\/www.microsoft.com\/en-us\/microsoft-365\/copilot\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft 365 Copilot<\/a> capabilities to drive growth, increase productivity, and create value-added experiences.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">We\u2019re committed to helping organizations use and build <a href=\"https:\/\/blogs.microsoft.com\/blog\/2024\/09\/24\/microsoft-trustworthy-ai-unlocking-human-potential-starts-with-trust\/\" target=\"_blank\" rel=\"noreferrer noopener\">AI that is trustworthy<\/a>, meaning it is secure, private, and safe. We bring best practices and learnings from decades of researching and building AI products at scale to provide industry-leading commitments and capabilities that span our three pillars of security, privacy, and safety. Trustworthy AI is only possible when you combine our commitments, such as our <a href=\"https:\/\/www.microsoft.com\/en-us\/trust-center\/security\/secure-future-initiative\" target=\"_blank\" rel=\"noreferrer noopener\">Secure Future Initiative<\/a> and our <a href=\"https:\/\/www.microsoft.com\/en-us\/ai\/responsible-ai\" target=\"_blank\" rel=\"noreferrer noopener\">responsible AI principles<\/a>, with our product capabilities to unlock AI transformation with confidence. &nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Phi models are developed in accordance with Microsoft AI principles: accountability, transparency, fairness, reliability and safety, privacy and security, and inclusiveness.\u202f&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">The Phi model family, including Phi-4-mini-flash-reasoning, employs a robust safety post-training strategy that integrates Supervised Fine-Tuning (SFT), Direct Preference Optimization (DPO), and Reinforcement Learning from Human Feedback (RLHF). These techniques are applied using a combination of open-source and proprietary datasets, with a strong emphasis on ensuring helpfulness, minimizing harmful outputs, and addressing a broad range of safety categories. Developers are encouraged to apply responsible AI best practices tailored to their specific use cases and cultural contexts.&nbsp;<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Read the model card to learn more about any risk and mitigation strategies. &nbsp;<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"learn-more-about-the-new-model\">Learn more about the new model&nbsp;<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Try out the new model on\u202f<a href=\"https:\/\/aka.ms\/try-phi\" target=\"_blank\" rel=\"noreferrer noopener\">Azure AI Foundry<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Find code samples and more in the\u202f<a href=\"https:\/\/aka.ms\/phicookbook\" target=\"_blank\" rel=\"noreferrer noopener\">Phi Cookbook<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Read the Phi-4-mini-flash-reasoning technical paper on <a href=\"http:\/\/aka.ms\/flashreasoning-hf\" target=\"_blank\" rel=\"noreferrer noopener\">Arxiv<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">If you have questions, sign up for the <a href=\"https:\/\/discord.com\/invite\/azureaifoundry?event=1382861149288005693\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Developer \u201cAsk Me Anything\u201d<\/a>.&nbsp;<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"create-with-azure-ai-foundry\">Create with Azure AI Foundry<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Get started with <a href=\"https:\/\/ai.azure.com\/\">Azure AI Foundry<\/a>,\u202fand jump directly into <a href=\"https:\/\/marketplace.visualstudio.com\/items?itemName=TeamsDevApp.vscode-ai-foundry\" target=\"_blank\" rel=\"noreferrer noopener\">Visual Studio Code<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Download the <a href=\"https:\/\/aka.ms\/aifoundrysdk\" target=\"_blank\" rel=\"noreferrer noopener\">Azure AI Foundry SDK<\/a>\u202f.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Take the <a href=\"https:\/\/aka.ms\/CreateAgenticAISolutions\" target=\"_blank\" rel=\"noreferrer noopener\">Azure AI Foundry learn courses<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Review the <a href=\"https:\/\/learn.microsoft.com\/azure\/ai-foundry\/\" target=\"_blank\" rel=\"noreferrer noopener\">Azure AI Foundry documentation<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Keep the conversation going in <a href=\"https:\/\/aka.ms\/azureaifoundry\/forum\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<\/a> and <a href=\"https:\/\/aka.ms\/azureaifoundry\/discord\" target=\"_blank\" rel=\"noreferrer noopener\">Discord<\/a>.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.<\/p>\n","protected":false},"author":42,"featured_media":44963,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ms_queue_id":["aiblog-content-sync"],"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","_alt_title":"","footnotes":"","msx_community_cta_settings":[]},"categories":[1454,1551],"tags":[2671,2735,2747,3168],"audience":[3072,3055],"content-type":[1465],"product":[1803,3164,1552],"tech-community":[3041],"topic":[],"coauthors":[3174,3264,3267],"class_list":["post-44153","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-machine-learning","category-developer-tools","tag-ai","tag-copilot","tag-generative-ai","tag-large-language-models-llms","audience-ai-professionals","audience-developers","content-type-announcements","product-azure-ai","product-microsoft-foundry","product-visual-studio","review-flag-1680286581-295","review-flag-1680286581-364","review-flag-1-1680286581-825","review-flag-2-1680286581-601","review-flag-3-1680286581-173","review-flag-4-1680286581-250","review-flag-8-1680286581-263","review-flag-microsofts","review-flag-new-1680286579-546","review-flag-on-pr-1680286585-571"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog<\/title>\n<meta name=\"description\" content=\"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog\" \/>\n<meta property=\"og:description\" content=\"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Azure Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsoftazure\" \/>\n<meta property=\"article:published_time\" content=\"2025-07-09T16:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-10-06T17:34:44+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1260\" \/>\n\t<meta property=\"og:image:height\" content=\"709\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Weizhu Chen, Jianfeng Gao, Liliang Ren\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.png\" \/>\n<meta name=\"twitter:creator\" content=\"@azure\" \/>\n<meta name=\"twitter:site\" content=\"@azure\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Weizhu Chen, Jianfeng Gao, Liliang Ren\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\"},\"author\":[{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/weizhu-chen\/\",\"@type\":\"Person\",\"@name\":\"Weizhu Chen\"},{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/jianfeng-gao\/\",\"@type\":\"Person\",\"@name\":\"Jianfeng Gao\"},{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/liliang-ren\/\",\"@type\":\"Person\",\"@name\":\"Liliang Ren\"}],\"headline\":\"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning\",\"datePublished\":\"2025-07-09T16:00:00+00:00\",\"dateModified\":\"2025-10-06T17:34:44+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\"},\"wordCount\":987,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp\",\"keywords\":[\"AI\",\"Copilot\",\"Generative AI\",\"Large language models (LLMs)\"],\"articleSection\":[\"AI + machine learning\",\"Developer tools\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\",\"name\":\"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp\",\"datePublished\":\"2025-07-09T16:00:00+00:00\",\"dateModified\":\"2025-10-06T17:34:44+00:00\",\"description\":\"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.\",\"breadcrumb\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp\",\"contentUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp\",\"width\":1260,\"height\":709,\"caption\":\"A white rectangular sign with blue text reading \\\"Announcing Phi-4-mini-flash-reasoning\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog home\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI + machine learning\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/ai-machine-learning\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"name\":\"Microsoft Azure Blog\",\"description\":\"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.\",\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\",\"name\":\"Microsoft Azure Blog\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"contentUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"width\":512,\"height\":512,\"caption\":\"Microsoft Azure Blog\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/microsoftazure\",\"https:\/\/x.com\/azure\",\"https:\/\/www.instagram.com\/microsoftdeveloper\/\",\"https:\/\/www.linkedin.com\/company\/16188386\",\"https:\/\/www.youtube.com\/user\/windowsazure\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/b2603da1afac705823964361ce9072c0\",\"name\":\"Kristin Gallagher\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=gb83eb8c5c3f8feea9763b473dabe8524\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=g\",\"caption\":\"Kristin Gallagher\"},\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/kristingallagher\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog","description":"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/","og_locale":"en_US","og_type":"article","og_title":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog","og_description":"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.","og_url":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/","og_site_name":"Microsoft Azure Blog","article_publisher":"https:\/\/www.facebook.com\/microsoftazure","article_published_time":"2025-07-09T16:00:00+00:00","article_modified_time":"2025-10-06T17:34:44+00:00","og_image":[{"width":1260,"height":709,"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.png","type":"image\/png"}],"author":"Weizhu Chen, Jianfeng Gao, Liliang Ren","twitter_card":"summary_large_image","twitter_image":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.png","twitter_creator":"@azure","twitter_site":"@azure","twitter_misc":{"Written by":"Weizhu Chen, Jianfeng Gao, Liliang Ren","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#article","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/"},"author":[{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/weizhu-chen\/","@type":"Person","@name":"Weizhu Chen"},{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/jianfeng-gao\/","@type":"Person","@name":"Jianfeng Gao"},{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/liliang-ren\/","@type":"Person","@name":"Liliang Ren"}],"headline":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning","datePublished":"2025-07-09T16:00:00+00:00","dateModified":"2025-10-06T17:34:44+00:00","mainEntityOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/"},"wordCount":987,"commentCount":0,"publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp","keywords":["AI","Copilot","Generative AI","Large language models (LLMs)"],"articleSection":["AI + machine learning","Developer tools"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/","name":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning | Microsoft Azure Blog","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp","datePublished":"2025-07-09T16:00:00+00:00","dateModified":"2025-10-06T17:34:44+00:00","description":"Unlock faster, efficient reasoning with Phi-4-mini-flash-reasoning\u2014optimized for edge, mobile, and real-time applications.","breadcrumb":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#primaryimage","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2025\/07\/Outlook-dj4dsrl1.webp","width":1260,"height":709,"caption":"A white rectangular sign with blue text reading \"Announcing Phi-4-mini-flash-reasoning"},{"@type":"BreadcrumbList","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/reasoning-reimagined-introducing-phi-4-mini-flash-reasoning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog home","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/"},{"@type":"ListItem","position":2,"name":"AI + machine learning","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/ai-machine-learning\/"},{"@type":"ListItem","position":3,"name":"Reasoning reimagined: Introducing Phi-4-mini-flash-reasoning"}]},{"@type":"WebSite","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","name":"Microsoft Azure Blog","description":"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.","publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization","name":"Microsoft Azure Blog","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","width":512,"height":512,"caption":"Microsoft Azure Blog"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/microsoftazure","https:\/\/x.com\/azure","https:\/\/www.instagram.com\/microsoftdeveloper\/","https:\/\/www.linkedin.com\/company\/16188386","https:\/\/www.youtube.com\/user\/windowsazure"]},{"@type":"Person","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/b2603da1afac705823964361ce9072c0","name":"Kristin Gallagher","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=gb83eb8c5c3f8feea9763b473dabe8524","url":"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/295fa37b6bb2bbf59603c38b6ac7a7b4b86cd0f736387182fa9d0117f52cdf5e?s=96&d=mm&r=g","caption":"Kristin Gallagher"},"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/kristingallagher\/"}]}},"msxcm_display_generated_audio":false,"msxcm_animated_featured_image":null,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Azure Blog","distributor_original_site_url":"https:\/\/azure.microsoft.com\/en-us\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/44153","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/users\/42"}],"replies":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/comments?post=44153"}],"version-history":[{"count":23,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/44153\/revisions"}],"predecessor-version":[{"id":44971,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/44153\/revisions\/44971"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media\/44963"}],"wp:attachment":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media?parent=44153"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/categories?post=44153"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tags?post=44153"},{"taxonomy":"audience","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/audience?post=44153"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/content-type?post=44153"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/product?post=44153"},{"taxonomy":"tech-community","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tech-community?post=44153"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/topic?post=44153"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/coauthors?post=44153"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}