{"id":49830,"date":"2026-03-11T00:00:00","date_gmt":"2026-03-11T07:00:00","guid":{"rendered":""},"modified":"2026-03-11T11:30:00","modified_gmt":"2026-03-11T18:30:00","slug":"introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure","status":"publish","type":"post","link":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/","title":{"rendered":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure"},"content":{"rendered":"\n<p class=\"wp-block-paragraph\">Across industries, organizations are increasingly standardizing on open models to gain greater control over performance, cost, customization, and the security and compliance required for enterprise deployment. Open models give teams the flexibility to choose the right architecture for each workload and avoid lock\u2011in to a single model provider as their needs evolve.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/ai.azure.com\/\" target=\"_blank\" rel=\"noreferrer noopener\">Explore models on Microsoft Foundry<\/a><\/div>\n<\/div>\n\n\n\n<p class=\"wp-block-paragraph\">As adoption grows, however, performance alone is no longer enough. Teams need a consistent way to evaluate models quickly, operate them safely in production, and improve them over time without rebuilding infrastructure or fragmenting their tooling. Too often, organizations are forced to assemble bespoke serving stacks, slowing innovation and making it harder to scale and compound progress.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Microsoft Foundry is designed to address this challenge. It serves as a unified system of record and enterprise control plane for AI, bringing together models, agents, evaluation, deployment, and governance into a single experience. With Microsoft Foundry, teams can move from experimentation to production with confidence, using the models and frameworks that best fit their requirements, while relying on a consistent operational foundation.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\"><strong>Today, we\u2019re announcing the public preview of Fireworks AI on <\/strong><a href=\"https:\/\/ai.azure.com\/\" target=\"_blank\" rel=\"noreferrer noopener\"><strong>Microsoft Foundry<\/strong><\/a><strong>, bringing high\u2011performance open model inference into Azure.<\/strong> This integration reflects Microsoft Foundry\u2019s broader direction: providing a single place where developers can not only run open models efficiently but also customize and operationalize them as part of a complete enterprise\u2011ready AI lifecycle.<\/p>\n\n\n\n<figure class=\"wp-block-msx-ump-embed wp-block-msx-ump-embed\" class=\"wp-block-msxcm-ump-embed\">\n\t<div class=\"wp-block-embed__wrapper\">\n\t\t<universal-media-player id=\"ump-69e78e04375cc\"><\/universal-media-player>\n\t\t<script type=\"module\">\n\t\t\tconst currentTheme =\n\t\t\t\tlocalStorage.getItem('msxcmCurrentTheme') ||\n\t\t\t\t(window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light');\n\n\t\t\t\/\/ Modify player theme based on localStorage value.\n\t\t\tlet options = {\"autoplay\":false,\"hideControls\":null,\"language\":\"en-us\",\"loop\":false,\"partnerName\":\"cloud-blogs\",\"poster\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/image\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry_tbmnl_en-us?wid=1280\",\"title\":\"\",\"sources\":[{\"src\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry-0x1080-6439k\",\"type\":\"video\\\/mp4\",\"quality\":\"HQ\"},{\"src\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry-0x720-3266k\",\"type\":\"video\\\/mp4\",\"quality\":\"HD\"},{\"src\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry-0x540-2160k\",\"type\":\"video\\\/mp4\",\"quality\":\"SD\"},{\"src\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry-0x360-958k\",\"type\":\"video\\\/mp4\",\"quality\":\"LO\"}],\"ccFiles\":[{\"url\":\"https:\\\/\\\/azure.microsoft.com\\\/en-us\\\/blog\\\/wp-json\\\/msxcm\\\/v1\\\/get-captions?url=https%3A%2F%2Fwww.microsoft.com%2Fcontent%2Fdam%2Fmicrosoft%2Fbade%2Fvideos%2Fproducts-and-services%2Fen-us%2Fazure%2F1375250-fireworks-ai-microsoft-foundry%2F1375250-Fireworks-AI-Microsoft-Foundry_cc_en-us.ttml\",\"locale\":\"en-us\",\"ccType\":\"TTML\"}],\"downloadableFiles\":[{\"url\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry_transcript_en-us\",\"locale\":\"en-us\",\"mediaType\":\"transcript\"},{\"url\":\"https:\\\/\\\/cdn-dynmedia-1.microsoft.com\\\/is\\\/content\\\/microsoftcorp\\\/1375250-Fireworks-AI-Microsoft-Foundry_audio_en-us\",\"locale\":\"en-us\",\"mediaType\":\"audio\"}]};\n\n\t\t\tif (currentTheme) {\n\t\t\t\toptions.playButtonTheme = currentTheme;\n\t\t\t}\n\n\t\t\tdocument.addEventListener('DOMContentLoaded', () => {\n\t\t\t\tump(\"ump-69e78e04375cc\", options);\n\t\t\t});\n\t\t<\/script>\n\t<\/div>\n\t<\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"fireworks-ai-models-on-microsoft-foundry-a-single-place-for-open-models\">Fireworks AI models on Microsoft Foundry: A single place for open models<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Fireworks AI delivers industry-leading inference for open models, and Microsoft Foundry is what makes that performance usable at enterprise scale. Accessing Fireworks AI through Microsoft Foundry gives teams a single, trusted control plane to evaluate, deploy, customize, and operate open models alongside the rest of their AI stack.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">As open models mature, customization increasingly extends beyond training. Teams need consistent ways to configure, deploy, optimize, govern, and iterate on models in production without fragmenting tools or infrastructure. Microsoft Foundry provides the environment where these customization and operational workflows are standardized, while Fireworks AI supplies the performance and efficiency needed to run open models at scale. This means teams can move from experimentation to production using open models without stitching together separate tools, contracts, and deployment paths.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Together, Fireworks AI and Microsoft Foundry enable a more complete and sustainable approach to working with open models combining fast, efficient inference with a platform designed to support enterprise open model operations over time. <\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"http:\/\/ai.azure.com\" target=\"_blank\" rel=\"noreferrer noopener\">Start building with Fireworks on Microsoft Foundry today<\/a><\/div>\n<\/div>\n\n\n\n<p class=\"wp-block-paragraph\">With Fireworks AI on Foundry, developers can <strong>get access to best-in-class inferencing for open models<\/strong>, including optimized deployments for custom weight models. Fireworks AI is a market leader for high performance inference for open models. Its engine already runs at internet scale processing over 13T tokens daily, sustaining about 180 thousand requests per second, and generating over 1,000 tokens per second on large models, substantiated by leading benchmark performance on<em> <\/em><a href=\"https:\/\/artificialanalysis.ai\/providers\/fireworks\"><em>Artificial Analysis.<\/em><\/a><em> <\/em>This performance is now available on Foundry.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">Developers can log into Foundry and access these open models with Fireworks AI today:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">DeepSeek V3.2<\/li>\n\n\n\n<li class=\"wp-block-list-item\">OpenAI gpt-oss-120b<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Kimi K2.5<\/li>\n\n\n\n<li class=\"wp-block-list-item\">MiniMax M2.5 (<em>new<\/em>)<\/li>\n<\/ul>\n\n\n\n<p class=\"wp-block-paragraph\">This brings a new open model (MiniMax M2.5) to Foundry with serverless support and offers optimized inference for already popular open models. <\/p>\n\n\n\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;69e78e0439989&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"960\" height=\"540\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Fireworks-Catalog-4_models_Final.gif\" alt=\"A demo of model discovery.\" class=\"wp-image-49918\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"state.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<p class=\"wp-block-paragraph\">With Fireworks AI in Microsoft Foundry, developers can:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Evaluate models faster with day\u2011zero access and support:<\/strong> Start building immediately with access to state-of-the-art open models from Fireworks AI through a single Azure endpoint via Foundry.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Optimize inference: <\/strong>Requests to open models are served by Fireworks\u2019 high\u2011throughput inference stack for fast performance with Azure\u2011grade governance.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Run the models you already trust:<\/strong> With bring-your-own-weights (BYOW), you can upload and register quantized or fine\u2011tuned weights trained elsewhere without changing the serving stack.<\/li>\n<\/ul>\n\n\n\n<figure data-wp-context=\"{&quot;imageId&quot;:&quot;69e78e043aae7&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img loading=\"lazy\" decoding=\"async\" width=\"960\" height=\"540\" data-wp-class--hide=\"state.isContentHidden\" data-wp-class--show=\"state.isContentVisible\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Fireworks-Custom_Model-2_Final.gif\" alt=\"A demo of custom model creation.\" class=\"wp-image-49920\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"state.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"state.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><strong>Choose the right pricing model for your workload<\/strong>: Use serverless, pay-per\u2011token inference to experiment securely and quickly with <a href=\"https:\/\/learn.microsoft.com\/en-us\/azure\/foundry\/foundry-models\/concepts\/deployment-types#data-zone-standard\">Data Zone Standard<\/a> or choose provisioned throughput units (PTUs) for predictable, steady-state performance with base or custom models. Whether you\u2019re optimizing for agility or efficiency, you get flexibility without managing infrastructure.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><strong>Operate with enterprise trust and scale<\/strong>: We are committed to enabling customers to build production-ready AI applications quickly, while maintaining the highest levels of safety and security. Foundry provides an end-to-end workspace for agent development, evaluation, and deployment, including unified governance, observability, and agent-ready tooling.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"the-future-of-fireworks-and-ai-use-cases\">The future of Fireworks and AI use cases<\/h2>\n\n\n\n<p class=\"wp-block-paragraph\">Microsoft Foundry is evolving to support the full lifecycle of open models\u2014from early evaluation through production operation and ongoing optimization. As teams scale their use of open models, having a consistent, enterprise\u2011ready foundation becomes increasingly important.<\/p>\n\n\n\n<p class=\"wp-block-paragraph\">By integrating Fireworks AI into Microsoft Foundry, developers gain access to high\u2011performance inference today while building on a platform designed to support deeper customization and enterprise operations over time. This approach gives teams the confidence to adopt open models not just for what they can do now, but for how they can grow, adapt, and operate reliably as their AI ambitions expand. We\u2019re looking forward to seeing how developers and enterprises use Fireworks AI on Microsoft Foundry to power the next generation of intelligent applications.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"to-get-started\">To get started:<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li class=\"wp-block-list-item\">Go to <a href=\"https:\/\/ai.azure.com\" target=\"_blank\" rel=\"noreferrer noopener\">Microsoft Foundry<\/a> models and select Fireworks AI open models in the model catalog collection.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Select the open model hosted by Fireworks.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">View the model card.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Select your deployment option\u2014serverless or PTU\u2014and deploy.<\/li>\n<\/ol>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-a89b3969 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/aka.ms\/Claudemodelcard\" target=\"_blank\" rel=\"noreferrer noopener\">Start building with Fireworks on Microsoft Foundry<\/a><\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"learn-more-about-fireworks-on-microsoft-foundry\">Learn more about Fireworks on Microsoft Foundry<\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/aka.ms\/fireworks-learn-more\" target=\"_blank\" rel=\"noreferrer noopener\">Learn more about Fireworks on Microsoft Foundry<\/a>.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/aka.ms\/foundry-custom-models\" target=\"_blank\" rel=\"noreferrer noopener\">Learn how to upload custom weight models<\/a> for inferencing with Fireworks on Foundry.<\/li>\n\n\n\n<li class=\"wp-block-list-item\"><a href=\"https:\/\/aka.ms\/model-mondays\">Join Fireworks on Model Mondays on March 23<\/a> live on YouTube or on demand.<\/li>\n\n\n\n<li class=\"wp-block-list-item\">Explore <a href=\"https:\/\/info.microsoft.com\/ww-landing-the-fast-track-to-secure-ai-apps-and-agents.html?lcid=en-us\" target=\"_blank\" rel=\"noreferrer noopener\">The Fast Track to AI Apps and Agents<\/a> for a roadmap to build, deploy, and scale AI-native solutions with Azure.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>We\u2019re announcing the public preview of Fireworks AI on Microsoft Foundry, bringing high\u2011performance open model inference into Azure. This integration reflects Microsoft Foundry\u2019s broader direction: providing a single place where developers can not only run open models efficiently but also customize and operationalize them as part of a complete enterprise\u2011ready AI lifecycle.<\/p>\n","protected":false},"author":76,"featured_media":49869,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ms_queue_id":["aiblog-content-sync"],"ep_exclude_from_search":false,"_classifai_error":"","_classifai_text_to_speech_error":"","_alt_title":"","footnotes":"","msx_community_cta_settings":[]},"categories":[1454],"tags":[2671,3213],"audience":[3055],"content-type":[1465,3272],"product":[1803,3164],"tech-community":[3041],"topic":[],"coauthors":[3166],"class_list":["post-49830","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-ai-machine-learning","tag-ai","tag-azure-ai","audience-developers","content-type-announcements","content-type-news","product-azure-ai","product-microsoft-foundry","review-flag-1-1680286581-825","review-flag-2-1680286581-601","review-flag-5-1680286581-950","review-flag-new-1680286579-546","review-flag-publi-1680286584-566"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.2 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog<\/title>\n<meta name=\"description\" content=\"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog\" \/>\n<meta property=\"og:description\" content=\"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Azure Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsoftazure\" \/>\n<meta property=\"article:published_time\" content=\"2026-03-11T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-03-11T18:30:00+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"576\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Yina Arenas\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:image\" content=\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\" \/>\n<meta name=\"twitter:creator\" content=\"@azure\" \/>\n<meta name=\"twitter:site\" content=\"@azure\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Yina Arenas\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\"},\"author\":[{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/yina-arenas\/\",\"@type\":\"Person\",\"@name\":\"Yina Arenas\"}],\"headline\":\"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure\",\"datePublished\":\"2026-03-11T07:00:00+00:00\",\"dateModified\":\"2026-03-11T18:30:00+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\"},\"wordCount\":973,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\",\"keywords\":[\"AI\",\"Azure AI\"],\"articleSection\":[\"AI + machine learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\",\"name\":\"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog\",\"isPartOf\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\",\"datePublished\":\"2026-03-11T07:00:00+00:00\",\"dateModified\":\"2026-03-11T18:30:00+00:00\",\"description\":\"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.\",\"breadcrumb\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\",\"contentUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg\",\"width\":1024,\"height\":576,\"caption\":\"Text reads \\\"Fireworks A I now on Microsoft Foundry.\\\"\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Blog home\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"AI + machine learning\",\"item\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/ai-machine-learning\/\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#website\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"name\":\"Microsoft Azure Blog\",\"description\":\"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.\",\"publisher\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization\",\"name\":\"Microsoft Azure Blog\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"contentUrl\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp\",\"width\":512,\"height\":512,\"caption\":\"Microsoft Azure Blog\"},\"image\":{\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/microsoftazure\",\"https:\/\/x.com\/azure\",\"https:\/\/www.instagram.com\/microsoftdeveloper\/\",\"https:\/\/www.linkedin.com\/company\/16188386\",\"https:\/\/www.youtube.com\/user\/windowsazure\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/83fe4c04c61d5e58d555ba137c01a107\",\"name\":\"Garry Guseltsev\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g4f09d3e62b774b84289036a84f6a8c1c\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g\",\"caption\":\"Garry Guseltsev\"},\"url\":\"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/garryguseltsev\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog","description":"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/","og_locale":"en_US","og_type":"article","og_title":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog","og_description":"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.","og_url":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/","og_site_name":"Microsoft Azure Blog","article_publisher":"https:\/\/www.facebook.com\/microsoftazure","article_published_time":"2026-03-11T07:00:00+00:00","article_modified_time":"2026-03-11T18:30:00+00:00","og_image":[{"width":1024,"height":576,"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","type":"image\/jpeg"}],"author":"Yina Arenas","twitter_card":"summary_large_image","twitter_image":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","twitter_creator":"@azure","twitter_site":"@azure","twitter_misc":{"Written by":"Yina Arenas","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#article","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/"},"author":[{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/yina-arenas\/","@type":"Person","@name":"Yina Arenas"}],"headline":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure","datePublished":"2026-03-11T07:00:00+00:00","dateModified":"2026-03-11T18:30:00+00:00","mainEntityOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/"},"wordCount":973,"commentCount":0,"publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","keywords":["AI","Azure AI"],"articleSection":["AI + machine learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/","name":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure | Microsoft Azure Blog","isPartOf":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage"},"thumbnailUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","datePublished":"2026-03-11T07:00:00+00:00","dateModified":"2026-03-11T18:30:00+00:00","description":"Learn how you can access low latency, high throughput inferencing for open models and performance-optimized deployment of custom models with Fireworks AI on Microsoft Foundry.","breadcrumb":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#primaryimage","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2026\/03\/Azure-Foundry-Fireworks.jpg","width":1024,"height":576,"caption":"Text reads \"Fireworks A I now on Microsoft Foundry.\""},{"@type":"BreadcrumbList","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/introducing-fireworks-ai-on-microsoft-foundry-bringing-high-performance-low-latency-open-model-inference-to-azure\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog home","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/"},{"@type":"ListItem","position":2,"name":"AI + machine learning","item":"https:\/\/azure.microsoft.com\/en-us\/blog\/category\/ai-machine-learning\/"},{"@type":"ListItem","position":3,"name":"Introducing Fireworks AI on Microsoft Foundry: Bringing high performance, low latency open model inference to Azure"}]},{"@type":"WebSite","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#website","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","name":"Microsoft Azure Blog","description":"Get the latest Azure news, updates, and announcements from the Azure blog. From product updates to hot topics, hear from the Azure experts.","publisher":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/azure.microsoft.com\/en-us\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#organization","name":"Microsoft Azure Blog","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","contentUrl":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-content\/uploads\/2024\/06\/microsoft_logo.webp","width":512,"height":512,"caption":"Microsoft Azure Blog"},"image":{"@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/microsoftazure","https:\/\/x.com\/azure","https:\/\/www.instagram.com\/microsoftdeveloper\/","https:\/\/www.linkedin.com\/company\/16188386","https:\/\/www.youtube.com\/user\/windowsazure"]},{"@type":"Person","@id":"https:\/\/azure.microsoft.com\/en-us\/blog\/#\/schema\/person\/83fe4c04c61d5e58d555ba137c01a107","name":"Garry Guseltsev","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g4f09d3e62b774b84289036a84f6a8c1c","url":"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/8476ebc2bcbe54e1843bd5cce3ec249bed771194411b3052815d4c5d272128f2?s=96&d=mm&r=g","caption":"Garry Guseltsev"},"url":"https:\/\/azure.microsoft.com\/en-us\/blog\/author\/garryguseltsev\/"}]}},"msxcm_display_generated_audio":false,"msxcm_animated_featured_image":49868,"distributor_meta":false,"distributor_terms":false,"distributor_media":false,"distributor_original_site_name":"Microsoft Azure Blog","distributor_original_site_url":"https:\/\/azure.microsoft.com\/en-us\/blog","push-errors":false,"_links":{"self":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/49830","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/users\/76"}],"replies":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/comments?post=49830"}],"version-history":[{"count":21,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/49830\/revisions"}],"predecessor-version":[{"id":49968,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/posts\/49830\/revisions\/49968"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media\/49869"}],"wp:attachment":[{"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/media?parent=49830"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/categories?post=49830"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tags?post=49830"},{"taxonomy":"audience","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/audience?post=49830"},{"taxonomy":"content-type","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/content-type?post=49830"},{"taxonomy":"product","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/product?post=49830"},{"taxonomy":"tech-community","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/tech-community?post=49830"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/topic?post=49830"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/azure.microsoft.com\/en-us\/blog\/wp-json\/wp\/v2\/coauthors?post=49830"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}