HomeTechAI Gateway Evolution: How Higress Transformed AI Infrastructure in 2025

AI Gateway Evolution: How Higress Transformed AI Infrastructure in 2025

Published on

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

Alibaba Cloud’s Higress AI Gateway emerged as critical infrastructure for enterprises deploying AI applications in 2025, marking a significant shift from experimental technology to production-grade necessity. The open-source platform gained momentum after supporting rapid transitions between models like DeepSeek R1 and Qwen2.5-Max, while simultaneously addressing the November 2025 retirement of Kubernetes Ingress NGINX.

What Makes AI Gateway Essential Now

AI gateways serve as intermediaries between applications and large language models, managing authentication, load balancing, observability, and protocol conversion. Higress distinguished itself by becoming the first in the industry to publish eight typical AI gateway scenarios based on real client deployments. The platform addresses core challenges enterprises face when implementing AI: model switching flexibility, networked search integration, permission controls, and abuse prevention.

The timing proved critical as organizations needed unified infrastructure to manage multiple LLM providers. Higress supports over 100 commonly used models through unified protocol conversion and provides model-level fallback capabilities when primary services fail. This eliminated the need for custom integrations with each AI vendor, reducing deployment complexity by an estimated 30% based on client feedback from Semir Group.

Higress Technical Capabilities and Performance

Higress functions as a “triple gateway” architecture combining traffic, microservice, and security gateway capabilities on a single platform. The open-source project introduced several performance optimizations in 2025, including a load balancing algorithm specifically designed for large models that cuts initial token latency by 50%.

Key technical features include:

  • Multi-model proxy with unified invocation protocol for 100+ LLM providers
  • Real-time observability for AI traffic with token-level monitoring
  • WebAssembly (Wasm) plugin architecture for extensible functionality
  • Integration with service registries like Nacos for microservices communication
  • Production-validated architecture handling hundreds of thousands of requests per second

The platform launched HiMarket, an open-source subproject based on Alibaba’s internal IdeaLAB AI platform, providing unified management for models, applications, and interfaces. When combined with AgentScope and AgentRun, this creates a complete AI development toolchain for enterprise deployment.

MCP Protocol Integration and Developer Tools

Higress became one of the first platforms to fully support Model Context Protocol (MCP), a standardized communication layer between AI systems and data sources introduced by Anthropic in November 2024. The gateway addresses the common challenge of converting existing APIs to MCP format by open-sourcing a low-code conversion tool and building an MCP marketplace with over 50 high-quality integrations.

In December 2025, Higress’s MCPMarket MCP Server ranked in the global Top 100, with major open-source client agents integrating the platform. Alibaba’s Taotian business unit used Higress to convert internal HSF services into MCP Servers, demonstrating production viability at scale. The MCP architecture enables AI models to access real-time data from platforms like Google Drive, Slack, and GitHub without custom connectors for each service.

The openapi-to-mcp tool allows developers to transform OpenAPI specifications into remote MCP servers automatically, significantly reducing integration time. This capability proved essential for enterprises with existing REST APIs that needed quick AI agent connectivity.

Industry Impact and Enterprise Adoption

Ctrip shared implementation details at China’s Trusted Cloud Conference, demonstrating how Higress solved real challenges when launching large model applications. The travel platform’s case became a reference example for enterprises implementing AI gateways across industries. Ant Group’s SOFA team released SOFA Higress, a version optimized for financial-grade clients requiring enhanced security and compliance.

Sealos, an early adopter, documented migration from Nginx Ingress to Higress on Reddit, reporting nearly 100x performance improvements that garnered international developer attention. Semir Group achieved 30% overall efficiency improvement through unified management of multiple models and MCP servers using Higress Enterprise Edition.

The platform received recognition at the 2025 Wuzhen World Internet Conference with the Outstanding Open Source Community Award, while Alibaba’s Feitian-based AI gateway won one of three major innovation practice awards. Higress also participated in drafting AI gateway industry standards led by China Academy of Information and Communications Technology.

Nginx Ingress Retirement Context

Kubernetes SIG Network and Security Response Committee announced Ingress NGINX retirement in November 2025, citing unsustainable maintenance costs and security concerns. The widely-deployed ingress controller will receive only best-effort maintenance until March 2026, after which no further releases, bug fixes, or security updates will be provided.

The Kubernetes community officially listed Higress among recommended alternatives, alongside cloud provider solutions. Cloud native advocate Jimmy Song characterized the retirement as “a pivotal moment in infrastructure evolution,” noting that maintenance costs permanently exceeded community contribution pace. The extensive attack surface created by Nginx Ingress’s flexibility became a liability when secure updates could no longer be guaranteed.

This created migration urgency for organizations running Kubernetes workloads, particularly those deploying AI applications requiring gateway capabilities beyond basic ingress routing. Higress positioned itself as a drop-in replacement with Kubernetes ingress controller compatibility, supporting many nginx ingress controller annotations while adding AI-specific features.

What’s Next for AI Gateway Infrastructure

Alibaba Cloud launched serverless instances of Higress Enterprise Edition at one-tenth the resource cost of traditional deployments, lowering barriers for small to mid-sized organizations. The HiMarket platform will continue expanding as an open-source project, providing enterprises with ready-to-use AI open platform capabilities.

The Higress community hosted the first AI Gateway Developer Challenge in 2025, with 11 finalist teams competing on AI Agent, RAG (Retrieval-Augmented Generation), and intelligent routing implementations. This developer engagement signals growing ecosystem momentum around standardized AI gateway patterns.

Gateway API support is on the roadmap, enabling smooth migration from Ingress API to the newer Kubernetes standard. The platform’s open-source model combined with commercial support aims to address sustainability challenges that led to Nginx Ingress retirement, providing long-term viability for production deployments.

Organizations planning AI implementations in 2026 should evaluate AI gateway requirements early in architecture design, as unified model management and observability have become baseline expectations rather than advanced features.

Featured Snippet Boxes

What is an AI Gateway and why do enterprises need it?

An AI Gateway is infrastructure that sits between applications and large language models, managing authentication, load balancing, protocol conversion, and observability. Enterprises need it to switch between multiple LLM providers without rewriting code, monitor token usage and costs, enforce access controls, and maintain consistent performance across AI services.

How does Higress differ from traditional API gateways?

Higress combines traffic gateway, microservice gateway, and security gateway capabilities with AI-specific features like unified protocol conversion for 100+ LLM providers, 50% reduction in initial token latency through optimized load balancing, MCP protocol support for AI agent tool calling, and AI-focused observability including token-level monitoring and cost tracking.

What should organizations know about Nginx Ingress retirement?

Nginx Ingress will receive only best-effort maintenance until March 2026, after which no security updates or bug fixes will be released. The Kubernetes community cited unsustainable maintenance costs and security concerns as retirement reasons. Organizations must migrate to alternatives like Higress, cloud provider ingress controllers, or Gateway API implementations before the March 2026 deadline.

What is Model Context Protocol (MCP) and how does it work with Higress?

MCP is an open standard introduced by Anthropic in November 2024 that enables AI models to connect with external data sources and tools using a unified protocol. Higress hosts MCP servers and provides an openapi-to-mcp conversion tool that transforms existing REST APIs into MCP-compatible services, allowing AI agents to call various tools without custom integrations for each platform.

Mohammad Kashif
Mohammad Kashif
Senior Technology Analyst and Writer at AdwaitX, specializing in the convergence of Mobile Silicon, Generative AI, and Consumer Hardware. Moving beyond spec sheets, his reviews rigorously test "real-world" metrics analyzing sustained battery efficiency, camera sensor behavior, and long-term software support lifecycles. Kashif’s data-driven approach helps enthusiasts and professionals distinguish between genuine innovation and marketing hype, ensuring they invest in devices that offer lasting value.

Latest articles

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

Xcode 26.5 Beta Ships Swift 6.3 and an iOS SDK That Lays Groundwork for Maps Ads

Xcode 26.5 beta (17F5012f) arrived on March 30, 2026, and it carries more developer impact than a typical point release. Swift 6.3 ships as the new default compiler, five platform SDKs move forward simultaneously, and

macOS Tahoe 26.5 Beta 1 Quietly Tests RCS Encryption Again and Lays the Foundation for Apple Maps Ads

Apple released macOS Tahoe 26.5 Beta 1 on March 29, 2026, less than a week after macOS 26.4 reached Mac hardware worldwide. Most coverage frames this as a routine maintenance drop.

iOS 26.5 Beta Flips RCS Encryption Back On, Puts Ads Inside Apple Maps, and Expands EU Wearable Access

Apple dropped iOS 26.5 beta 1 (build 23F5043g) on March 29, 2026, one week after iOS 26.4 shipped to the public. Siri watchers will find nothing new here. But the update carries three changes significant enough to

More like this

Claude’s Agent Harness Patterns Are Rewriting Developer Assumptions About What AI Can Handle Alone

That’s Anthropic’s confirmed BrowseComp score for Claude Opus 4.6 running with a multi-agent harness, web search, compaction triggered at 50,000 tokens, and max reasoning effort.

Xcode 26.5 Beta Ships Swift 6.3 and an iOS SDK That Lays Groundwork for Maps Ads

Xcode 26.5 beta (17F5012f) arrived on March 30, 2026, and it carries more developer impact than a typical point release. Swift 6.3 ships as the new default compiler, five platform SDKs move forward simultaneously, and

macOS Tahoe 26.5 Beta 1 Quietly Tests RCS Encryption Again and Lays the Foundation for Apple Maps Ads

Apple released macOS Tahoe 26.5 Beta 1 on March 29, 2026, less than a week after macOS 26.4 reached Mac hardware worldwide. Most coverage frames this as a routine maintenance drop.