Featured
Reports

Scott Gutterman from the PGA TOUR discusses the new Studios and the impact on fan experience

Zeus Kerravala and Scott Gutterman, SVP of Digital and Broadcast Technologies discuss the expansion of the PGA TOUR Studios from […]

Continue Reading

Phillipe Dore, CMO of BNP Paribas Tennis Tournament talks innovation

April 2025 // Zeus Kerravala from ZK Research interviews Philippe Dore, CMO of the BNP Paribas tennis tournament. Philippe discusses […]

Continue Reading

Nathan Howe, VP of Global Innovation at Zscaler talks mobile security

March 2025 // Zeus Kerravala from ZK Research interviews Nathan Howe, VP of Global Innovation at Zscaler, about their new […]

Continue Reading

Check out
OUR NEWEST VIDEOS

2026 ZKast #121- The Death of CCaaS? How Agentic AI & Data Gravity Are Changing Workforce Management

6.9K views 11 hours ago

0 0

2026 ZKast #120 - Cisco Live Big Announcements: Zeus Kerravala and Ravit Jain - The Ravit Show

6.5K views June 27, 2026 5:48 pm

0 0

2026 ZKast #119 - Move Fast & Break Nothing: How Forward’s Digital Twin Prevents Network Outages

7.9K views June 27, 2026 3:37 pm

1 0

Recent
ZK Research Blog

News

Artificial intelligence played a prominent role at this week’s Bio International Convention in San Diego, the largest biotech event with vendors spanning the full ecosystem of companies in this industry.

Today in a special address, Kimberly Powell (pictured), vice president and general manager of healthcare and life sciences at Nvidia Corp., made the case that agentic AI is about to do for biotech what it just did for software — and the company’s BioNeMo is the stack that turns generic large language models into working “AI scientists” that are both faster and cheaper to run.

Nvidia wants to make ‘AI scientists’ mainstream in biotech

Powell opened her presentation by outlining where the industry is now. “We are witnessing the fastest platform shift the life sciences industry has ever seen,” she said. She compared AI to the microscope, X-ray crystallography, and gene sequencing, calling them a new class of scientific instruments. This time, the instrument doesn’t just see or measure; it reasons, plans and acts.

At the event, Nvidia announced its BioNeMo Agent Toolkit, a software stack that turns large language models into domain-specific AI agents capable of executing end-to-end biology and chemistry workflows — from literature review to protein design to lab automation — while optimizing for performance and cost.

From generative to agentic AI for science

Powell’s core thesis is that the life sciences, a $300 billion annual pharmaceutical budget (global R&D is reaching $3.8 trillion), have quietly been preparing for this inflection for a decade. On one side, there has been an explosion of AI research in biology, chemistry, imaging and genomics. On the other hand, Nvidia has been building the infrastructure to operationalize that research: GPUs, networking, CUDA-X libraries and domain platforms such as MONAI, Parabricks, cuEquivariance and BioNeMo.

What has changed in the last 12 to 18 months is the emergence of agentic AI, systems in which a large language model “brain” is wrapped in a harness that manages tools, memory, security policies and multistep workflows. Nvidia’s NeMo Curator and NemoClaw framework and open-source harness are generic versions of that pattern; the BioNeMo Agent Toolkit is the life-sciences-optimized edition.

“Agents are becoming the modern application layer in life sciences,” Powell said. “Every single one of the thousands of companies in life sciences is about to become an agent builder.” That’s a very different framing than “just another model.” It says the next application tier in biotech won’t be GUIs and pipelines, but rather networks of specialized agents coordinating work across digital and physical labs.

BioNeMo as the scientific toolbox — tuned for speed and cost

Nvidia’s announcement positions BioNeMo as the science that sits behind those agents. In practice, the BioNeMo Agent Toolkit does three important things for biotech teams:

  • Packages proven life-science models, such as protein folding, molecular docking, generative chemistry, genomics and imaging, into agent-callable tools with clear schemas: what each tool does, what inputs it requires, what outputs to expect and how to troubleshoot.
  • Exposes those capabilities via NIM microservices that can run on-premises, in the public cloud or across hybrid environments, so pharma and biotech can place compute where data and regulatory constraints demand.
  • Optimizes for token efficiency and computational cost, not just raw accuracy, by giving agents access to highly accelerated libraries and models, so they spend fewer tokens and less wall clock time hunting for the right tool or rerunning failed steps.

Powell specifically addressed the historical cost-performance trade-off. She described BioNeMo’s skills and tools as “the knowhow” that lets agents complete complex workflows with “strong task completion, workflow accuracy, and reduced token expense — that means less compute, more reliable results.” In other words, a BioNeMo-enabled agent doesn’t just produce better science; it does so with fewer LLM calls and more efficient graphics processing unit usage, making cost and performance optimization possible at the same time.

Powell emphasized that BioNeMo is agent-agnostic. The same toolkit can serve agents built on OpenAI, Anthropic, in-house LLMs or Nvidia’s own Nemotron models. That matters for buyers who don’t want their next decade of drug discovery workflows locked to a single model vendor.

What an AI ‘co-scientist’ looks like in practice

To ground this in something beyond architectural diagrams, Powell walked through a protein-binder design workflow targeting MCL1, a protein that helps tumor cells survive. Traditionally, that path — understanding the target to generating binders, predicting structures, scoring candidates and deciding what to synthesize — takes months of specialized human effort.

A generic agent can attempt that workflow but will burn time and tokens “searching for the right tools, figuring out how to call them and oftentimes completely failing to complete the task.” With BioNeMo, Powell said, a scientist gives a single goal such as “Design a binder for MCL1,” and the agent:

  • Retrieves or predicts the target structure and its binding region.
  • Generates candidate binders using BioNeMo generative models.
  • Folds the target and binder together, then evaluates docking poses using accelerated structural engines.
  • Ranks and returns the top candidates for human review — “all done without human intervention.”

This is the “AI scientist” pattern many startups are pursuing. The key nuance is verification. Panelist Andrew White, co-founder and chief technology officer at Edison Scientific, noted that as agents improve, “the era of humans writing questions and agents taking the test is over. We really do need this kind of lab-in-the-loop.” His takeaway: The true bottleneck is shifting from reasoning about existing literature to running new experiments, which is exactly where closed-loop digital and robotic labs come in.

Why this matters for biotech and pharma

For biotech leaders, the strategic implications are less about any single toolkit and more about the operating model shift Powell and the panelists described:

  • Compression of timelines. Powell argued that agents will “take scientific discovery and shrink the timeframe” — work that took years moves to months, and months to days. Josh Meier, CEO of Chai Discovery, gave a concrete example. Antibody design success rates have risen from one in 1,000 to 10% to 15% in just a few years, driven by improved models and faster iteration.
  • Rising expectations on wet-lab speed. As in-silico design compresses from months to hours of GPU time, lab workflows become the new bottleneck. Meier pointed out that many assays were never optimized for speed because there was no incentive; now, tightening that loop is a competitive necessity.
  • New collaboration patterns: Powell sees pharma shifting from primarily “deep scientific relationships” to partnerships that integrate frontier AI labs, tool providers, and platform companies within closed-loop systems — where every experiment feeds back into proprietary foundation models and agents. Benchling CEO Sajith Wickramasekara echoed this, arguing that electronic lab notebooks are evolving from retrospective records into “systems of action” co-authored by AI.
  • Lowering barriers and de-siloing science. Powell believes tools like BioNeMo will let biologists tap into advanced modeling “in a natural language way, instead of having to get into any type of coding at all,” breaking down silos between disciplines and making modern AI tools accessible to more of the bench.

That last point is worth watching. If AI agents can reliably orchestrate highend modeling and workflow automation behind a conversational front end, the practical distinction between “computational biologist” and “wetlab biologist” starts to blur.

Reading the signal for the road ahead

From an industry watcher’s perspective, BIO 2026 is less about Nvidia “entering” life sciences, since it has been here for a decade, and more about standardizing the agentic stack for biotech before others do. The BioNeMo Agent Toolkit turns Nvidia’s existing beachheads, such as MONAI, Parabricks, cuEquivariance and BioNeMo models, into a coherent runtime that any agent harness can plug into, with clear value props for speed, accuracy, and cost.

The open-source angle is also notable. Powell made it explicit that the toolkit is available on GitHub and is designed to work with both open- and closed-frontier models, giving pharma and biotech the option to build their own domain-specific “brains” on top of Nvidia’s toolbox. In a world where IP, data residency and regulator trust are existential concerns, that flexibility will matter.

Powell closed with an ambition that neatly captures Nvidia’s posture: “Agentic AI has revolutionized coding — that’s a done deal. Now this ecosystem is assembling to revolutionize science as we know it.” For biotech leaders, the question is no longer whether AI can help science, she argued, but “does AI have the right instruments to run science?” With the BioNeMo Agent Toolkit, Nvidia is betting that the answer for a growing slice of the industry will be yes.

When Amazon Web Services Inc. held its New York Summit last week, Vice President of Agentic AI Swami Sivasubramanian as usual was the headline act, delivering the opening keynote.

Sivasubramanian made the case to enterprise leaders that the artificial intelligence conversation has moved beyond pilots and productivity hacks into a world where the real advantage lies in compounding momentum across work, security, software delivery and data. For IT pros, that means your architectural decisions over the next 12 to 18 months will determine whether AI agents become a force multiplier or a new source of chaos.

Here are five big ideas from Sivasubramanian’s keynote and what they mean for those responsible for building and operating enterprise technology:

1. From ‘faster search bars’ to compounding agents

Sivasubramanian’s main critique of the first generation of AI assistants is that they never broke out of chat-window gravity. They sit on top of tools, answer a question and then forget. “We gave them chat windows and connected them to our tools,” he said. “They answer one question, and then they forget. The promise was intelligence, but what we got was a slightly faster search bar. Faster search doesn’t compound; it flatlines.”

The alternative he laid out is an agentic model in which every completed task feeds the next. “What you really need is agents that actually change the way you work, not just speed up the steps, but completely eliminate them,” Swami argued. “If humans are still forced to be the orchestration layer, your momentum actually has a ceiling.” In his framing, “every task that their agents complete makes the next one smarter,” creating “compounding momentum” and widening the gap between early adopters and those who wait.

That’s the design center for Amazon Quick, an AI assistant that “states the outcome you want and figures out how to get there across all your systems, all your data and all your context,” powered by a knowledge graph that reasons across people, documents, communications and data lakes. In the live demo, Quick assembled a marketing report by pulling data from Slack, Google Drive and OneDrive in about 20 seconds — work, Sivasubramanian said, “would have taken probably hours of actual research” before.

Implications for IT pros: This model assumes your collaboration and data platforms are open to agent access and governed by strong identity and policy controls. The job shifts from choosing yet another assistant to curating an ecosystem where agents can safely traverse silos. Connectors, metadata and policy enforcement become as important as model choice. This is a vastly different role for IT pros, but one that’s critical for companies that succeed with their agentic initiatives.

2. Security: Ending the ‘walled garden vs. wild garden’ tradeoff

On security, Sivasubramanian highlighted a dilemma many chief information security officers will face. On one side, “agents that work inside their own walled garden only see what’s inside their own productivity suite. The moment you need something outside the wall, you are back to being the orchestrator.” On the other hand, open tools “do not offer the level of security, compliance and governance that enterprises demand. You traded the walled garden for the wild one.”

“This is a false choice,” he said. “Quick doesn’t ask you to choose. No walls, no copy-and-paste bridges, and every action it takes carries its own governance. Who acted on it, what data they touched, where it went, and whether the policy allowed it.” That theme continues with AWS Continuum, a suite of agent-driven security capabilities spanning penetration testing, threat modeling and code vulnerability assessment. Chet Kapoor, who leads security, observability, search and governance products, described the shift from “telemetry, storage, query and dashboards for humans” to “telemetry to context to reasoning to actions for agents.” Telemetry without context is “noise,” he said; with context, it becomes a “signal” agents can act on.

Customer stories were included to make the stakes concrete. Swami cited GoDaddy using Amazon Quick to eliminate “15,000 hours of manual work annually.” He also highlighted the NBA’s use of Quick to structure 25 years of prospect data into interactive leaderboards and comparisons.

Implications for IT pros: Security operations are headed toward agents taking actions under policy, not analysts staring at dashboards. That raises the importance of policy as code, identity boundaries, least-privilege design, and clear “rails” for where agents can operate. The conversation with the CISO is no longer “Should we use AI?” but “What will we allow AI to do, and under what guardrails?”

3. Software delivery as a closed loop

If the first wave of generative AI was about coding copilots, this keynote reframed the narrative around end-to-end software delivery loops. “Write it right, ship it fast, keep it modern – not three tools, one continuous loop, always running, always compounding,” he said. That loop is already in production at Amazon Stores, where teams behind the retail experience saw a “median 4.5x improvement in how fast correct code reaches production, with some teams hitting up to 17x,” and “AI-generated code changes landing with 95% accuracy, higher than the human baseline.”

Kiro is the engineering agent that anchors the “write it right” part of the loop. You give it a prompt, and it generates “clear requirements, structured design docs, implementation tasks, and validated tests before a single line of code is generated.” It then uses agents and property-based testing to implement and verify. Swami pointed to fintech startup Dhan, which needed to support more than 170 complex trading indicators. Without agents, it estimated “over a dozen engineers in a period of 12 to 24 months;” with Kiro, “all this was built by a single engineer in just eight weeks.”

The loop extends into operations. AWS DevOps Agent started as an incident-response companion used by customers like T-Mobile and United Airlines; now AWS is adding release management. It can project production risk from a code change, explore an application such as an end user, score releases, and feed its report “directly to your coding agent to start implementing those fixes automatically.”

On the other side of the loop, AWS Transform moves from one-time modernization projects to “continuous modernization,” performing “continuous state analysis and remediation at machine speed, always watching, always fixing across every code base you own.” AWS says customers have already used Transform to eliminate 1.6 million hours of manual modernization work.

Implications for IT pros: This is an opinionated pipeline: spec, code, test, release, modernize, repeat, with agents in each phase. To benefit, enterprises will need to standardize how they organize their Git repositories, pipelines and quality gates so agents can act safely across services and to make a cultural shift that treats modernization and reliability work as continuous flows, not project-of-the-year initiatives.

4. Southwest Airlines: A playbook for a ‘modern fleet’ of systems

The most compelling customer story came from Lauren Woods, executive vice president and chief information officer at Southwest Airlines. She linked technology choices directly to lessons from Winter Storm Elliott. “It wasn’t our systems that were failing, but they were not designed to keep up with the pace and the level of complexity happening across the operation all at once,” she said. To run like a modern airline, “we need technology that operates like a modern fleet.”

Southwest chose AWS as its preferred cloud partner for a “secure, scalable foundation” and access to innovation. Regarding AI, Woods said she uses Amazon Quick every day, describing a shift from “looking at data after the fact to interacting with it in real time” across fare and revenue analysis and call center behavioral trends. The impact has been faster decisions, closer to the point of action.

For engineering, Southwest scaled Kiro to “more than 2,700 developers, about two-thirds of our engineering organization,” using it for unit test generation, infrastructure as code, and faster onboarding. The Southwest.com platform, which is mission-critical and built on legacy architecture, had a long modernization roadmap. Using Kiro, “our teams have accelerated that modernization significantly, pulling the original timeline in by three years,” Lauren said. “We’re making it easier to build on, evolve and scale as our business changes.”

Implications for IT pros: Southwest is an excellent case study. AI-augmented decision-making across the business, agents embedded in the SDLC at scale, and modernization and transformation running in parallel. It’s also a reminder that the key performance indicator for AI initiatives will increasingly be operational resilience and customer satisfaction, not just developer productivity.

5. Agent platforms: Harness, guardrails and context as first-class primitives

The final act of the keynote shifted from AWS-built agents to the agents that customers will build themselves. Sivasubramanian noted that “the agents that will matter the most are the ones for your business that only you can create,” but many are “stuck between prototype and production” because teams are re-implementing basics: authentication, memory, tool access, security and governance.

Amazon’s answer is AgentCore, which provides “core components to build agents” and includes a managed runtime, built-in identity, session memory, observability, evaluations and access controls. It is designed to work with any agent framework and model. Over the past six months, Swami said, “the number of tasks performed by agents in AgentCore has grown by 15x,” and customers such as PGA TOUR, Nasdaq and Visa are building production agents in weeks instead of months.

Two concepts are important here. First, the harness. Sivasubramanian described the model as the “brain” and the harness as the “body” that provides “state persistence, error recovery, context management, [and] session isolation.” AgentCore Harness can turn a model into an agent in minutes with three application programming interface calls. Second, Agent Core Policies define what agents can and cannot do and are enforced “outside the agent’s code, where the agent can’t bypass it,” including detection of prompt attacks, harmful content, and sensitive data. AWS plans to ingest signals from third-party security providers into that policy layer.

Underpinning this is context. AWS Context automatically builds a knowledge graph across structured and unstructured data and exposes it to agents at runtime. Swami pointed out that within Amazon, the semantic knowledge store behind Q processes “over 1.8 million requests” per day, mapping business semantics (“escalations” vs. “tickets”) and relationships across systems. In the enterprise, that graph spans public web data via managed search tools, organizational content in S3, SharePoint, Confluence, and Google Drive, and structured data in lakes and warehouses.

Implications for IT pros: This is the AI platform north star: an agent runtime/harness, a policy and guardrail layer outside prompts, and a governed context service — often graph-based — that encodes how your business works. Whether you adopt AWS’ stack or assemble your own, success will come down less to prompt engineering and more to how well you design skills, policies and knowledge graphs that reflect your domain.

Final thoughts

Sivasubramanian’s core point is that agents aren’t a feature toggle but an architectural choice. The advantage goes to organizations that design for compounding momentum across work, security, software delivery and data, rather than to those that simply switch on Amazon Quick, Kiro or DevOps Agent.

For information technology leaders, that means treating agent access, guardrails and context as platform services, embedding AI more deeply in delivery and operations, and copying the Southwest playbook: Start with a high-impact domain, align business and engineering on outcomes, and let agents handle the undifferentiated heavy lifting while your teams focus on domain-specific decisions.

Artificial intelligence dominated headlines and keynotes at every event I’ve attended this year, including the recent Cisco Live 2026. Though the thirst for AI has been insatiable for a couple of years, customer feedback at the event showed that the era of AI curiosity has given way to AI urgency.

Information technology and business leaders are no longer satisfied with conversational chatbots or basic AI scribes that merely summarize meetings or draft text. They want systems that proactively identify and resolve problems across their massive, complex IT estates.

The industry is rapidly moving toward autonomous, agentic artificial intelligence — that is, systems that can observe, reason, plan and execute tasks across distributed environments without human intervention. Yet doing this at an enterprise scale is proving remarkably difficult.

Production-ready agentic AI is something startup Fabrix.ai has been developing for a couple of years. At Cisco Live, I stopped by the AI Village to get an update on the vendor’s progress. At the booth, Fabrix.ai demonstrated a multi-vendor, multi-agent platform designed specifically for enterprise operations that customers can run today.

The underlying crisis forcing the shift to AgenticOps

To understand why what Fabrix.ai is building is important, it’s vital to understand the state of traditional operations. For decades, IT teams have operated in a strictly reactive mode. The typical enterprise uses seven to 10 monitoring tools per department. When an outage or performance degradation occurs, these fragmented point tools unleash a storm of alerts. What follows is the notorious “swivel-chair” choreography: subject matter experts jumping from console to console, interpreting logs, reading dashboards and manually correlating issues across network, security and cloud silos.

This model has not and will never scale. Traditional AIOps suffered from a critical “last-mile” problem. It was highly effective at generating and clustering alerts, but it still left the actual analysis and manual remediation to humans. This friction strains organizations and stalls digital transformation.

According to data cited by Fabrix.ai, failed IT modernization initiatives drain an astonishing $2.3 trillion annually, and 70% of digital transformation programs fail to deliver their promised outcomes. The industry requires a fundamental evolution from AIOps to agentic operations. IT pros need AI agents that not only alert that a fire has started but also autonomously trace the root cause, assess the blast radius and execute remediation before a human analyst even opens a ticket.

The four debts blocking the agentic control plane

If the value of AgenticOps is so obvious, why hasn’t every enterprise deployed it? The reality is that building a unified control plane capable of steering autonomous agents is an architectural nightmare. In fact, fewer than 5% of enterprises have achieved measurable return on investment from their AI initiatives, and only 13% feel truly ready for AI, according to Cisco’s own Readiness Index.

Enterprise architectures are blocked by four compounding debts:

  1. The hallucination and governance gap: Large language models are inherently nondeterministic. In a marketing or copywriting use case, a minor hallucination is harmless. In network engineering or cybersecurity operations, an autonomous agent making a nondeterministic choice can inadvertently take down a core data center fabric. Without strict operational governance, trust frameworks and guardrails, agents cannot be unleashed in production.
  2. The siloed telemetry problem: Agents are only as good as the data they can parse. Dumping raw, unorganized telemetry data into an LLM context window doesn’t make it smarter; it only accelerates hallucination. Agents do not need more volume; they require structure — a unified semantic data layer that maps relationships, identities and causality across disparate tools.
  3. Context degradation in multi-agent orchestration: Complex enterprise troubleshooting requires multiple specialized agents working in parallel. However, maintaining context purity across these boundaries is incredibly difficult. If a network agent and a security agent act on shared infrastructure using fragmented or contradictory data, the operational context breaks down, leading to erroneous or destructive actions.
  4. The lack of universal connectivity: Autonomous agents are trapped by what they cannot reach. Static API catalogs become obsolete the moment an enterprise updates its stack. True operational intelligence demands dynamic, schema-aware connectivity that can interact directly with devices and software at runtime.

How Fabrix.ai bridges the agentic value gap

Fabrix.ai is tackling these hurdles head-on with a vendor-neutral, full-stack AgentOps platform. Rather than forcing companies to undergo expensive rip-and-replace migrations, Fabrix sits atop an organization’s existing software estate via a unique Robotic Data Automation Fabric or RDAF layer.

“At the core of the Fabrix platform is its multi-agent, Mythos-ready orchestration and Reasoning Layer, which coordinates specialized digital workers across disciplines. Instead of relying on a single, massive generic model, Fabrix uses domain-aware, AI-engineered hierarchical agents, specifically for ITOps, SecOps and NOC use cases,” explained Shailesh Manjrekar, chief marketing officer for AI strategy.

The platform’s architectural pillars map precisely to the challenges mentioned above:

  • Agentic data federation: Fabrix connects to more than 1,900 enterprise data sources and creates run-time MCP wrappers for any data source. The federation agents perform in-place data discovery and continuously link metrics, logs, traces, topology and CMDB metadata into a single semantic data layer, providing agents with a clear, hallucination-resistant view of operational states.
  • The multi-domain context engine presents only curated data to agents across domains, preserving tokens with a shared state.
  • FinOps and agent governance: To address trust and cost issues, Fabrix features a granular FinOps and spend management engine. Organizations can enforce individual AI quotas per user, departmental limits, and LLM-specific cost caps. More importantly, it embeds an evaluation and guardrail layer that enforces strict, predictable execution limits.
  • Pre-built digital worker catalog: Rather than forcing enterprises to build agents from scratch, Fabrix offers an out-of-the-box Orchestrator AI Agents Catalog. This catalog includes specialized digital workers such as Root Cause Analysts, SecOps Compliance Monitors, and Auto-Remediation Techs that can be deployed in weeks.

CollabOps: Bringing autonomous agents into the team meeting

One of the most interesting components demonstrated at the event was CollabOps. Most enterprise collaboration tools use AI defensively, primarily as a passive scribe on the sidelines, generating transcripts. Fabrix.ai flips this script by making Voice AI agents active participants in the conversation.

With an ambient listening pattern, a Fabrix digital worker can be invited directly into meeting rooms and channels across Webex, Microsoft Teams, Zoom, and Slack. During a live incident bridge, engineers don’t need to leave the call to query data. They can simply speak to the ambient agent: “Hey Fabrix, check the health of the wireless network in Building C” or “Run an RCA on incident CFX-2026.”

The agent processes the request through the semantic data layer, performs automated root cause analysis, runs safe diagnostic checks and drops the live interactive link directly into the channel chat in real time. Furthermore, Fabrix highlighted that this ambient architecture is extending directly into front-line Cisco Webex Contact Center environments to assist agents with live case reconciliation and sentiment triage.

Certified sovereign AI for Cisco Secure AI Factory

For highly regulated verticals such as healthcare, financial services and the public sector, moving operational data to public cloud LLMs is out of the question due to compliance and data sovereignty constraints.

To address this, Fabrix.ai announced at the show that it has become a certified independent software vendor for the Cisco Secure AI Factory and Unified Edge. For customers seeking a fully sovereign, air-gapped AI infrastructure, Fabrix can deploy its entire AgentOps platform natively on-premises on Cisco AI PODs, using local LLMs/GNNs.

By running locally on GPU-optimized, Cisco-validated compute (including UCS Series and Nexus Dashboard infrastructures), enterprise buyers gain the full power of cross-domain agentic reasoning and real-time cluster observability, with their proprietary telemetry data never leaving their physical control. Fabrix estimates that this on-premises architecture can reduce total cost of ownership by 30% to 40% compared with equivalent public cloud deployments.

Real-world results: Proof in production

The proof, as always, lies in the production metrics. Fabrix showcased several customer case studies across diverse verticals, demonstrating that this architecture has moved beyond the experimentation phase:

  • Telco/service providers: An enterprise customer reduced NOC alert noise by 85% by deploying autonomous 5G RAN agents to isolate faults across more than 500,000 network elements. BizOps agents span OSS/CRM systems, connecting records, contracts, and case data to enable instant, governed decisions.
  • Energy: A Fortune 500 energy company used Fabrix DEXOps (Digital Employee Experience) agents to proactively detect and analyze VPN peer losses and wireless authentication failures. Campus hotspot failures were isolated in under two minutes, reducing combined OT/IT downtime by 35% without a single human ticket being opened.
  • FinTech and SOC: Automated triage of billions of daily financial transactions reduced SOC alert noise by 90% using explainable AI reasoning.

Advice for IT pros: How to turn the ‘autonomy dial’

The transition to agentic operations will fundamentally change the day-to-day realities for IT professionals. For engineers and operational leaders seeking to navigate this shift successfully, I offer the following advice:

  • Stop fighting telemetry volume; demand an ontology: Stop spending budget on adding more disconnected point-monitoring tools that dump raw data into isolated buckets. When evaluating platforms, prioritize data liquidity and semantic layers. Your AI strategy will stall if your agents cannot natively resolve identities across Cisco and non-Cisco tools.
  • Look for an extensible harness, not a closed box: Avoid vendors pushing closed, single-ecosystem agent frameworks. True enterprise environments are complex composites of multiple clouds, legacy software and multi-vendor networks. Look for open control planes that embrace standards such as the Model Context Protocol to orchestrate smoothly across your entire ecosystem.
  • Ease into autonomy with human-in-the-loop controls: You don’t have to hand over the keys to the kingdom on day one. Use platforms with a flexible “autonomy dial.” Start by configuring your agents to operate in an advisory capacity — generating root-cause narratives and drafting runbooks. Once an agent has consistently earned your trust in specific error categories, promote those actions to fully automated remediation.

The shift from reactive dashboards to proactive, autonomous operations is no longer a futuristic concept. Platforms such as Fabrix.ai demonstrate that with the right data federation and governance models, agentic operations can deliver substantial, measurable efficiency today.

Hewlett Packard Enterprise Co. Chief Executive Antonio Neri opened the company’s annual user conference, Discover, this week in Las Vegas with a manifesto for the AI era. Though all events now discuss the changing role of AI, Neri offered a different perspective on AI through the lens of information technology.

The industry is moving from building IT systems to architecting intelligence. In that shift, the enterprise is no longer just an operator of technology but rather a designer of outcomes, a much different role for the people sitting in the audience.

Neri leaned into the architectural metaphor throughout the keynote, and it worked. AI isn’t a feature or a workload; it’s a system-level transformation. That framing set up five clear takeaways that define where HPE is placing its bets and, more importantly, where enterprise IT is headed.

1. The network is back at the center of everything

If the cloud era abstracted the network, the AI era is making it foundational again. Neri stated, “architecting for AI starts with your network.” This reflects a broader industry shift in which AI workloads, especially large-scale training and distributed inference, are fundamentally network-bound and generate massive traffic. Latency, congestion and east-west traffic patterns now directly affect model performance and cost.

The Juniper acquisition is obviously central to this strategy. HPE is positioning itself as a full-stack networking provider spanning campus, data center, and interconnect. The introduction of AI-optimized switching (such as the QFX series) and routing platforms (MX series) underscores the broader point that AI infrastructure is as much a networking problem as it is a compute problem.

During his keynote, he named a customer, Vultr, to reinforce this point. Hyperscale AI environments aren’t just about graphics processing units; they’re about how efficiently you can connect them. In that sense, HPE is betting that Ethernet, paired with software intelligence, can compete with and win against proprietary AI fabric solutions. For enterprise buyers, this reframes network investments from “plumbing” to a “performance multiplier.”

2. ‘Self-driving’ is evolving from a tag line to an operational model

Juniper Networks Inc. has discussed the concept of self-driving networks for years, but this keynote presented a more mature, credible vision. The combination of Aruba Central, Juniper Mist and GreenLake Intelligence points to a unified operational model in which AI doesn’t just monitor but actually acts. Neri emphasized systems capable of “detecting, diagnosing and remediating” issues before users notice them.

This matters because AI infrastructure dramatically increases operational complexity. IT pros need to deal with hybrid environments, distributed inference and agent-driven workflows. Human-in-the-loop IT operations won’t scale.

What’s different now is the integration of generative and agentic AI into operations. GreenLake Intelligence isn’t just correlating telemetry; it’s reasoning across domains and increasingly automating actions. A useful way to think about this: traditional AIOps was about insights. This next phase is about execution. Currently, self-driving applies to the network, but during the analyst Q&A, Neri made it crystal-clear that the intent is for agentic capabilities to span the IT stack.

3. The rise of the agentic enterprise is very real — and very messy

One of the more forward-looking parts of the keynote was Neri’s focus on the “agentic enterprise.” The idea that enterprises will soon manage thousands of AI agents isn’t speculative; it’s already underway. What’s still missing is the control plane.

Neri highlighted the looming problem of agent sprawl. Developers are building agents quickly, often outside centralized IT governance, creating risks to security, data access, and operational consistency. HPE’s response is to position Private Cloud AI as the foundation for governed agent deployment. Additions to agent registration, identity models, policy enforcement and secure runtimes are intended to bring order to what could otherwise descend into chaos.

The key insight is that managing agents will resemble managing users or applications, but with greater autonomy and higher stakes because business-impacting actions will be automated. For enterprise IT leaders, this should be a wakeup call that AI adoption is no longer just about models. It’s about managing the lifecycle of autonomous systems.

4. Data and increasingly storage architecture are the real bottlenecks

Neri made a point that often gets overshadowed by GPU headlines. AI is only as good as the data foundation it rests on. The HPE Alletra Storage MP updates, particularly those focused on unified file and object storage and Nvidia certification, highlight an important trend. Storage is becoming an active participant in AI pipelines, not just a passive repository.

Features such as real-time metadata enrichment and tighter integration with AI frameworks are designed to reduce friction between data and models. That’s critical because one of the biggest delays in enterprise AI projects is data preparation and movement.

An interesting claim was that simplifying data pipelines could significantly shorten time-to-value. Though the exact numbers will vary, the direction is clear: Whoever solves the data problem wins the AI race. This is where HPE’s full-stack story matters. Compute gets the attention, but data architecture determines outcomes.

The importance of data management was underscored in a customer Q&A with analysts. I asked Matt Messick, chief information officer of the Dallas Cowboys, about the importance of bringing data silos together, and he said it’s the top priority he thinks about now and that it’s something the organization must get right if its AI aspirations are to be met.

5. Power is the constraint no one can ignore

Perhaps the most grounded moment in the keynote was the discussion of energy. Neri cited a projected 19-gigawatt power gap in the U.S. by 2028, with data centers consuming an increasing share of that capacity. That’s not a theoretical issue; it’s a hard limit on AI expansion.

During his keynote, HPE played a Siemens Energy video illustrating how AI is both driving demand and helping optimize supply. But the broader point is that infrastructure decisions are now inseparable from energy considerations.

This has several implications:

  • Efficiency becomes a competitive advantage, not just a cost metric
  • Location strategy (where you build and run AI) becomes more constrained
  • Cooling, power delivery and sustainability move into the core architecture conversation

In other words, the future of AI won’t just be defined by model breakthroughs but rather it will be defined by who can power them.

Final thoughts

In summary, Neri’s keynote wasn’t about a single product or announcement. It was about positioning HPE as the company that can tie together networking, compute, storage, cloud and operations into a coherent AI architecture.

That’s an ambitious claim, but it aligns with where the market is going. Enterprises don’t need more point solutions; they need integrated systems that can handle the scale and complexity of AI. The architectural framing is the right one. The open question is execution. Because in this new era, being an architect isn’t just about designing the blueprint. It’s about delivering the outcome.

When people talk about artificial intelligence and language, the focus usually defaults to English, a handful of European languages and perhaps a few from Asia. African languages – thousands of them, often tonal, hyper-local and deeply contextual – rarely make the roadmap.

That blind spot is exactly where Nkenne is building a business and a developer platform, and arguably an economic on-ramp for an entire continent. Nkenne, founded by musician-turned-tech founder Michael Odokara-Okigbo, is an African language-learning app and AI translation platform designed to “build the infrastructure for African language learning and translation capacities.” In his words, the goal is to bring African languages “to the 21st century” through speech-to-text, text-to-speech and speech-to-speech translation that preserves tonal, dialectal and proverbial nuance.

That vision just got a meaningful boost from Zoom Communications Inc. Odokara-Okigbo was named among the Top 5 winners in Zoom’s inaugural Solopreneur 50 program, selected from more than 3,000 applicants, and earned $30,000 in no-strings-attached funding. For a solo founder with a hyper-lean team, that’s not prize money – it’s runway.

What the Zoom Solopreneur 50 represents

Zoom’s Solopreneur 50 is a recognition program that highlights one of the most interesting trends in this AI cycle: highly leveraged individual builders using cloud, AI and collaboration tools to do what once required a funded startup and a full team. The program highlights solo founders whose businesses demonstrate originality, performance and real-world impact. The Top 5 receive cash awards to accelerate their missions.

Odokara-Okigbo was candid about what it meant to be singled out from thousands of entrepreneurs: “I was surprised, because there are a lot of entrepreneurs out there, and we just try to put our best foot forward every day,” he said. For him, the win is as much about his team as about himself: “Everyone who has been along on this journey, it’s because of them that we’ve been able to achieve this award.”

The recognition also comes from a platform Nkenne relies on daily. The company runs its global operations on Zoom, using it for international meetings and leaning heavily on AI-generated meeting notes to track progress and stay aligned. Being honored by a tool the company already depends on adds a layer of validation.

From a postponed tour to AI infrastructure

Nkenne didn’t start as a big infrastructure play. It started as a personal gap in the market and a push from family. During the pandemic, Odokara-Okigbo’s music tour across Europe and Africa was postponed, leaving him back home with time and energy he didn’t want to waste.

He had long wanted to learn his native language, Igbo, spoken in southeastern Nigeria, but quickly discovered there were no real tools to help him do it at scale. “I didn’t find any resources that could allow me to,” he explained. His mother’s response was simple and decisive: If you can’t find the resource, build it. Nkenne, named after his mother, was born from that challenge.

What began as an effort “just for wanting to teach African languages” has since expanded into a dual-pronged product approach:

  • A business-to-consumer African language learning app, which also has utility for business-to-business and even government users.
  • An AI-powered African language translation platform focused on speech-to-text, text-to-speech, text-to-text and speech-to-speech.

Today, Nkenne supports 15 African languages, with ambitions to grow to hundreds over the next three to five years. That’s a tiny slice of the total addressable space: Odokara-Okigbo points out that Nigeria alone has more than 500 distinct languages, and there are thousands across the continent. The strategy is deliberate: “We’re doing one language at a time,” with the long-term dream of broad coverage.

Culture, code and product as art

Odokara-Okigbo is not a typical AI founder. He is an accomplished singer-songwriter and AMA winner who describes himself as operating at the “intersection of culture and code.” That mindset shapes how Nkenne is built and where it differentiates itself.

“I find building tech as an artistic endeavor as well,” he said, emphasizing that for Nkenne, design is not a layer applied at the end – it’s core to the experience. He estimates that “how it feels to the customer” accounts for “90% of the battle,” citing Steve Jobs’ approach at Apple as an inspiration. For a language platform that aims to serve both a global diaspora and users in fast-growing African markets, a focus on emotional resonance and user experience is not cosmetic; it’s a growth strategy.

He also sees his dual background as bringing left-brain and right-brain strengths together. Music is intuitive and emotional; AI engineering is analytical. “There’s a creativity between those two that I enjoy utilizing,” he said, noting that he often combines them – including using AI tools to help produce his own tracks faster, without outsourcing the core creative work.

Why African languages are a hard but important AI problem

One reason Nkenne exists is that mainstream AI and translation providers have largely sidestepped African languages. “A lot of Western companies… have not put funds towards advancing African language capacities,” Odokara-Okigbo noted.

The technical challenge is significant. Many African languages are both tonal and “dialectally sensitive.” Small changes in tone can completely change meaning. The word “Nkenne” itself, in Igbo, has six meanings depending on how it is pronounced. That complexity undermines naïve translation approaches and requires models trained on the right data, tuned for tone, dialect and proverb-heavy usage, and evaluated in partnership with native speakers.

Nkenne’s translation platform is designed specifically for that environment. The goal is not just to translate, but to translate with trust – to build a standard that governments, telecommunications companies and enterprises can rely on when deploying services across diverse regions. As Odokara-Okigbo put it, the company wants to be “that infrastructural layer and provide that standard where languages on the continent will no longer be ignored and misrepresented.” Without trust in the translations, “systems do degrade.”

AI, solopreneurs and leverage

The Nkenne story also illustrates how AI is reshaping what a single founder can do. Odokara-Okigbo does not see AI as overhyped; he sees it as a force multiplier.

Tools such as Zoom’s AI features, along with other AI products, have enabled him to streamline operations, respond more quickly to customers, and stay connected with partners and team members across geographies – even as he shuttles between the U.S. and Nigeria. AI is not abstract; it is infrastructure that enables him to operate as a global tech company while remaining, fundamentally, a solo founder with a lean team.

This is the broader promise behind the “solopreneur” label Zoom is seeking to elevate: that one person — with no full-time staff, but equipped with the right mix of AI, collaboration tools, contractors and part-time collaborators — can build something with the reach and impact of a traditional startup.

Putting $30,000 to work

For a large vendor, $30,000 barely registers as a budget line. For Nkenne, it is catalytic capital. Odokara-Okigbo already has specific plans for the funds: Nkenne is in active discussions with telco providers across Africa that want to integrate Nkenne AI into their services. The prize money is being used to “expand to those services to provide more value to those customers” and to “be prepared” to serve telcos as full-fledged providers, not just a niche app.

This is a key inflection point for the company. Telcos are among the primary channels for digital services in many African markets, and becoming a trusted language and translation partner at that layer positions Nkenne as infrastructure, not just an app. The funds effectively help Nkenne scale from consumer learning into a broader B2B and business-to-government play faster than originally planned.

A B2C app with B2B and B2G ambitions

Economically, Nkenne sits at an interesting juncture. On one side is the language-learning app, aimed at individuals – both within Africa and across the diaspora – who want to learn and retain their languages. On the other is a translation platform that already has traction among institutions.

Odokara-Okigbo describes the translation business as “mainly a B2B entity,” with usage expanding into B2G. Nkenne already has a contract with Nigeria’s National Information Technology Development Agency, a federal government agency, to deploy its translation capabilities. As AI-based language infrastructure becomes more critical to digital government and citizen services, that early foothold could be strategic.

He also sees Nkenne as a bridge in both directions: helping global businesses enter African markets and helping African businesses and governments reach Western markets, with language as the enabling layer.

Preservation, power and what’s next

A broader societal thread runs through Nkenne’s roadmap. Odokara-Okigbo rejects the notion that AI is inevitably a force of cultural erasure. He argues that tools like Nkenne can “help a lot of small cultures to elevate themselves” by preserving endangered languages – not only in Africa but also among Indigenous communities in Australia, the U.S. and parts of Europe.

Africa’s demographics add urgency. Nigeria alone has over 250 million people and one of the world’s fastest-growing youth populations. Smartphone penetration is rising, and many people carry more than one phone. Investment in African tech is increasing. In that environment, language is not just cultural – it’s economic power. AI translation and learning tools open new economic corridors and ensure that growth doesn’t come at the cost of linguistic diversity.

Looking ahead three to five years, Odokara-Okigbo is explicit about where he wants Nkenne to be:

  • Supporting “hundreds of languages” across both learning and translation.
  • Offering robust text-to-text, speech-to-text, text-to-speech, and speech-to-speech for African languages.
  • Deepening penetration in government and enterprise as the de facto standard for African language translation.
  • Ensuring African languages are “no longer ignored and misrepresented” in digital systems.

On a lighter note, he even has a target for where Nkenne’s work might show up: When asked when Zoom’s real-time captions should seamlessly translate into all African languages, he shoots for 2028.

For other would-be solopreneurs, his advice is grounded in lived experience as an artist who has had to develop a thick skin: “Let no be your guide. Accept no. No is good as it allows you to go where you need to go.” In this framing, rejection is not a verdict; it is a routing logic.

For Zoom, the Solopreneur 50 showcases what happens when AI and collaboration tools meet conviction and cultural purpose. For Nkenne, it is a validation milestone on a much longer journey to turn a mother’s challenge – “if you can’t find it, build it” – into the language infrastructure of a continent.

Enterprise software architecture has long suffered from what can be called an “integration tax.”

When an organization deploys a communications platform, it rarely stops at the basic functions, such as calling and messaging. To extract operational value, it must overlay data analytics layers, emergency notification systems and context-matching engines. Each addition introduces architectural complexity, data egress liabilities and synchronization latency.

This week communications provider 8×8 expanded its AI tool suite. Specifically, their “Pulse” conversational intelligence tool and “Resolve” critical notification engine enable a transition: the migration of sophisticated application logic directly into the communications layer.

By embedding conversational data ingestion, AI-enabled orchestration and cross-channel mass notification directly into the core communication routing framework, the architecture eliminates traditional application programming interfaces, third-party data middleware and synchrony delays that have plagued enterprise workflows for a decade.

Defragmenting corporate workflows: The unified layer

To understand the business operational impact, one must examine the modern data silos within most enterprises. Customer interactions occur across multiple, disconnected modalities: voice calls, web-based chat widgets, direct text messaging and corporate emails. Typically, capturing insights from these interactions requires exporting recordings or text logs via APIs to a secondary enterprise data platform or a specialized third-party artificial intelligence engine.

This legacy approach introduces three points of friction: engineering overhead to maintain API pipelines, compliance risks from moving sensitive conversational data across multi-vendor cloud boundaries, and information delays. By applying AI to the data natively at ingestion, 8×8 Pulse converts communication from an operational expense into a structured data stream in real time. A manager can query the state of client friction or competitor mentions using natural language and receive contextual answers mapped directly back to the primary audio or text transcript, which is securely stored within the original network container.

The practical result is reduced operational efficiency, often caused by blind spots in workflows. Instead of waiting for weekly post-mortem reports compiled by data analysts, corporate leadership receives immediate, data-driven operational visibility. If a pricing update or a software disruption triggers customer friction, the pattern appears through real-time telemetry rather than through retroactive complaints.

True platform consolidation occurs when application logic resides within the data path. Moving intelligence to the infrastructure level eliminates the data movement overhead that has historically limited real-time analysis.

The bridge to the deskless workforce

The concurrent introduction of 8×8 Resolve addresses an equally persistent operational vulnerability: the digital isolation of the front-line, or “deskless,” workforce. Approximately 70% of the global workforce does not work at a traditional desk or on a laptop. Warehouse operators, logistics drivers, field technicians and healthcare workers frequently work outside the core enterprise application ecosystem and often lack corporate email credentials or access to central collaboration environments.

During an operational crisis, such as a facility hazard, a critical supply chain bottleneck or an unexpected network failure, traditional top-down corporate communications fail. Mass email distributions and corporate chat announcements do not reach workers in the field. 8×8 Resolve circumvents this structural gap by leveraging the network provider’s direct access to external communication channels, specifically SMS, WhatsApp and automated voice services, using personal mobile devices without requiring localized software clients or complex authentication walls.

When combined with native conversational parsing, this infrastructure establishes a closed-loop system. A mass notification sent during a supply chain disruption can ingest natural-language responses from hundreds of field operators. The native intelligence engine identifies patterns in those text strings and categorizes roadblocks, such as a specific terminal blockage, without requiring an administrator to manually read individual messages. Integrating notification (the output) and analysis (the input) within a single architectural layer changes how enterprises manage real-time operations.

Strategic guidance for IT leaders

For information technology professionals, this shift from modular add-ons to a consolidated platform infrastructure requires a reevaluation of long-term software procurement strategies. IT professionals should consider the following actions:

  • Audit the integration tax: Quantify the actual maintenance, licensing and compliance costs for third-party middleware used solely to extract interaction data from communication tools and feed it into analytical platforms.
  • Reevaluate data boundaries: Assess the security implications of data egress. Processing conversational data within the hosting communication environment minimizes data replication and reduces the corporate attack surface.
  • Address front-line accessibility: Review crisis management and operational dispatch workflows. If notification plans rely heavily on corporate email or desktop clients, accept that a significant share of the workforce will remain isolated during an incident.
  • Prioritize verification mechanisms: When evaluating analytical tools, prioritize systems that provide clear lineage — linking every summarized insight directly back to source transcripts or call records to combat model inaccuracies.

Market implications for 8×8

From a market positioning standpoint, these announcements mark a pivot for 8×8. The unified-communications-as-a-service and contact-center-as-a-service markets have entered a highly commoditized phase. Basic voice routing, video conferencing and text messaging are no longer differentiators; they are baseline utilities. Competitors are locked in an aggressive price war, squeezing margins across the industry.

By embedding advanced application capabilities, specifically AI-enabled behavioral intelligence and strict, compliance-oriented notification pathways, directly into their core network stack, 8×8 is attempting to move up the value chain. It is shifting the purchasing conversation away from “cost per seat for phone lines” toward “operational efficiency and risk mitigation.”

Success will hinge on execution. Enterprise buyers are notoriously cautious about vendor lock-in. If 8×8 can prove that this native approach delivers demonstrably lower latency, superior data security and verifiable cost reductions compared with a best-of-breed multivendor stack, it will establish a defensible market position. However, it must clearly demonstrate to technical buyers that these native applications perform at par with specialized standalone software, or risk being viewed as a generalist utility trying to cover too many disciplines.

Final thoughts

The consolidation of communications infrastructure with localized enterprise applications is long overdue. For businesses, the benefit is reduced complexity: fewer APIs to maintain, lower security risks and a direct path to understanding operational realities across both deskbound and front-line workforces. As software architectures mature, the competitive edge will belong to frameworks that minimize data movement and deliver actionable context directly at the source.

Neocloud provider QumulusAI announced today that it has secured more than $124 million in customer subscriptions for three-year terms with Hyperbolic and another leading artificial intelligence inference platform.

These agreements cover deployments totaling 1,280 Nvidia Corp. Blackwell GPUs, delivered via 160 Lenovo and Supermicro bare-metal servers connected with Cisco Systems Inc. Nexus networking to form high-throughput, low-latency clusters.

A notable share of the value is front-loaded, with nearly $21.9 million in combined upfront customer commitments, providing QumulusAI with working capital. Structurally, these are graphics processing unit as-a-service subscriptions rather than one-off hardware deals, which means predictable recurring revenue for QumulusAI and predictable operating expenses for its customers over the life of the contracts. In market terms, this is a significant win for a vertically integrated AI cloud infrastructure provider that is betting on an inference-centric architecture rather than general-purpose “AI cloud” branding.

QumulusAI has been working to reset the floor on AI infrastructure costs by making GPU-class inference more economical and broadly accessible. The best way to understand that shift is to see how it is redesigning infrastructure around utilization and economics rather than peak-performance benchmarks.

How AI infrastructure providers are cutting inference costs by 20%

Traditional AI stacks are often built on generic reference architectures that assume maxed-out central processing units, large memory footprints and oversized local storage “just in case” workloads need them. For inference, that often means enterprises pay for underutilized resources simply because the blueprint was drawn that way.

QumulusAI is challenging that model with an “inference-first” approach. It tunes CPU core counts, system memory and local storage to match the real behavior of large-scale open-source inference workloads, deep-research agents, automated coding systems and other asynchronous applications that prioritize throughput, latency and cost per token. The company’s deployments around Nvidia Blackwell GPUs are designed so that every component above the GPU is rightsized. Its own analysis indicates this can cut AI inference costs by roughly 20% compared with standard configurations, largely by eliminating waste in CPU and storage provisioning.

From GPU scarcity to GPU efficiency

The first wave of generative AI was defined by GPU scarcity. Whoever secured the most accelerators won. That scarcity mindset led AI providers and large enterprises to hoard GPU capacity and overbuild general-purpose infrastructure, assuming training would be the dominant workload. As the market matures, the constraint is shifting from “can I get GPUs?” to “can I afford to run them continuously?” That’s where efficiency becomes the differentiator.

QumulusAI’s architecture pairs Blackwell GPUs with Lenovo and Supermicro bare-metal systems and Cisco Nexus networking. The real innovation is how tightly it aligns those systems with inference utilization patterns. The net effect is that the same GPU remains in play, but the surrounding infrastructure is no longer a generic, overprovisioned shell — it is an efficient, purpose-built environment designed to maximize useful work per watt and per dollar.

Inference is creating a new class of AI infrastructure

Inference is emerging as a distinct class of AI infrastructure, separate from training, with different design goals and success metrics. Training environments are optimized for short, intense bursts and massive data movement. Inference environments, especially for open-source models, are optimized for sustained, high-volume request traffic, predictable latency and stable economics over multiyear horizons.

QumulusAI’s design choices reflect that reality. It leads with GPU-as-a-service contracts, multiyear subscription terms and a distributed deployment model that brings compute closer to end users rather than concentrating everything in a handful of mega-regions. That combination creates an “inference fabric” where capacity can be added incrementally, and the balance of GPUs, CPUs, memory and storage is tuned to maximize utilization rather than headline TOPS. The result is a new category of infrastructure where success is measured by cost per query and utilization rates, not just peak training performance.

How infrastructure teams can reduce AI operating costs

For operations teams, it’s time to rethink how you approach infrastructure. Treat inference infrastructure as a distinct tier, not an extension of existing training clusters or general-purpose virtualized environments.

Start by profiling actual inference workloads. Collect data on request patterns, concurrency, latency targets and model footprints, and use it to right-size CPU, memory and storage around the GPUs you already plan to deploy. Look for providers and partners that offer inference-specific SKUs or architectures, rather than generic “AI-ready” instances that simply bundle more of everything.

Consider distributed or regional deployments where bringing compute closer to users reduces network overhead and improves utilization, especially for asynchronous or agentic workloads that can be scheduled across multiple sites. Finally, shift the financial conversation from “How many GPUs did we buy?” to “What is our cost per 1,000 inferences, and how can we drive it down by 10% to 20% through better utilization?”

Customers such as Hyperbolic are buying optimized capacity, not just GPUs

One proof point of this shift is how customers are structuring their commitments. Companies such as Hyperbolic, which operate large-scale inference services for open-source models, are signing multiyear agreements not simply to lock in GPU inventory but to secure optimized capacity. GPU clusters, CPU and memory configurations, and network fabrics are co-designed for their specific workloads.

In QumulusAI’s case, that has translated into more than $124 million in three-year agreements and substantial upfront commitments. The value proposition is framed around economics — about a 20% reduction in inference costs relative to standard builds — rather than raw accelerator counts. These customers are voting with their budgets for infrastructure that treats inference as a primary workload.

Final thoughts

What’s interesting about this announcement is not just the size of the agreements but the logic behind it. AI infrastructure is entering a second phase where differentiation comes from utilization and economics, not just raw accelerator counts. The pivot from the number of GPUs purchased to efficiency is overdue, and QumulusAI is positioning itself in that gap by wrapping rightsized CPUs, memory,and storage around Blackwell GPUs.

For enterprises, the takeaway is that AI infrastructure is no longer a monolithic, once-in-a-decade investment. It’s becoming a modular, workload-specific fabric where the winners will be the teams and providers that treat inference economics as a design constraint rather than an afterthought.

Since Zscaler Inc.‘s launch, the company’s mission has been to disrupt traditional access and security with its Zero Trust platform. At its user event, Zenith Live, in Las Vegas, the company made its case for what its next act would look like: becoming the foundational “zero trust for agentic AI” platform.

For enterprises, the keynote by Chief Executive Jay Chaudhry (pictured) highlighted that securing artificial intelligence agents, including their connections, data paths and device footprint, is now a board-level architectural decision, not a bolt-on control, and that this will require a rethinking of security.

Here are my top takeaways from Chaudhry’s day 1 keynote at Zenith Live:

Agentic AI as the new risk plane

Throughout his keynote, Chaudhry framed agentic AI as the next major giga-wave after cloud and mobile, arriving faster and with fundamentally different risk characteristics. He warned that enterprises will soon face “dozens of AI agents for every employee,” each running continuously, spawning other agents and autonomously accessing enterprise systems and data. “Agents don’t take coffee breaks, they don’t sleep and they can create more agents,” he said, underscoring the shift from a human-centric to a machine-centric threat model.

This shift reframes users as just one part of a much larger digital workforce, where agents may hold more privileges than people. Chaudhry argued that governance built for humans — periodic certifications, training and manual approvals — cannot keep pace with agents operating at machine speed and scale. “You can’t rely on policies written for people when machines are making decisions in milliseconds,” he told the audience, making the case for a new control plane grounded in identity, data and application context rather than in networks and IP addresses.

Zero Trust Exchange evolves into an agent fabric

A second major takeaway is that Zscaler is evolving its crown jewel, the Zero Trust Exchange, from a user-to-app fabric into an “agent fabric” that brokers interactions among users, workloads and AI agents. Zscaler’s longstanding thesis holds that the internet can become your corporate network and that applications should be hidden behind a policy-driven exchange. This now extends to AI agents as first-class entities. “We always believed the internet should be your corporate network,” Chaudhry reminded the audience. “Now we must treat every AI agent as an untrusted outsider, just as we do with every user.”

That continuity is strategically important for customers who have already standardized on Zscaler for Zero Trust and Security Service Edge. Rather than standing up a new, parallel “AI security stack,” enterprises can onboard AI agents into the same fabric used today to connect users and applications. The platform can then enforce least-privilege access for agents, hide internal applications from direct exposure, and monitor all interactions for anomalous behavior. This positions Zscaler as a logical extension of the existing architecture, not a disruptive rip-and-replace solely to secure AI.

AI Broker: From AI gateway to policy engine for agents

On the product side, the most notable announcement is the Zscaler AI Broker, designed to sit between AI agents and the systems they access, including MCP-based agents and agent-to-agent interactions. With an integrated Agent Registry, the Broker tracks each agent’s identity, purpose and permitted data and actions, enabling granular policies such as restricting financial agents to specific systems or limiting customer-support agents’ access to personally identifiable information. This moves beyond the first generation of AI gateways, which focused largely on prompt filtering and model routing.

Chaudhry positioned AI Broker as the control plane for an emerging agentic ecosystem rather than another inspection point. “We can’t just watch what agents are doing; we must control what they are allowed to do from the very beginning,” he said. For enterprises experimenting with internal orchestrators and AI frameworks, AI Broker offers a way to centralize governance, contain the blast radius, and demonstrate compliance to regulators by treating agents as highly privileged service accounts with continuous authorization.

Endpoint AI Security: Taming shadow AI on devices

Zscaler also addressed the growing reality that much AI experimentation occurs at the endpoint — through browsers, extensions, local tools and plugins — by introducing Endpoint AI Security. These capabilities extend Zscaler’s reach into AI-related activity on endpoints, detecting and blocking behaviors such as malicious browser extensions acting as agents, unmanaged AI tools accessing sensitive files, and data exfiltration through AI assistants. Rather than becoming a traditional extended-detection-and-response provider, Zscaler is leveraging its existing visibility into encrypted traffic and software-as-a-service usage to correlate that visibility with endpoint-level AI behavior.

The goal is to give organizations a way to rein in “shadow AI” without stifling innovation. As Chaudhry put it, “Your employees will use AI, whether you have a policy or not. The question is, will you have visibility and control?” Endpoint AI Security effectively closes a critical blind spot between the cloud security stack and endpoint agents, providing security teams with a unified view of how AI is used across browsers, devices and SaaS applications.

AI Access Graph and AIGuardian: Turning telemetry into AI governance

Finally, Zscaler introduced AI Access Graph, powered in part by its Symmetry Systems acquisition, to map how identities, data and applications connect across the enterprise. This data-centric graph can answer questions such as which users and agents can access a particular sensitive dataset and what access chain led to a specific AI action. For AI governance, this level of lineage and visibility is increasingly critical, especially as regulators and boards demand proof of who or what interacted with sensitive data and under which policies.

AI Access Graph slots into the broader AI-Guardian initiative, which combines Zscaler’s Zero Trust Everywhere framework, AI Broker, Endpoint AI Security and AI Access Graph, with consulting and integration support from global system integrators. This recognizes that securing agentic AI is as much an operating-model challenge as a technology challenge.

“We see our customers as partners in this transformation,” Chaudhry said. “Our job is not just to provide technology, but to give you a path to adopt AI safely and at scale.” For large enterprises, this ecosystem approach may be the difference between staying stuck in pilot mode and confidently moving AI into production.

Mythos creates a long-term tailwind for Zscaler

Mythos wasn’t addressed at length in the keynote, but I did ask Chaudhry about it during the analyst Q&A. Given the confusion around this, I felt it was worth getting Chaudhry’s thoughts. He explained that Mythos creates a long-term tailwind because it validates the company’s core thesis that eliminating the attack surface matters more than chasing every new vulnerability, especially in an era when frontier models can find and weaponize bugs at machine speed.

The Mythos “SaaSpocalypse” narrative assumes that AI-accelerated vulnerability discovery is existential for SaaS security vendors, but Zscaler’s model is structurally different from that of a typical exposed SaaS app. Its Zero Trust Exchange is designed to hide applications from the public internet, remove public IPs and open ports, and make users and workloads reachable only through identity and policy-driven connections.

As Anthropic’s Project Glasswing and the Claude Mythos leak have already shown, most catastrophic exposures trace back to misconfigured internet-facing services rather than sophisticated exploits. This directly supports Zscaler’s message that “if you are reachable, you are breachable” and that shrinking what’s reachable is the only sustainable response to AI-driven reconnaissance. By being an early Glasswing partner, feeding Mythos with rich telemetry from hundreds of billions of daily transactions, and using it to harden both its own stack and customers’ attack surfaces, Zscaler can turn the same frontier AI that terrifies the market into a differentiator for its Zero Trust Everywhere architecture, reinforcing its relevance as AI makes legacy perimeter and VPN models obsolete.

Final thoughts

For chief information security officers and chief information officers, the key takeaway from Zenith Live is that AI security can no longer be deferred until projects “settle down.” Chaudhry acknowledged that many organizations remain in pilot mode not because they lack AI ideas, but because they don’t trust their ability to govern AI access to sensitive systems and data. By extending zero-trust principles to AI agents and anchoring them in a unified platform, Zscaler aims to give enterprises a credible path to move from experimentation to production with guardrails.

I expect that as AI moves into the mainstream, the secure access service edge and SSE vendors will ride a rising tide that lifts most of them. It’s critical that Zscaler and its peers clearly articulate their agentic AI strategies and how they integrate with or compete against emerging AI security fabrics.

For enterprises, Zenith Live provides a blueprint: Converge user, app and agent connectivity on a single zero-trust fabric; treat AI agents as untrusted yet governable entities; and use data-centric visibility, not network topology, as the foundation of AI governance. In Chaudhry’s words, “This is the kind of moment Zscaler was built for,” and the company is clearly betting that securing the agentic future will define the next decade of cybersecurity.

Though this is the kind of message one would expect from Zscaler’s CEO, the reality is that information technology has continued to grow in complexity and security, and that environment is an order of magnitude more complicated. Zscaler’s message has always been about shrinking the attack surface and limiting east-west traffic to minimize the blast radius of a breach. With AI coming, and coming fast, those basic principles can make the difference in security teams being able to keep up with the business or falling behind.

The networking industry loves inflection points. Over the years, we have had many new compute models that require the network to evolve. For as long as I can remember, the holy war between InfiniBand and Ethernet was fought on a relatively simple battlefield: throughput versus ubiquity.

But as artificial intelligence workloads scale from tens of thousands of processors to massive clusters approaching the million-graphics-processing-unit mark, the network is fundamentally changing. It is no longer just a standalone infrastructure layer; it has become the critical backplane of a tightly integrated AI supersystem.

The InfiniBand vs. Ethernet debate has been interesting, as data center engineers have always preferred Ethernet if all things were equal, but that wasn’t the case, as InfiniBand continually outperformed Ethernet. But over the past couple of years, that gap has been closing to the point where the performance is negligible in most use cases.

Today Arista Networks Inc. made its next move by announcing its Arista 7060XE7 Series of switches (pictured), based on Broadcom Inc.’s Tomahawk 6 silicon. The 1.6Tb portfolio, powered by the Arista 7060XE7 Series, delivers a whopping 100 terabits per second of switching capacity and 224G SerDes technology. However, though the speeds and feeds tend to grab headlines, the real innovation is the architectural pivot toward rack-scale integration, the operationalization of open standards, and what these signals for the enterprise and Tier-2 market segments.

Moving beyond the box: The reality of rack-scale systems

Historically, networking vendors sold switches as discrete, fixed boxes or standalone chassis. If customers needed to scale, they built out traditional leaf-spine topologies. However, the physical constraints of generative AI, specifically power density and extreme thermal demands, have made the individual switch a suboptimal unit of scale.

With the 7060XE7 Series, Arista is leaning heavily into comprehensive, rack-scale system design. This shift is most clearly demonstrated by its specialized liquid-cooled platform, the 7060XE7-64PRS-RV3-L. Optimized for Open Rack v3 or ORv3 specifications, this 2OU system has no internal fans and draws DC power directly from the rack bus bar. It is designed to sit directly within liquid-cooled XPU server environments, matching inlet and outlet fluid dynamics to maximize compute density per kilowatt.

Eliminating internal fans removes a significant share of power overhead. In standard air-cooled environments, power usage effectiveness or PUE overhead can account for 30% to 50% of power just to move air. By shifting to a unified liquid-cooled rack architecture, the operational overhead drops to 5% to 15%. In a world where data center capacity is severely power-constrained, saving that energy isn’t just an environmental victory; it means a customer can redirect that power to run more revenue-generating GPUs.

The economics of linear pluggable optics

One of the most fiercely debated topics in high-performance networking is co-packaged optics versus pluggables. Proponents of CPO argue that moving the optical engine closer to the silicon is the only way to manage power at ultra-high speeds. However, CPO introduces a massive serviceability nightmare: If a single optical lane fails, an entire 100-terabit system could go down.

Arista is doubling down on linear pluggable optics, or LPO, for this 1.6T generation. By leveraging advanced signal-integrity engineering and removing power-hungry DSPs from the optical modules, Arista claims LPO can slash interconnect power consumption by roughly 60%.

This directly affects the total cost of ownership in two ways:

  • Thermal cascading: Lower optic power means the switch runs cooler, the fans spin more slowly, and the mean time between failures of the components improves dramatically.
  • Asset optimization: In large AI clusters, components fail frequently. Pluggable optics preserve operational optionality, allowing operators to replace a single failed port rather than risk a catastrophic rack-level outage.

Every fractional percentage point of network downtime stalls an expensive training job. By engineering a system that pairs the low-power benefits of CPO with the serviceability of pluggables, Arista is protecting utilization rates for the most expensive assets in the data center.

When debating LPO versus CPO, there are pros and cons on both sides, and neither is better in every situation. Customers should do their homework and choose the configuration that works best in their environment.

Demystifying the ‘scale-up’ and ‘scale-out’ architectural moat

To understand where the market is headed, we must look at how Arista is splitting the architectural responsibilities for these platforms between scale-out (the traditional back-end fabric connecting thousands of nodes) and scale-up applications (the tight interconnect inside the compute complex).

A significant piece of news tucked into this launch is Arista’s formal entry into the scale-up domain. For proprietary architectures, scale-up has been dominated by NVLink. However, as the non-Nvidia Corp. ecosystem, consisting of Advanced Micro Devices Inc., Intel Corp. and custom hyperscaler silicon, continues to gain momentum, there is demand for an open, Ethernet-based scale-up architecture.

Arista’s scale-up solutions are co-developed and custom-engineered to match the specific GPU blade characteristics and mechanical layouts of their ecosystem partners. By using Broadcom’s Tomahawk 6 silicon to unlock massive on-chip radix, Arista is providing a unified, open-standards alternative to proprietary compute fabrics.

For scale-out architectures, Arista is pushing the limits of physical tier reduction. By combining its high-density 7060XE7 leaf switches with its deeply buffered 7800 AI Spine chassis, it can build a two-tier network that supports up to 4.5 times more GPUs than standard fixed-box configurations while maintaining a flat, low-latency topology. This architectural flexibility is critical for mitigating the “packet microbursts” that inherently plague AI collective communication patterns.

Reading the enterprise weather vane

Although hyperscalers and frontier AI labs remain the primary consumers of 1.6T bandwidth, we are seeing early signs of a broader market shift. Arista’s historical enterprise footprint has always skewed toward the ultra-high end — financial hedge funds, automotive simulation, biotech research and sovereign government clouds.

These verticals are acting as a leading indicator for mainstream enterprise AI adoption. They aren’t building million-GPU clusters, but their workloads are rapidly scaling to thousands of nodes. These organizations lack the massive internal engineering teams of a Meta Platforms Inc. or a Microsoft Corp.; they cannot build custom network transport protocols from scratch.

This is where software execution matters. Features such as dynamic load balancing, multipath reliable connection fabric resiliency and hardware-level congestion signaling are built directly into Arista EOS. By taking performance optimizations codeveloped with cloud giants and packaging them into validated enterprise designs, Arista is simplifying the operational complexity of deploying high-performance AI fabrics.

Final thoughts: The inevitable march of Ethernet

InfiniBand has been widely adopted in AI networking systems for its performance, and it has also become a bundled, turnkey system. Ethernet is a tried-and-true technology that has stood the test of time. In fact, Bob Metcalfe, the co-inventor of Ethernet, famously stated that “What’s next after Ethernet is Ethernet,” meaning it’s not a static technology but one that continually evolves to meet current challenges.

I do believe InfiniBand will be around for a long time, but it will be used primarily in high-performance environments, while Ethernet is the high-growth networking technology. Customers are increasingly deploying hybrid environments that mix XPU vendors for training and inference, and they want a networking fabric that remains entirely agnostic to the underlying compute silicon.

In networking, open standards have historically won out, and Arista’s 1.6T announcement shows that Ethernet is not merely playing catch-up; it is actively delivering the density, power efficiency and operational software needed to support the next era of infrastructure.

At Cisco Systems Inc.‘s annual event, Cisco Live, this week in Las Vegas, it was no surprise that artificial intelligence was the top theme of the show and dominated most of the news and product innovations announced.

Cisco has been successful in riding the AI wave and using it as a growth engine. Over the past year, revenue and profits have grown, and the stock price has doubled. The company has accomplished this by positioning itself as “critical infrastructure for the AI era” and by revamping its entire product line to back that claim.

As customers move from AI pilots to production, the value of the network will continue to grow because AI is inherently a network service. During his Day 1 keynote, Jeetu Patel (pictured), Cisco’s executive vice president and chief product officer, framed the challenge: “Humans click, but agents swarm.” As this shift occurs, spiky, human-led chatbot traffic will give way to consistently higher volumes of network traffic from swarms of autonomous agents running at machine speed, which will stress every part of the technology environment.

Liz Centoni, executive vice president and general manager for applications and customer experience, carried that theme into day two from a different angle: “I’m not here to talk about the future of AI, but what I am here to talk about is the boring problems it’s already fixing for you in your environment.” Her message was that Cisco’s AI story is not just about graphics processing units and glossy demos; it’s about using AI to fix the operational drudgery customers face every day.

Here are five takeaways from Cisco Live that matter for enterprise buyers:

1. Cloud Control is Cisco’s AI-era control plane

If there was one announcement that stood out from the rest in Patel’s keynote, it was Cisco Cloud Control. Patel called it “one of the best pieces of work that I’ve seen a team do,” describing it as “simplicity without losing the sophistication of Cisco.” Cloud Control is an ambitious effort by the company. “Every Cisco product you know will be managed from Cisco Cloud Control, and every new product and acquisition will start there,” he said.

Cloud Control is a unified control plane for all infrastructure domains, with the following benefits:

Cross-domain visibility and action. From a single interface, engineers can view campus and branch networks (Catalyst and Meraki), data center fabrics (Nexus, Application Specific Infrastructure or ACI, Kubernetes and VMware), security controls and collaboration devices. Patel emphasized that every demo on day two — from AI-ready data centers to future-proof workplaces and digital resilience — “will actually originate and start from Cloud Control.”

Agentic operations are the default. Cloud Control is built around AI agents and runbooks rather than static dashboards. In one demo, an AI assistant recognized a mis-cabled switch that created a spanning-tree loop, traced the root cause, proposed a fix and then enabled the operator to promote that remediation to an autonomous action for the next occurrence. Patel described this as “the autonomy dial,” where customers choose when an agent has “earned your trust” before handing over categories of fixes.

Workflows as guardrails. When Patel asked how workflows fit, Anurag Dhingra, senior vice president and general manager of enterprise connectivity and collaboration at Cisco, explained that a workflow is essentially a codified runbook that makes “nondeterministic models behave in a much more deterministic way for predictable outcomes.” If you don’t have a runbook, the agent will generate one from your intent, execute it and save it back to the catalog for future use.

Open harness for agents and data. Patel emphasized that Cisco “can’t be arrogant enough to think that the only technology you’re going to use is Cisco.” Cloud Control was described as a secure agent harness in which Cisco agents, partner agents and customer-built agents run under a common identity and policy system. Thanks to the Splunk-powered Cisco Data Fabric, those agents can reason across network, security, application and AI telemetry in one place.

Strategically, Cisco wants Cloud Control to be the GPS for AI-era information technology operations. Continuously ingesting telemetry, reasoning and proposing the next best action across a complex, hybrid estate. I spoke with several customers at Cisco Live, and they all echoed the same sentiment. They have lived with a sprawl of point tools and disjointed consoles for years and a single, AI-native control plane is highly appealing.

2. Cisco IQ makes the ‘era of guesswork’ obsolete

Day two was kicked off by Centoni and Cisco IQ. Whereas Cloud Control focuses on live operations, Cisco IQ is aimed at lifecycle, risk and support, and it’s arguably the best example of Cisco using AI to turn its own complexity into a customer-facing product.

Centoni began with a reality check: “Spreadsheet after spreadsheet, but you still don’t have confidence in what assets you have in your environment,” she said. “That’s not information, it’s educated guesswork.” In a post-Mythos world, she warned, “human-speed reactive defense is no longer viable.”

Cisco IQ’s value proposition is to wipe out that guesswork:

Real-time, always-current inventory. IQ provides “broad visibility into your Cisco environment, a single, always-current view of all your assets, including hardware, software and cryptographic assets, with no fragmentation or lag.” Centoni urges customers to ask themselves a simple question: Can you answer, right now, “Do you have a complete and accurate inventory of every asset in your environment and its current security status?” If not, Cisco IQ can help illuminate that blind spot.

Post-Mythos vulnerability management. With AI-enabled attacks able to map networks in minutes at machine speed, Centoni framed this as “the Mythos moment” and argued that organizations that don’t move decisively “are not going to get a second chance.” IQ enables customers to see exactly which devices are exposed to a given vulnerability, their lifecycle state, and the most targeted remediation path.

Traction at machine speed. Cisco IQ went GA just weeks before Cisco Live, but Centoni reported that “as of this morning, we have 2,036 customers already onboarded,” far exceeding her own expectation of 800. Customers such as Geodis and GlobalFoundries described IQ as “too much, too fast and too clear,” and then quickly concluded, “This is what it looks like when the system actually knows your environment.”

Support that never starts at zero. Centoni captured the traditional TAC experience in a line every network team recognizes: “You’re spending the first 35 to 40 minutes before a single line of troubleshooting is done on just collecting basic data.” With IQ, she said, “your engineer does not brief TAC. TAC briefs itself.” Topology, config history, prior cases, and logs are pre-populated; Cisco’s AI routes 88% of cases to the right engineer; and resolution starts with context, not interrogation.

Centoni’s closing line is the one many customers will remember: “You’ve always had the map; now you have the GPS. Use it, as the era of guesswork is over.” For buyers, IQ reframes “support” as a data product: a continuously updated view of asset risk and health, integrated with both Cloud Control and Cisco’s human support organization.

3. Cisco Silicon One is the engine of an AI networking supercycle

Patel spent a significant amount of time on silicon and optics, and for good reason. His argument is that AI will drive a massive networking super cycle, and that without rethinking the network from the chip up, enterprises will discover bottlenecks in the places they least expect.

Patel’s diagnosis is that “inference is happening everywhere,” from GPU clusters in data centers to “desk-side computing” with Mac minis hosting hundreds of agents per user. Cisco’s internal study, he said, shows that “on average, every agent generates about 450 more traffic than a human for conducting that same task.” As agents proliferate — his phrase was “trillions of agents that are going to be proliferated everywhere” — the demands on networks will dwarf those of prior waves, such as video.

Cisco’s answer is its Silicon One family, which includes products to scale up, out and across, covering all the critical components needed to build a network that supports an AI factory. Patel noted that Silicon One powers both enterprise gear and hyperscale systems, with different system families (e.g., Cisco 8100 for hyperscalers, Nexus 9300 for enterprises) built on the same silicon. On the campus side, he highlighted the Catalyst 9550 as “the most powerful core switch we’ve ever created,” emphasizing that “if you don’t have a strong backbone, then that becomes the bottleneck of your network” for agentic workloads.

4. Security is being rearchitected for the post-Mythos world

Security was the most urgent thread across both days. Chief Executive Chuck Robbins warned that AI “changes the speed of defense” while “empowering our adversaries at a pace we’ve never seen in our careers.” Patel echoed this view, saying we now live in a “post-Mythos world” where the time from disclosure to exploitation is “a matter of minutes, if not seconds.”

Cisco’s response spans silicon, software and operations:

Live Protect is a bridge, not a crutch. In a live demo, Cisco showed Live Protect on Nexus switches automatically identifying devices affected by a new advisory and deploying shields and runtime compensating controls without reboots or downtime. Tom Gillis, senior vice president and general manager of the infrastructure and security groups, emphasized that “compensating control does not eliminate the need to patch; it’s a bridge between patches,” and Patel underscored that customers still take “about 40 to 45 to patch vulnerabilities.” The design intent is that, over time, customers will “let them go in auto mode” to close exposure windows quickly.

Security is embedded in the fabric. Cisco’s Nexus 9K “smart switches,” powered by Silicon One and on-board data processing units, can host stateful L4 firewall functions directly in the data plane. As was demonstrated, Cloud Control can push a policy that inserts a firewall “between that same AI workload and legacy applications” without external appliances or hairpinned traffic. “No additional appliances, no hairpinning and no complex routing,” Gillis summarized, just “lots more little, tiny firewalls throughout the infrastructure” to contain lateral movement.

Agentic SOC on Splunk. SVP and GM of Splunk, Kamal Hathi, showcased an agentic security operations center in which triage agents discard about 92% of alerts as false positives, guided-response agents orchestrate containment across Cisco and third-party tools, and every agent decision is accompanied by a reasoning log and an evidentiary trail. Patel argued that “in a matter of a few months, a SOC that is not agentic won’t even make any sense.”

5. Cisco wants to help operationalize AI, not just power it

The most important takeaway for business leaders may be that Cisco devoted as much time to AI governance, operations and economics as to raw performance.

A few themes ran through both days of keynotes:

Tokenomics and cost control. In a fireside chat with Advanced Micro Devices Inc. Chief Information Officer Hasmukh Ranjan, Patel walked through a back-of-the-envelope example: If an AI-empowered employee consumes roughly $200 in tokens per week, that’s $10,000 a year. At 40,000 employees, that’s $400 million, and at 90,000 employees, $900 million. Ranjan noted that “that line item never existed a few years back” — and that IT is now “naturally designed to optimize” it. By integrating Splunk and Cloud Control into agent-level observability, Cisco aims to make “tokenomics” a mainstream operational key performance indicator.

Digital twins and trusted autonomy. Control’s digital twin capability creates one-for-one virtual copies of networks, “running the same IOS XE version and the exact same configuration as my network.” Operators can describe tests in natural language, have the AI generate scenarios, and validate changes in the twin before deploying them live. Dhingra positioned this as a way for agents to “earn your trust” and for customers to decide, category by category, when they’re ready for full autonomy.

AI-ready workplaces. Cisco tied its AI story to collaboration and the edge of work with new devices (Room Kit Pro G2, Desk Pro G2, Board Pro G3) that ship with Nvidia chipsets and on-device AI agents such as Director and Note Taker. These devices, combined with Cloud Control and the refreshed campus portfolio, aim to create “future-proof workplaces” that are as agent-ready as data centers.

Final thoughts

Patel closed his keynote by reminding the audience how much has changed in two years: “We used to be a collection of products, he said. “We are now a fully integrated platform, and that’s why we believe that Cisco is the critical infrastructure for the AI era.” Centoni added the operator’s perspective: “Windows open and close,” she said. “The organizations that move decisively inside them get to decide what comes and shape what comes next.”

Cisco has clearly decided what its next act will be. The open question for customers is whether Cloud Control, Cisco IQ and the broader platform deliver enough real-world simplification and trustworthy automation to justify reshaping how they run networks, security and AI operations.

Sovereign cloud discussions have been a core part of artificial intelligence and infrastructure conversations for the past few years and are now critical to communications.

Historically, the communications sector has been a late adopter of technology trends due to the mission-critical nature of its operations. However, given that customer service is one of the “low-hanging fruit” use cases for AI, the sovereign cloud conversation has come to communications.

The sovereign topic in communications has reached an inflection point, especially in Europe, but it’s also a consideration for many U.S. firms. What began as a niche requirement for government agencies and highly regulated industries has become a mainstream procurement criterion. Boards are asking tougher questions about data residency. Regulators are imposing significant fines for cross-border data governance failures. Enterprises are realizing that “our hyperscaler is compliant” doesn’t answer the question their legal team is asking.

The problem isn’t that vendors don’t understand sovereignty requirements. It’s that most communications platforms treat sovereignty as a deployment option rather than an architectural principle. You can’t bolt governance onto a multitenant infrastructure after the fact and call it sovereign. You need to design it in from the beginning — dedicated environments, documented data residency, managed encryption keys and a vendor willing to share accountability with the customer.

Mitel Networks Corp.’s CX platform is one of the first genuine attempts to solve this problem at the product layer rather than just the infrastructure layer. The timing matters because the market is splitting faster than most vendors realize.

The market bifurcation nobody’s talking about

The enterprise communications market has bifurcated into two segments with fundamentally different architectural requirements. On one side are organizations with straightforward compliance needs — well-served by mainstream unified communications-as-a-service platforms. These platforms are good for a large portion of the market, and sovereignty isn’t a meaningful decision factor for them.

On the other side are organizations for which compliance is the architectural requirement. These include financial services firms processing transactions across European markets, healthcare systems managing patient communications under increasingly aggressive data-protection regimes, government agencies facing explicit in-country processing mandates, and hospitality groups operating properties across multiple jurisdictions with guest data residency obligations.

That second segment can’t accept a shared infrastructure model, regardless of how many certifications vendors have accumulated. They need dedicated tenancy, provable data residency and a platform vendor that understands sovereignty isn’t a feature toggle — it’s a governance framework that runs through every layer of the stack.

IDC Europe’s data supports this thesis. Two-thirds of businesses are already adopting hybrid communications solutions, and 60% plan to replace existing platforms to align with evolving compliance requirements. The demand signal is real and accelerating. The challenge is that very few vendors can meet the needs of this segment.

It’s important to note that the decision “to be sovereign or not sovereign” isn’t about fear of the cloud and does not stifle innovation. It’s a business decision. I spoke with Tom Boyle, head of telecom at Sheffield Teaching Hospitals NHS Foundation Trust. “Sovereignty in communications is ultimately about control, reliability and patient safety,” he said. “It’s not about being cloud-avoidant or lacking a desire for innovation — it’s about resilience. In the NHS, we need absolute certainty that a nurse can make a critical crash call, regardless of any macro-environmental issue or geopolitical shift. Maintaining communication sovereignty ensures that our core operational continuity is safeguarded, no matter what happens on the global stage.”

This sentiment has been echoed by many other information technology leaders I have spoken with recently. The world is becoming more uncertain, and having control has never been more important.

Why most platforms can’t get there

Building a sovereign communications platform isn’t primarily a technology problem. The components — private infrastructure, dedicated tenancy, encrypted key management — are well-understood. The hard part is the operations.

Most UCaaS vendors are optimized for multitenant efficiency. Their economics, product development and support models assume shared infrastructure. Offering a truly sovereign deployment requires dedicated environments and a cost structure that doesn’t destroy unit economics. Most vendors won’t do it because their business models don’t support it.

Generic managed service providers can run dedicated infrastructure, but they lack the communications-specific expertise to manage the experience with depth. They’ll keep the platform available. They won’t tell you why call quality degraded during a distributed contact center shift change, how to configure AI-assisted routing to align with a compliance workflow, or what your data governance exposure looks like when an agent uses an unsupported endpoint. That gap between infrastructure management and communications experience management is where most sovereign cloud offerings fail in practice.

Mitel CX is architected for sovereignty

Mitel CX was designed to solve the sovereignty problem at the platform level, and its architectural choices are worth examining. First, deployment agnosticism isn’t marketingv— it’s structural. MCX runs across private cloud, on-premises and hybrid environments without requiring organizations to surrender control of their infrastructure to access modern capabilities. That matters because sovereignty isn’t just about where data lives. It’s about who controls the environment where processing occurs.

Second, the platform integrates generative AI virtual agents and unified workspace capabilities without requiring a clean-room cloud environment. Most AI-embedded contact center platforms assume you’re running in a hyperscaler environment where the AI tooling can access shared compute resources. MCX brings AI into the workflow within the governance architecture that regulated industries require — not as an afterthought, not as a separate module that breaks the sovereignty model, but as part of the core platform that respects the data residency boundaries the customer has defined.

Third, the tiered governance model, which includes hosted, trusted and sovereign tiers, allows organizations to align their spend with their actual regulatory posture rather than forcing everyone into the most restrictive architecture. A multinational financial services firm with operations across GDPR, U.K. data protection and APAC residency requirements needs full sovereignty. A midmarket healthcare provider with regional compliance obligations may need only enhanced data protection within a dedicated environment. MCX lets customers choose the governance tier that matches their risk profile without replatforming.

That flexibility is uncommon in the contact center market, where most vendors offer a single architecture and expect customers to adapt their compliance strategy to fit it.

The hospitality case study

Mitel’s extension of MCX into hospitality shows why platform-level sovereignty matters in ways infrastructure-only solutions can’t address. Large hotel groups and cruise operators face a structurally complex version of the sovereign cloud problem. They process high volumes of personal guest data across multiple jurisdictions. European properties are subject to the GDPR, while properties in certain APAC markets face local data residency mandates. The guest communications infrastructure, integrated into property management systems, staff mobility tools and in-room experience platforms, is both a regulatory surface and a differentiator.

Most communications vendors pitch hospitality as a vertical use case for a horizontal platform. Mitel built a communications platform specifically for this sector. The property management integrations, in-room communications architecture and staff workflow tooling aren’t generic UCaaS features adapted for hotels — they’re purpose-built capabilities that understand the operational context of running a global hospitality operation. Layering MCX’s sovereign governance capabilities onto that operational depth creates something competitors can’t easily replicate: a contact center and communications platform that delivers AI-enhanced guest experience management within a governance framework that respects per-property data residency requirements.

That’s not a product feature. That’s a multiyear institutional capability that would take most vendors years to build.

What I’m still watching

Geographic expansion will determine how much of this market Mitel can capture. MCX and Secure Cloud are live in the U.K. and Europe (Austria, Belgium, France, Germany, Italy and Spain), but the sovereign cloud demand signal is global. APAC markets, in particular, are tightening data residency requirements, and how quickly Mitel extends platform availability across those regions will shape its competitive position.

The multivendor management story also needs clearer articulation. Mitel’s ability to manage heterogeneous communications environments, not just its own stack, is a differentiator, but enterprises with mixed infrastructure are skeptical by default. More proof points for buyers who haven’t seen it in practice would strengthen the positioning.

Then there is execution. Amazon Web Services Inc., Microsoft Corp. and specialized providers are all moving into enterprise communications with sovereignty-aligned offerings. Mitel’s advantage lies in the combination of communications platform depth and governance architecture, but that combination must be visible in the sales conversation rather than buried in product documentation.

Final thoughts

The sovereign cloud conversation has shifted from theory to procurement reality. Enterprises with genuine data governance obligations can no longer accept “secure” as a substitute for “sovereign.” The number of vendors that can deliver both the platform capability and the governance architecture is smaller than the market noise suggests.

Mitel CX represents a serious attempt to solve this problem at the product layer — building sovereignty into the platform design rather than offering it as an infrastructure add-on. The more relevant question is whether it executes well enough to capture a market opportunity that may not remain open indefinitely. That is one worth watching closely.

Useful artificial intelligence has arrived, and if Nvidia Chief Executive Jensen Huang is right, it is about to reshape not only data centers but also the structure of the global economy and the tech labor market.

In his GTC Taipei 2026 keynote, Huang laid out his vision for the “age of agents,” agentic AI systems that don’t just answer questions but also observe, reason, plan and act across distributed infrastructure. For enterprise information technology leaders, the key message was that compute is now directly convertible into revenue, and that the architectural choices they make over the next few years will define both competitiveness and cost structure in an AI-saturated world.

Below are five takeaways from Huang’s keynote:

1. ‘Useful AI’ has arrived, and it’s a net job creator

Huang’s first major claim was that the industry has moved beyond experimentation to deliver economic impact. “Today we can say that agentic AI has arrived, that useful AI has arrived,” he told the Taipei audience. He backed that up with GitHub data: Commits have nearly tripled between 2023 and early 2026, even though the number of professional developers has not. In his framing, the roughly 30–40 million software developers are now generating vastly more output thanks to AI copilots.

Crucially, Huang dismissed the idea that AI is a net job destroyer. Pointing to the massive productivity uplift, he argued that the economics are selfreinforcing: if each developer can now generate “$9 trillion worth of productive work” for $3 trillion in salary, enterprises will want more developers, not fewer. “People talk about AI reducing jobs – complete nonsense,” he said. “If you can hire a software engineer and generate $9 trillion worth of productive work, why wouldn’t you want to hire more software engineers?” For chief information officers and chief technology officers, that frames AI not as a headcount-reduction lever but as a force multiplier for already scarce technical talent. Businesses have lived through technical staffing shortages for decades, and AI can help close that gap.

2. Tokens are now profitable units

The second major takeaway is that the core economic unit of AI has shifted. In Huang’s words, “tokens are now profitable units of revenues.” Once you assume tokens – the slices of model output that power copilots, agents, and generative services – are directly monetizable, the industry logic shifts: every token generated efficiently is incremental revenue, and every watt wasted is foregone profit.

Huang linked this directly to the current supply-demand imbalance in high-end compute. Because AI services can now be priced and measured in tokens, “AI companies want to build a lot more tokens, generate a lot more tokens, build more AI factories, which is the reason why compute demand here in Taiwan has skyrocketed.” He was explicit that data center design is becoming an exercise in financial engineering: “If you have 1 gigawatt of power, then throughput per watt is revenues, because every token is profitable, every token is revenues.”

For cloud providers and enterprises building their own clusters, the implication is to choose architectures that maximize tokens per watt and minimize time-to-first-token, or risk being permanently behind on unit economics. That may seem overly dramatic but in AI, a little behind will lead to a widening gap over time.

3. Agentic AI is the new application model

Huang spent a significant portion of the keynote defining what he means by an “agent” and why that matters more than traditional apps. In the old world, you had code running in an application on an operating system. In the new world, “it is an agent, which consists of a large language model or many sitting inside a harness, and that harness orchestrates it to do productive work.”

That harness manages the lifecycle of work. It understands the user’s intent, observes context, reasons, plans, calls tools, and juggles working memory with long-term memory, whether those tools are spreadsheets, compilers, databases, or CUDA-accelerated libraries. Huang likened it to a person in a workshop: “You can think of the model as the brain, the harness as the body, and the tools it uses working in a runtime. Think of it as a workshop.”

This represents a fundamentally disaggregated and distributed computing pattern in which different stages of an agent’s loop activate distinct parts of the data center, such as graphics processing units for thinking, central processing units for tools, data processing units for security, storage for memory, and fabric for orchestration. For enterprises, the shift is not just adopting large language model application programming interfaces but redesigning systems, workflows and even org charts around agents that can own entire business processes end-to-end.

4. AI factories and DSX: Nvidia as an AI infrastructure company

If agentic AI is the new workload, the new unit of infrastructure is the “AI factory.” Huang described AI factories as the largest infrastructure buildout in human history, with single sites heading toward 1 gigawatt and capital costs “at $50 billion to $60 billion, and soon it will be $80 billion to $100 billion per gigawatt.” These facilities must “work the first time, and it must work right away,” because any delay is extremely expensive idle capital.

To address that, Nvidia is pushing DSX, a full-stack blueprint for designing and operating AI factories, spanning simulation in Omniverse (DSX SIM), runtime operations (DSX OS), and power optimization (DSX Max LPS and DSX Flex). The idea is to co-design chips, racks, networking, power, cooling and grid interactions as a single system, then validate it in a digital twin before “a single rack lands.”

Huang was clear that this marks another transformation for Nvidia. “A long time ago, Nvidia used to be a GPU company, but over the years, we’ve evolved to become a systems company,” he said. Now, “Nvidia has really started to transform ourselves yet again” into an AI infrastructure company that helps customers build entire AI factories, not just buy servers. For hyperscalers, telcos, and a growing tier of regional clouds, this positions Nvidia as a strategic partner across the technical and economic architecture of AI.

5. Vera Rubin and Vera CPU: Hardware built for the agentic loop

Finally, Huang introduced Vera Rubin and Vera CPUs as hardware platforms purpose-built for the agentic era. Vera Rubin is not a single GPU; it is a multi-rack, pod-scale system that integrates next-generation GPUs, Vera CPUs, BlueField DPUs, Grok LPUs, NVLink 72, and Spectrum-X into a cable-free rack design to maximize throughput, reliability, and assembly speed. “Vera Rubin is the most ambitious endeavor in the history of our company,” he said, noting that what used to take two hours to assemble in Grace Blackwell racks now takes about five minutes.

On the CPU side, Vera is pitched as “the CPU for agents,” featuring a monolithic 88-core design, high instructions per clock, extreme per-core bandwidth, LPDDR5X memory, and fabric bandwidth built to remove CPU bottlenecks that limit GPU utilization. Historically, CPUs were built “for humans,” rented by the core and measured in seconds; agents, Huang argued, “live in a world that’s in nanoseconds” and are impatient with tool calls and database access. The four design pillars he highlighted were single-thread performance, bandwidth per core, total bandwidth, and energy efficiency – the last being critical to pack more CPU into a fixed power envelope without stealing watts from token generation.

Huang summarized the new division of labor: “The CPU is now the conductor, and the GPU is the orchestra.” For enterprises, that translates into a new optimization problem: design systems in which CPUs, GPUs, DPUs, and storage are co-tuned to agents’ latency and throughput needs, rather than treating CPUs as general-purpose workhorses and GPUs as isolated accelerators.

Final thoughts

Jensen Huang’s keynotes have become must-see TV, regardless of the time zone. His message from Taipei is that the AI story has moved beyond proofs of concept and into production. This shift moves the narrative from bits and bytes to a discussion of economics, architecture and operational discipline. Useful AI is here. It is creating more work for more people, and the winners will be those who can translate tokens, watts and racks into durable business advantage. It’s time to recognize that the risks of not hopping on the AI train far outweigh the risks of moving too fast.

digital concept art in gold