EM360 Enterprise Management 360

Podcast

Scaling AI in Digital Commerce Effectively

Vespa AI on the architectural shifts, personalization, and migration paths required to make AI-native search work in digital commerce.

Intro

AI is reshaping digital commerce, but most teams are still running search, ranking, and personalization on infrastructure designed for a different era. The ambition to deliver AI-driven experiences keeps colliding with fragmented architectures, batch pipelines, and bolted-on vector stores.

In this episode, Jorgen Oberman and Piotr Kubziakowski of Vespa AI explain what actually breaks at scale and how technical leaders can move toward an AI-native platform without ripping everything out at once.

Where Digital Commerce Teams Get Stuck

Three problem areas keep showing up across e-commerce organizations:

  • Operational drag from fragmented, microservice-heavy architectures with 90–180 day delivery times for small changes
  • Customer experience limits — current stacks cannot deliver the personalized, AI-driven search users now expect
  • Slow business iteration — campaigns and ranking changes take weeks of data-science work instead of minutes

The Architectural Bottleneck

Lucene-based stacks tend to hit a ceiling the moment vector operations enter the picture. In one customer migration, throughput jumped from roughly four queries per second to four thousand on the same hardware envelope — two orders of magnitude — simply by moving vectors and tensors into the same engine as search and ranking.

The deeper problem is distribution. When search, ranking, recommendation, feature store, and inference all live in separate systems, every request becomes a chain of network calls, the same catalog gets replicated multiple times, and updates ripple through every API.

Designing an AI-Native Search Platform

The core principle: put processing where the data is.

  • Combine lexical, semantic, and signal-based retrieval in one engine
  • Treat ranking as a multi-phase compute layer that can mix business logic, personalization, and ML inference
  • Use tensors — not just vectors — to represent users across multiple categories and preferences
  • Run inference (GBT, ONNX models) next to the data instead of round-tripping to a separate service

Real-Time Personalization

Static segmentation and nightly batches cannot keep up with live user behavior. With partial updates at the field — and even individual tensor cell — level, personalization profiles can be refreshed continuously and applied as a reordering stage on top of text or semantic results. Users see what they searched for, ranked by what they actually like.

The Cost of Running Everything Separately

  • Data inconsistency across replicated catalogs
  • Latency blowing out P95/P99 from cross-system calls at thousands of QPS
  • Platform complexity that slows experimentation and innovation
  • Testing friction — running and comparing ranking variants becomes painful

A Phased Migration Path

You cannot rip and replace a legacy commerce stack. A practical sequence:

  • Start with personalization on category pages — this forces a unified catalog
  • Reuse that catalog to add semantic and lexical search as new ranking profiles
  • Roll out category by category, measuring impact before expanding scope
  • Retire redundant search and vector systems once the unified platform proves out

The Business Impact

Leadership often pushes for AI without a clear picture of what it means operationally. Yet customers replacing legacy infrastructure with a unified AI-native platform have seen revenue gains of 20–25% in certain categories — driven by better product representation, faster discovery, and more relevant personalization.

Advice for Technical Leaders

Look beyond LLMs. Models like GPT-class systems are excellent for extracting features and enriching catalogs, but the value only lands if the surrounding platform can store tensors, update them in real time, run inference next to the data, and combine everything into a single low-latency response. Evaluate latency, data availability, update paths, and how all the pieces compose — not just the model in isolation.

Speakers

Jorgen Oberman

Senior Go-to-Market Leader, EMEA, Vespa AI

Piotr Kubziakowski

Senior Principal Solutions Architect, Vespa AI

Dana Gardner

Host · Principal Analyst, Inter Arbor Solutions