Friday, May 16, 2025
HomeTechnologyArtificial IntelligenceHow Qualtrics built Socrates: An AI platform powered by Amazon SageMaker and...

How Qualtrics built Socrates: An AI platform powered by Amazon SageMaker and Amazon Bedrock | Amazon Web Services TechTricks365


This post is co-authored by Jay Kshirsagar and Ronald Quan from Qualtrics. The content and opinions in this post are those of the third-party author and AWS is not responsible for the content or accuracy of this post.

Qualtrics, founded in 2002, is a pioneering software company that has spent over two decades creating exceptional frontline experiences, building high-performing teams, and designing products that people love. As the creators and stewards of the Experience Management (XM) category, Qualtrics serves over 20,000 clients globally, bringing humanity, connection, and empathy back to businesses across various industries, including retail, government, and healthcare.

Qualtrics’s comprehensive XM platform enables organizations to consistently understand, measure, and improve the experiences they deliver for customers, employees, and the broader market. With its three core product suites—XM for Customer Experience, XM for Employee Experience, and XM for Research & Strategy—Qualtrics provides actionable insights and purpose-built solutions that empower companies to deliver exceptional experiences.

Qualtrics harnesses the power of generative AI, cutting-edge machine learning (ML), and the latest in natural language processing (NLP) to provide new purpose-built capabilities that are precision-engineered for experience management (XM). These AI capabilities are purpose-built to help organizations of all sizes deeply understand and address the needs of every customer, employee, and stakeholder—driving stronger connections, increased loyalty, and sustainable growth.

In this post, we share how Qualtrics built an AI platform powered by Amazon SageMaker and Amazon Bedrock.

AI at Qualtrics

Qualtrics has a deep history of using advanced ML to power its industry-leading experience management platform. Early 2020, with the push for deep learning and transformer models, Qualtrics created its first enterprise-level ML platform called Socrates. Built on top of SageMaker, this new platform enabled ML scientists to efficiently build, test, and deliver new AI-powered capabilities for the Qualtrics XM suite. This strong foundation in ML and AI has been a key driver of Qualtrics’s innovation in experience management.

Qualtrics AI, a powerful engine that sits at the heart of the company’s XM platform, harnesses the latest advances in ML, NLP, and AI. Trained on Qualtrics’s expansive database of human sentiment and experience data, Qualtrics AI unlocks richer, more personalized connections between organizations and their customers, employees, and stakeholders. Qualtrics’s unwavering commitment to innovation and customer success has solidified its position as the global leader in experience management.

To learn more about how AI is transforming experience management, visit this blog from Qualtrics.

Socrates platform: Powering AI at Qualtrics

Qualtrics AI is powered by a custom-built  ML platform, a synergistic suite of tools and services designed to enable a diverse set of Qualtrics personae—researchers, scientists, engineers, and knowledge workers—to harness the transformative power of AI and ML.  Qualtrics refers to it internally as the “Socrates” platform. It uses managed AWS services like SageMaker and Amazon Bedrock to enable the entire ML lifecycle. Knowledge workers can source, explore, and analyze Qualtrics data using Socrates’s ML workbenches and AI Data Infrastructure. Scientists and researchers are enabled to conduct research, prototype, develop, and train models using a host of SageMaker features. ML engineers can test, productionize, and monitor a heterogeneous set of ML models possessing a wide range of capabilities, inference modes, and production traffic patterns. Partner application teams are provided with an abstracted model inference interface that makes the integration of an ML model into the Qualtrics product a seamless engineering experience. This holistic approach enables internal teams to seamlessly integrate advanced AI and ML capabilities into their workflows and decision-making processes.

Science Workbench

The Socrates Science Workbench, purpose-built for Qualtrics Data and Knowledge Workers, provides a powerful platform for model training and hyperparameter optimization (HPO) with a JupyterLab interface, support for a range of programming languages, and secure, scalable infrastructure through SageMaker integration, giving users the flexibility and reliability to focus on their core ML tasks. Users can take advantage of the robust and reliable infrastructure of SageMaker to maintain the confidentiality and integrity of their data and models, while also taking advantage of the scalability that SageMaker provides to handle even the most demanding ML workloads.

AI Data Infrastructure

Socrates’s AI Data Infrastructure is a comprehensive and cohesive end-to-end ML data ecosystem. It features a secure and scalable data store integrated with the Socrates Science Workbench, enabling users to effortlessly store, manage, and share datasets with capabilities for anonymization, schematization, and aggregation. The AI Data Infrastructure also provides scientists with interfaces for distributed compute, data pulls and enrichment, and ML processing.

AI Playground

The AI Playground is a user-friendly interface that provides Socrates users with direct access to the powerful language models and other generative AI capabilities hosted on the Socrates platform using backend tools like SageMaker Inference, Amazon Bedrock, and OpenAI GPT, allowing them to experiment and rapidly prototype new ideas without extensive coding or technical expertise. By continuously integrating the latest models, the AI Playground empowers Socrates users to stay at the forefront of advancements in large language models (LLMs) and other cutting-edge generative AI technologies, exploring their potential and discovering new ways to drive innovation.

Model deployment for inference

The Socrates platform features a sophisticated model deployment infrastructure that is essential for the scalable implementation of ML and AI models. This infrastructure allows users to host models across the variety of hardware options available for SageMaker endpoints, providing the flexibility to select a deployment environment that optimally meets their specific needs for inference, whether those needs are related to performance optimization, cost-efficiency, or particular hardware requirements.

One of the defining characteristics of the Socrates model deployment infrastructure is its capability to simplify the complexities of model hosting. This allows users to concentrate on the essential task of deploying their models for inference within the larger Socrates ecosystem. Users benefit from an efficient and user-friendly interface that enables them to effortlessly package their models, adjust deployment settings, and prepare them for inference use.

By offering an adaptable model deployment solution, the Socrates platform makes sure ML models created within the system are smoothly integrated into real-world applications and workflows. This integration not only speeds up the transition to production but also maximizes the usage of Qualtrics’s AI-driven features, fostering innovation and providing significant business value to its customers.

Model capacity management

Model capacity management is a critical component that offers efficient and reliable delivery of ML models to Qualtrics users by providing oversight of model access and the allocation of computing resources across multiple consumers. The Socrates team closely monitors resource usage and sets up rate limiting and auto scaling policies, where applicable, to meet the evolving demands of each use case.

Unified GenAI Gateway

The Socrates platform’s Unified GenAI Gateway simplifies and streamlines access to LLMs and embedding models across the Qualtrics ecosystem. The Unified GenAI Gateway is an API that provides a common interface for consumers to interact with all of the platform-supported LLMs and embedding models, regardless of their underlying providers or hosting environments. This means that Socrates users can use the power of cutting-edge language models without having to worry about the complexities of integrating with multiple vendors or managing self-hosted models.

The standout feature of the Unified GenAI Gateway is its centralized integration with inference platforms like SageMaker Inference and Amazon Bedrock. which allows the Socrates team to handle the intricate details of model access, authentication, and attribution on behalf of users. This not only simplifies the user experience but also enables cost attribution and control mechanisms, making sure the consumption of these powerful AI resources is carefully monitored and aligned with specific use cases and billing codes. Furthermore, the Unified GenAI Gateway boasts capabilities like rate-limiting support, making sure the system’s resources are efficiently allocated, and an upcoming semantic caching feature that will further optimize model inference and enhance overall performance.

Managed Inference APIs (powered by SageMaker Inference)

The Socrates Managed Inference APIs provide a comprehensive suite of services that simplify the integration of advanced ML and AI capabilities into Qualtrics applications. This infrastructure, built on top of SageMaker Inference, handles the complexities of model deployment, scaling, and maintenance, boasting a growing catalog of production-ready models.

Managed Inference APIs offer both asynchronous and synchronous modes to accommodate a wide range of application use cases. Importantly, these managed APIs come with guaranteed production-level SLAs, providing reliable performance and cost-efficiency as usage scales. With readily available pre-trained Qualtrics models for inference, the Socrates platform empowers Qualtrics application teams to focus on delivering exceptional user experiences, without the burden of building and maintaining AI infrastructure.

GenAI Orchestration Framework

Socrates’s GenAI Orchestration Framework is a collection of tools and patterns designed to streamline the development and deployment of LLM-powered applications within the Qualtrics ecosystem. The framework consists of such tools/frameworks such as:

  • Socrates Agent Platform, built on top of LangGraph Platform providing a flexible orchestration framework to develop agents as graphs that expedite delivery of agentic features while centralizing core infrastructure and observability components.
  • A GenAI SDK, providing straightforward coding convenience for interacting with LLMs and third-party orchestration packages
  • Prompt Lifecycle Management Service (PLMS) for maintaining the security and governance of prompts
  • LLM guardrail tooling, enabling LLM consumers to define the protections they want applied to their model inference
  • Synchronous and asynchronous inference gateways

These tools all contribute to the overall reliability, scalability, and performance of the LLM-powered applications built upon it. Capabilities of the Socrates AI App Framework are anticipated to grow and evolve alongside the rapid advancements in the field of LLMs. This means that Qualtrics users always have access to the latest and most cutting-edge AI capabilities from generative AI inference platforms like SageMaker Inference and Amazon Bedrock, empowering them to harness the transformative power of these technologies with greater ease and confidence.

Ongoing enhancements to the Socrates platform using SageMaker Inference

As the Socrates platform continues to evolve, Qualtrics is continuously integrating the latest advancements in SageMaker Inference to further enhance the capabilities of their AI-powered ecosystem:

  • Improved cost, performance, and usability of generative AI inference – One prominent area of focus is the integration of cost and performance optimizations for generative AI inference. The SageMaker Inference team has launched innovative techniques to optimize the use of accelerators, enabling SageMaker Inference to reduce foundation model (FM) deployment costs by 50% on average and latency by 20% on average with inference components. Using this feature, we’re working on achieving significant cost savings and performance improvements for Qualtrics customers running their generative AI workloads on the Socrates platform. In addition, SageMaker has streamlined deployment of open source LLMs and FMs with just three clicks. This user-friendly functionality removes the complexity traditionally associated with deploying these advanced models, empowering more Qualtrics customers to harness the power of generative AI within their workflows and applications.
  • Improved auto scaling speeds – The SageMaker team has developed an advanced auto scaling capability to better handle the scaling requirements of generative AI models. These improvements reduce significantly (from multiple minutes to under a minute), reducing auto scaling times by up to 40% and auto scaling detection by six times for Meta Llama 3 8B, enabling Socrates users to rapidly scale their generative AI workloads on SageMaker to meet spikes in demand without compromising performance.
  • Straightforward deployment of self-managed OSS LLMs – Using the new capability from SageMaker Inference for a more streamlined and intuitive process to package your generative AI models reduces the technical complexity that was traditionally associated with this task. This, in turn, empowers a wider range of Socrates users, including application teams and subject matter experts, to use the transformative power of these cutting-edge AI technologies within their workflows and decision-making processes.
  • Generative AI inference optimization toolkit – Qualtrics is also actively using the latest advancements in the SageMaker Inference optimization toolkit within the Socrates platform, which offers two times higher throughput while reducing costs by up to 50% for generative AI inference. By integrating using capabilities, Socrates is working on lowering the cost of generative AI inference. This breakthrough is particularly impactful for Qualtrics’s customers, who rely on the Socrates platform to power AI-driven applications and experiences.

“By seamlessly integrating SageMaker Inference into our Socrates platform, we’re able to deliver inference advancements in AI to our global customer base. The generative AI inference from capabilities in SageMaker like inference components, faster auto scaling, easy LLM deployment, and the optimization toolkit have been a game changer for Qualtrics to reduce the cost and improve the performance for our generative AI workloads. The level of sophistication and ease of use that SageMaker Inference brings to the table is remarkable.”

– James Argyropoulos, Sr AI/ML Engineer at Qualtrics.

Partnership with SageMaker Inference

Since adopting SageMaker Inference, the Qualtrics Socrates team has been a key collaborator in the development of AI capabilities in SageMaker Inference. Building on expertise to serve Socrates users, Qualtrics has worked closely with the SageMaker Inference team to enhance and expand the platform’s generative AI functionalities. From the early stages of generative AI, they offered invaluable insights and expertise to the SageMaker team. This has enabled the introduction of several new features and optimizations that have strengthened the platform’s generative AI offerings, including:

  • Cost and performance optimizations for generative AI inference – Qualtrics helped the SageMaker Inference team build a new inference capability for SageMaker Inference to reduce FM deployment costs by 50% on average and latency by 20% on average with inference components. This feature delivers significant cost savings and performance improvements for customers running generative AI inference on SageMaker.
  • Faster auto scaling for generative AI inference – Qualtrics has helped the SageMaker team develop These improvements have reduced auto scaling times by up to 40% for models like Meta Llama 3 and increased auto scaling detection speed by six times faster. With this, generative AI inference can scale with changing traffic without compromising performance.
  • Inference optimization toolkit for generative AI inference – Qualtrics has been instrumental in giving feedback for AWS to launch the inference optimization toolkit, which increases throughput by up to two times faster and reduces latency by 50%.
  • Launch of multi-model endpoint (MME) support for GPU – MMEs allow customers to reduce inference costs by up to 90%. Qualtrics was instrumental in helping AWS with the launch of this feature by providing valuable feedback.
  • Launch of asynchronous inference – Qualtrics was a launch partner for and has played a key role in helping AWS improve the offering to give customers optimal price-performance.

The partnership between Qualtrics and the SageMaker Inference team has been instrumental in advancing the state-of-the-art in generative AI within the AWS ecosystem. Qualtrics’s deep domain knowledge and technical proficiency have played a crucial role in shaping the evolution of this rapidly developing field on the SageMaker Inference.

“Our partnership with the SageMaker Inference product team has been instrumental in delivering incredible performance and cost benefits for Socrates platform consumers running AI Inference workloads. By working hand in hand with the SageMaker team, we’ve been able to introduce game changing optimizations that have reduced AI inference costs multiple folds for some of our use cases. We look forward to continued innovation through valuable partnership to improve state-of-the-art AI inference capabilities.”

–  Jay Kshirsagar, Senior Manager, Machine Learning

Conclusion

The Socrates platform underscores Qualtrics’s commitment to advancing innovation in experience management by flawlessly integrating advanced AI and ML technologies. Thanks to a strong partnership with the SageMaker Inference team, the platform has seen enhancements that boost performance, reduce costs, and increase the accessibility of AI-driven features within the Qualtrics XM suite. As AI technology continues to develop rapidly, the Socrates platform is geared to empower Qualtrics’s AI teams to innovate and deliver exceptional customer experiences.


About the Authors

Jay Kshirsagar is a seasoned ML leader driving GenAI innovation and scalable AI infrastructure at Qualtrics. He has built high-impact ML teams and delivered enterprise-grade LLM solutions that power key product features.

Ronald Quan is a Staff Engineering Manager for the Data Intelligence Platform team within Qualtrics. The team’s charter is to enable, expedite and evolve AI and Agentic developments on the Socrates platform. He focuses on the team’s technical roadmap and strategic alignment with the business needs.

Saurabh Trikande is a Senior Product Manager for Amazon Bedrock and SageMaker Inference. He is passionate about working with customers and partners, motivated by the goal of democratizing AI. He focuses on core challenges related to deploying complex AI applications, inference with multi-tenant models, cost optimizations, and making the deployment of Generative AI models more accessible. In his spare time, Saurabh enjoys hiking, learning about innovative technologies, following TechCrunch, and spending time with his family.

Micheal Nguyen is a Senior Startup Solutions Architect at AWS, specializing in using AI/ML to drive innovation and develop business solutions on AWS. Michael holds 12 AWS certifications and has a BS/MS in Electrical/Computer Engineering and an MBA from Penn State University, Binghamton University, and the University of Delaware.

Ranga Malaviarachchi is a Sr. Customer Solutions Manager in the ISV Strategic Accounts organization at AWS. He has been closely associated with Qualtrics over the past 4 years in supporting their AI initiatives. Ranga holds a BS in Computer Science and Engineering and an MBA from Imperial College London.


RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Most Popular

Recent Comments