How to Fix Fragile SSO Systems Using AWS Cognito: A Scalable Identity Platform Guide

December 30, 2025
7
min read
Blog creator
Juan Pablo Rubio
Github logoLikedin LogoX Logo

Rebuilding SSO the Right Way: From Fragile Logins to a Reliable Identity Platform

When authentication fails, users don’t complain, they leave.

That was the challenge facing a large professional association in the architecture and planning industry, operating multiple member-facing digital products. Their Universal Login experience had become unreliable, hard to evolve, and increasingly risky from a security standpoint. What started as “occasional login issues” was actually a deeper architectural problem.

At Streaver, we were brought in not just to fix bugs, but to rethink how identity should work at scale.

The Problem: Why the Existing SSO Kept Failing

The client’s existing SSO system was built on AWS Cognito, but over time it had accumulated complexity in all the wrong places.

The most critical issue was a race condition during sign-up:

  • Users authenticated successfully
  • Cognito triggers attempted to create users in Salesforce
  • Asynchronous flows executed out of order
  • Result: users stuck in partial or inconsistent states

On top of that, error handling was fragile. When something failed, recovery was unclear, both for the system and for the user.

There were also security and scalability concerns:

  • A single OAuth client ID was shared across multiple internal and partner applications
  • Custom OAuth logic increased the risk of non-compliance
  • Maintaining a homegrown auth layer required ongoing audits and manual infrastructure work

This wasn’t just a technical inconvenience. It directly impacted user trust, internal velocity, and the ability to safely evolve the platform.

Our Goal: A Secure, Scalable SSO Built on AWS Cognito

Our objective wasn’t to add features, it was to remove uncertainty.

We aligned on a few clear goals:

  • Eliminate sign-up and login race conditions
  • Guarantee consistent user states across Cognito and Salesforce
  • Improve error recovery and observability
  • Reduce custom security surface area
  • Automate everything that shouldn’t require manual effort
  • Modernize the experience to align with the main website
  • Prepare the platform for MFA and future growth

In short: make authentication reliable, scalable, and invisible.

The Solution: Simpler Architecture, Predictable Outcomes

Instead of extending yet another custom authentication layer, we leaned into what AWS already does well.

We rebuilt the platform using AWS Cognito’s managed login experience, combined with a clean, event-driven backend that gives us explicit control over when and how side effects occur. The result is a simpler, more reliable SSO architecture that’s easier to reason about, operate, and evolve.

Using Cognito Managed Login and Standard OAuth Flows

At the core of the new system is AWS Cognito’s Managed Login, paired with Cognito’s standard OAuth 2.0 endpoints for authorization, token issuance, and logout.

Rather than implementing custom OAuth logic, we relied on Cognito’s compliant /authorize, /token, and /logout endpoints while using the Managed Login solely for the user-facing authentication experience. This allowed us to keep full control over application flows without taking on the risk of maintaining a bespoke OAuth server.

Authentication, token lifecycle management, and standards compliance are now handled by Cognito, while the rest of the platform focuses on business logic and system integration.

This separation reduced security risk, simplified long-term maintenance, and made it straightforward to introduce additional capabilities such as MFA in later phases.

Eliminating Race Conditions with Event-Driven Flows

The most critical fix addressed the race condition between user authentication and CRM user creation.

AWS Cognito triggers have a strict execution time limit, which makes them unsuitable for long-running or failure-prone integrations such as creating users in an external CRM. Attempting to perform these operations synchronously during sign-up had previously led to timeouts, retries, and inconsistent user states.

To solve this, we introduced a deterministic, event-driven flow:

  • A Cognito trigger publishes an event to SQS
  • A dedicated Lambda consumes the message asynchronously
  • The Lambda creates or synchronizes the user in Salesforce

By moving CRM integration out of the authentication path, we respected Cognito’s execution constraints while gaining explicit control over ordering, retries, and failure handling. This eliminated partial user states and made the sign-up process predictable, resilient, and easier to operate.

A Custom /userinfo API for Salesforce Data

To avoid coupling internal applications directly to Salesforce, we introduced a custom /userinfo API.

Built as a serverless REST endpoint using API Gateway and Lambda, this service:

  • Uses a Cognito Authorizer to validate access tokens
  • Extracts a contact_id from Cognito user attributes
  • Fetches additional profile data from Salesforce on demand

This approach keeps authentication fast while allowing applications to retrieve enriched user data through a single, stable interface.

Observability, Monitoring, and CI/CD Automation

Reliability requires visibility.

We introduced structured logging, CloudWatch dashboards, and alarms to provide real-time insight into authentication flows and failures. Issues that were previously hard to diagnose are now immediately observable and actionable.

At the same time, we implemented a fully automated CI/CD pipeline using GitHub Actions, Terraform, and AWS SAM. Infrastructure and application code are deployed consistently across environments with zero manual steps. Deeper observability with Datadog was planned for Phase 2.

Utilizing AWS Cognito’s UserMigration trigger for a better user experience

One of the most delicate parts of this project was migrating more than 150,000 existing users from a legacy User Pool to a new one. A traditional migration would have required forcing password resets or scheduling a coordinated cutover, both of which would have introduced friction and risk.

Instead, we relied on Cognito's UserMigration trigger to make the transition effectively invisible.

When a user attempts to sign in and does not yet exist in the new User Pool, Cognito automatically invokes the migration trigger. At that point, our system retrieves the required user data from Salesforce, which remains the source of truth, and creates the user in the new pool on the fly.

From the user's perspective, nothing changes.

They log in exactly as they always have, using the same credentials, without being prompted to reset passwords or take any additional steps. Behind the scenes, the account is migrated, attributes are synchronized, and the user is now part of the new identity platform.

This approach allowed us to migrate users progressively and safely, only when they actually signed in, instead of attempting a risky bulk migration.

Most importantly, it preserved trust. Users never had to think about the migration at all — which is exactly how authentication systems should behave.

Why This Worked

By leaning on managed services and event-driven design, we reduced complexity rather than adding to it. Authentication is now reliable, scalable, and easy to extend — exactly what a modern identity platform should be.

This project wasn’t just about AWS services. It worked because of how decisions were made.

  • Engineers worked directly on architecture, backend and DevOps.
  • Tradeoffs were discussed openly with stakeholders
  • Security, UX, and operability were treated as first-class concerns
  • We optimized for long-term maintainability, not short-term patches

Instead of adding complexity to “fix” complexity, we removed it.

Results: Better Performance, Lower Cost, Happier Users

The impact was immediate and measurable:

  • Login failures caused by race conditions: eliminated
  • Scalability: fully serverless, scales automatically with demand
  • Infrastructure costs: reduced by removing a custom OAuth server
  • Deployment effort: significantly reduced through CI/CD automation
  • Operational confidence: improved thanks to monitoring and visibility
  • User experience: faster, clearer, more consistent authentication flows

Just as important, the platform is now ready for what’s next; including MFA and future identity integrations without another rewrite.

A Broader Lesson About Identity Systems

Authentication is not where you want creativity. It’s where you want clarity.

By relying on managed services and clean event flows, we helped this organization turn a fragile SSO into a foundation they can trust.

This is the kind of work Streaver focuses on:

  • AI-first and cloud-native
  • Secure by design
  • Built to evolve, not break

No unnecessary layers. No over-engineering. Just systems that do what they’re supposed to do, reliably.

Thinking about modernizing your authentication or identity platform?

Let’s talk about how Streaver can help you build systems that scale, stay secure, and get out of your users’ way.

 👉 Explore what we do !

Continue Reading

Ready to Start?

Let's make something great together!
Let's TalkAbstract blue geometric diamond-shaped icon with layered curved segments on a dark gray backgroundAbstract blue geometric diamond-shaped icon with layered curved segments on a dark gray backgroundAbstract blue four-petal flower icon with curved shapes on a dark gray background.Abstract blue four-petal flower icon with curved shapes on a dark gray background.