Modernizing Without the Big Bang: A Phased Approach to Data Platform Migration
My colleague Isaac Alexander wrote recently about why data platform implementations fail before they even begin, the stakeholder misalignment, undocumented requirements, and scope creep that derail projects in the preparation phase. He's right on all of it. But let's say your firm has done that work. The requirements are documented. Stakeholders are aligned. The vendor is selected. Now what?
That's where most implementation guides go quiet. And it's where, in my experience running complex implementations across hedge funds and institutional asset managers, the real execution risk lives. McKinsey found that only 10% of cloud transformations achieve their full value, and in most cases, the gap between ambition and outcome isn't about the technology chosen. It's about how the implementation was executed. That's why we believe a phased, incremental implementation is the most reliable path to successful outcomes.
Getting the pre-work right is necessary but not sufficient. The firms that extract value quickly from a data platform transformation are the ones that execute each phase with discipline — knowing what to prioritize, where hidden complexity tends to surface, and how to validate progress before moving forward. Here's what that actually looks like.
Phase 1 - Data ingestion and normalization: Where surprises live
The first phase is establishing reliable, governed connections to every source system. This sounds straightforward. It rarely is.
You might find these in Phase 1: undocumented data feeds, inconsistent field labeling across vendors, identifier mismatches between systems, data quality issues that no one knew existed. The instinct is to push through them and clean up later. Resist it. Problems not resolved at the ingestion layer don't disappear; they propagate into the model and show up as reconciliation breaks and reporting errors three phases down the line. If your platform is already showing signs it can't handle the data load, these ingestion-layer issues will compound quickly.
Two things make this phase go well. First, a complete source system inventory completed before configuration begins (every feed, its owner, its format, its latency characteristics, and its known quirks). Second, clearly defined exception handling: what happens when a feed is late, incomplete, or malformed? If your implementation plan doesn't specify this, you'll be improvising under pressure when it happens in production.
Phase 1 should make you move beyond trying to achieve perfection on every feed and focus more on knowing precisely what you have, and what you've decided to do about it. You should be normalizing incoming data to a standard data model, including identifying core data entities like reference data, transactions, and holdings, and defining their relationships across source systems. This is what makes Phase 2 tractable. Without a consistent, standardized model at the ingestion layer, every business-specific data product built on top of it inherits the inconsistencies underneath.
Phase 2 - Data modeling and validation: Where business and technology must actually agree
Normalized data flowing reliably into a platform doesn't mean anyone can use it yet. In Phase 2, that data gets translated into the business data products that operations and investment teams actually work with, such as positions, transactions, NAV components, cash flows, and loan-level attributes.
This phase exposes the gap between what business teams assumed the platform would do and what technology teams are actually building. In my experience, it's the most common place for implementations to stall. A portfolio manager's definition of "current position" is often different from an operations team's, which is often different from what the risk system expects. Getting alignment on these definitions (in writing, with sign-off) before configuration begins is not bureaucracy. It's what prevents a three-week rework cycle at testing.
Treating validation as a one-time UAT event at the end also consistently trips up Phase 2. Validation should be iterative, with business stakeholders actively reviewing modeled outputs at each sprint and not just at the finish line. By the time a problem surfaces in a final UAT review, it's already expensive to fix.
Phase 3 - Reporting, data access policies, and downstream integration: Where governance pays off (or doesn't)
In the third phase, end users finally interact with their data platform, including reporting surfaces, permissioned data access, and the downstream integrations that feed risk systems, investor reports, and regulatory submissions.
Firms that embedded governance early — defining data ownership, access controls, and lineage documentation in Phase 1 — find this phase relatively straightforward. Firms that deferred governance to "later" find it waiting for them here, except now it's more expensive and disruptive to implement because data is already flowing. Gartner predicts that 80% of data governance initiatives will fail by 2027, with a major contributing factor being governance treated as an afterthought rather than a foundational design decision. A data-first approach to governance, where lineage, quality controls, and stewardship are established before users are handed access, is consistently what separates clean Phase 3 rollouts from messy ones.
Permissioning, in particular, is chronically underestimated. Who can see which portfolios? Which data sets require regulatory access controls? What happens when a user's role changes? These aren't edge cases; they're day-one operational requirements, and the earlier those conversations happen, the cleaner the implementation.
Downstream integration is the other place where scope has a habit of quietly expanding. Every system that consumes data from the platform is a dependency. Map them all before Phase 3 begins, agree on file formats and delivery schedules, and test them before go-live and not after. For firms with regulatory reporting obligations downstream, the stakes of getting this wrong are particularly high. Data accuracy and regulatory controls need to be treated as implementation requirements, not post-launch considerations.
What this looks like end to end
Consider a $10B AUM alternative asset manager operating across four legacy providers with 28 distinct data integrations. Reconciliation was a daily manual exercise. The operations team was spending time managing data process rather than analyzing it, and the firm's ability to onboard new asset classes was constrained by the architecture underneath. This is a pattern that tends to emerge when firms defer modernization past the point where incremental fixes still work.
Before any migration work began, the team ran a comprehensive data inventory: every feed mapped, every data owner documented, every legacy dependency catalogued. That pre-work determined the sequencing of the entire implementation, allowing the transition to proceed without disrupting live operations.
Phase by phase, those 28 integrations were consolidated into a unified data model. By completion, the firm was processing over 100,000 trades and cash events and modeling more than 2 million loans. Operations teams now work from a single governed data foundation, one they trust enough to use as the basis for new product launches and asset class expansion.
The metrics matter. But the more important outcome was an operations team that stopped spending its day managing data and started using it.
The implementation is the strategy
A modern data platform is only as valuable as the implementation that delivers it. Deloitte's 2025 investment management outlook notes that firms seeking to harness AI at scale will find it requires a robust data foundation and strong governance controls first, which means the quality of today's implementation directly determines a firm's readiness for tomorrow's capabilities.
Firms that treat implementation as a project to be completed, rather than a discipline to be practiced, tend to get a platform that works eventually, mostly, for the use cases scoped on day one. Firms that approach each phase with the rigor it requires get something different: a foundation they can actually build on.
That's the difference between deploying a platform and transforming how your operations run.
Key takeaways
Q1. What's the most underestimated phase of a data platform implementation?
Data ingestion and normalization. It consistently surfaces more complexity than expected (e.g., undocumented feeds, inconsistent identifiers, and data quality issues) and problems not resolved here compound downstream.
Q2. Why do modeling and validation stall implementations?
Misaligned definitions between business and technology teams, and validation treated as a one-time event rather than an iterative process throughout the sprint cycle.
Q3. When should governance be established?
Phase 1. Permissioning, data lineage, and access controls retrofitted in Phase 3 are significantly more disruptive and costly than those built in from the start.
Q4. How do you prevent downstream integration surprises at go-live?
Map every consuming system before Phase 3 begins. Agree on formats and delivery schedules early, and test integrations before, and not after go-live.
Q5. What separates firms that get value quickly from those that don't?
Treating implementation as a discipline rather than a project: with structured validation at each phase gate and governance built into the foundation, not bolted on afterward.
Authored By
Ankit Jain
Ankit has 14 years of experience building technology-driven products for the investment management industry, focusing on turning complex operational challenges into scalable, user-centric solutions. His work intersects across product management and solutions architecture, where he combines strategic thinking with hands-on execution. Ankit has led the end-to-end development of platforms supporting the full investment lifecycle, from trade processing to reporting and analytics. He partners with stakeholders across business and technology teams to define product vision, prioritize roadmaps, and deliver robust and adaptable solutions.
Share This post