Legacy systems create a dangerous illusion of data accuracy that erodes as information moves through the operational chain. Fragmented operational systems are most certainly a problem. But looking under the hood, one will find that the astonishing operational complexity banks are dealing with also comes from a big data monster. Anybody working in finance is grappling with solving the data monstrosity that digital transformation, cloud computing, and AI have created.
The potent advantage in data architecture modernization initiatives is that it not only solves the data problem, but it also can solve many operational issues related to institutions using numerous, fragmented, legacy systems. Banking ecosystems still reliant on decades-old mainframe architectures creates the risk of nuclear events in technology.
In my most recent article, Legacy Infrastructure Is Creating Invisible Ceilings on Sell-Side Efficiency and Growth, I outlined why older technology are massive barriers to efficiency and growth, and why now is the time for institutions to revamp core banking systems. Now, we are going to get at the heart of the investment banking operations problem. Here is an in-depth look at how global banks and sell-side institutions can streamline operations and drive scalable efficiency by modernizing the data foundations that power them.
A three-headed data monster: Volume, complexity, and fragmentation
The big data monster in finance is, firstly, a problem of volume. You may have heard claims that 90% of the world’s data was created between 2021-2023, and 23% more was generated in 2025.i Understandably, legacy infrastructure – designed for a megabytes/gigabytes world – lacks the processing power to handle today’s petabytes volume. But it’s not merely a volume problem. It’s also high dimensionality (where the number of predictors is larger than the number of observations) and complexity, such as with the preponderance of unstructured financial data.ii
Addressing banks’ data & operational problems
Workflow delays, bottlenecks, interruptions, and errors plague sell-side institutions still laboring under legacy data infrastructure. Information is often trapped in siloed systems, valuable information which, if accessible, can generate actionable insights for multi-functional decision making. Large banks may spend $50 million annually on a centralized data lake that does not meet the specific needs of a prime brokerage or swap unit with $1 billion revenue units.
Investment bank data architecture modernization changes an institution’s cannots into “can do.”
Cannot access data in real-time: In a fragmented environment, pulling real-time information across global markets is nearly impossible, putting the firm at a distinct disadvantage. Over 90% of data users in banks report that the data they need is often unavailable or takes too long to retrieve.iii
Cannot collect revenue: Manually parsing messy data causes settlement delays of loans of 7 to 20 days. Banks cannot charge financing fees until a transaction settles; a delay is direct revenue forfeiture.
Cannot launch new products: Because data is trapped in closed data lake architectures, it can take up to 18 months to get into a development queue just to generate a new report or analytics dashboard for a new product line.
Cannot take advantage of market shifts: This inability to be nimble prevents banks from capitalizing on animal spirits and rapid shifts in market demand.
Cannot get a house view: Fragmentation prevents a cohesive house view of risk and exposure.
Cannot do deals: When executing an M&A, institutions with scattershot ecosystems struggle to migrate decades of data from acquired balance sheets, jeopardizing the success of these mergers.
Cannot see oncoming data traffic: Many tier two banks still operate with 1990s-era systems that provide reports only once per day. In a market where volatility happens in milliseconds and treasurers must fund trillion-dollar balance sheets in real-time. Reliance on viewing data through the rear-view mirror is a competitive disadvantage.
Cannot focus on high value analyses: In complex operational chains, having even 95% confidence in data quality at the point of ingestion is insufficient.iv In older systems, there is degradation as data passes through clearing, matching, and settlement. Lacking automated data quality controls and integrity checks, teams spend too much time defensively fixing errors rather than performing high-value analyses.
Modernized data foundation streamlines operations and drives scalable efficiency
Modernization allows banks to find commonalities across asset classes and consolidate these onto a single scalable platform. By implementing a unified data foundation, firms can decouple their core data from underlying legacy systems, allowing them to harmonize information into a golden source of truth that feeds critical upstream functions like risk management, funding, and regulatory reporting, and enables straight-through processing.
Can understand optimal uses of capital and what has been deployed for funding.
Can manage different regulatory metrics, like Core Tier 1 equity or RWAs, across multiple sovereigns more efficiently.
Can automate unstructured data extraction for loans and private credit deals, including popular asset-backed finance (ABF) that legacy systems cannot natively model, ensuring risk and credit officers have perfect data for informed decision-making.
Can allow for greater growth of your platforms in the future by getting your data models correct, now.
Can reposition rates market strategies as desired by monitoring interest rate volatility with bitemporal interest rates, inflation, and consumer pricing datasets.
Can capture increased market share without a linear increase in headcount by handling major surges in trading data volumes.
Can view real-time cash positions and P&L that include all intraday accruals for financing, legal fees, and management fees.
Can achieve complete data lineage and auditability required to satisfy regulatory mandates such as those from the Office of the Comptroller of the Currency (OCC).v
Can monetize intelligent observations about market trends, such as when banks clear epic scales of data during volatility events, like big Treasurys sell-offs.
Data architecture that is ready for anything
Banks which install flexible, open data architecture can build stellar resilience, scalability, and future-proofing. As evidenced by the past five years of market shocks and volatility, financial institutions have to be ready for anything and everything – and so do their technology systems. As long as you’re making decisions about data architecture modernization, you may as well adopt technology that can withstand the test of time and tumult. Systems need the flexibility to, if a challenge pops up that is not supported, integrate with another solution easily.
Prepared to grow
Banks should avoid reliance on platform models that lock them into a single vendor, add disruption, come with a heavy integration burden, or restrict data ownership. To drive scalable efficiency, firms must shift toward a model of 100% data confidence at the point of ingestion. Data portability, ownership, and scale can become the ultimate competitive advantage.
Prepared for AI
Crucially, installing a unified data fabric and centralized data platform helps ready sell-side institutions for ongoing AI adoption. AI governance and data governance are twin brothers with common responsibilities in guiding data that AI systems create and consume, both overseeing data integration, quality, security, privacy, and accessibility. AI is only as good as the data we feed it. Banks can properly operationalize AI with a modern, cloud-based data platform that empowers them to make critical decisions from a stronger data management foundation.
Prepared for industry consolidation
Banks can become best prepared for industry consolidation with data modernization. In the old days, when a bank or intermediary failed and there was a five-day settlement cycle, they used to let institutions fade into the past. They had to wait five days to understand exactly what the settlement cycle exposure was at that institution. Now, the banks don't just disappear; somebody buys them. Those looking to buy distressed institutions need precise, up-to-the-minute data on everything from real-time exposure and settlement and funding obligations to collateral and liquidity positions. Moreover, legacy data systems will choke and sputter when attempting to migrate decades of data from acquired balance sheets.