The Pitfalls of the 95% Confidence Paradigm for Banking Data Quality
Every bank is in a different stage of their data journey. Recently, while attending the InvestOps Europe conference in Paris, one of the presenters mentioned that when it comes to gauging the level of confidence banking leadership has in the integrity of its data, 95% confidence in their data is the barometer to which they need adhere. Ninety-five percent has always been a desirable grade to get on a paper or in a class, but is it good enough when talking about a multinational bank operating in dozens of jurisdictions?
Like the air we breathe, data is odorless, colorless, silent, and hard to measure. That is, until data is presented next to dollar signs on a disclosure report, balance sheet, or interminable spreadsheet; then it becomes real. The past few years have seen financial institutions grappling with suddenly ballooning volumes of financial data, not an easy ask for legacy data systems and banks that might run on scores of different systems.
To complicate matters, recently, banks have been plagued by an unstructured data problem, hundreds of unformatted or differently formatted datasets that legacy systems have trouble transforming into usable information. Here are some of the factors and trends causing these challenges and some tactics banks can use to overcome them and become data champions with 100% trust in their data.
The 95% confidence fallacy
While a 95% confidence intervali in data is the target, banks will really have only 80-90% confidence in their data today. In a 2024 study of sell-side reference data operations, over 90% reported that poor data quality caused issues in clearing and settlement, risk management, and regulatory reporting, with 80% citing challenges in automated trading and market connectivity emanating from inaccurate data.ii Moreover, that 80-90% is a bit of an illusion. Here's the reality. Say, I am a bank CTO or chief data scientist, and I have 80% confidence in the data that's coming to me via any type of transaction. I then push that data into the clearing or matching process. Then, I push it into the settlement process; and there's cash movement that goes along with this. That data keeps getting pushed from one process to the next, to the next, and the next, which means there's a little bit of degeneration that happens all the way through. By the time I get to the end of my processes, I have 50% confidence in my data, and that little anomaly from the first process becomes a serious data problem 10 steps later. However, this is an inscrutable problem to recognize, much less solve. It depends on the robustness of the institution's existing data and operational infrastructure, the stage of its data transformation journey, and the asset classes and structures involved.
Meanwhile, the risk of getting it wrong is high. On the undesirable end of the 95% spectrum, Citi shelled out about a billion dollars in fines in the last 5 years for irregularities in its regulatory reporting data and governance failures; and responded by spending millions modernizing its technology.iii Deutsche Bank, Wells Fargo, and Mitsubishi Bank are examples of institutions that have worked through confidential supervisory findings called Matters Requiring Attention (MRAs) and Matters Requiring Immediate Attention (MRIAs). Many of these have been rooted in data processes. In this context, even 95% (and even if it were a true 95%) isn’t enough for global banks — UBS, for instance, has a balance sheet larger than the Swiss economy. A Swiss bailout of such a bank is challenging. The risk needs to be near-zero, which means confidence needs to be near-perfect.
A prime example: private credit’s unstructured data problem
There is probably no better example of how this plays out than in private credit, where a lack of a uniform standard for data coming from private credit loan documents, originators, term sheets, mortgage servicers, and administrators is creating major challenges for financial institutions across capital markets, including banks.
When policymakers put the clamp on banks after the 2008 crisis, private credit was not on anyone’s radar. It soon became the thing, and now it is never going away. With much less regulation and transparency in private markets, regulators have now awakened to a bigger problem that they can't get their arms around. While regulators focus on private credit’s risks as retail investors gain greater access, banks are partnering with private credit in ways that allow them to do the things they do well as traditional advisories. In the past couple of years, Wall Street has made inroads in reclaiming lost credit business by modernizing systems and embracing a new hybrid public-private credit world.iv
“Banks must enhance structured and unstructured data and content. This can include leveraging lineage generators to automatically map information flow based on existing sources and systems. This reduces manual effort to capture end-to-end lineage, creates consistent labels for data paths, and clarifies the source of data, how the data changed over time, and its intended use. Additionally, the usability of unstructured data can be increased by creating descriptions via metadata labels that help business teams define use cases by specifying details such as the source of the data, applicable usage rights, and how the content connects with or complements related datasets.” — Boston Consulting Groupv
As such, banks now must deal with the unstructured data volume and complexity that comes with private credit, especially acute in the hugely popular asset-based finance (ABF) business. PDF and Excel documents for dozens or hundreds of, for example, private manufacturing companies, come with added complexity of different financial formats that require data infrastructure that can ingest and normalize.
It starts when the operations team must reconcile the initial loan tapes, the lens through which investors view and price risk via cash flow models, the validity of reported returns, and the ability to raise capital with confidence. It’s not only middle-market business loans. Forward flow agreements and programmatic funding have turned consumer loans into a scalable, securitizable asset class, fueling a $25 trillion ABF market.vi
The risk feared by credit officers is missing critical information in the data, leading to a loss on loans. Losing $100 million on a loan because something was overlooked is an existential issue to banks. This goes beyond ensuring that regulators and other stakeholders get good data; risk officers and credit officers need near-perfect data to make the best-informed decisions. Back to our fallacy: 95% confidence isn’t enough.
Is AI the key?
AI has lit a fire in the bellies of buy-side and sell-side institutions alike, as they know their data house must be in order for the AI house to be in order. According to Deloitte, “Banks’ AI readiness is often slowed by the data foundations that models depend on. Poor infrastructure can result in data sprawl, vulnerability, and limited data-led innovation, limiting model efficacy.”vii But once a bank has their AI game in place, it can play a pivotal role in bringing order to the data chaos. There are several data quality management functions that AI agents are already helping with. For example, one financial institution recently leveraged generative AI to automate data lineage capture and metadata generation, achieving 40% to 70% productivity gains in specific tasks.viii
AI presents ready-assistance for the unstructured data problem, in particular: If managing structured data is like sorting pre-labeled packages, managing unstructured data with AI is like instantly reading thousands of handwritten letters, identifying key facts in each one, and organizing those facts into a searchable spreadsheet — a task impossible for humans at scale. But, again, the art of the possible when it comes to AI will come back to data quality; it will require institutions to centralize their data management capabilities, with an emphasis on tools that support strong data lineage and reporting accuracy.
The 100% data confidence paradigm
There are severe pitfalls in having a 95% data confidence barometer when executing tech transformations. Regulatory considerations, data governance challenges (especially with unstructured data), surging market volumes, private credit, and the adoption of AI in the financial services industry are forces that cannot be ignored as we get the calendar ready for a 2026-page turn. Realistically, banking leaders need to keep their eyes on the 100% prize for data management. Everybody under the roof will do a better job if they trust that the information they do their jobs with is reliable, timely, and precise.
Authored By
Ted O’Connor
Ted is a Senior Vice President focused on Business Development at Arcesium. In this role, Ted works with leading financial institutions in the capital markets to optimize data, technology, and operational needs.
Share This post
[i] Investopedia, May 6, 2025. https://www.investopedia.com/terms/c/confidenceinterval.asp#toc-explain-like-im-five
[ii] Acuity Knowledge Partners, November 2024. https://assets.ctfassets.net/cy2jgjrgaerj/5V6yrRfzYZU1LXqUgvulAD/.../increasing-efficiency-in-sell-side-reference-data-management-fow.pdf
[iii] Banking Dive, July 11, 2024. https://www.bankingdive.com/news/citi-occ-fed-135-million-penalties-2020-orders-data-quality-risk-management-control-fraser-hsu/721061/
[iv] Traders Magazine, July 28, 2025. https://www.tradersmagazine.com/am/the-lending-showdown-how-banks-are-reasserting-power-in-the-private-credit-arena/
[v] BCG, May 6, 2025. https://www.bcg.com/publications/2025/tech-banking-transformation-starts-with-smarter-tech-investment
[vi] Guggenheim, September 25, 2025. https://www.guggenheiminvestments.com/perspectives/portfolio-strategy/asset-backed-finance
[vii] Deloitte, October 30, 2025. https://www.deloitte.com/us/en/insights/industry/financial-services/financial-services-industry-outlooks/banking-industry-outlook.html
[viii] BCG, May 6, 2025. https://www.bcg.com/publications/2025/tech-banking-transformation-starts-with-smarter-tech-investment