top of page
Search

The Data Lineage Problem in Sustainability Reporting

  • Feb 23
  • 5 min read
Visual representation of complex data lineage
Visual representation of complex data lineage

TL;DR As assurance expectations move toward the rigour of financial reporting, the gap between a well-maintained spreadsheet workflow and a genuinely traceable data architecture is becoming harder to paper over with documentation and sign-offs. 

The solution is structural, not procedural - building systems where the audit trail constructs itself rather than relying on a person to reconstruct it on request.


Sustainability teams have good data. The problem is that most reporting systems were never built to prove it.


---


Sustainability teams are, in many organisations, some of the most analytically rigorous people in the building. The challenge they face is not one of capability. It’s that the systems they work inside were built for a different version of the job - one where the audit trail was a nice-to-have rather than a legal requirement.


That distinction matters more and more.


  • In Workiva's 2024 survey1 of more than 2,000 finance, sustainability, and risk professionals, 98% said they were confident in the accuracy of their sustainability data, yet

  • 83% said that collecting data to fulfil mandatory reporting requirements would be a significant challenge for their organisation. 


On the surface, those two statistics seem contradictory. But they actually describe the same situation from two different angles; while the data that teams themselves have built and trust is good, the systems that produced it are not built for what the assurance process now requires.


The assurance bar has moved; data traceability is now table stakes.


Sustainability professionals in Aotearoa know the landscape well. The XRB’s mandatory climate standards, the ISSB frameworks, the growing expectation of independent assurance… none of this is news to teams that have been working in this space for years.


What’s changing is not the existence of assurance, but the technical specificity of what assurance providers are starting to ask for.


Auditors assessing ESG disclosures are increasingly adopting SOX-style financial controls logic: clear data ownership, defined evidence requirements, and direct traceability between what appears in a disclosure and the source documentation that supports it.2

That last part - direct traceability between disclosure and source - is where many reporting architectures, even well-maintained ones, run into friction.

The Deloitte 2024 Sustainability Action Report, which surveyed 300 senior executives, found that 57% identified data quality as their single biggest challenge in sustainability reporting, with 88% ranking it among their top three concerns.3


These are not people who do not know their data. These are people who know their data well, and are finding that knowing it is not the same as being able to prove the chain of custody in the way an assurance process demands.


What data lineage (actually) means at a system level


Data lineage is the unbroken, machine-readable record of where a piece of data came from, what was done to it, and what it is now connected to. In financial reporting, this is a solved problem - your general ledger maintains it automatically, and any figure in your accounts can be traced to its source transaction.


In sustainability reporting, this problem is largely unsolved - not because teams have not tried, but because the underlying system architecture was never designed for it.4


Consider what needs to be true for a single Scope 1 figure in a disclosure to have genuine, auditable lineage:


  1. The raw energy or fuel reading needs to be traceable to a specific meter or telemetry source.


  2. That source needs to be mapped to a reporting boundary, as defined by a specific version of the GHG Protocol corporate standard.


  3. The emission factor applied needs to be documented - which factor, which version, sourced from where.


  4. If an estimate was used because primary data was unavailable, the estimation methodology needs to be recorded and defensible.


  5. And all of this needs to be stored in a form that an assurance provider can interrogate without needing to ask a person to manually reconstruct it.


In most reporting environments today, parts of that chain exist in different systems: the meter data in a building management system, the boundary definitions in a spreadsheet, the emission factors in a methodology document, the estimation decisions in someone's institutional memory. When an auditor pulls on the thread, a human being has to walk the path manually.


That works until the person who knows the path leaves. It also works until the assurance standard rises to the point where "a person walked me through it" is no longer sufficient.


Why this is an engineering problem, not a process problem


The standard response to assurance pressure has been to add more process. More sign-off steps. More documentation requirements. More rows in the spreadsheet tracking who approved what. These are reasonable responses within the constraints of existing systems, and the teams implementing them are doing diligent work.


The limitation is that a process layered on top of a poorly connected data architecture does not produce data lineage. It produces a paper trail that approximates lineage - one that requires manual maintenance, is vulnerable to gaps when things change, and does not scale gracefully as the reporting scope expands.


What produces genuine lineage is a system where the connections between a disclosure figure and its source data are structural, not narrative. Where the path from a meter reading to a reported number is encoded in the system itself, not documented after the fact in a supporting workbook. When a methodology changes or an emission factor is updated, the system propagates that change through every connected calculation automatically, and flags every figure that is affected.


This is not a novel concept in data engineering. It’s standard practice in financial systems, in clinical trial data management, in any domain where the chain of custody of a data point is a regulatory requirement. The specific challenge in sustainability is that the data landscape is more heterogeneous - it crosses organisational boundaries, connects structured and unstructured information, and sits at the intersection of physical measurement, regulatory logic, and scientific methodology in ways that financial data does not.


That intersection is where DataLoom's work sits. Building the connective layer that makes sustainability data lineage automatic rather than manual, without requiring organisations to rebuild their existing systems from scratch. The people doing the reporting stay in control of the analysis and the judgement - what changes is whether the audit trail builds itself or relies on a person to hold it together.


The practical question for right now


A useful self-assessment for any sustainability team: pick your three most material figures from last year's report. For each one, ask - without asking a person to explain it from memory - whether a new team member or an external auditor could independently reconstruct the full path from that number back to its source data.


If the answer is yes for all three, you are ahead of most. If the answer involves phrases like "you'd need to talk to Jamie about that" or "it's in the methodology notes document, but those aren't linked to the figures themselves," you have identified the specific version of this problem that matters for your organisation.


The gap is closeable. The direction of travel is clear. The question is whether it gets closed proactively, before assurance expectations tighten further, or reactively, when an assurance provider asks a question that takes three weeks to answer properly.


---


DataLoom builds automated data lineage infrastructure for sustainability reporting - connecting operational data sources, regulatory frameworks, and emission methodologies into a system where audit trails build themselves. If the problem described here sounds familiar, feel free to get in touch.


---


Sources referenced:


  1. Workiva 2024 Sustainability Practitioner Survey (n=2,000+): https://www.workiva.com/resources/2024-sustainability-practitioner-survey 

  2. DFIN - ESG Trends from 2025 and What to Expect in 2026: https://www.dfinsolutions.com/knowledge-hub/blog/esg-trends-2025-and-what-expect-2026 

  3. Deloitte 2024 Sustainability Action Report (n=300 executives): https://www.deloitte.com/us/en/services/audit-assurance/articles/esg-survey.html 

  4. KEY ESG - Auditable ESG Reporting: https://www.keyesg.com/article/auditable-esg-reporting-staying-ahead-of-the-curve 


 
 
bottom of page