Credibility is Enhanced Through Cross Links Between Different Data from Different Domains

Introducing Our Data Curators

Jun 8, 2021 4 min read

Karel Volckaert

As a consultant, what type of data do you usually work with?

I work at the intersection between strategy, finance and organisation. My usual dataset is quite broad - and sometimes unstructured. Oftentimes, the most decisive data are ones that cross domains: economic data coupled with environmental measurements, sociodemographic characteristics linked with online analytics.

If you were able to pick, what would be the ultimate dataset, or datasets that you would like to see in the Green Deal Data Observatory? And the Economy Data Observatory?

If I may venture that far, the interesting point is where these two data observatories meet. But high on my wishlist would be anything related to geospatial dispersion of environmental and climate data: land erosion, aerosols, solar incidence. From an economic perspective, my interest would go especially to - again - dispersion across regions or other geographical domains of, say, number of new enterprises, disposable income, tax incidence…

See our [case study](https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/) on connecting local tax revenues, climate awareness poll data and drought data in Belgium. — See our case study on connecting local tax revenues, climate awareness poll data and drought data in Belgium.

Why did you decide to join the challenge and why do you think that this would be a game changer for policymakers and for business leaders?

There is, both from an ecological and a societal point of view, an urgent need for open-access, real-time, trustworthy data to base decisions on. Ever since Kydland & Prescott’s analyses of “rules rather than discretion” and even earlier analyses of investment under uncertainty, the dynamic rules for optimal decision-making (including investment) require fast-response reliable data.

Do you have a favorite, or most used open governmental or open science data source? What do you think about it? Could it be improved?

Let me give one example: the AMECO annual macro-economic database is great for long-term historical analyses but its components ought to be real-time available. As an anecdote, as a fund manager in emerging markets we needed to anticipate macro-economic evolutions and in particular the manner in which capital markets anticipate these evolutions by adjusting foreign exchange rates or positioning themselves along yield curves. To some extent, we needed to predict what AMECO would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory.

To some extent, we needed to predict what [AMECO](https://ec.europa.eu/info/business-economy-euro/indicators-statistics/economic-databases/macro-economic-database-ameco/ameco-database_en) would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory. — To some extent, we needed to predict what AMECO would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory.

Is there a piece of information that recently surprised you? What was it?

I am currently working on water-related issues and came across a result reported in Nature Energy earlier this year that in more than one in ten hydropower stations, the extra warming from the dark surface of the water reservoir was enough to outbalance its “green” electricity generation potential, leading to no net climate benefits.

The researchers found that almost half of the reservoirs they surveyed took just four years to reach a net climate benefit. Unfortunately, they also found that 19% of those surveyed took more than 40 years to do so, and approximately 12% of them took 80 years—the average lifetime of a hydroelectric plant. Calculating the albedo-climate penalty of hydropower dammed reservoirs

Again: spatial distribution matters…

Photo: Kees Streefkerk, [Unplash License](https://unsplash.com/license) — Photo: Kees Streefkerk, Unplash License

From your experience, what do you think the greatest problem with open data in 2021 will be?

Trust. In a society where “value” and even “truth” is determined more by the amount of (web) links to a particular “fact” than by its intrinsic characteristics, we need to be able to trust data — open data because it’s open and “closed” data because it’s closed.

What can our automated data observatories do to make open data more credible in the European economic policy and climate change or mitigation community and be more accepted as verified information?

If I may refer to the previous answer: credibility is enhanced through cross-links between different data from different domains that “does not disprove” one another or that is internally consistent. If, say, data on taxable income goes in one direction and taxes in another, it is the reasoned reconciliation of the - alleged or real - inconsistency that will validate the comprehensive data set. So I am a great believer in broad, real-time observatories where not only the data capture, but the data reconciliation is automated, sometimes by means of a simple comparative statics analysis, in other cases maybe through quite elaborate artificial intelligence.

Join us

Join our open collaboration Green Deal Data Observatory team as a data curator, developer or business developer. More interested in antitrust, innovation policy or economic impact analysis? Try our Economy Data Observatory team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our Digital Music Observatory team!