stacodelists: use standard, language-independent variable codes to help international data interoperability and machine reuse in R
An R data package with all the SDMX standard codelists
The goal of
statcodelists is to promote the reuse and exchange of statistical information and related metadata with making the internationally standardized SDMX code lists available for the R user. SDMX – the Statistical Data and Metadata eXchange has been published as an ISO International Standard (ISO 17369). The metadata definitions, including the codelists are updated regularly according to the standard. The authoritative version of the code lists made available in this package is https://sdmx.org/?page_id=3215/.
Click to expand table of contents of the post
Table of Contents
Cross-domain concepts in the SDMX framework describe concepts relevant to many, if not all, statistical domains. SDMX recommends using these concepts whenever feasible in SDMX structures and messages to promote the reuse and exchange of statistical information and related metadata between organisations.
Code lists are predefined sets of terms from which some statistical coded concepts take their values. SDMX cross-domain code lists are used to support cross-domain concepts. What are these cross-domain coded concepts?
- Geographical codes, like
NL: the Netherlands in the CL_AREA code list.
- Standard industry codes
J631for Data processing, hosting and related activities in Europe. (NACE Rev 2 in Europe, beware, it is
J592in Australia and New Zealand, see CL_ACTIVITY_ANZSIC06.)
- Occupations, like
Database designers and administratorsin CL_OCCUPATIONS
- Time fomatting standards, like
CCYYfor annual data series in CL_TIME_FORMAT.
Check out the available codlists on the package homepage.
The use of common code lists will help users to work even more efficiently, easing the maintenance of and reducing the need for mapping systems and interfaces delivering data and metadata to them. A very obvious advantage of using the code systems is that you can retrieve data from national sources indifferent of the natural language used in North Macedonia, Japan, the U.S. or the Netherlands. While the data labels may change to be locally human-readable, computers and geeks can read the codes and understand them immediately. Provided that they use the standard codes.
Our data observatories are rolling out SDMX coding across all datasets to help data ingestion and interoperability, data findability and data reuse.
statcodelists can help the use of standard SDMX codes in your R workflow–both for downloading data from statistical agencies and to produce publication-ready datasets that the rest of the world (and even APIs) will understand.
You can install
statcodelists from CRAN:
Further recommended code values for expressing general statistical concepts like
not applicable, etc., can be found in section
Generic codes of the Guidelines for the creation and management of SDMX Cross-Domain Code Lists.
The creator of this package is not affiliated with SDMX, and this package was has not been endorsed by SDMX.
Code of Conduct
Please note that the
statcodelists project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.