A data dictionary describes your data. It describes the choices made about column names, codes, methods, or sampling. It enables anyone to better find, understand, reuse, and manage that data.
The benefits of a data dictionary depend on your relationship to the data and what you are trying to achieve.
Data dictionaries:
Data dictionaries:
Data dictionaries:
Downloadable data dictionaries in CSV and PDF formats serve the basic needs of data analysts. That is, if the data is readily available, then a data dictionary in a CSV format will typically give them the information they need to confidently analyse yoour data.
However, these data dictionaries are merely static 'codebooks'. CSV formats are, at least, human and machine readible. But, these 'codebooks' can result in a lack of standardisation, as each analyst or data producer creates their own data dictionaries based on their own style.
Also, there are fantastic software products that can now 'intelligently' (digitally) bring together and maintained your data in interactive data catalogues. These will help your organisation keep track of your data, help others find your data, and link your data to similar data published by other organisations.
If every organisation maintained data dictionaries and catalogues like this, then the findability and accessibility of Aotearoa NZ data would be phenomenal.
Software products like Colectica can help you to create these standardised, maintainable, and interactable data inventories. Furthermore, these products follow metadata standards from the Data Documentation Initiative (DDI). Following these international standards helps the interoperability of all our metadata and data.
The Data Documentation Initiative
These standards and software products also improves the findability of your data. For instance, a basic search on a data catalogue often looks for your search term in only the title and description. But, I am sure you are familiar with titles or descriptions that make no sense or use unexpected words for certain ideas. This is common and results often fail to find all of the relevant data.
In comparison, rather than relying on a match between your search term and the title and description, these technologies and standards allow you to search for concept or a theme across all of the metadata.
Searching for 'adolescent' would then discover any data related to, linked by, or containing variables related to the concept of 'adolescent'.
You can see an example of this in the following websites:
The Question Variable database
The Closer Discovery database
If you want to explore DDI and the benefits for your organisation, the following documents have further explanation and examples. Unfortunately, they are in PDF format. If you need them in alternate formats for accessibility purposes, contact datalead@stats.govt.nz.
An introduction to DDI [PDF 947 KB]
What can DDI do for you? [PPTX 5.4 MB]
Question driven harmonisation of data – The variable cascade in practice [PDF 3.7 MB]
Data stewardship is the careful and responsible creation, collection, management, and use of data. Data dictionaries are one of the first, easiest, and best ways for an organisation to improve how it works with data and how data works for Aotearoa NZ.
If you’d like more information, have a question, or want to provide feedback, email datalead@stats.govt.nz.
Content last reviewed 11 January 2021.