Metadata Registries in NSW Govt

Hi champs,

I am looking at options for adopting a metadata registry for our team, with a view to perhaps sharing more widely within the cluster. We currently use a spreadsheet with a nifty Tableau front end, but I would like to improve interoperability and automated harvesting as simply as possible.

I’m aware that OEH’s Information Asset Register and data.nsw use CKAN, but I don’t know whether other departments are using any metadata registry tools, such as METeOR (AIHW), CKAN, Aristotle or Data61’s MAGDA (which powers

Any tips appreciated.
With thanks, Drew


Hi Drew, outside of tools are you investigating common metadata standards for interoperability between agencies? For instance Dublin Core (

Hi James, I’m hoping that is handled by the tool, but if not then would definitely look at adding DC, AGLS etc.

Hi Drew

I’ll email you some further details. There is a group within Dept Communities and Justice that has recently implemented Aristotle for a specific metadata registry purpose (around KPI’s).

As to your question James, we in DCJ are looking at how we can reconvene a former FACS group called ISAG (Information Standards Advisory Group) that was looking to develop various information standards. Where possible that would utilise existing standards. That ISAG group had some participants from across Govt (years ago) but that fell away.


1 Like

Fabulous, thanks Mark! I’m keen to see how they fared in terms of usability & implementation.

Hi Drew.
At Sydney Water we use Collibra. Happy to give you at demo and some feedback if you are interested……

Thanks Juanita, I’ll drop you a line outside the forum.

Ta, Drew

Hi Mark,

Do you know if there is any inter-agency group for metadata standards? We also had a group here for standards that we are looking at re-convening, the Enterprise Information Working Group.

Hi Drew

In the projects I work on we implement linked data principles and tooling to create and manage the metadata and then provide an endpoint for a registry to consume to support sharing.

In short, have a look at the work being implemented at Geological Survey of Queensland (GSQ) . It’s an excellent and achievable model for both sides of the process. Moreover the whole approach has been written up for others to use

A colleague formerly from Geoscience Australia was the architect of this approach in a project GSQ engaged Geoscience Australia to undertake. The team at GSQ are now up and running with the methodology under their own steam.

As a bit more background…

The metadata management side of things has good options for tools (free and paid), base-standards, domain standards customization and interchange mechanisms. The GSQ work is a good mix of tools working together that meet the constraints of small teams and early stage governance maturity agencies. The main challenges are twofold; 1. getting buy in and everyone on board to implement it and 2. a learning curve (by key people) and/or access to resources who can implement linked data, aka RDF, infrastructure. After these aspects are on a roadmap, modelling, taxonomies, vocabularies, domain specific standards and linking your data are, while not trivial, not that hard to get started to get some key easy wins, and the three things ( buy in, skills and depth of standards) mature well over time.

On the registry side of things, it is more of a mixed bag - a different set of challenges and things to weigh up. There are only a small number of free options and these require a degree of work to implement ( though this is not different from CKAN or MAGDA). They are certainly achievable for small teams that have commitment, but do require a good roadmap to the two challenges above.
Then there’s paid registry solutions which are quite expensive. However if the compliance requirements and risks are high, this is a price which at the end of the day, must be confronted to meet these requirements and we see many more organisations, particularly commonwealth agencies, agreeing that a linked data web-enabled approach is the way forward for this compliance.

The reality for most agencies at state level and even commonwealth is that the they are at the beginning of the journey and so the GSQ approach is a good model to kick off.

Just a quick additional note of detail…just to help place RDF and dublin core in context

At the lowest level linked data is based on the W3C standards OWL and RDF and is supported by a range of tools (free and commercial) also fully supporting the W3C standards.

Any domain can be modeled using OWL/RDF and by adopting the base standard any data can be used, linked and shared with any party also adopting the standard. Re-using and linking to other “standards” e.g. whole of government taxonomies in a non-vendor lock in way. This is the main strength of the approach.

@james.bibby mentioned dublin core. This is just one such standard using OWL/RDF as the underlying interoperability mechanism. But by itself dublin core only provides a very specific set of “attributes” (properties) and classes to define data - that is, it’s for a relatively narrow domain of applicability but reasonably extensive (deep) for that domain - it’s for libraries or managers of catalogues/lists/collections of item/s e.g. author, title, published date, plus other more abstract ones. This may be enough for some things… but it will quickly run out of expressiveness.

So for other domains and especially for describing very specific data objects and properties e.g. some aspect of water sampling data, this means it requires more work to extend dublin core, find other standards and customize them to define your domain specific properties.

Happy to chat more if you have any questions



Hi @james.bibby and @mark.holdsworth

Check out the Australian linked data working group.

Even if the linked data approach isn’t adopted, there are a range of controlled whole of government and data vocabularies that are useful. Also there is much material from which to learn the principles, methods of standards development and re-use which have a proven track record. Though achieving a similar competency level of interoperability, re-use and provenance outside the linked data toolset is generally not possible.



1 Like

Hi @Simon.Opper,

I would characterize the applicability of Dublin core the other way around. It is not at all deep, but it is broad - hence its usefulness for metadata.

We have good examples of this where we have successfully applied it across a diverse range of data sets to provide consistency to make is easy to automate data discovery and cataloging. Essentially we apply Dublin core to describe the data, and then domain specific data standards for the data itself.

See attached visualization.

Note while its shown here at record level, we apply it up to data set, repository and service level as well.

Hi James

I think this is a case of ‘it depends’ on the use case and requirements. To be clear I’m not saying it is not useful. My main point was that it is one of a number of standards which can be combined to make a very expressive set of linked models and queries. Starting using dublin core and SKOS are fundamental steps onto the data governance pathway, but being aware of their limitations and having a view to future needs is the main insight I’m keen to share with anyone on the journey.

Dublin core will allow for a level of specialisation/extension of it’s classes, suiting a wide range of use cases, that will meet a set competencies required. But using extensions of only dublin core classes and properties will necessarily limit the complexity(competency) of the queries you can then ask of your data. E.g. can i get provenance of what events have occurred on that piece of data over time and between applying standards or data quality fixes - no, can I understand the spatio-temporal relation of data at different scales ( data cubes, content negotiation and link-sets) - no, can I query what unit of measure is used and how this relates to another - no, is there a statement of qualtiy - no, as a user from view point 1 can I query what data is relevant to my data model ( same for user 2-n) - no.

Each of these facets has an extensive set of standards/vocabularies and ontologies which give much greater ability to develop metadata and to query it. e.g. Prov-O, QB4ST, SOSA, SHACL

If these more detailed data queries don’t need answering… happy days… a lean minimal approach is the ticket ( though you’d be surprised what value the others can bring easily). But for many projects of high importance they definitely are a requirement and this is where these bunch of cool standards can really be leveraged :muscle:

Cheers mate

1 Like

Hi @Simon.Opper. Great, so we agree on the main point “it is one of a number of standards which can be combined”.

I would advocate for Dublin core to be used for the metadata component of any service/standard we develop/adopt.

When working for SWIFT in the financial standards space, I saw the organisation come to the realization after 40 years that the standards “wars” would never be won. ISO20022 was a recognition that the focus needed to be on inter-operability. It provides a common data dictionary used by many standards initiatives to ensure that as data passes from one standard to another along the supply chain it can be reliably mapped with the same concepts. That notion of a supply chain is a key part of the thinking.

The business payload, the data that you actually want to work with should indeed be based on standards specific to the knowledge domain that data relates to. By combining we can do the complex queries you mention, while making discovery, the first step of the process, easier.

1 Like

Hi folks,

A colleague put me onto FAIR data principles over coffee today (

Thought i’d share it on…

For an Australian view on it ANDS has some info

Are others familiar with it ?

It’s intent is quite familiar to me though I hadn’t heard of it…

  • Data should be Findable
  • Data should be Accessible
  • Data should be Interoperable
  • Data should be Re-usable

The implementation of these concepts is the challenging and interesting bit for us in our work. It’s only one way to skin this problem of course… but it’s a great example encapsulating the problem areas.

It’s also very closely tied with the linked data community… through the paradigm of unique re-usable identifiers (URIs) and open exchange protocols aka RDF and it leads nicely into the benefits of standards such as dublin core. Hence my posting here.

Looking around the related work I also found this very neat standards discovery tool based on FAIR, that I like a lot… It would be great to see many of the metadata typologies expressed in the FAIR principles applied in Australia data catalogues :slight_smile:



1 Like

Thanks @Simon.Opper and @james.bibby for your informed and generous replies.

Agreed that RDF / OWL is a preferred foundation to interoperability, and DC fits the bill on that front. Does AGLS still get any love?

1 Like

Hi Drew

AGLS seems as loved as any metadata standard :wink: All depends on whom you are talking to and what the purpose is…

AGRIF is a more recent and complimentary vocabulary to AGLS to check out.

Additional a growing ‘family’ of Australian govt vocabularies is Longspine. My group is building a lot of records management and data catalogue capability based on it.

Though national archives has expressed the possibility of a different records management vocabulary in the future. So maybe at some stage a more expressive national vocab to come… we will see



Hi Simon, thanks for the pointer to AGRIF - it looks like it will be a welcome leap into the semantic web for records management. A more current URL looks to be

LongSpine looks really useful, e.g. to help consistency across cross-walks.

Thanks again, Drew

© Data.NSW