A Category in Dimensions, represents a topic and could be as broad as “cancer” or “engineering” or “basic research”, to populations (indigenous people, teens), to new yet-to-be defined topics like “implementation research”, and difficult to define topics (“advanced manufacturing”). 

In Dimensions there are two main types of categories - sets of categories in "classification systems" - that are internationally recognised and pre-built into the system (including FOR, RCDC and HRCS categories), and those which can be created by the user (“My Categories” - not available to all users). When new data comes into the Dimensions system it is automatically allocated into each and every appropriate category into which it belongs. This means that categories can quickly and easily be used to search Dimensions or filter any list of results from a search.

Within Dimensions we have embedded a number of classifications systems (sets of categories), and over time we will increase the number and scope of these. There are a number of advantages of using pre-existing classification systems:

  • You can use them in conjunction with keyword and abstract searches, or other classifications systems. For example, if you filter search for ‘Salt OR sodium’ and filter to HRCS Cardiovascular you will only return grants or publications that fall into the ‘cardiovascular’ set as defined by HRCS but also include the term ‘salt’ or ‘sodium’. This means you don’t always have to create a complete, complicated query when looking for specific sets of data.

  • They are maintained by subject matter experts - governance is externally controlled. 

  • They generally have guidance topics for usage. For example, the precise meaning and scope of a term and rules governing number of terms per grant, meaning you know what to expect from returns.

  • Their usage is often already quite extensive.

How are the categories created?

Standard categories are built in Dimensions using emulations of the categorisation systems led by machine learning. This is done by taking a set of documents coded by subject matter experts in that system, and then feeding these into the Dimensions machine learning algorithm, before then using what the system has learned to automatically categorise new documents. The algorithms are refined through identification of false positive and false negative allocations, and once a high enough level of accuracy has been achieved these definitions are then used in Dimensions to automatically label all information coming into the system.