Dimensions has a series of in-built categorisation systems which are used by funders and researchers around the world, and originally defined by field experts outside of Dimensions. These are:
Fields of research (FOR) from Australia and New Zealand
This system covers all areas of research from Arts and Humanities to Science and Engineering, and divided into 22 divisions, and 157 groups within these divisions. A third level of definition is also used in FOR classification, but these are not present in Dimensions.
Research, Condition, and Disease Categorization (RCDC) from the USA.
This categorisation system is used by the NIH to report to Congress and is a biomedical system consisting of 237 categories - some of which are very specific in topic (ataxia telangiectasia), and others more general (neuroscience).
Health Research Classification System (HRCS) from the UK.
This system is used by a large number of health research funders in the UK, and is subdivided into the Research Activity Classifications (RAC) and Health Categories (HC).
International Cancer Research Partnership (ICRP) CSO and Cancer Type codes.
The Common Scientific Outline or 'CSO' is a classification system organized into six broad areas of scientific interest in cancer research. The CSO is complemented by a standard cancer type coding scheme. The CSO is maintained by the International Cancer Research Partnership and further information on versions, using the CSO and training guides can be accessed at https://www.icrpartnership.org/cso
How are these categories constructed in Dimensions?
These standard categories are built in Dimensions using emulations of the categorisation systems led by machine learning. Briefly, this is done by taking a set of documents coded by subject matter experts in that system, and then feeding these into the Dimensions machine learning algorithm, before then using what the system has learned to automatically categorise new documents. The algorithms are refined through identification of false positives and negatives, and once a high enough level of accuracy has been achieved these definitions are then used in Dimensions to automatically label all information coming into the system.