Functions Syntax

The DSL contains several functions that can be used to perform non-search operations. For example, functions can be used to check if an object exist, or to automatically extract categories from a piece of text.

Note

The DSL supports UTF-8 encoded search terms. See also the FAQs.

Basic functions structure

Each function call starts with a function name followed by parenthesis, in between which arguments are placed, separated by commas.

For example, calling function extract_grants looks like this:

extract_grants  ("R01HL117329", fundref="100000050")
-------------------------------------------------------
<function name>:(positional arguments, named arguments)

The DSL supports two types of arguments:

Positional Arguments

These are placed without any name. They are just values. These are completely optional and can be replaced by named arguments. Their purpose is mainly to simplify calling functions with just one or two arguments, by omitting the argument name.

Named Arguments

Named arguments are always put after positional arguments and their order is not important. In the example above, one could also write extract_grants(fundref=”100000050”, grant_number=”R01HL117329”).

Arguments can be of various type, for example string or integer. See below for the full list of supported functions.

Function: classify

This function retrieves suggested codes for any text.

Note

Our classifier is optimised to work with titles and abstracts, that is, fairly short texts. When using it with longer texts, it may produce unexpected or no results.

Warning

When using the API to classify documents, you may sometimes find that the same document will have different categories assigned to it from what is listed in Dimensions (if the document is already indexed in Dimensions). This happens because pre-computed document classifications often derive from multiple data enrichment steps aimed at improving the category precision e.g. by looking at the overall topic area for a journal. As these enrichment steps are not carried out in the API classifications, differences can arise despite both these and Dimensions using the same classification algorithms.

Currently, the following classification systems are supported:

  • Fields of Research 2020 (FOR, FOR_2020)

  • Research, Condition, and Disease Categorization (RCDC)

  • Health Research Classification System Health Categories (HRCS_HC)

  • Health Research Classification System Research Activity Classifications (HRCS_RAC)

  • Health Research Areas (HRA)

  • Broad Research Areas (BRA)

  • ICRP Common Scientific Outline (ICRP_CSO)

  • ICRP Cancer Types (ICRP_CT)

  • Units of Assessment (UOA)

  • Sustainable Development Goals (SDG, SDG_2021)

Examples

classify(
    title="Burnout and intentions to quit the practice among community pediatricians: associations with specific professional activities",
    abstract="BACKGROUND: Burnout is an occupational disease expressed by loss of mental and physical energy due to prolonged and unsuccessful coping with stressors at work. A prior survey among Israeli pediatricians published in 2006 found a correlation between burnout and job structure match, defined as the match between engagement with, and satisfaction from, specific professional activities. The aims of the present study were to characterize the current levels of burnout and its correlates among community pediatricians, to identify changes over time since the prior survey, and to identify professional activities that may reduce burnout. METHODS: A questionnaire was distributed among pediatricians both at a medical conference and by a web-based survey. RESULTS: Of the 518 pediatricians approached, 238 (46%) responded to the questionnaire. High burnout levels were identified in 33% (95% CI:27-39%) of the respondents. Higher burnout prevalence was found among pediatricians who were not board-certified, salaried, younger, and working long hours. The greater the discrepancy between the engagement of the pediatrician and the satisfaction felt in the measured professional activities, the greater was the burnout level (p < 0.01). The following activities were especially associated with burnout: administrative work (frequent engagement, disliked duty) and research and teaching (infrequent engagement, satisfying activities). A comparison of the engagement-satisfaction match between 2006 and 2017 showed that the discrepancy had increased significantly in research (p < 0.001), student tutoring (P < 0.001), continuing medical education and participation in professional conferences (P = 0.0074), management (p = 0.043) and community health promotion (P = 0.006). A significant correlation was found between burnout and thoughts of quitting pediatrics or medicine (p < 0.001). CONCLUSIONS: Healthcare managers should encourage diversification of the pediatrician's job by enabling greater engagement in the identified anti-burnout professional activities, such as: participation in professional consultations, management, tutoring students and conducting research.",
    system="FOR")

Arguments

Argument Name

Argument Type

Optional?

title

string

abstract

string

system

string

Function: extract_affiliations

Extract affiliations either using structured or unstructured input. Batch extraction is available as well for up to 200 input objects.

Unstructured Affiliation Matching

Single input

Find the GRID id for a unstructured affiliation. If multiple affiliations are detected these will be split and returned as separate “affiliation_parts” along with any GRID ids matched. As with the structured matching this is geared towards precision so any time the results are ambiguous (we find more than one result) or below an internal confidence threshold they will be marked as requiring manual review, otherwise we consider the result to be correct.

For this type of affiliation matching, the user needs to provide just the affiliation argument as shown in the following example:

extract_affiliations(affiliation="university of oxford, uk")

Batch input

Find the GRID id for a batch of unstructured affiliations. The parameter is the same as a single match (affiliation) however multiple can be specified in a JSON encoded array of hashes. Each entry in the query array will be returned in the result array along with any GRID institutes found. There is no result formatting, all results are returned in the basic format.

For this type of affiliation matching, the user needs to provide the json argument as shown in the following example:

extract_affiliations(json=[
    {"affiliation": "university of oxford, uk"}
])

The json argument can contain up to 200 individual objects, of which all must provide the affiliation parameter. This argument is expecting standard JSON syntax.

Note

This function accepts additional parameter results which can have a value of either basic (default), full or publisher. These options provide ability to restrict the returned metadata.

Structured Affiliation Matching

Single input

Find the GRID id for a structured representation of an institute. Geared towards precision so that the data can be used in an automated process with minimal human interaction. Any time the results are ambiguous (we find more than one result) or below an internal confidence threshold they will be marked as requiring manual review, otherwise we consider the result to be correct. In order to get the best results it is recommended that you provide as much geographical information as possible in the available fields.

For this type of affiliation matching, the user needs to provide name, city, state and country arguments as shown in the following example:

extract_affiliations(name="Southwestern University", city="Georgetown", state="Texas", country="USA")

Batch input

Find the GRID ids for a batch of structured data. The parameters are the same as a single match (name, city, state and country) however multiple can be specified in a JSON encoded array of hashes. Each entry in the query array will be returned in the result array along with any GRID institutes found. There is no result formatting, all results are returned in the basic format.

For this type of affiliation matching, the user needs to provide the json argument as shown in the following example:

extract_affiliations(json=[
    {"name":"Southwestern University",
    "city":"Georgetown",
    "state":"Texas",
    "country":"USA"}
])

The json argument can contain up to 200 individual objects, of which all must provide all four parameters, including name, city, state and country. This argument is expecting standard JSON syntax.

Arguments

Argument Name

Argument Type

Optional?

name

string

city

string

state

string

country

string

affiliation

string

json

json

results

string

Function: extract_concepts

Extract concepts from any text. Text input is processed and extracted concepts are returned as an array of strings ordered by their relevance. For background information about concepts, see also the searching with concepts section.

Optionally return_scores parameter may be set to true/`false`(default), to include relevance score of each concept.

Examples

extract_concepts("text of abstract")

Arguments

Argument Name

Argument Type

Optional?

text

string

return_scores

boolean

Function: extract_grants

Extract grant Dimensions ID from provided parameters. Grant number must be provided with either a fundref or a funder name as an argument. Returns an object with a key grant_id and value of either an identified Dimensions grant ID or null, if no grant was identified.

This function takes advantage of the GRID data to identify grants in any subsidiaries of the provided funder as well. Additionally, it performs partial match of the grant_number, so the success rate of identifying a grant as opposed to identifying it by a search grants query is expected to be higher.

Examples

extract_grants(grant_number="R01HL117329", fundref="100000050")
extract_grants(grant_number="HL117648", funder_name="NIH")

Arguments

Argument Name

Argument Type

Optional?

grant_number

string

fundref

string

funder_name

string