Frequently Asked Questions¶

Here are some frequently asked questions regarding Dimensions on BigQuery.

1. Access & Support¶

Can I access the Dimensions BigQuery data sets using my normal Dimensions account?¶

Unfortunately no, this is not possible. Dimensions on BigQuery requires access to be mediated through a Google Account, and this is not the same as the account used to access Dimensions. Dimensions on BigQuery is a paid service additional to Dimensions Analytics.

Can I try out Dimensions BigQuery?¶

Yes, we have a sandbox environment that is freely available on Google Cloud. The dataset is updated daily and contains several hundreds of thousand documents related to COVID-19 research.

How can I get support?¶

Please send an email to supportbigquery@dimensions.ai. Provide as much information about your problem as possible, mainly:

What are you trying to do and what problem are you facing.
Describe how can we reproduce the problem.
Any additional details, such as the BigQuery SQL for the query and any additional details that might be relevant for reproducing the issue.

2. Schema¶

How often do you change the schema?¶

Minor changes to the schema such as additional fields are released at any time. The Dimensions BigQuery data set is updated on a daily basis and any changes would be pushed out within the next daily release. Minor changes are guaranteed not to break or change the underlying semantic intent of existing queries written against the same major version of the schema. Additional fields maybe added at the top level or within a nested structure – so care should be taken if the exact structure of a nested record is relied upon.

Major schema changes are rare and would occur infrequently. Each major version will be available for two years minimum from its release date. When a new major version of the BigQuery schema is released, the previous major release will be maintained in tandem until a subsequent major version is released (so any single major version would be supported for a minimum of four years). Only two major schema versions are released at any single point in time. Queries written against the schema of one major release is not guaranteed to work with a new major release (but all efforts would be undertaken to minimise any difference).

Snapshots are not retrospectively updated to reflect a new major schema. Snapshots maintain the schema revision that was released at the time the point in time the snapshot was generated.

Deprecated Fields¶

From time to time fields will be deprecated. They will normally be maintained within the current version of the major schema, but it is advised to replace their use in queries with the value it is being replaced with. Deprecated fields may be removed in the next major revision of the data sources schema that is released.

Deprecations of versioned categories and their eventual removal from the schema however are not tied to a major schema release. When an old version of a category model is marked as deprecated it will be removed within the near future so it is recommended to change to using the replacement version. The top-level, non-versioned categories will point to the revision of the model used within the Dimensions web app and DSL API and is normally the preferred field to use for analysis.

Partitioning and Google BI Engine¶

One of the limitations of Google BI Engine is that the maximum number of partitions that it can handle is currently support is 500 (limitations and quotas).

The vast majority of our primary data source tables (ie. publications, patents etc.) have a primary partition which has less than 500 partitions and as such using these tables directly with BI Engine is possible.

3. Snapshots¶

When are snapshots created and how long are they available for?¶

Snapshots are generated on the first of every month and become available approximately two days later. Once a snapshot is available it is guaranteed to remain available for at minimum 12 months.