Frequently Asked Questions

Here are some frequently asked questions regarding Dimensions on BigQuery.

1. Access & Support

Can I access the Dimensions BigQuery data sets using my normal Dimensions account?

Unfortunately no, this is not possible. Dimensions on BigQuery requires access to be mediated through a Google Account, and this is not the same as the account used to access Dimensions. Dimensions on BigQuery is a paid service additional to Dimensions Analytics.

Can I try out Dimensions BigQuery?

Yes, we have a sandbox environment that is freely available on Google Cloud. The dataset is updated daily and contains several hundreds of thousand documents related to COVID-19 research.

How can I get support?

Please send an email to supportbigquery@dimensions.ai. Please provide as much information about your problem as possible, mainly:

  • What are you trying to do and what problem are you facing.

  • Describe how can we reproduce the problem.

  • Any additional details, such as the BigQuery SQL for the query and any additional details that might be relevant for reproducing the issue.

2. Schema

How often do you change the schema?

Minor changes to the schema such as additional fields are released at any time. The Dimensions BigQuery data set is updated on a daily basis and any changes would be pushed out within the next daily release. Minor changes are guaranteed not to break or change the underlying semantic intent of existing queries written against the same major version of the schema. Additional fields maybe added at the top level or within a nested structure – so care should be taken if the exact structure of a nested record is relied upon.

Major schema changes are rare and would occur infrequently. Each major version will be available for two years minimum from its release date. When a new major version of the BigQuery schema is released, the previous major release will be maintained in tandem until a subsequent major version is released (so any single major version would be supported for a minimum of four years). Only two major schema versions are released at any single point in time. Queries written against the schema of one major release is not guaranteed to work with a new major release (but all efforts would be undertaken to minimise any difference).

Snapshots are not retrospectively updated to reflect a new major schema. Snapshots maintain the schema revision that was released at the time the point in time the snapshot was generated.

Deprecated Fields

From time to time fields will be deprecated. They will normally be maintained within the current version of the major schema, but it is advised to replace their use in queries with the value it is being replaced with. Deprecated fields may be removed in the next major revision of the data sources schema that is released.

Partitioning and Google BI Engine

One of the limitations of Google BI Engine is that the maximum number of partitions that it can handle is currently support is 500 (limitations and quotas).

The vast majority of our primary data source tables (ie. publications, patents etc.) have a primary partition which has more than 500 partitions and therefore using these tables directly with BI Engine will result in an error in relation to exceeding the maximum allowed partitions. If you need to utilise BI Engine on tables which exceed the maximum number of allowed partitions, one possible approach to get around the limitation is using a intermediate table which is repopulated on a regular and ongoing basis (and partitioned on a different key or over a different key range). This new table can then be used as the source of the query for BI Engine.

3. Snapshots

When are snapshots created and how long are they available for?

Snapshots are generated on the first of every month and become available approximately two days later. Once a snapshot is available it is guaranteed to remain available for at minimum 12 months.