Copyright © 2002-2023. Cloud Software Group, Inc. All Rights Reserved.
TIBCO® Data Virtualization
Google Big Query Adapter Guide
Version 8.7.0 | October 2023
TIBCO® Data Virtualization Google Big Query Adapter Guide
2 | Contents
Contents
Contents 2
Google BigQuery Adapter 5
Google BigQuery Version Support 5
SQL Compliance 5
Getting Started 5
Connecting to Google BigQuery 5
Deploying the Google BigQuery Adapter 5
Basic Tab 6
Logging 12
Creating a Custom OAuth App 13
Advanced Integrations 15
Minimum Required Roles 17
Changelog 18
Advanced Features 26
User Defined Views 26
SSL Configuration 27
Firewall and Proxy 27
Query Processing 27
Logging 27
User Defined Views 27
SSL Configuration 30
Firewall and Proxy 30
Query Processing 31
Logging 32
SQL Compliance 35
SELECT Statements 35
INSERT Statements 36
TIBCO® Data Virtualization Google Big Query Adapter Guide
3 | Contents
UPDATE Statements 36
DELETE Statements 36
EXECUTE Statements 36
Names and Quoting 36
SELECT Statements 36
SELECT INTO Statements 118
INSERT Statements 118
UPDATE Statements 119
DELETE Statements 120
EXECUTE Statements 121
PIVOT and UNPIVOT 122
Data Model 123
Views 123
Stored Procedures 123
Additional Metadata 124
Tables 124
Views 125
Stored Procedures 128
Data Type Mapping 144
Connection String Options 147
Authentication 147
BigQuery 147
Storage API 148
Uploading 148
OAuth 149
JWT OAuth 150
SSL 150
Firewall 151
Proxy 151
Logging 152
Schema 152
Miscellaneous 152
TIBCO® Data Virtualization Google Big Query Adapter Guide
4 | Contents
Authentication 154
BigQuery 157
Storage API 163
Uploading 166
OAuth 170
JWT OAuth 177
SSL 184
Firewall 185
Proxy 189
Logging 195
Schema 196
Miscellaneous 202
Google BigQuery Adapter Limitations 214
TIBCO Product Documentation and Support Services 215
How to Access TIBCO Documentation 215
Release Version Support 216
How to Contact TIBCO Support 216
How to Join TIBCO Community 217
Legal and Third-Party Notices 218
TIBCO® Data Virtualization Google Big Query Adapter Guide
5 | Google BigQuery Adapter
Google BigQuery Adapter
Google BigQuery Version Support
The adapter enables read/write SQL-92 access to the BigQuery tables in your Google
account or Google Apps domain. The complete aggregate and join syntax in BigQuery is
supported. Additionally, statements in the BigQuery syntax can be passed through. The
adapter uses version 2.0 of the BigQuery Web services API: You must enable this API by
creating a project in the Google Developers Console. See Connecting to Google for a guide
to creating a project and authenticating to this API.
SQL Compliance
The SQL Compliance section shows the SQL syntax supported by the adapter and points
out any limitations.
Getting Started
Connecting to Google BigQuery
Basic Tab shows how to authenticate to Google BigQuery and configure any necessary
connection properties. Additional adapter capabilities can be configured using the
available Connection properties on the Advanced tab. The Advanced Settings section
shows how to set up more advanced configurations and troubleshoot connection errors.
Deploying the Google BigQuery Adapter
To deploy the adapter, you can execute the server_util utility via the command line by
1. Unzip the tdv.googlebigquery.zip file to the location of your choice.
2. Open a command prompt window.
TIBCO® Data Virtualization Google Big Query Adapter Guide
6 | Google BigQuery Adapter
3. Navigate to the <TDV_install_dir>/bin
4. Enter the server_util command with the -deploy option:
server_util -server <hostname> [-port <port>] -user <user> -
password <password> -deploy -package <TDV_install_
dir>/adapters/tdv.googlebigquery/tdv.googlebigquery.jar
Note: When deploying a build of an existing adapter, you will need to undeploy the existing
adapter using the server_util command with the -undeploy option.
server_util -server <hostname> [-port <port>] -user <user> -password
<password> -undeploy -version 1 -name GoogleBigQuery
Basic Tab
Connecting to Google BigQuery
By default, the CData adapter connects to all available projects in your database. To limit
the scope of your connection, set combinations of the following properties:
l
ProjectId: specifies which projects the driver connects to
l
BillingProjectId: specifies which projects are billed
l
DatasetId: specifies which datasets the driver accesses
Authenticating to Google BigQuery
The adapter supports using user accounts, service accounts and GCP instance accounts for
authentication.
The following sections discuss the available authentication schemes for Google BigQuery:
l
User Accounts (OAuth)
l
Service Account (OAuthJWT)
l
GCP Instance Account
TIBCO® Data Virtualization Google Big Query Adapter Guide
7 | Google BigQuery Adapter
User Accounts (OAuth)
AuthScheme must be set to OAuth in all user account flows.
Web Applications
When connecting via a Web application, you need to create and register a custom OAuth
application with Google BigQuery. You can then use the adapter to acquire and manage
the OAuth token values. See Creating a Custom OAuth App for more information about
custom applications.
Get an OAuth Access Token
Set the following connection properties to obtain the OAuthAccessToken:
l
OAuthClientId: Set this to the Client Id in your application settings.
l
OAuthClientSecret: Set this to the Client Secret in your application settings.
Then call stored procedures to complete the OAuth exchange:
1. Call the GetOAuthAuthorizationURL stored procedure. Set the CallbackURL input to
the Callback URL you specified in your application settings. The stored procedure
returns the URL to the OAuth endpoint.
2. Navigate to the URL that the stored procedure returned in Step 1. Log in to the
custom OAuth application and authorize the web application. Once authenticated,
the browser redirects you to the callback URL.
3. Call the GetOAuthAccessToken stored procedure. Set AuthMode to WEB and the
Verifier input to the "code" parameter in the query string of the callback URL.
Once you have obtained the access and refresh tokens, you can connect to data and
refresh the OAuth access token either automatically or manually.
Automatic Refresh of the OAuth Access Token
To have the driver automatically refresh the OAuth access token, set the following on the
first data connection:
l
InitiateOAuth: Set this to REFRESH.
l
OAuthClientId: Set this to the Client Id in your application settings.
TIBCO® Data Virtualization Google Big Query Adapter Guide
8 | Google BigQuery Adapter
l
OAuthClientSecret: Set this to the Client Secret in your application settings.
l
OAuthAccessToken: Set this to the access token returned by GetOAuthAccessToken.
l
OAuthRefreshToken: Set this to the refresh token returned by GetOAuthAccessToken.
l
OAuthSettingsLocation: Set this to the path where the adapter saves the OAuth token
values, which persist across connections.
On subsequent data connections, the values for OAuthAccessToken and
OAuthRefreshToken are taken from OAuthSettingsLocation.
Manual Refresh of the OAuth Access Token
The only value needed to manually refresh the OAuth access token when connecting to
data is the OAuth refresh token.
Use the RefreshOAuthAccessToken stored procedure to manually refresh the
OAuthAccessToken after the ExpiresIn parameter value returned by GetOAuthAccessToken
has elapsed, then set the following connection properties:
l
OAuthClientId: Set this to the Client Id in your application settings.
l
OAuthClientSecret: Set this to the Client Secret in your application settings.
Then call RefreshOAuthAccessToken with OAuthRefreshToken set to the OAuth refresh
token returned by GetOAuthAccessToken. After the new tokens have been retrieved, open a
new connection by setting the OAuthAccessToken property to the value returned by
RefreshOAuthAccessToken.
Finally, store the OAuth refresh token so that you can use it to manually refresh the OAuth
access token after it has expired.
Headless Machines
To configure the driver, use OAuth with a user account on a headless machine. You need
to authenticate on another device that has an Internet browser.
1. Choose one of two options:
l
Option 1: Obtain the OAuthVerifier value as described in "Obtain and Exchange
a Verifier Code" below.
l
Option 2: Install the adapter on a machine with an Internet browser and
transfer the OAuth authentication values after you authenticate through the
TIBCO® Data Virtualization Google Big Query Adapter Guide
9 | Google BigQuery Adapter
usual browser-based flow.
2. Then configure the adapter to automatically refresh the access token on the headless
machine.
Option 1: Obtain and Exchange a Verifier Code
To obtain a verifier code, you must authenticate at the OAuth authorization URL.
Follow the steps below to authenticate from the machine with an Internet browser and
obtain the OAuthVerifier connection property.
1. Choose one of these options:
l
If you are using the Embedded OAuth Application click Google BigQuery OAuth
endpoint to open the endpoint in your browser.
l
If you are using a custom OAuth application, create the Authorization URL by
setting the following properties:
o
InitiateOAuth: Set to OFF.
o
OAuthClientId: Set to the client Id assigned when you registered your
application.
o
OAuthClientSecret: Set to the client secret assigned when you registered
your application.
Then call the GetOAuthAuthorizationURL stored procedure with the
appropriate CallbackURL. Open the URL returned by the stored procedure in a
browser.
2. Log in and grant permissions to the adapter. You are then redirected to the callback
URL, which contains the verifier code.
3. Save the value of the verifier code. Later you will set this in the OAuthVerifier
connection property.
Next, you need to exchange the OAuth verifier code for OAuth refresh and access tokens.
Set the following properties:
On the headless machine, set the following connection properties to obtain the OAuth
authentication values:
l
InitiateOAuth: Set this to REFRESH.
l
OAuthVerifier: Set this to the verifier code.
l
OAuthClientId: (custom applications only) Set this to the Client Id in your custom
TIBCO® Data Virtualization Google Big Query Adapter Guide
10 | Google BigQuery Adapter
OAuth application settings.
l
OAuthClientSecret: (custom applications only) Set this to the Client Secret in the
custom OAuth application settings.
l
OAuthSettingsLocation: Set this to persist the encrypted OAuth authentication values
to the specified file.
After the OAuth settings file is generated, you need to re-set the following properties to
connect:
l
InitiateOAuth: Set this to REFRESH.
l
OAuthClientId: (custom applications only) Set this to the client Id assigned when you
registered your application.
l
OAuthClientSecret: (custom applications only) Set this to the client secret assigned
when you registered your application.
l
OAuthSettingsLocation: Set this to the file containing the encrypted OAuth
authentication values. Make sure this file gives read and write permissions to the
adapter to enable the automatic refreshing of the access token.
Option 2: Transfer OAuth Settings
Prior to connecting on a headless machine, you need to create and install a connection
with the driver on a device that supports an Internet browser. Set the connection
properties as described in "Desktop Applications" above.
After completing the instructions in "Desktop Applications", the resulting authentication
values are encrypted and written to the path specified by OAuthSettingsLocation. The
default filename is OAuthSettings.txt.
Once you have successfully tested the connection, copy the OAuth settings file to your
headless machine.
On the headless machine, set the following connection properties to connect to data:
l
InitiateOAuth: Set this to REFRESH.
l
OAuthClientId: (custom applications only) Set this to the client Id assigned when you
registered your application.
l
OAuthClientSecret: (custom applications only) Set this to the client secret assigned
when you registered your application.
l
OAuthSettingsLocation: Set this to the path to your OAuth settings file. Make sure this
TIBCO® Data Virtualization Google Big Query Adapter Guide
11 | Google BigQuery Adapter
file gives read and write permissions to the adapter to enable the automatic
refreshing of the access token.
Service Accounts (OAuthJWT)
To authenticate using a service account, you must create a new service account and have a
copy of the accounts certificate. If you do not already have a service account, you can
create one by following the procedure in Creating a Custom OAuth App.
For a JSON file, set these properties:
l
AuthScheme: Set this to OAuthJWT.
l
InitiateOAuth: Set this to GETANDREFRESH.
l
OAuthJWTCertType: Set this to GOOGLEJSON.
l
OAuthJWTCert: Set this to the path to the .json file provided by Google.
l
OAuthJWTSubject: (optional) Only set this value if the service account is part of a
GSuite domain and you want to enable delegation. The value of this property should
be the email address of the user whose data you want to access.
For a PFX file, set these properties instead:
l
AuthScheme: Set this to OAuthJWT.
l
InitiateOAuth: Set this to GETANDREFRESH.
l
OAuthJWTCertType: Set this to PFXFILE.
l
OAuthJWTCert: Set this to the path to the .pfx file provided by Google.
l
OAuthJWTCertPassword: (optional) Set this to the .pfx file password. In most cases
you must provide this since Google encrypts PFX certificates.
l
OAuthJWTCertSubject: (optional) Set this only if you are using a OAuthJWTCertType
which stores multiple certificates. Should not be set for PFX certificates generated by
Google.
l
OAuthJWTIssuer: Set this to the email address of the service account. This address
will usually include the domain iam.gserviceaccount.com.
l
OAuthJWTSubject: (optional) Only set this value if the service account is part of a
GSuite domain and you want to enable delegation. The value of this property should
be the email address of the user whose data you want to access.
TIBCO® Data Virtualization Google Big Query Adapter Guide
12 | Google BigQuery Adapter
GCP Instance Accounts
When running on a GCP virtual machine, the adapter can authenticate using a service
account tied to the virtual machine. To use this mode, set AuthScheme to
GCPInstanceAccount.
Logging
The adapter uses TDV Server's logging (log4j) to generate log files. The settings within the
TDV Server's logging (log4j) configuration file are used by the adapter to determine the
type of messages to log. The following categories can be specified:
l
Error: Only error messages are logged.
l
Info: Both Error and Info messages are logged.
l
Debug: Error, Info, and Debug messages are logged.
The Other property of the adapter can be used to set Verbosity to specify the amount of
detail to be included in the log file, that is:
Verbosity=4;
You can use Verbosity to specify the amount of detail to include in the log within a
category. The following verbosity levels are mapped to the log4j categories:
l
0 = Error
l
1-2 = Info
l
3-5 = Debug
For example, if the log4j category is set to DEBUG, the Verbosity option can be set to 3 for
the minimum amount of debug information or 5 for the maximum amount of debug
information.
Note that the log4j settings override the Verbosity level specified. The adapter never logs at
a Verbosity level greater than what is configured in the log4j properties. In addition, if
Verbosity is set to a level less than the log4j category configured, Verbosity defaults to the
minimum value for that particular category. For example, if Verbosity is set to a value less
than 3 and the Debug category is specified, the Verbosity defaults to 3.
The following list is an explanation of the Verbosity levels and the information that they
log.
TIBCO® Data Virtualization Google Big Query Adapter Guide
13 | Google BigQuery Adapter
l
1 - Will log the query, the number of rows returned by it, the start of execution and
the time taken, and any errors.
l
2 - Will log everything included in Verbosity 1 and HTTP headers.
l
3 - Will additionally log the body of the HTTP requests.
l
4 - Will additionally log transport-level communication with the data source. This
includes SSL negotiation.
l
5 - Will additionally log communication with the data source and additional details
that may be helpful in troubleshooting problems. This includes interface commands.
Configure Logging for the Google BigQuery Adapter
By default, logging is turned on without debugging. If debugging information is desired,
uncomment the following line in the TDV Server's log4j.properties file (default location of
this file is: C:\Program Files\TIBCO\TDV Server <version>\conf\server):
log4j.logger.com.cdata=DEBUG
The TDV Server must be restarted after changing the log4j.properties file, which can be
accomplished by running the composite.bat script located at: C:\Program Files\TIBCO\TDV
Server <version>\bin. Note that reauthenticating to the TDV Studio is required after
restarting the server.
Here is an example of the calls:
.\composite.bat monitor restart
All logs for the adapter are written to the "cs_server_dsrc.log" file as specified in the log4j
properties.
Note: The "log4j.logger.com.cdata=DEBUG" option is not required if the Debug Output
Enabled option is set to true within the TDV Studio. To set this option, navigate to
Administrator > Configuration. Select Server > Configuration > Debugging and set the
Debug Output Enabled option to True.
Creating a Custom OAuth App
You can use a custom OAuth app to authenticate a service account or a user account.
TIBCO® Data Virtualization Google Big Query Adapter Guide
14 | Google BigQuery Adapter
When to Create a Custom OAuth App
Web Applications
You need to create a custom OAuth app in the web flow.
Desktop Applications
Creating a custom OAuth app is optional as the driver is already registered with Instagram
and you can connect with its embedded credentials. You might want to create a custom
OAuth app to change the information displayed when users log into the Instagram OAuth
endpoint to grant permissions to the driver.
Headless Machines
Creating a custom OAuth app is optional to authenticate a headless machine; the driver is
already registered with Instagram and you can connect with its embedded credentials. In
the headless OAuth flow, users need to authenticate via a browser on another machine.
You might want to create a custom OAuth app to change the information displayed when
users log into the Instagram OAuth endpoint to grant permissions to the driver.
Create an OAuth App for User Account Authentication
Follow the procedure below to register an app and obtain the OAuthClientId and
OAuthClientSecret.
Create a Custom OAuth App: Desktop
1. Log into the Google API Console and open a project. Select the API Manager from the
main menu.
2. In the user consent flow, click Credentials > Create Credentials > OAuth Client Id.
Then click Other.
3. After creating the app, the OAuthClientId and OAuthClientSecret are displayed.
4. Click Library > BigQuery API > Enable API.
TIBCO® Data Virtualization Google Big Query Adapter Guide
15 | Google BigQuery Adapter
Create an OAuth App for Service Account Authentication
Follow the steps below to create an OAuth application and generate a private key.
1. Log into the Google API Console and open a project. Select the API Manager from the
main menu.
2. Click Create Credentials > Service Account Key.
3. In the Service Account menu, select New Service Account or select an existing
service account.
4. If you are creating a new service account, additionally select one or more roles. You
can assign primitive roles at the project level in the IAM and Admin section; other
roles enable you to further customize access to Google APIs.
5. In the Key Type section, select the JSON key type.
6. Create the app to download the key file.
7. Click Library > BigQuery API > Enable API.
Advanced Integrations
The following sections detail adapter settings that may be needed in advanced
integrations.
Saving Result Sets
Large result sets must be saved in a temporary or permanent table. You can use the
following properties to control table persistence:
Accessing Temporary Tables
You can use the following properties to manage temporary tables. These temporary tables
are managed by Google BigQuery and automatically expires after 24 hours.
l
AllowLargeResultSets: Store large result sets in temporary tables in a hidden dataset.
If this is not set, an error is returned for result sets bigger than a certain size.
TIBCO® Data Virtualization Google Big Query Adapter Guide
16 | Google BigQuery Adapter
Accessing Persistent Tables
Persist result sets larger than 128MB in a permanent table. To do so, you can set
DestinationTable to the qualified name of the table. For example,
"DestinationProjectId:DestinationDataSet.TableName". Subsequent queries replace the
table with the new result set.
Limiting Billing
Set MaximumBillingTier to override your project limits on the maximum cost for any given
query in a connection.
Bulk Modes
Google BigQuery provides several interfaces for operating on batches of rows. The adapter
supports these methods through the InsertMode option, each of which are specialized to
different use cases:
l
The Streaming API is intended for use where the most important factor is being able
to insert quickly. However, rows which are inserted via the API are queued and only
appear in the table after a delay. Sometimes this delay can be as high as 20-30
minutes which makes this API incompatible with cases where you want to insert data
and then run other operations on it immediately. You should avoid modifying the
table while any rows are in the streaming queue: Google BigQuery prevents DML
operations from running on the table while any rows are in the streaming queue, and
changing the table's metadata (name, schema, etc.) may cause streamed rows that
haven't been committed to be lost.
l
The DML mode API uses Standard SQL INSERT queries to upload data. This is by the
most robust method of uploading data because any errors in the uploaded rows will
be reported immediately. The adapter also uses this API in a synchronous way so
once the INSERT is processed, any rows can be used by other operations without
waiting. However, it is by far the slowest insert method and should only be used for
small data volumes.
l
The Upload mode uses the multipart upload API for uploading data. This method is
intended for performing low-cost medium to large data loads within a reasonable
time. When using this mode the adapter will upload the inserted rows to public
TIBCO® Data Virtualization Google Big Query Adapter Guide
17 | Google BigQuery Adapter
temporary storage within BigQuery and then create a load job for them. This job will
execute and the adapter can either wait for it (see WaitForBatchResults) or let it run
asyncronously. Waiting for the job will report any errors that the job enconters but
will take more time. Determining if the job failed without waiting for it requires
manually checking the job status via the job stored procedures.
l
The GCSStaging mode is the same as Upload except that it uses your Google Cloud
Storage acccount to store staged data instead of BigQuery public storage. The
adapter cannot act asynchronously in this mode because it must delete the file after
the load is complete, which means that WaitForBatchResults has no effect.
Because this depends on external data, you must set the GCSBucket to the name of
your bucket and ensure that Scope (this is a space delimited set of scopes) contains
at least the scopes https://www.googleapis.com/auth/bigquery and
https://www.googleapis.com/auth/devstorage.read_write. The devstorage scope
used for GCS also requires that you connect using a service account because Google
BigQuery does not allow user accounts to use this scope.
In addition to bulk inserts, the adapter also supports performing bulk UPDATE and DELETE
operations. This requires the adapter to upload the data containing the filters and rows to
set into a new table in BigQuery, then perform a MERGE between the two tables and drop
the temporary table. InsertMode determines how the rows are inserted into the temporary
table but the Streaming and DML modes are not supported.
In order to perform bulk UPDATE specifically, you must define a primary key on the
relevant tables. The simplest way to do this is via the PrimaryKeyIdentifiers property.
Columns that are used for UPDATE filters must be marked as primary keys, while all other
columns columns may be modified by bulk UPDATE.
Minimum Required Roles
Minimum Required Roles for Service Accounts
The following roles allow SELECT queries to work with a service account:
l
BigQuery Data Viewer (roles/bigquery.dataViewer): read data and metadata
l
BigQuery Filtered Data Viewer (roles/bigquery.filteredDataViewer): view filtered table
data
l
BigQuery Job User (roles/bigquery.jobUser): run jobs, including queries, within the
project
TIBCO® Data Virtualization Google Big Query Adapter Guide
18 | Google BigQuery Adapter
Changelog
General Changes
Date Build
Number
Change
Type
Description
08/17/2022 8264 General Changed
l
We now support handling the keyword
"COLLATE" as standard function name as
well.
06/06/2022 8192 Google
BigQuery
Added
l
Added support for MERGE statements. The
syntax is a subset of what BigQuery
natively supports, including everything
except the MATCHED BY SOURCE,
DEFAULT and WHEN MATCHED AND
clauses.
04/27/2022 8152 Google
BigQuery
Added
l
Added support for the SAFE_CAST
function. Due to syntax limitations, it must
be written this way when passthrough
mode is disabled (the default):
SAFE_CAST(somecol, 'INTEGER')
. It supports SQL type names (VARCHAR,
INT, etc.) as well as BigQuery type names
(STRING, INT64, etc.)
04/11/2022 8136 Google
BigQuery
Changed
l
Changed what views are retrieved
TIBCO® Data Virtualization Google Big Query Adapter Guide
19 | Google BigQuery Adapter
depending on the connection property
UseLegacySQL. Connections using
standard SQL will not report views that
are only available in legacy SQL, and the
same goes for standard views in legacy
SQL connections.
03/01/2022 8095 Google
BigQuery
Added
l
Added support for CREATE VIEW AS
statements.
02/18/2022 8084 Google
BigQuery
Added
l
Added support for ALTER TABLE. Due to
BigQuery limitations, the only supported
operations are ADD COLUMN, DROP
COLUMN, ALTER COLUMN and (table)
RENAME TO.
01/26/2022 8061 Google
BigQuery
Added
l
Added support for querying snapshot
tables. Append it to the
AllowedTableTypes connection property
(along with other allowed table types) to
make them show up.
01/13/2022 8048 Google
BigQuery
Added
l
Added support for the NATIVEQUERY table
function. This function can be used after a
FROM to execute a query using BigQuery-
native SQL. For example,
SELECT * FROM NATIVEQUERY
('SELECT * FROM UNNEST([1,2,3])
AS a')
will execute the inner query in BigQuery
directly and return the results. This will
work even in tools which are not normally
TIBCO® Data Virtualization Google Big Query Adapter Guide
20 | Google BigQuery Adapter
compatible with QueryPassthrough=true.
01/04/2022 8039 Google
BigQuery
Added
l
Added support for querying materialized
views.
12/20/2021 8024 Google
BigQuery
Added
l
Added support for using the BigQuery
TABLESAMPLE without enabling
QueryPassthrough. If TableSamplePercent
is set then the random row selection
happens server-side for each table used in
a query.
12/10/2021 8014 Google
BigQuery
Changed
l
Migrate to the latest version of the
Storage API.
10/21/2021 7964 Google
BigQuery
Added
l
Added support for reading policy tags
from the Data Catalog service. These are
exposed using the PolicyTags column on
sys_tablecolumns.
10/20/2021 7963 Google
BigQuery
Added
l
Added support for parameterized data
types, both on the read side with and
without UseStorageAPI as well as on the
metadata side. The metadata is only
reported on the table itself and not in
queries, since BigQuery only enforces
length/precision/scale when the data is
written.
09/02/2021 7915 General Added
l
Added support for the STRING_SPLIT
table-valued function in the CROSS APPLY
TIBCO® Data Virtualization Google Big Query Adapter Guide
21 | Google BigQuery Adapter
clause.
09/02/2021 7915 Google
BigQuery
Added
l
Added support for bulk UPDATE and
DELETE operations. These make better
use of BigQuery's DML query limits
because each batch is just one MERGE
operation. They are also faster because
they use the bulk upload methods to send
the list of rows to be changed. Primary
keys must be defined on the tables to use
DELETE.
08/09/2021 7891 Google
BigQuery
Added
l
Support for staging data in all formats
allowed by BigQuery. JSON and Avro were
already supported, this adds options that
allow uploading data stored in the CSV,
Parquet and ORC formats.
08/07/2021 7889 General Changed
l
Add the KeySeq column to the sys_
foreignkeys table.
08/06/2021 7888 General Changed
l
Add the new sys_primarykeys system
table.
07/28/2021 7879 Google
BigQuery
Added
l
Added support for reading TIME,
DATETIME and TIMESTAMP data at the full
precision supported by BigQuery. We now
report microsecond precision in .NET-
based editions and millisecond precision
in Java-based editions.
07/23/2021 7874 General Changed
TIBCO® Data Virtualization Google Big Query Adapter Guide
22 | Google BigQuery Adapter
l
Updated the Literal Function Names for
relative date/datetime functions.
Previously relative date/datetime
functions resolved to a different value
when used in the projection vs te
predicate. Ie: SELECT LAST_MONTH() AS
lm, Col FROM Table WHERE Col > LAST_
MONTH(). Formerly the two LAST_MONTH
() methods would resolve to different
datetimes. Now they will match.
l
As a replacement for the previous
behavior, the relative date/datetime
functions in the criteria may have an 'L'
appended to them. Ie: WHERE col > L_
LAST_MONTH(). This will continue to
resolve to the same values that previously
were calculated in the criteria. Note that
the "L_" prefix will only work in the
predicate - it not available for the
projection.
07/20/2021 7871 Google
BigQuery
Added
l
Added support for staging data with
Google Cloud Storage when performing
inserts. This is similar to the existing
Upload insert mode, but transfers the row
data into GCS and then creates a load job
referencing it instead of performing the
upload directly into BigQuery.
07/08/2021 7859 General Added
l
Added the TCP Logging Module for the
logging information happening on the TCP
wire protocol. The transport bytes that are
incoming and ongoing will be logged at
verbosity=5.
06/18/2021 7839 Google Added
TIBCO® Data Virtualization Google Big Query Adapter Guide
23 | Google BigQuery Adapter
BigQuery
l
Added support for the GOOGLEJSONBLOB
JWT certificate type. This works like the
existing GOOGLEJSON certificate type
except that the certificate is provided as
JSON text instead of as a file path.
05/27/2021 7817 Google
BigQuery
Added
l
Added support for automatically
reconnecting on Storage connections that
have long idle times. This is useful most
for ETL applications and other scenarios
where the consumer reading data out of
the driver is expected to be slower than
the driver itself. In these cases the
connection can be dropped by the other
side (either by the Storage API itself or
network appliances like firewalls and
proxies) leading to timeouts. To avoid this
the driver will restart the read from the
same position using a new connection.
04/23/2021 7785 General Added
l
Added support for handling client side
formulas during insert / update. For
example: UPDATE Table SET Col1 = Concat
(Col1, " - ", Col2) WHERE Col2 LIKE 'A%'
04/23/2021 7783 General Changed
l
Updated how display sizes are determined
for varchar primary key and foreign key
columns so they will match the reported
length of the column.
04/16/2021 7776 General Added
l
Non-conditional updates between two
columns is now available to all drivers. For
example: UPDATE Table SET Col1=Col2
TIBCO® Data Virtualization Google Big Query Adapter Guide
24 | Google BigQuery Adapter
Changed
l
Reduced the length to 255 for varchar
primary key and foreign key columns.
l
Updated implicit and metadata caching to
improve performance and support for
multiple connections. Old metadata
caches are not compatible - you would
need to generate new metadata caches if
you are currently using CacheMetadata.
l
Updated index naming convention to
avoid duplicates
l
Updated and standardized Getting Started
connection help.
l
Added the Advanced Features section to
the help of all drivers.
l
Categorized connection property listings
in the help for all editions.
04/15 /2021 7775 General Changed
l
Kerberos authentication is updated to use
TCP by default, but will fall back to UDP if
a TCP connection cannot be established
04/05/2021 7765 Google
BigQuery
Added
l
Added support for data unnesting. This
mode performs client-side expansion of
array data across multiple rows in a way
that emulates the compact preview in the
BigQuery UI. Server-side support of some
options is limited in this mode (LIMIT and
OFFSET) but otherwise behaves like the
nested mode with the exception that it
always outputs flat data and never
outputs aggregates.
Changed
TIBCO® Data Virtualization Google Big Query Adapter Guide
25 | Google BigQuery Adapter
l
Updated the default value for
WaitForBatchResults to true since not
checking the job status at smaller batch
sizes can lead to lost jobs.
Deprecated
l
Deprecated the UseStreamingInserts
connection property in favor of
InsertMode. The default for InsertMode is
Streaming.
03/04/2021 7733 Google
BigQuery
Removed
l
Removed the TempTableDataset and
TempTableExpirationTime connection
properties. These properties have been
defunct since we moved to using the
BigQuery-managed query cache for large
resultsets.
01/29/2021 7699 Google
BigQuery
Added
l
Added support for the BIGNUMERIC type,
which is twice the size of the existing
numeric type. JDBC supports this type at
full precision while .NET requires using the
connection property IgnoreTypes=decimal
because the full precision of BIGNUMERIC
(as with regular NUMERIC) is too high for
System.Decimal.
01/08/2021 7678 Google
BigQuery
Added
l
Added support for using the upload API
for batch inserts. This is an API that is
slower than the streaming API for large
volumes of data, but is free to use and
doesn't have the same buffering delays
that streaming does. It is also async by
default although the option to wait on the
batch is available via the new
TIBCO® Data Virtualization Google Big Query Adapter Guide
26 | Google BigQuery Adapter
WaitForBatchResults connection property.
12/23/2020 7663 Google
BigQuery
Added
l
Added support for unnesting metadata
with the Storage API.
12/21/2020 7660 Google
BigQuery
Added
l
Added support for unnesting metadata
with the REST API. By enabling
UnnestArrays, metadata for fields within
REPEATED RECORD types is returned as
separate columns instead of generating
JSON aggregates. These values can be
selected within queries but are currently
treated as NULL.
12/07/2020 7646 Google
BigQuery
Added
l
Added support for injecting partition
filters into queries that require them but
do not have them, if the user enables the
InsertPartitionFilterOption. This modifies
queries against partitioned tables that
require a partition filter so that they
always contain a valid partition filter,
which allows queries like simple SELECT *
FROM t to work. Currently all partitions
are selected by the generated filter.
Advanced Features
This section details a selection of advanced features of the Google BigQuery adapter.
User Defined Views
The adapter allows you to define virtual tables, called user defined views, whose contents
are decided by a pre-configured query. These views are useful when you cannot directly
TIBCO® Data Virtualization Google Big Query Adapter Guide
27 | Google BigQuery Adapter
control queries being issued to the drivers. See User Defined Views for an overview of
creating and configuring custom views.
SSL Configuration
Use SSL Configuration to adjust how adapter handles TLS/SSL certificate negotiations. You
can choose from various certificate formats; see the SSLServerCert property under
"Connection String Options" for more information.
Firewall and Proxy
Configure the adapter for compliance with Firewall and Proxy, including Windows proxies
and HTTP proxies. You can also set up tunnel connections.
Query Processing
The adapter offloads as much of the SELECT statement processing as possible to Google
BigQuery and then processes the rest of the query in memory (client-side).
Logging
See Logging for an overview of configuration settings that can be used to refine CData
logging. For basic logging, you only need to set two connection properties, but there are
numerous features that support more refined logging, where you can select subsets of
information to be logged using the LogModules connection property.
User Defined Views
The Google BigQuery Adapter allows you to define a virtual table whose contents are
decided by a pre-configured query. These are called User Defined Views, which are useful in
situations where you cannot directly control the query being issued to the driver, e.g. when
using the driver from a tool. The User Defined Views can be used to define predicates that
are always applied. If you specify additional predicates in the query to the view, they are
TIBCO® Data Virtualization Google Big Query Adapter Guide
28 | Google BigQuery Adapter
combined with the query already defined as part of the view.
There are two ways to create user defined views:
l
Create a JSON-formatted configuration file defining the views you want.
l
DDL statements.
Defining Views Using a Configuration File
User Defined Views are defined in a JSON-formatted configuration file called
UserDefinedViews.json. The adapter automatically detects the views specified in this file.
You can also have multiple view definitions and control them using the UserDefinedViews
connection property. When you use this property, only the specified views are seen by the
adapter.
This User Defined View configuration file is formatted as follows:
l
Each root element defines the name of a view.
l
Each root element contains a child element, called query, which contains the custom
SQL query for the view.
For example:
{
"MyView": {
"query": "SELECT * FROM [publicdata].[samples].github_nested WHERE
MyColumn = 'value'"
},
"MyView2": {
"query": "SELECT * FROM MyTable WHERE Id IN (1,2,3)"
}
}
Use the UserDefinedViews connectio property to specify the location of your JSON
configuration file. For example:
"UserDefinedViews",
"C:\\Users\\yourusername\\Desktop\\tmp\\UserDefinedViews.json"
TIBCO® Data Virtualization Google Big Query Adapter Guide
29 | Google BigQuery Adapter
Defining Views Using DDL Statements
The adapter is also capable of creating and altering the schema via DDL Statements such
as CREATE LOCAL VIEW, ALTER LOCAL VIEW, and DROP LOCAL VIEW.
Create a View
To create a new view using DDL statements, provide the view name and query as follows:
CREATE LOCAL VIEW [MyViewName] AS SELECT * FROM Customers LIMIT 20;
If no JSON file exists, the above code creates one. The view is then created in the JSON
configuration file and is now discoverable. The JSON file location is specified by the
UserDefinedViews connection property.
Alter a View
To alter an existing view, provide the name of an existing view alongside the new query
you would like to use instead:
ALTER LOCAL VIEW [MyViewName] AS SELECT * FROM Customers WHERE
TimeModified > '3/1/2020';
The view is then updated in the JSON configuration file.
Drop a View
To drop an existing view, provide the name of an existing schema alongside the new query
you would like to use instead.
DROP LOCAL VIEW [MyViewName]
This removes the view from the JSON configuration file. It can no longer be queried.
Schema for User Defined Views
User Defined Views are exposed in the UserViews schema by default. This is done to avoid
TIBCO® Data Virtualization Google Big Query Adapter Guide
30 | Google BigQuery Adapter
the view's name clashing with an actual entity in the data model. You can change the
name of the schema used for UserViews by setting the UserViewsSchemaName property.
Working with User Defined Views
For example, a SQL statement with a User Defined View called UserViews.RCustomers only
lists customers in Raleigh:
SELECT * FROM Customers WHERE City = 'Raleigh';
An example of a query to the driver:
SELECT * FROM UserViews.RCustomers WHERE Status = 'Active';
Resulting in the effective query to the source:
SELECT * FROM Customers WHERE City = 'Raleigh' AND Status = 'Active';
That is a very simple example of a query to a User Defined View that is effectively a
combination of the view query and the view definition. It is possible to compose these
queries in much more complex patterns. All SQL operations are allowed in both queries
and are combined when appropriate.
SSL Configuration
Customizing the SSL Configuration
By default, the adapter attempts to negotiate SSL/TLS by checking the server's certificate
against the system's trusted certificate store.
To specify another certificate, see the SSLServerCert property for the available formats to
do so.
Firewall and Proxy
TIBCO® Data Virtualization Google Big Query Adapter Guide
31 | Google BigQuery Adapter
Connecting Through a Firewall or Proxy
HTTP Proxies
To connect through the Windows system proxy, you do not need to set any additional
connection properties. To connect to other proxies, set ProxyAutoDetect to false.
In addition, to authenticate to an HTTP proxy, set ProxyAuthScheme, ProxyUser, and
ProxyPassword, in addition to ProxyServer and ProxyPort.
Other Proxies
Set the following properties:
l
To use a proxy-based firewall, set FirewallType, FirewallServer, and FirewallPort.
l
To tunnel the connection, set FirewallType to TUNNEL.
l
To authenticate, specify FirewallUser and FirewallPassword.
l
To authenticate to a SOCKS proxy, additionally set FirewallType to SOCKS5.
Query Processing
Query Processing
CData has a client-side SQL engine built into the adapter library. This enables support for
the full capabilities that SQL-92 offers, including filters, aggregations, functions, etc.
For sources that do not support SQL-92, the adapter offloads as much of SQL statement
processing as possible to Google BigQuery and then processes the rest of the query in
memory (client-side). This results in optimal performance.
For data sources with limited query capabilities, the adapter handles transformations of
the SQL query to make it simpler for the adapter. The goal is to make smart decisions
based on the query capabilities of the data source to push down as much of the
computation as possible. The Google BigQuery Query Evaluation component examines SQL
queries and returns information indicating what parts of the query the adapter is not
capable of executing natively.
TIBCO® Data Virtualization Google Big Query Adapter Guide
32 | Google BigQuery Adapter
The Google BigQuery Query Slicer component is used in more specific cases to separate a
single query into multiple independent queries. The client-side Query Engine makes
decisions about simplifying queries, breaking queries into multiple queries, and pushing
down or computing aggregations on the client-side while minimizing the size of the result
set.
There's a significant trade-off in evaluating queries, even partially, client-side. There are
always queries that are impossible to execute efficiently in this model, and some can be
particularly expensive to compute in this manner. CData always pushes down as much of
the query as is feasible for the data source to generate the most efficient query possible
and provide the most flexible query capabilities.
More Information
For a full discussion of how CData handles query processing, see CData Architecture:
Query Execution.
Logging
Capturing adapter logging can be very helpful when diagnosing error messages or other
unexpected behavior.
Basic Logging
You will simply need to set two connection properties to begin capturing adapter logging.
l
Logfile: A filepath which designates the name and location of the log file.
l
Verbosity: This is a numerical value (1-5) that determines the amount of detail in the
log. See the page in the Connection Properties section for an explanation of the five
levels.
l
MaxLogFileSize: When the limit is hit, a new log is created in the same folder with the
date and time appended to the end. The default limit is 100 MB. Values lower than
100 kB will use 100 kB as the value instead.
l
MaxLogFileCount: A string specifying the maximum file count of log files. When the
limit is hit, a new log is created in the same folder with the date and time appended
to the end and the oldest log file will be deleted. Minimum supported value is 2. A
value of 0 or a negative value indicates no limit on the count.
TIBCO® Data Virtualization Google Big Query Adapter Guide
33 | Google BigQuery Adapter
Once this property is set, the adapter will populate the log file as it carries out various
tasks, such as when authentication is performed or queries are executed. If the specified
file doesn't already exist, it will be created.
Log Verbosity
The verbosity level determines the amount of detail that the adapter reports to the Logfile.
Verbosity levels from 1 to 5 are supported. These are described in the following list:
1 Setting Verbosity to 1 will log the query, the number of rows returned by it, the start of
execution and the time taken, and any errors.
2 Setting Verbosity to 2 will log everything included in Verbosity 1 and additional
information about the request.
3 Setting Verbosity to 3 will additionally log HTTP headers, as well as the body of the
request and the response.
4 Setting Verbosity to 4 will additionally log transport-level communication with the data
source. This includes SSL negotiation.
5 Setting Verbosity to 5 will additionally log communication with the data source and
additional details that may be helpful in troubleshooting problems. This includes
interface commands.
The Verbosity should not be set to greater than 1 for normal operation. Substantial
amounts of data can be logged at higher verbosities, which can delay execution times.
To refine the logged content further by showing/hiding specific categories of information,
see LogModules.
Sensitive Data
Verbosity levels 3 and higher may capture information that you do not want shared
outside of your organization. The following lists information of concern for each level:
l
Verbosity 3: The full body of the request and the response, which includes all the
TIBCO® Data Virtualization Google Big Query Adapter Guide
34 | Google BigQuery Adapter
data returned by the adapter
l
Verbosity 4: SSL certificates
l
Verbosity 5: Any extra transfer data not included at Verbosity 3, such as non human-
readable binary transfer data
Best Practices for Data Security
Although we mask sensitive values, such as passwords, in the connection string and any
request in the log, it is always best practice to review the logs for any sensitive information
before sharing outside your organization.
Java Logging
When Java logging is enabled in Logfile, the Verbosity will instead map to the following
logging levels.
l
0: Level.WARNING
l
1: Level.INFO
l
2: Level.CONFIG
l
3: Level.FINE
l
4: Level.FINER
l
5: Level.FINEST
Advanced Logging
You may want to refine the exact information that is recorded to the log file. This can be
accomplished using the LogModules property.
This property allows you to filter the logging using a semicolon-separated list of logging
modules.
All modules are four characters long. Please note that modules containing three letters
have a required trailing blank space. The available modules are:
l
EXEC: Query Execution. Includes execution messages for original SQL queries, parsed
TIBCO® Data Virtualization Google Big Query Adapter Guide
35 | Google BigQuery Adapter
SQL queries, and normalized SQL queries. Query and page success/failure messages
appear here as well.
l
INFO: General Information. Includes the connection string, driver version (build
number), and initial connection messages.
l
HTTP: HTTP Protocol messages. Includes HTTP requests/responses (including POST
messages), as well as Kerberos related messages.
l
SSL : SSL certificate messages.
l
OAUT: OAuth related failure/success messages.
l
SQL : Includes SQL transactions, SQL bulk transfer messages, and SQL result set
messages.
l
META: Metadata cache and schema messages.
l
TCP : Incoming and Ongoing raw bytes on TCP transport layer messages.
An example value for this property would be.
LogModules=INFO;EXEC;SSL ;SQL ;META;
Note that these modules refine the information as it is pulled after taking the Verbosity
into account.
SQL Compliance
The Google BigQuery Adapter supports several operations on data, including querying,
deleting, modifying, and inserting.
SELECT Statements
See SELECT Statements for a syntax reference and examples.
See Data Model for information on the capabilities of the Google BigQuery API.
TIBCO® Data Virtualization Google Big Query Adapter Guide
36 | Google BigQuery Adapter
INSERT Statements
See INSERT Statements for a syntax reference and examples, as well as retrieving the new
records' Ids.
UPDATE Statements
The primary key Id is required to update a record. See UPDATE Statements for a syntax
reference and examples.
DELETE Statements
The primary key Id is required to delete a record. See DELETE Statements for a syntax
reference and examples.
EXECUTE Statements
Use EXECUTE or EXEC statements to execute stored procedures. See EXECUTE Statements
for a syntax reference and examples.
Names and Quoting
l
Table and column names are considered identifier names; as such, they are restricted
to the following characters: [A-Z, a-z, 0-9, _:@].
l
To use a table or column name with characters not listed above, the name must be
quoted using double quotes ("name") in any SQL statement.
l
Strings must be quoted using single quotes (e.g., 'John Doe').
SELECT Statements
TIBCO® Data Virtualization Google Big Query Adapter Guide
37 | Google BigQuery Adapter
Google BigQuery API Syntax
The Google BigQuery API offers additional SQL operators and functions. A complete list of
the available syntax is located at: https://cloud.google.com/bigquery/query-reference
A SELECT statement can consist of the following basic clauses.
l
SELECT
l
INTO
l
FROM
l
JOIN
l
WHERE
l
GROUP BY
l
HAVING
l
UNION
l
ORDER BY
l
LIMIT
SELECT Syntax
The following syntax diagram outlines the syntax supported by the Google BigQuery
adapter:
SELECT {
[ TOP <numeric_literal> | DISTINCT ]
{
*
| {
<expression> [ [ AS ] <column_reference> ]
| { <table_name> | <correlation_name> } .*
} [ , ... ]
}
[ INTO csv:// [ filename= ] <file_path> [ ;delimiter=tab ] ]
{
FROM <table_reference> [ [ AS ] <identifier> ]
} [ , ... ]
[ [
TIBCO® Data Virtualization Google Big Query Adapter Guide
38 | Google BigQuery Adapter
INNER | { { LEFT | RIGHT | FULL } [ OUTER ] }
] JOIN <table_reference> [ ON <search_condition> ] [ [ AS ]
<identifier> ]
] [ ... ]
[ WHERE <search_condition> ]
[ GROUP BY <column_reference> [ , ... ]
[ HAVING <search_condition> ]
[
ORDER BY
<column_reference> [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]
]
[
LIMIT <expression>
]
} | SCOPE_IDENTITY()
<expression> ::=
| <column_reference>
| @ <parameter>
| ?
| COUNT( * | { [ DISTINCT ] <expression> } )
| { AVG | MAX | MIN | SUM | COUNT } ( <expression> )
| NULLIF ( <expression> , <expression> )
| COALESCE ( <expression> , ... )
| CASE <expression>
WHEN { <expression> | <search_condition> } THEN { <expression> |
NULL } [ ... ]
[ ELSE { <expression> | NULL } ]
END
| <literal>
| <sql_function>
<search_condition> ::=
{
<expression> { = | > | < | >= | <= | <> | != | LIKE | NOT LIKE |
IN | NOT IN | IS NULL | IS NOT NULL | AND | OR | + | - | * | / | % | ||
} [ <expression> ]
} [ { AND | OR } ... ]
Examples
1. Return all columns:
SELECT * FROM [publicdata].[samples].github_nested
TIBCO® Data Virtualization Google Big Query Adapter Guide
39 | Google BigQuery Adapter
2. Rename a column:
SELECT "repository.name" AS MY_repository.name FROM [publicdata].
[samples].github_nested
3. Cast a column's data as a different data type:
SELECT CAST(repository.watchers AS VARCHAR) AS Str_
repository.watchers FROM [publicdata].[samples].github_nested
4. Search data:
SELECT * FROM [publicdata].[samples].github_nested WHERE
repository.name = 'EntityFramework'
5. The Google BigQuery APIs support the following operators in the WHERE clause: =, >,
<, >=, <=, <>, !=, LIKE, NOT LIKE, IN, NOT IN, IS NULL, IS NOT NULL, AND, OR, +, -, *, /,
%, ||.
SELECT * FROM [publicdata].[samples].github_nested WHERE
repository.name = 'EntityFramework';
6. Return the number of items matching the query criteria:
SELECT COUNT(*) AS MyCount FROM [publicdata].[samples].github_
nested
7. Return the unique items matching the query criteria:
SELECT DISTINCT repository.name FROM [publicdata].[samples].github_
nested
8. Summarize data:
SELECT repository.name, MAX(repository.watchers) FROM [publicdata].
[samples].github_nested GROUP BY repository.name
See Aggregate Functions for details.
9. Retrieve data from multiple tables.
TIBCO® Data Virtualization Google Big Query Adapter Guide
40 | Google BigQuery Adapter
SELECT * FROM CRMAccounts INNER JOIN ERPCustomers ON
CRMAccounts.BillingState = ERPCustomers.BillingState
See JOIN Queries for details.
10. Sort a result set in ascending order:
SELECT actor.attributes.email, repository.name FROM [publicdata].
[samples].github_nested ORDER BY repository.name ASC
Aggregate Functions
Google BigQuery API Syntax
The Google BigQuery API offers additional SQL operators and functions. A complete list of
the available syntax is located at: https://cloud.google.com/bigquery/query-reference
Examples of Aggregate Functions
Below are several examples of SQL aggregate functions. You can use these with a GROUP
BY clause to aggregate rows based on the specified GROUP BY criterion. This can be a
reporting tool.
COUNT
Returns the number of rows matching the query criteria.
SELECT COUNT(*) FROM [publicdata].[samples].github_nested WHERE
repository.name = 'EntityFramework'
AVG
Returns the average of the column values.
TIBCO® Data Virtualization Google Big Query Adapter Guide
41 | Google BigQuery Adapter
SELECT repository.name, AVG(repository.watchers) FROM [publicdata].
[samples].github_nested WHERE repository.name = 'EntityFramework' GROUP
BY repository.name
MIN
Returns the minimum column value.
SELECT MIN(repository.watchers), repository.name FROM [publicdata].
[samples].github_nested WHERE repository.name = 'EntityFramework' GROUP
BY repository.name
MAX
Returns the maximum column value.
SELECT repository.name, MAX(repository.watchers) FROM [publicdata].
[samples].github_nested WHERE repository.name = 'EntityFramework' GROUP
BY repository.name
SUM
Returns the total sum of the column values.
SELECT SUM(repository.watchers) FROM [publicdata].[samples].github_
nested WHERE repository.name = 'EntityFramework'
CORR
Returns the Pearson correlation coefficient of a set of number pairs.
SELECT repository.name, CORR(repository.watchers, repository.size) FROM
[publicdata].[samples].github_nested
COVAR_POP
Computes the population covariance of the values computed by a set of number pairs.
TIBCO® Data Virtualization Google Big Query Adapter Guide
42 | Google BigQuery Adapter
SELECT repository.name, COVAR_POP(repository.watchers, repository.size)
FROM [publicdata].[samples].github_nested
COVAR_SAMP
Computes the sample covariance of the values computed by a set of number pairs.
SELECT repository.name, COVAR_SAMP(repository.watchers, repository.size)
FROM [publicdata].[samples].github_nested
NTH
Returns the nth sequential value in the scope of the function, where n is a constant. The
NTH function starts counting at 1, so there is no zeroth term. If the scope of the function
has less than n values, the function returns NULL.
SELECT repository.name, NTH(n, actor.attributes.email) FROM
[publicdata].[samples].github_nested
STDDEV
Returns the standard deviation of the computed values. Rows with a NULL value are not
included in the calculation.
SELECT repository.name, STDDEV(repository.watchers) FROM [publicdata].
[samples].github_nested
JOIN Queries
The adapter supports the complete join syntax in Google BigQuery. Google BigQuery
supports inner joins, outer joins, and cross joins. The default is inner. Multiple join
operations are supported.
SELECT field_1 [..., field_n] FROM
table_1 [[AS] alias_1]
[[INNER|[FULL|RIGHT|LEFT] OUTER|CROSS] JOIN [EACH]
table_2 [[AS] alias_2]
[ON join_condition_1 [... AND join_condition_n]]
TIBCO® Data Virtualization Google Big Query Adapter Guide
43 | Google BigQuery Adapter
]+
Note that the default join is an inner join. The following limitations exist on joins in Google
BigQuery:
l
Cross joins must not contain an ON clause.
l
Normal joins require that the right-side table must contain less than 8 MB of
compressed data. If you are working with tables larger than 8 MB, use the EACH
modifier. Note that EACH cannot be used in cross joins.
Projection Functions
ANY_VALUE(expression)
Returns any value from the input or NULL if there are zero input rows. The value returned
is non-deterministic, which means you might receive a different result each time you use
this function.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to retrieve a value from.
APPROX_COUNT_DISTINCT(expression)
Returns the approximate result for COUNT(DISTINCT expression). The value returned is a
statistical estimate-not necessarily the actual value. This function is less accurate than
COUNT(DISTINCT expression), but performs better on huge input.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the approximate count distinct on.
APPROX_QUANTILES(expression, number)
Returns the approximate boundaries for a group of expression values, where number
represents the number of quantiles to create. This function returns an array of number + 1
elements, where the first element is the approximate minimum and the last element is the
TIBCO® Data Virtualization Google Big Query Adapter Guide
44 | Google BigQuery Adapter
approximate maximum.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the approximate quantiles on.
l
number: The number of quantiles to create.
APPROX_TOP_COUNT(expression, number)
Returns the approximate top elements of expression. The number parameter specifies the
number of elements returned.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the approximate top count on.
l
number: The number of elements to be returned.
APPROX_TOP_SUM(expression, weight, number)
Returns the approximate top elements of expression, based on the sum of an assigned
weight. The number parameter specifies the number of elements returned. If the weight
input is negative or NaN, this function returns an error.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the approximate top sum on.
l
weight: The assigned weight.
l
number: The number of elements to be returned.
ARRAY(subquery)
The ARRAY function returns an ARRAY with one element for each row in a subquery.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
subquery: The subquery to execute.
TIBCO® Data Virtualization Google Big Query Adapter Guide
45 | Google BigQuery Adapter
ARRAY_CONCAT(array_expr1 [, array_expr2 [, ...]])
Concatenates one or more arrays with the same element type into a single array.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
array_expr1: The first array to concatenate.
l
array_expr2: The second array to concatenate.
ARRAY_LENGTH(array_expr)
Returns the size of the array. Returns 0 for an empty array. Returns NULL if the array_
expression is NULL.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
array_expr: The array expression to get the size of.
ARRAY_TO_STRING(array_expr, delimiter [, null_text])
Returns a concatenation of the elements in array_expression as a STRING. The value for
array_expression can either be an array of STRING or BYTES data types.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
array_expr: The array expression to convert to string.
l
delimiter: The delimiter string used to delimit the array values.
l
null_text: If the null_text parameter is used, the function replaces any NULL values
in the array with the value of null_text. If the null_text parameter is not used, the
function omits the NULL value and its preceding delimiter.
GENERATE_ARRAY(start_expr, end_expr [, step_expr])
Returns an array of values. The start_expression and end_expression parameters determine
the inclusive start and end of the array.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
start_expr: The starting value.
TIBCO® Data Virtualization Google Big Query Adapter Guide
46 | Google BigQuery Adapter
l
end_expr: The ending value.
l
step_expr: The increment used to generate array values.
GENERATE_DATE_ARRAY(start_date, end_date [, INTERVAL
int_expr date_part])
Returns an array of dates. The start_date and end_date parameters determine the inclusive
start and end of the array.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
start_date: The starting date.
l
end_date: The ending date.
l
int_expr: The increment used to generate dates.
l
date_part: The date part used to increment the generated dates. Valid values are:
DAY, WEEK, MONTH, QUARTER, and YEAR.
ARRAY_REVERSE(array_expr)
Returns the input ARRAY with elements in reverse order.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
array_expr: The array to reverse.
ARRAY_AGG(expression)
Returns an ARRAY of expression values.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression values to generate an array from.
ARRAY_CONCAT_AGG(expression1[, expression2 [,...]])
Concatenates elements from expression of type ARRAY, returning a single ARRAY as a
result. This function ignores NULL input arrays, but respects the NULL elements in non-
NULL input arrays (an error is raised, however, if an array in the final query result contains
TIBCO® Data Virtualization Google Big Query Adapter Guide
47 | Google BigQuery Adapter
a NULL element).
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression1: The first expression to concatenate.
l
expression2: The first expression to concatenate.
AVG([DISTINCT] expression)
Returns the average on non-null values. Each distinct value of expression is aggregated
only once into the result.
l
expression: The expression to use to compute the average.
BIT_AND(numeric_expression)
Returns the result of a bitwise AND operation between each instance of numeric_expr
across all rows. NULL values are ignored. This function returns NULL if all instances of
numeric_expr evaluate to NULL.
l
numeric_expression: The numeric expression to perform the bitwise operation.
BIT_COUNT(expression)
The input, expression, must be an integer or BYTES. Returns the number of bits that are set
in the input expression. For integers, this is the number of bits in two's complement form.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the bit count operation on.
BIT_OR(numeric_expression)
Returns the result of a bitwise OR operation between each instance of numeric_expr across
all rows. NULL values are ignored. This function returns NULL if all instances of numeric_
expr evaluate to NULL.
l
numeric_expression: The numeric expression to perform the bitwise operation.
TIBCO® Data Virtualization Google Big Query Adapter Guide
48 | Google BigQuery Adapter
BIT_XOR(numeric_expression)
Returns the result of a bitwise XOR operation between each instance of numeric_expr
across all rows. NULL values are ignored. This function returns NULL if all instances of
numeric_expr evaluate to NULL.
l
numeric_expression: The numeric expression to perform the bitwise operation.
CORR(numeric_expression1, numeric_expression2)
Returns the Pearson correlation coefficient of a set of number pairs.
l
numeric_expression1: The first series.
l
numeric_expression2: The second series.
COUNTIF(expression)
Returns the count of TRUE values for expression. Returns 0 if there are zero input rows or
expression evaluates to FALSE for all rows.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to evaluate.
COVAR_POP(numeric_expression1, numeric_expression2)
Computes the population covariance of the values computed by numeric_expression1 and
numeric_expression2.
l
numeric_expression: The first series.
l
numeric_expression: The second series.
COVAR_SAMP(numeric_expression1, numeric_expression2)
Computes the sample covariance of the values computed by numeric_expression1 and
numeric_expression2.
TIBCO® Data Virtualization Google Big Query Adapter Guide
49 | Google BigQuery Adapter
l
numeric_expression: The first series.
l
numeric_expression: The second series.
FIRST(column)
Returns the first sequential value in the scope of the function.
Note: this function is only available when UseLegacySQL=True.
l
column: Any column expression.
FIRST_VALUE(value_expression [(IGNORE/RESPECT) NULLS])
Returns the value of the value_expression for the first row in the current window frame.
Note: this function only supports [IGNORE NULLS] when using Standard SQL
(UseLegacySQL=False).
l
value_expression: Any value expression
GROUP_CONCAT(string_expression [, separator])
Concatenates multiple strings into a single string, where each value is separated by the
optional separator parameter. If separator is omitted, returns a comma-separated string.
Note: this function is only available when UseLegacySQL=True.
l
string_expression: The string expression to concat.
l
separator: The separator.
GROUP_CONCAT_UNQUOTED(string_expression [, separator])
Concatenates multiple strings into a single string, where each value is separated by the
optional separator parameter. If separator is omitted, BigQuery returns a comma-separated
string. Unlike GROUP_CONCAT, this function will not add double quotes to returned values
that include a double quote character. For example, the string a"b would return as a"b.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
50 | Google BigQuery Adapter
l
string_expression: The string expression to concat.
l
separator: The separator.
LAST(column)
Returns the last sequential value in the scope of the function.
Note: this function is only available when UseLegacySQL=True.
l
column: Any column expression
LAST_VALUE(value_expression [(IGNORE/RESPECT) NULLS])
Returns the value of the value_expression for the last row in the current window frame.
Note: this function only supports [IGNORE NULLS] when using Standard SQL
(UseLegacySQL=False).
l
value_expression: Any value expression
LOGICAL_AND(expression)
Returns the logical AND of all non-NULL expressions. Returns NULL if there are zero input
rows or expression evaluates to NULL for all rows.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the logical AND on.
LOGICAL_OR(expression)
Returns the logical OR of all non-NULL expressions. Returns NULL if there are zero input
rows or expression evaluates to NULL for all rows.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to perform the logical OR on.
TIBCO® Data Virtualization Google Big Query Adapter Guide
51 | Google BigQuery Adapter
NEST(expression)
Aggregates all values in the current aggregation scope into a repeated field. For example,
the query SELECT x, NEST(y) FROM ... GROUP BY x returns one output record for each
distinct x value, and contains a repeated field for all y values paired with x in the query
input. The NEST function requires a GROUP BY clause.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any expression.
NOW()
Returns the current UNIX timestamp in microseconds.
Note: this function is only available when UseLegacySQL=True.
NTH(n, field)
Returns the nth sequential value in the scope of the function, where n is a constant. The
NTH function starts counting at 1, so there is no zeroth term. If the scope of the function
has less than n values, the function returns NULL.
Note: this function is only available when UseLegacySQL=True.
l
n: The nth sequential value.
l
field: The column name.
NTH_VALUE(value_expression, constant_integer_expression)
Returns the value of value_expression at the Nth row of the current window frame, where
Nth is defined by constant_integer_expression. Returns NULL if there is no such row.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value_expression: Any value expression.
l
constant_integer_expression: The nth sequential value.
TIBCO® Data Virtualization Google Big Query Adapter Guide
52 | Google BigQuery Adapter
QUANTILES(expression [, buckets])
Computes approximate minimum, maximum, and quantiles for the input expression. NULL
input values are ignored. Empty or exclusively-NULL input results in NULL output. The
number of quantiles computed is controlled with the optional buckets parameter, which
includes the minimum and maximum in the count.
Note: this function is only available when UseLegacySQL=True.
l
expression: The numeric expression to compute quantiles on.
l
buckets: The number of buckets.
STDDEV(numeric_expression)
Returns the standard deviation of the values computed by numeric_expr. Rows with a
NULL value are not included in the calculation.
l
numeric_expression: The series to calculate STDDEV on.
STDDEV_POP(numeric_expression)
Computes the population standard deviation of the value computed by numeric_expr.
l
numeric_expression: The series to calculate STDDEV on.
STDDEV_SAMP([DISTINCT] numeric_expression)
Computes the sample standard deviation of the value computed by numeric_expr.
l
numeric_expression: The series to calculate STDDEV on.
STRING_AGG(expression[, delimiter])
Returns a value (either STRING or BYTES) obtained by concatenating non-null values. If a
delimiter is specified, concatenated values are separated by that delimiter; otherwise, a
comma is used as a delimiter.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
53 | Google BigQuery Adapter
l
expression: The string expression to concatenate.
l
delimiter: The delimiter to separate concatenated values.
SUM([DISTINCT] expression)
Returns the sum on non-null values. Each distinct value of expression is aggregated only
once into the result.
l
expression: The expression to use to compute the sum.
TOP(column [, max_values][, multiplier])
TOP is a function that is an alternative to the GROUP BY clause. It is used as simplified
syntax for GROUP BY ... ORDER BY ... LIMIT .... Generally, the TOP function performs faster
than the full ... GROUP BY ... ORDER BY ... LIMIT ... query, but may only return approximate
results.
Note: this function is only available when UseLegacySQL=True.
l
numeric_expression: The series to calculate STDDEV on.
l
max_values: The maximum number of results to return. Default is 20.
l
multiplier: A positive integer that increases the value(s) returned by COUNT(*) by
the multiple specified.
UNIQUE(expression)
Returns the set of unique, non-NULL values in the scope of the function in an undefined
order.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any expression.
VARIANCE(numeric_expression)
Computes the variance of the values computed by numeric_expr. Rows with a NULL value
are not included in the calculation.
TIBCO® Data Virtualization Google Big Query Adapter Guide
54 | Google BigQuery Adapter
l
numeric_expression: The series to calculate VARIANCE on.
VAR_POP(numeric_expression)
Computes the population variance of the values computed by numeric_expr.
l
numeric_expression: The series to calculate VARIANCE on.
VAR_SAMP([DISTINCT] numeric_expression)
Computes the sample variance of the values computed by numeric_expr.
l
numeric_expression: The series to calculate VARIANCE on.
RANK()
Returns the ordinal (1-based) rank of each row within the ordered partition. All peer rows
receive the same rank value. The next row or set of peer rows receives a rank value which
increments by the number of peers with the previous rank value, instead of a rank value
which always increments by 1.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
PERCENT_RANK()
Return the percentile rank of a row defined as (RK-1)/(NR-1), where RK is the RANK of the
row and NR is the number of rows in the partition. Returns 0 if NR=1.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
NTILE(constant_integer_expression)
This function divides the rows into constant_integer_expression buckets based on row
ordering and returns the 1-based bucket number that is assigned to each row. The number
of rows in the buckets can differ by at most 1. The remainder values (the remainder of
number of rows divided by buckets) are distributed one for each bucket, starting with
bucket 1. If constant_integer_expression evaluates to NULL, 0 or negative, an error is
TIBCO® Data Virtualization Google Big Query Adapter Guide
55 | Google BigQuery Adapter
provided.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
constant_integer_expression: The number of buckets to divide the rows into.
LEAD(value_expression [, offset [, default_expression]])
Returns the value of the value_expression on a subsequent row. Changing the offset value
changes which subsequent row is returned; the default value is 1, indicating the next row
in the window frame. An error occurs if offset is NULL or a negative value.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value_expression: The value expression.
l
offset: The offset to use. Must be a non-negative integer.
l
default_expression: The default expression. Must be compatible with the value_
expression type.
LAG(value_expression [, offset [, default_expression]])
Returns the value of the value_expression on a subsequent row. Changing the offset value
changes which subsequent row is returned; the default value is 1, indicating the next row
in the window frame. An error occurs if offset is NULL or a negative value.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value_expression: The value expression.
l
offset: The offset to use. Must be a non-negative integer.
l
default_expression: The default expression. Must be compatible with the value_
expression type.
PERCENTILE_CONT(value_expression [, percentile [{RESPECT
| IGNORE} NULLS]])
Computes the specified percentile value for the value_expression, with linear interpolation.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
56 | Google BigQuery Adapter
l
value_expression: A numeric expression.
l
percentile: A literal in the range [0, 1].
PERCENTILE_DISC(value_expression, percentile [{RESPECT |
IGNORE} NULLS])
Computes the specified percentile value for a discrete value_expression. The returned
value is the first sorted value of value_expression with cumulative distribution greater than
or equal to the given percentile value.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value_expression: Any orderable type.
l
percentile: A literal in the range [0, 1].
COALESCE(expr1 [, expr2 [, ...]])
Returns the value of the first non-null expression. The remaining expressions are not
evaluated. All input expressions must be implicitly coercible to a common supertype.
Note: this function currently accepts up to 9 expressions.
l
expr1: Any expression
l
expr2: Any expression
NULLIF(expression, expression_to_match)
Returns NULL if expression = expression_to_match is true, otherwise returns expression.
expression and expression_to_match must be implicitly coercible to a common supertype;
equality comparison is done on coerced values.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: Any expression
l
expression_to_match: Any expression to be matched
TIBCO® Data Virtualization Google Big Query Adapter Guide
57 | Google BigQuery Adapter
CUME_DIST()
Return the relative rank of a row defined as NP/NR. NP is defined to be the number of rows
that either precede or are peers with the current row. NR is the number of rows in the
partition.
Note: this function returns a double when using Legacy SQL (UseLegacySQL=True).
DENSE_RANK()
Returns the ordinal (1-based) rank of each row within the window partition. All peer rows
receive the same rank value, and the subsequent rank value is incremented by one.
ROW_NUMBER()
Does not require the ORDER BY clause. Returns the sequential row ordinal (1-based) of
each row for each ordered partition. If the ORDER BY clause is unspecified then the result is
non-deterministic.
IFNULL(expr, null_result)
If expr is NULL, return null_result. Otherwise, return expr. If expr is not NULL, null_result is
not evaluated. expr and null_result must be implicitly coercible to a common supertype.
Synonym for COALESCE(expr, null_result)
l
expr: Any expression
l
null_result: The result to return if expr is null
CAST(expression AS type)
Cast is used in a query to indicate that the result type of an expression should be
converted to some other type.
l
expression: The expression to cast.
l
type: The type to cast the expression to.
TIBCO® Data Virtualization Google Big Query Adapter Guide
58 | Google BigQuery Adapter
SAFE_CAST(expression, type)
Cast is used in a query to indicate that the result type of an expression should be
converted to some other type.SAFE_CAST is identical to CAST, except it returns NULL
instead of raising an error.
l
expression: The expression to cast.
l
type: The type to cast the expression to.
CURRENT_DATE()
Returns a human-readable string of the current date in the format %Y-%m-%d.
DATE(timestamp [, timezone])
Converts a timestamp_expression to a DATE data type.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp from which to return the date.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
DATEDIFF(timestamp1, timestamp2)
Returns the number of days between two TIMESTAMP data types. The result is positive if
the first TIMESTAMP data type comes after the second TIMESTAMP data type, and
otherwise the result is negative.
Note: this function is only available when UseLegacySQL=True.
l
timestamp1: The first timestamp.
l
timestamp2: The second timestamp.
TIBCO® Data Virtualization Google Big Query Adapter Guide
59 | Google BigQuery Adapter
DATE_DIFF(date1, date2, date_part)
Computes the number of specified date_part differences between two date expressions.
This can be thought of as the number of date_part boundaries crossed between the two
dates. If the first date occurs before the second date, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date1: The first date.
l
date2: The second date.
l
date_part: The date part. Supported values are: DAY, MONTH, QUARTER, YEAR.
DATE_TRUNC(date, date_part)
Truncates the date to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date: The date to truncate.
l
date_part: The date part. Supported values are: DAY, WEEK, ISOWEEK, MONTH,
QUARTER, YEAR, ISOYEAR.
FORMAT_DATE(format_string, date_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
date_expr: The date to format.
PARSE_DATE(format_string, date_string)
Uses a format_string and a string representation of a date to return a DATE object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the date_string.
l
date_string: The date string to parse.
TIBCO® Data Virtualization Google Big Query Adapter Guide
60 | Google BigQuery Adapter
CURRENT_DATETIME([timezone])
Returns the current time as a DATETIME object.
l
timezone: The timezone to use when retrieving the current datetime object.
DATETIME(timestamp [, timezone])
Constructs a DATETIME object using a TIMESTAMP object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp from which to return the datetime.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
DATETIME_DIFF(datetime1, datetime2, date_part)
Computes the number of specified date_part differences between two date expressions.
This can be thought of as the number of date_part boundaries crossed between the two
dates. If the first date occurs before the second date, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
datetime1: The first datetime.
l
datetime2: The second datetime.
l
date_part: The date part. Possible values include: MICROSECOND, MILLISECOND,
SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, and YEAR.
DATETIME_TRUNC(datetime, part)
Truncates the datetime to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date: The datetime to truncate.
l
part: The date part. Possible values include: MICROSECOND, MILLISECOND, SECOND,
MINUTE, HOUR, DAY, WEEK, ISOWEEK, MONTH, QUARTER, YEAR, and ISOYEAR.
TIBCO® Data Virtualization Google Big Query Adapter Guide
61 | Google BigQuery Adapter
FORMAT_DATETIME(format_string, datetime_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
datetime_expr: The datetime to format.
PARSE_DATETIME(format_string, datetime_string)
Uses a format_string and a string representation of a date to return a DATETIME object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the date_string.
l
datetime_string: The datetime string to parse.
CURRENT_TIME()
Returns a human-readable string of the server's current time in the format %H:%M:%S.
TIME(timestamp [, timezone])
Constructs a DATETIME object using a TIMESTAMP object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp from which to return the datetime.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
TIME_DIFF(time1, time2, time_part)
Computes the number of specified time_part differences between two time expressions.
This can be thought of as the number of time_part boundaries crossed between the two
times. If the first time occurs before the second time, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
62 | Google BigQuery Adapter
l
time1: The first time.
l
time2: The second time.
l
time_part: The time part. Possible values include: MICROSECOND, MILLISECOND,
SECOND, MINUTE, HOUR.
TIME_TRUNC(time, part)
Truncates the time to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
time: The time to truncate.
l
part: The time part. Possible values include: MICROSECOND, MILLISECOND, SECOND,
MINUTE, HOUR.
FORMAT_TIME(format_string, time_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
time_expr: The time to format.
PARSE_TIME(format_string, time_string)
Uses a format_string and a string representation of a time to return a TIME object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the time_string.
l
time_string: The time string to parse.
CURRENT_TIMESTAMP()
Returns a TIMESTAMP data type of the server's current time in the format %Y-%m-%d
%H:%M:%S.
TIBCO® Data Virtualization Google Big Query Adapter Guide
63 | Google BigQuery Adapter
TIMESTAMP_DIFF(timestamp1, timestamp2, time_part)
Computes the number of specified time_part differences between two timestamp
expressions. This can be thought of as the number of time_part boundaries crossed
between the two timestamp. If the first timestamp occurs before the second timestamp,
then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp1: The first timestamp.
l
timestamp2: The second timestamp.
l
time_part: The timestamp part. Possible values include: MICROSECOND,
MILLISECOND, SECOND, MINUTE, HOUR.
FORMAT_TIMESTAMP(format_string, timestamp_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
timestamp_expr: The timestamp to format.
PARSE_TIMESTAMP(format_string, timestamp_string)
Uses a format_string and a string representation of a timestamp to return a TIMESTAMP
object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the timestamp_string.
l
timestamp_string: The timestamp string to parse.
TIBCO® Data Virtualization Google Big Query Adapter Guide
64 | Google BigQuery Adapter
DAY(timestamp)
Returns the day of the month of a TIMESTAMP data type as an integer between 1 and 31,
inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the month.
DAYOFWEEK(timestamp)
Returns the day of the week of a TIMESTAMP data type as an integer between 1 (Sunday)
and 7 (Saturday), inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the week.
DAYOFYEAR(timestamp)
Returns the day of the year of a TIMESTAMP data type as an integer between 1 and 366,
inclusively. The integer 1 refers to January 1.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the year.
FORMAT_UTC_USEC(unix_timestamp)
Returns a human-readable string representation of a UNIX timestamp in the format YYYY-
MM-DD HH:MM:SS.uuuuuu.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The unix timestamp to format.
HOUR(timestamp)
Returns the hour of a TIMESTAMP data type as an integer between 0 and 23, inclusively.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
65 | Google BigQuery Adapter
l
timestamp: The timestamp from which to return the hour as an integer.
MINUTE(timestamp)
Returns the minutes of a TIMESTAMP data type as an integer between 0 and 59, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the minutes as an integer.
MONTH(timestamp)
Returns the month of a TIMESTAMP data type as an integer between 1 and 12, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the month as an integer.
MSEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in milliseconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The unix timestamp to convert.
PARSE_UTC_USEC(date_string)
Converts a date string to a UNIX timestamp in microseconds. date_string must have the
format YYYY-MM-DD HH:MM:SS[.uuuuuu]. The fractional part of the second can be up to 6
digits long or can be omitted.
Note: this function is only available when UseLegacySQL=True.
l
date_string: The date string to convert.
TIBCO® Data Virtualization Google Big Query Adapter Guide
66 | Google BigQuery Adapter
QUARTER(timestamp)
Returns the quarter of the year of a TIMESTAMP data type as an integer between 1 and 4,
inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the quarter as an integer.
SEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in seconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
SECOND(timestamp)
Returns the seconds of a TIMESTAMP data type as an integer between 0 and 59, inclusively.
During a leap second, the integer range is between 0 and 60, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the second as an integer.
STRFTIME_UTC_USEC(unix_timestamp, date_format_str)
Returns a human-readable date string in the format date_format_str.date_format_str can
include date-related punctuation characters (such as / and -) and special characters
accepted by the strftime function in C++ (such as %d for day of month).
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
l
date_format_str: The date format string.
TIBCO® Data Virtualization Google Big Query Adapter Guide
67 | Google BigQuery Adapter
TIMESTAMP_SECONDS(unix_timestamp)
Interprets INT64_expression as the number of seconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The unix timestamp to convert.
TIMESTAMP_MILLIS(unix_timestamp)
Interprets INT64_expression as the number of milliseconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The unix timestamp to convert.
TIMESTAMP_MICROS(unix_timestamp)
Interprets INT64_expression as the number of microseconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The unix timestamp to convert.
TIMESTAMP_TO_MSEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in milliseconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
TIMESTAMP_TO_SEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in seconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
TIBCO® Data Virtualization Google Big Query Adapter Guide
68 | Google BigQuery Adapter
TIMESTAMP_TO_USEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in microseconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
UNIX_DATE(date_string)
Returns the number of days since 1970-01-01.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date_string: The date string to convert.
UNIX_SECONDS(timestamp)
Returns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of
precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp to convert.
UNIX_MILLIS(timestamp)
Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels
of precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp to convert.
UNIX_MICROS(timestamp)
Returns the number of microseconds since 1970-01-01 00:00:00 UTC. Truncates higher
levels of precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
69 | Google BigQuery Adapter
l
timestamp: The timestamp to convert.
USEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in microseconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
UTC_USEC_TO_DAY(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the day it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_HOUR(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the hour it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_MONTH(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the month it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_WEEK(unix_timestamp, day_of_week)
Returns a UNIX timestamp in microseconds that represents a day in the week of the unix_
timestamp argument.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
70 | Google BigQuery Adapter
l
unix_timestamp: The unix timestamp to shift.
l
day_of_week: A day of the week from 0 (Sunday) to 6 (Saturday).
UTC_USEC_TO_YEAR(unix_timestamp)
Returns a UNIX timestamp in microseconds that represents the year of the unix_timestamp
argument.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
WEEK(timestamp)
Returns the week of a TIMESTAMP data type as an integer between 1 and 53, inclusively.
Weeks begin on Sunday, so if January 1 is on a day other than Sunday, week 1 has fewer
than 7 days and the first Sunday of the year is the first day of week 2.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the week as an integer.
YEAR(timestamp)
Returns the year of a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the year as an integer.
ABS(expression)
Returns the absolute value of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
71 | Google BigQuery Adapter
ACOS(expression)
Returns the arc cosine of the argument.
l
expression: Any column or literal expression.
ACOSH(expression)
Returns the arc hyperbolic cosine of the argument.
l
expression: Any column or literal expression.
ASIN(expression)
Returns arcsine in radians.
l
expression: Any column or literal expression.
ASINH(expression)
Returns the arc hyperbolic sine of the argument.
l
expression: Any column or literal expression.
ATAN(expression)
Returns arc tangent of the argument.
l
expression: Any column or literal expression.
ATANH(expression)
Returns the arc hyperbolic tangent of the argument.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
72 | Google BigQuery Adapter
ATAN2(expression1, expression2)
Returns the arc tangent of the two arguments.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
CEIL(expression)
Rounds the argument up to the nearest whole number and returns the rounded value.
l
expression: Any column or literal expression.
CEILING(expression)
Synonym for CEIL function.
l
expression: Any column or literal expression.
COS(expression)
Returns the cosine of the argument.
l
expression: Any column or literal expression.
COSH(expression)
Returns the hyperbolic cosine of the argument.
l
expression: Any column or literal expression.
DEGREES(expression)
Returns expression, converted from radians to degrees.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
73 | Google BigQuery Adapter
l
expression: Any column or literal expression.
EXP(expression)
Returns the result of raising the constant "e" - the base of the natural logarithm - to the
power of expression.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
FLOOR(expression)
Rounds the argument down to the nearest whole number and returns the rounded value.
l
expression: Any column or literal expression.
LN(expression)
Returns the natural logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
LOG(expression)
Returns the natural logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
LOG2(expression)
Returns the Base-2 logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
74 | Google BigQuery Adapter
l
expression: Any column or literal expression.
LOG10(expression)
Returns the Base-10 logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
PI()
Returns PI.
Note: this function is only available when UseLegacySQL=True.
POW(expression1, expression2)
Returns the result of raising expression1 to the power of expression2.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
POWER(expression1, expression2)
Synonym of POW function.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
RADIANS(expression)
Returns expression, converted from degrees to radians.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
75 | Google BigQuery Adapter
RAND([expression])
Returns a random float value in the range 0.0 >= value < 1.0. Each int32_seed value always
generates the same sequence of random numbers within a given query, as long as you
don't use a LIMIT clause. If int32_seed is not specified, BigQuery uses the current
timestamp as the seed value.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
ROUND(expression [, integer_digits])
Rounds the argument either up or down to the nearest whole number (or if specified, to
the specified number of digits) and returns the rounded value.
l
expression: Any column or literal expression.
l
integer_digits: The number of digits to round to.
GREATEST(value1[, value2 [, ...]])
Returns NULL if any of the inputs is NULL. Otherwise, returns NaN if any of the inputs is
NaN. Otherwise, returns the largest value among X1,...,XN according to the < comparison.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value1: The first value to compare.
l
value2: The second value to compare.
LEAST(value1[, value2 [, ...]])
Returns NULL if any of the inputs is NULL. Otherwise, returns NaN if any of the inputs is
NaN. Otherwise, returns the smallest value among X1,...,XN according to the > comparison.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value1: The first value to compare.
l
value2: The second value to compare.
TIBCO® Data Virtualization Google Big Query Adapter Guide
76 | Google BigQuery Adapter
SIN(expression)
Returns the sine of the argument.
l
expression: Any column or literal expression.
SINH(expression)
Returns the hyperbolic sine of the argument.
l
expression: Any column or literal expression.
SQRT(expression)
Returns the square root of the expression.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TAN(expression)
Returns the tangent of the argument.
l
expression: Any column or literal expression.
TANH(expression)
Returns the hyperbolic tangent of the argument.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
77 | Google BigQuery Adapter
TRUNC(expression [, integer_digits])
Rounds X to the nearest integer whose absolute value is not greater than Xs. When the
integer_digits parameter is specified this function is similar to ROUND(X, N) but always
rounds towards zero. Unlike ROUND(X, N) it never overflows.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: Any column or literal expression.
l
integer_digits: The number of digits to round to.
BYTE_LENGTH(str)
Returns the length of the value in bytes, regardless of whether the type of the value is
STRING or BYTES.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to calculate the length on.
CHAR_LENGTH(str)
Returns the length of the STRING in characters.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to calculate the length on.
CONCAT(str1, str2 [, str3] [, ...])
Returns the concatenation of two or more strings, or NULL if any of the values are NULL.
l
str1: The first string to concatenate.
l
str2: The second string to concatenate.
l
str3: The third string to concatenate.
TIBCO® Data Virtualization Google Big Query Adapter Guide
78 | Google BigQuery Adapter
ENDS_WITH(str1, str2)
Takes two values. Returns TRUE if the second value is a suffix of the first.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
FROM_BASE64(string_expr)
Converts the base64-encoded input string_expr into BYTES format. To convert BYTES to a
base64-encoded STRING, use TO_BASE64.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert from base64 encoding.
FROM_HEX(string_expr)
Converts a hexadecimal-encoded STRING into BYTES format. Returns an error if the input
STRING contains characters outside the range (0..9, A..F, a..f). The lettercase of the
characters does not matter. To convert BYTES to a hexadecimal-encoded STRING, use TO_
HEX.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert from hexadecimal encoding.
INSTR(str1, str2)
Returns the one-based index of the first occurrence of str2 in str1, or returns 0 if str2 does
not occur in str1.
Note: this function is only available when UseLegacySQL=True.
l
str1: The string to search in.
l
str2: The string to search for.
TIBCO® Data Virtualization Google Big Query Adapter Guide
79 | Google BigQuery Adapter
LEFT(str, numeric_expression)
Returns the leftmost numeric_expr characters of str. If the number is longer than str, the
full string will be returned. Example: LEFT('seattle', 3) returns sea.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to perform the LEFT operation on.
l
numeric_expression: The number of characters to return.
LENGTH(str)
Returns a numerical value for the length of the string. Example: if str is '123456', LENGTH
returns 6.
l
str: The string to calculate the length on.
LOWER(str)
Returns the original string with all characters in lower case.
l
str: The string to lower.
LPAD(str1, numeric_expression[, str2])
Pads str1 on the left with str2, repeating str2 until the result string is exactly numeric_expr
characters. Example: LPAD('1', 7, '?') returns ??????1.
l
str1: The string to pad.
l
numeric_expression: The number of str2 instances to pad.
l
str2: The pad characters.
LTRIM(str1 [, str2])
Removes characters from the left side of str1. If str2 is omitted, LTRIM removes spaces from
the left side of str1. Otherwise, LTRIM removes any characters in str2 from the left side of
TIBCO® Data Virtualization Google Big Query Adapter Guide
80 | Google BigQuery Adapter
str1 (case-sensitive).
l
str1: The string to trim.
l
str2: The characters to trim from str1.
REPEAT(str, repetitions)
Returns a value that consists of original_value, repeated. The repetitions parameter
specifies the number of times to repeat original_value. Returns NULL if either original_
value or repetitions are NULL. This function return an error if the repetitions value is
negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to repeat.
l
str2: The number of repititions.
REPLACE(original_value, from_value, to_value)
Replaces all instances of from_value within original_value with to_value.
l
original_value: The string to search in.
l
from_value: The string to search for.
l
to_value: The string to replace instances of from_value.
REVERSE(str)
Returns the reverse of the input STRING or BYTES.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to reverse.
TIBCO® Data Virtualization Google Big Query Adapter Guide
81 | Google BigQuery Adapter
RIGHT(str, numeric_expression)
Returns the rightmost numeric_expr characters of str. If the number is longer than the
string, it will return the whole string. Example: RIGHT('kirkland', 4) returns land.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to perform the RIGHT operation on.
l
numeric_expression: The number of characters to return.
RPAD(str1, numeric_expression, str2)
Pads str1 on the right with str2, repeating str2 until the result string is exactly numeric_
expr characters. Example: RPAD('1', 7, '?') returns 1??????.
l
str1: The string to pad.
l
numeric_expression: The number of str2 instances to pad.
l
str2: The pad characters.
RTRIM(str1 [, str2])
Removes trailing characters from the right side of str1. If str2 is omitted, RTRIM removes
trailing spaces from str1. Otherwise, RTRIM removes any characters in str2 from the right
side of str1 (case-sensitive).
l
str1: The string to trim.
l
str2: The characters to trim from str1.
SPLIT(str [, delimiter])
Splits a string into repeated substrings. If delimiter is specified, the SPLIT function breaks
str into substrings, using delimiter as the delimiter.
l
str: The string to split.
l
delimiter: The delimiter to split the string on. Default delimiter is a comma (,).
TIBCO® Data Virtualization Google Big Query Adapter Guide
82 | Google BigQuery Adapter
STARTS_WITH(str1, str2)
Takes two values. Returns TRUE if the second value is a prefix of the first.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
STRPOS(str1, str2)
Returns the 1-based index of the first occurrence of value2 inside value1. Returns 0 if
value2 is not found.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
SUBSTR(str, index [, max_len])
Returns a substring of str, starting at index. If the optional max_len parameter is used, the
returned string is a maximum of max_len characters long. Counting starts at 1, so the first
character in the string is in position 1 (not zero). If index is 5, the substring begins with the
5th character from the left in str. If index is -4, the substring begins with the 4th character
from the right in str. Example: SUBSTR('awesome', -4, 4) returns the substring some.
l
str: The original string.
l
index: The starting index.
l
max_len: The maximum length of the return string.
TO_BASE64(string_expr)
Converts a sequence of BYTES into a base64-encoded STRING. To convert a base64-
encoded STRING into BYTES, use FROM_BASE64.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
83 | Google BigQuery Adapter
l
string_expr: The string to convert to base64 encoding.
TO_HEX(string_expr)
Converts a sequence of BYTES into a hexadecimal STRING. Converts each byte in the
STRING as two hexadecimal characters in the range (0..9, a..f). To convert a hexadecimal-
encoded STRING to BYTES, use FROM_HEX.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert to hexadecimal encoding.
TRIM(str1 [, str2])
Removes all leading and trailing characters that match value2. If value2 is not specified, all
leading and trailing whitespace characters (as defined by the Unicode standard) are
removed. If the first argument is of type BYTES, the second argument is required. If value2
contains more than one character or byte, the function removes all leading or trailing
characters or bytes contained in value2.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to trim.
l
str2: The optional string characters to trim from str1.
UPPER(str)
Returns the original string with all characters in upper case.
l
str: The string to upper.
JSON_EXTRACT(json, json_path)
Selects a value in json according to the JSONPath expression json_path. json_path must be
a string constant. Returns the value in JSON string format.
l
json: The JSON to select a value from.
TIBCO® Data Virtualization Google Big Query Adapter Guide
84 | Google BigQuery Adapter
l
json_path: The JSON path of the value contained in json.
JSON_EXTRACT_SCALAR(json, json_path)
Selects a value in json according to the JSONPath expression json_path. json_path must be
a string constant, and bracket notation is not supported. Returns a scalar JSON value.
l
json: The JSON to select a value from.
l
json_path: The JSON path of the value contained in json.
REGEXP_CONTAINS(str, reg_exp)
Returns TRUE if value is a partial match for the regular expression, regex. You can search
for a full match by using ^ (beginning of text) and $ (end of text). If the regex argument is
invalid, the function returns an error.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
REGEXP_EXTRACT(str, reg_exp)
Returns the portion of str that matches the capturing group within the regular expression.
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
REGEXP_EXTRACT_ALL(str, reg_exp)
Returns an array of all substrings of value that match the regular expression, regex. The
REGEXP_EXTRACT_ALL function only returns non-overlapping matches. For example, using
this function to extract ana from banana returns only one substring, not two.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to match in the regular expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
85 | Google BigQuery Adapter
l
reg_exp: The regular expression to match.
REGEXP_REPLACE(orig_str, reg_exp, replace_str)
Returns a string where any substring of orig_str that matches reg_exp is replaced with
replace_str. For example, REGEXP_REPLACE ('Hello', 'lo', 'p') returns Help.
l
orig_str: The original string to match in the regular expression.
l
reg_exp: The regular expression to match.
l
replace_str: The replacement for the matched orig_str in the regular expression.
FORMAT_IP(integer_value)
Converts 32 least significant bits of integer_value to human-readable IPv4 address string.
Note: this function is only available when UseLegacySQL=True.
l
integer_value: The integer value to convert to an IPv4 address.
PARSE_IP(readable_ip)
Converts a string representing IPv4 address to unsigned integer value. For example,
PARSE_IP('0.0.0.1') will return 1. If string is not a valid IPv4 address, PARSE_IP will return
NULL.
Note: this function is only available when UseLegacySQL=True.
l
readable_ip: The IPv4 address to convert to an integer.
NET.IPV4_FROM_INT64(integer_value)
Converts an IPv4 address from integer format to binary (BYTES) format in network byte
order. In the integer input, the least significant bit of the IP address is stored in the least
significant bit of the integer, regardless of host or client architecture. For example, 1 means
0.0.0.1, and 0x1FF means 0.0.1.255.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
86 | Google BigQuery Adapter
l
integer_value: The integer value to convert to an IPv4 address.
NET.IPV4_TO_INT64(readable_ip)
Converts an IPv4 address from binary (BYTES) format in network byte order to integer
format. In the integer output, the least significant bit of the IP address is stored in the least
significant bit of the integer, regardless of host or client architecture. For example, 1 means
0.0.0.1, and 0x1FF means 0.0.1.255. The output is in the range [0, 0xFFFFFFFF]. If the input
length is not 4, this function throws an error. This function does not support IPv6.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
readable_ip: The IPv4 address to convert to an integer.
FARM_FINGERPRINT(expression)
Computes the fingerprint of the STRING or BYTES input using the Fingerprint64 function
from the open-source FarmHash library. The output of this function for a particular input
will never change.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the fingerprint.
MD5(expression)
Computes the hash of the input using the MD5 algorithm. The input can either be STRING
or BYTES. The string version treats the input as an array of bytes. This function returns 16
bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
SHA1(expression)
Computes the hash of the input using the SHA-1 algorithm. The input can either be STRING
or BYTES. The string version treats the input as an array of bytes. This function returns 20
bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
87 | Google BigQuery Adapter
l
expression: The expression to use to compute the hash.
SHA256(expression)
Computes the hash of the input using the SHA-256 algorithm. The input can either be
STRING or BYTES. The string version treats the input as an array of bytes. This function
returns 32 bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
SHA512(expression)
Computes the hash of the input using the SHA-512 algorithm. The input can either be
STRING or BYTES. The string version treats the input as an array of bytes. This function
returns 64 bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
TIMESTAMP(datetime_expression[, timezone])
Convert a date, datetime, or string to a TIMESTAMP data type.
Note: this function does not support the timezone parameter and requires datetime_
expression to be a string when using Legacy SQL (UseLegacySQL=True).
l
datetime_expression: The expression to be converted to a timestamp
l
timezone: The timezone to be used. If no timezone is specified, the default
timezone, UTC, is used
JSON_EXTRACT(json, jsonpath)
Selects any value in a JSON array or object. The path to the array is specified in the
jsonpath argument. Return value is numeric or null.
TIBCO® Data Virtualization Google Big Query Adapter Guide
88 | Google BigQuery Adapter
l
json: The JSON document to extract.
l
jsonpath: The XPath used to select the nodes. The JSONPath must be a string
constant. The values of the nodes selected will be returned in a token-separated list.
Predicate Functions
REGEXP_MATCH(str, reg_exp)
Returns true if str matches the regular expression. For string matching without regular
expressions, use CONTAINS instead of REGEXP_MATCH.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
CAST(expression AS type)
Cast is used in a query to indicate that the result type of an expression should be
converted to some other type.
l
expression: The expression to cast.
l
type: The type to cast the expression to.
SAFE_CAST(expression, type)
Cast is used in a query to indicate that the result type of an expression should be
converted to some other type.SAFE_CAST is identical to CAST, except it returns NULL
instead of raising an error.
l
expression: The expression to cast.
l
type: The type to cast the expression to.
TIBCO® Data Virtualization Google Big Query Adapter Guide
89 | Google BigQuery Adapter
CURRENT_DATE()
Returns a human-readable string of the current date in the format %Y-%m-%d.
DATE(timestamp [, timezone])
Converts a timestamp_expression to a DATE data type.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp from which to return the date.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
DATEDIFF(timestamp1, timestamp2)
Returns the number of days between two TIMESTAMP data types. The result is positive if
the first TIMESTAMP data type comes after the second TIMESTAMP data type, and
otherwise the result is negative.
Note: this function is only available when UseLegacySQL=True.
l
timestamp1: The first timestamp.
l
timestamp2: The second timestamp.
DATE_DIFF(date1, date2, date_part)
Computes the number of specified date_part differences between two date expressions.
This can be thought of as the number of date_part boundaries crossed between the two
dates. If the first date occurs before the second date, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date1: The first date.
l
date2: The second date.
l
date_part: The date part. Supported values are: DAY, MONTH, QUARTER, YEAR.
TIBCO® Data Virtualization Google Big Query Adapter Guide
90 | Google BigQuery Adapter
DATE_TRUNC(date, date_part)
Truncates the date to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date: The date to truncate.
l
date_part: The date part. Supported values are: DAY, WEEK, ISOWEEK, MONTH,
QUARTER, YEAR, ISOYEAR.
FORMAT_DATE(format_string, date_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
date_expr: The date to format.
PARSE_DATE(format_string, date_string)
Uses a format_string and a string representation of a date to return a DATE object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the date_string.
l
date_string: The date string to parse.
CURRENT_DATETIME([timezone])
Returns the current time as a DATETIME object.
l
timezone: The timezone to use when retrieving the current datetime object.
DATETIME(timestamp [, timezone])
Constructs a DATETIME object using a TIMESTAMP object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
91 | Google BigQuery Adapter
l
timestamp: The timestamp from which to return the datetime.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
DATETIME_DIFF(datetime1, datetime2, date_part)
Computes the number of specified date_part differences between two date expressions.
This can be thought of as the number of date_part boundaries crossed between the two
dates. If the first date occurs before the second date, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
datetime1: The first datetime.
l
datetime2: The second datetime.
l
date_part: The date part. Possible values include: MICROSECOND, MILLISECOND,
SECOND, MINUTE, HOUR, DAY, WEEK, MONTH, QUARTER, and YEAR.
DATETIME_TRUNC(datetime, part)
Truncates the datetime to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
date: The datetime to truncate.
l
part: The date part. Possible values include: MICROSECOND, MILLISECOND, SECOND,
MINUTE, HOUR, DAY, WEEK, ISOWEEK, MONTH, QUARTER, YEAR, and ISOYEAR.
FORMAT_DATETIME(format_string, datetime_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
datetime_expr: The datetime to format.
TIBCO® Data Virtualization Google Big Query Adapter Guide
92 | Google BigQuery Adapter
PARSE_DATETIME(format_string, datetime_string)
Uses a format_string and a string representation of a date to return a DATETIME object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the date_string.
l
datetime_string: The datetime string to parse.
CURRENT_TIME()
Returns a human-readable string of the server's current time in the format %H:%M:%S.
TIME(timestamp [, timezone])
Constructs a DATETIME object using a TIMESTAMP object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp from which to return the datetime.
l
timezone: The timezone to use when converting the timestamp. If not specified, the
default timezone, UTC, is used.
TIME_DIFF(time1, time2, time_part)
Computes the number of specified time_part differences between two time expressions.
This can be thought of as the number of time_part boundaries crossed between the two
times. If the first time occurs before the second time, then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
time1: The first time.
l
time2: The second time.
l
time_part: The time part. Possible values include: MICROSECOND, MILLISECOND,
SECOND, MINUTE, HOUR.
TIBCO® Data Virtualization Google Big Query Adapter Guide
93 | Google BigQuery Adapter
TIME_TRUNC(time, part)
Truncates the time to the specified granularity.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
time: The time to truncate.
l
part: The time part. Possible values include: MICROSECOND, MILLISECOND, SECOND,
MINUTE, HOUR.
FORMAT_TIME(format_string, time_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
time_expr: The time to format.
PARSE_TIME(format_string, time_string)
Uses a format_string and a string representation of a time to return a TIME object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the time_string.
l
time_string: The time string to parse.
CURRENT_TIMESTAMP()
Returns a TIMESTAMP data type of the server's current time in the format %Y-%m-%d
%H:%M:%S.
TIMESTAMP_DIFF(timestamp1, timestamp2, time_part)
Computes the number of specified time_part differences between two timestamp
expressions. This can be thought of as the number of time_part boundaries crossed
between the two timestamp. If the first timestamp occurs before the second timestamp,
TIBCO® Data Virtualization Google Big Query Adapter Guide
94 | Google BigQuery Adapter
then the result is negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp1: The first timestamp.
l
timestamp2: The second timestamp.
l
time_part: The timestamp part. Possible values include: MICROSECOND,
MILLISECOND, SECOND, MINUTE, HOUR.
FORMAT_TIMESTAMP(format_string, timestamp_expr)
Formats the date_expr according to the specified format_string.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to format the date_expr.
l
timestamp_expr: The timestamp to format.
PARSE_TIMESTAMP(format_string, timestamp_string)
Uses a format_string and a string representation of a timestamp to return a TIMESTAMP
object.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
format_string: The format string used to parse the timestamp_string.
l
timestamp_string: The timestamp string to parse.
DAY(timestamp)
Returns the day of the month of a TIMESTAMP data type as an integer between 1 and 31,
inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the month.
TIBCO® Data Virtualization Google Big Query Adapter Guide
95 | Google BigQuery Adapter
DAYOFWEEK(timestamp)
Returns the day of the week of a TIMESTAMP data type as an integer between 1 (Sunday)
and 7 (Saturday), inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the week.
DAYOFYEAR(timestamp)
Returns the day of the year of a TIMESTAMP data type as an integer between 1 and 366,
inclusively. The integer 1 refers to January 1.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the day of the year.
FORMAT_UTC_USEC(unix_timestamp)
Returns a human-readable string representation of a UNIX timestamp in the format YYYY-
MM-DD HH:MM:SS.uuuuuu.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The unix timestamp to format.
HOUR(timestamp)
Returns the hour of a TIMESTAMP data type as an integer between 0 and 23, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the hour as an integer.
MINUTE(timestamp)
Returns the minutes of a TIMESTAMP data type as an integer between 0 and 59, inclusively.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
96 | Google BigQuery Adapter
l
timestamp: The timestamp from which to return the minutes as an integer.
MONTH(timestamp)
Returns the month of a TIMESTAMP data type as an integer between 1 and 12, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the month as an integer.
MSEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in milliseconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The unix timestamp to convert.
PARSE_UTC_USEC(date_string)
Converts a date string to a UNIX timestamp in microseconds. date_string must have the
format YYYY-MM-DD HH:MM:SS[.uuuuuu]. The fractional part of the second can be up to 6
digits long or can be omitted.
Note: this function is only available when UseLegacySQL=True.
l
date_string: The date string to convert.
QUARTER(timestamp)
Returns the quarter of the year of a TIMESTAMP data type as an integer between 1 and 4,
inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the quarter as an integer.
TIBCO® Data Virtualization Google Big Query Adapter Guide
97 | Google BigQuery Adapter
SEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in seconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
SECOND(timestamp)
Returns the seconds of a TIMESTAMP data type as an integer between 0 and 59, inclusively.
During a leap second, the integer range is between 0 and 60, inclusively.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the second as an integer.
STRFTIME_UTC_USEC(unix_timestamp, date_format_str)
Returns a human-readable date string in the format date_format_str.date_format_str can
include date-related punctuation characters (such as / and -) and special characters
accepted by the strftime function in C++ (such as %d for day of month).
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
l
date_format_str: The date format string.
TIMESTAMP_SECONDS(unix_timestamp)
Interprets INT64_expression as the number of seconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The unix timestamp to convert.
TIMESTAMP_MILLIS(unix_timestamp)
Interprets INT64_expression as the number of milliseconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
98 | Google BigQuery Adapter
l
timestamp: The unix timestamp to convert.
TIMESTAMP_MICROS(unix_timestamp)
Interprets INT64_expression as the number of microseconds since 1970-01-01 00:00:00 UTC.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The unix timestamp to convert.
TIMESTAMP_TO_MSEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in milliseconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
TIMESTAMP_TO_SEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in seconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
TIMESTAMP_TO_USEC(timestamp)
Converts a TIMESTAMP data type to a UNIX timestamp in microseconds.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp to convert.
UNIX_DATE(date_string)
Returns the number of days since 1970-01-01.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
99 | Google BigQuery Adapter
l
date_string: The date string to convert.
UNIX_SECONDS(timestamp)
Returns the number of seconds since 1970-01-01 00:00:00 UTC. Truncates higher levels of
precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp to convert.
UNIX_MILLIS(timestamp)
Returns the number of milliseconds since 1970-01-01 00:00:00 UTC. Truncates higher levels
of precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp to convert.
UNIX_MICROS(timestamp)
Returns the number of microseconds since 1970-01-01 00:00:00 UTC. Truncates higher
levels of precision.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
timestamp: The timestamp to convert.
USEC_TO_TIMESTAMP(unix_timestamp)
Converts a UNIX timestamp in microseconds to a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to convert.
TIBCO® Data Virtualization Google Big Query Adapter Guide
100 | Google BigQuery Adapter
UTC_USEC_TO_DAY(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the day it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_HOUR(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the hour it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_MONTH(unix_timestamp)
Shifts a UNIX timestamp in microseconds to the beginning of the month it occurs in.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
UTC_USEC_TO_WEEK(unix_timestamp, day_of_week)
Returns a UNIX timestamp in microseconds that represents a day in the week of the unix_
timestamp argument.
Note: this function is only available when UseLegacySQL=True.
l
unix_timestamp: The unix timestamp to shift.
l
day_of_week: A day of the week from 0 (Sunday) to 6 (Saturday).
UTC_USEC_TO_YEAR(unix_timestamp)
Returns a UNIX timestamp in microseconds that represents the year of the unix_timestamp
argument.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
101 | Google BigQuery Adapter
l
unix_timestamp: The unix timestamp to convert.
WEEK(timestamp)
Returns the week of a TIMESTAMP data type as an integer between 1 and 53, inclusively.
Weeks begin on Sunday, so if January 1 is on a day other than Sunday, week 1 has fewer
than 7 days and the first Sunday of the year is the first day of week 2.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the week as an integer.
YEAR(timestamp)
Returns the year of a TIMESTAMP data type.
Note: this function is only available when UseLegacySQL=True.
l
timestamp: The timestamp from which to return the year as an integer.
ABS(expression)
Returns the absolute value of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
ACOS(expression)
Returns the arc cosine of the argument.
l
expression: Any column or literal expression.
ACOSH(expression)
Returns the arc hyperbolic cosine of the argument.
TIBCO® Data Virtualization Google Big Query Adapter Guide
102 | Google BigQuery Adapter
l
expression: Any column or literal expression.
ASIN(expression)
Returns arcsine in radians.
l
expression: Any column or literal expression.
ASINH(expression)
Returns the arc hyperbolic sine of the argument.
l
expression: Any column or literal expression.
ATAN(expression)
Returns arc tangent of the argument.
l
expression: Any column or literal expression.
ATANH(expression)
Returns the arc hyperbolic tangent of the argument.
l
expression: Any column or literal expression.
ATAN2(expression1, expression2)
Returns the arc tangent of the two arguments.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
103 | Google BigQuery Adapter
CEIL(expression)
Rounds the argument up to the nearest whole number and returns the rounded value.
l
expression: Any column or literal expression.
CEILING(expression)
Synonym for CEIL function.
l
expression: Any column or literal expression.
COS(expression)
Returns the cosine of the argument.
l
expression: Any column or literal expression.
COSH(expression)
Returns the hyperbolic cosine of the argument.
l
expression: Any column or literal expression.
DEGREES(expression)
Returns expression, converted from radians to degrees.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
EXP(expression)
Returns the result of raising the constant "e" - the base of the natural logarithm - to the
power of expression.
Note: this function is only available when UseLegacySQL=True.
TIBCO® Data Virtualization Google Big Query Adapter Guide
104 | Google BigQuery Adapter
l
expression: Any column or literal expression.
FLOOR(expression)
Rounds the argument down to the nearest whole number and returns the rounded value.
l
expression: Any column or literal expression.
LN(expression)
Returns the natural logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
LOG(expression)
Returns the natural logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
LOG2(expression)
Returns the Base-2 logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
LOG10(expression)
Returns the Base-10 logarithm of the argument.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
105 | Google BigQuery Adapter
PI()
Returns PI.
Note: this function is only available when UseLegacySQL=True.
POW(expression1, expression2)
Returns the result of raising expression1 to the power of expression2.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
POWER(expression1, expression2)
Synonym of POW function.
l
expression1: Any column or literal expression.
l
expression2: Any column or literal expression.
RADIANS(expression)
Returns expression, converted from degrees to radians.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
RAND([expression])
Returns a random float value in the range 0.0 >= value < 1.0. Each int32_seed value always
generates the same sequence of random numbers within a given query, as long as you
don't use a LIMIT clause. If int32_seed is not specified, BigQuery uses the current
timestamp as the seed value.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TIBCO® Data Virtualization Google Big Query Adapter Guide
106 | Google BigQuery Adapter
ROUND(expression [, integer_digits])
Rounds the argument either up or down to the nearest whole number (or if specified, to
the specified number of digits) and returns the rounded value.
l
expression: Any column or literal expression.
l
integer_digits: The number of digits to round to.
GREATEST(value1[, value2 [, ...]])
Returns NULL if any of the inputs is NULL. Otherwise, returns NaN if any of the inputs is
NaN. Otherwise, returns the largest value among X1,...,XN according to the < comparison.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value1: The first value to compare.
l
value2: The second value to compare.
LEAST(value1[, value2 [, ...]])
Returns NULL if any of the inputs is NULL. Otherwise, returns NaN if any of the inputs is
NaN. Otherwise, returns the smallest value among X1,...,XN according to the > comparison.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
value1: The first value to compare.
l
value2: The second value to compare.
SIN(expression)
Returns the sine of the argument.
l
expression: Any column or literal expression.
SINH(expression)
Returns the hyperbolic sine of the argument.
TIBCO® Data Virtualization Google Big Query Adapter Guide
107 | Google BigQuery Adapter
l
expression: Any column or literal expression.
SQRT(expression)
Returns the square root of the expression.
Note: this function is only available when UseLegacySQL=True.
l
expression: Any column or literal expression.
TAN(expression)
Returns the tangent of the argument.
l
expression: Any column or literal expression.
TANH(expression)
Returns the hyperbolic tangent of the argument.
l
expression: Any column or literal expression.
TRUNC(expression [, integer_digits])
Rounds X to the nearest integer whose absolute value is not greater than Xs. When the
integer_digits parameter is specified this function is similar to ROUND(X, N) but always
rounds towards zero. Unlike ROUND(X, N) it never overflows.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: Any column or literal expression.
l
integer_digits: The number of digits to round to.
BYTE_LENGTH(str)
Returns the length of the value in bytes, regardless of whether the type of the value is
STRING or BYTES.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
108 | Google BigQuery Adapter
l
str: The string to calculate the length on.
CHAR_LENGTH(str)
Returns the length of the STRING in characters.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to calculate the length on.
CONCAT(str1, str2 [, str3] [, ...])
Returns the concatenation of two or more strings, or NULL if any of the values are NULL.
l
str1: The first string to concatenate.
l
str2: The second string to concatenate.
l
str3: The third string to concatenate.
ENDS_WITH(str1, str2)
Takes two values. Returns TRUE if the second value is a suffix of the first.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
FROM_BASE64(string_expr)
Converts the base64-encoded input string_expr into BYTES format. To convert BYTES to a
base64-encoded STRING, use TO_BASE64.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert from base64 encoding.
TIBCO® Data Virtualization Google Big Query Adapter Guide
109 | Google BigQuery Adapter
FROM_HEX(string_expr)
Converts a hexadecimal-encoded STRING into BYTES format. Returns an error if the input
STRING contains characters outside the range (0..9, A..F, a..f). The lettercase of the
characters does not matter. To convert BYTES to a hexadecimal-encoded STRING, use TO_
HEX.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert from hexadecimal encoding.
INSTR(str1, str2)
Returns the one-based index of the first occurrence of str2 in str1, or returns 0 if str2 does
not occur in str1.
Note: this function is only available when UseLegacySQL=True.
l
str1: The string to search in.
l
str2: The string to search for.
LEFT(str, numeric_expression)
Returns the leftmost numeric_expr characters of str. If the number is longer than str, the
full string will be returned. Example: LEFT('seattle', 3) returns sea.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to perform the LEFT operation on.
l
numeric_expression: The number of characters to return.
LENGTH(str)
Returns a numerical value for the length of the string. Example: if str is '123456', LENGTH
returns 6.
l
str: The string to calculate the length on.
TIBCO® Data Virtualization Google Big Query Adapter Guide
110 | Google BigQuery Adapter
LOWER(str)
Returns the original string with all characters in lower case.
l
str: The string to lower.
LPAD(str1, numeric_expression[, str2])
Pads str1 on the left with str2, repeating str2 until the result string is exactly numeric_expr
characters. Example: LPAD('1', 7, '?') returns ??????1.
l
str1: The string to pad.
l
numeric_expression: The number of str2 instances to pad.
l
str2: The pad characters.
LTRIM(str1 [, str2])
Removes characters from the left side of str1. If str2 is omitted, LTRIM removes spaces from
the left side of str1. Otherwise, LTRIM removes any characters in str2 from the left side of
str1 (case-sensitive).
l
str1: The string to trim.
l
str2: The characters to trim from str1.
REPEAT(str, repetitions)
Returns a value that consists of original_value, repeated. The repetitions parameter
specifies the number of times to repeat original_value. Returns NULL if either original_
value or repetitions are NULL. This function return an error if the repetitions value is
negative.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to repeat.
l
str2: The number of repititions.
TIBCO® Data Virtualization Google Big Query Adapter Guide
111 | Google BigQuery Adapter
REPLACE(original_value, from_value, to_value)
Replaces all instances of from_value within original_value with to_value.
l
original_value: The string to search in.
l
from_value: The string to search for.
l
to_value: The string to replace instances of from_value.
REVERSE(str)
Returns the reverse of the input STRING or BYTES.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to reverse.
RIGHT(str, numeric_expression)
Returns the rightmost numeric_expr characters of str. If the number is longer than the
string, it will return the whole string. Example: RIGHT('kirkland', 4) returns land.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to perform the RIGHT operation on.
l
numeric_expression: The number of characters to return.
RPAD(str1, numeric_expression, str2)
Pads str1 on the right with str2, repeating str2 until the result string is exactly numeric_
expr characters. Example: RPAD('1', 7, '?') returns 1??????.
l
str1: The string to pad.
l
numeric_expression: The number of str2 instances to pad.
l
str2: The pad characters.
TIBCO® Data Virtualization Google Big Query Adapter Guide
112 | Google BigQuery Adapter
RTRIM(str1 [, str2])
Removes trailing characters from the right side of str1. If str2 is omitted, RTRIM removes
trailing spaces from str1. Otherwise, RTRIM removes any characters in str2 from the right
side of str1 (case-sensitive).
l
str1: The string to trim.
l
str2: The characters to trim from str1.
SPLIT(str [, delimiter])
Splits a string into repeated substrings. If delimiter is specified, the SPLIT function breaks
str into substrings, using delimiter as the delimiter.
l
str: The string to split.
l
delimiter: The delimiter to split the string on. Default delimiter is a comma (,).
STARTS_WITH(str1, str2)
Takes two values. Returns TRUE if the second value is a prefix of the first.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
STRPOS(str1, str2)
Returns the 1-based index of the first occurrence of value2 inside value1. Returns 0 if
value2 is not found.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str1: The string to search in.
l
str2: The string to search for.
TIBCO® Data Virtualization Google Big Query Adapter Guide
113 | Google BigQuery Adapter
SUBSTR(str, index [, max_len])
Returns a substring of str, starting at index. If the optional max_len parameter is used, the
returned string is a maximum of max_len characters long. Counting starts at 1, so the first
character in the string is in position 1 (not zero). If index is 5, the substring begins with the
5th character from the left in str. If index is -4, the substring begins with the 4th character
from the right in str. Example: SUBSTR('awesome', -4, 4) returns the substring some.
l
str: The original string.
l
index: The starting index.
l
max_len: The maximum length of the return string.
TO_BASE64(string_expr)
Converts a sequence of BYTES into a base64-encoded STRING. To convert a base64-
encoded STRING into BYTES, use FROM_BASE64.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert to base64 encoding.
TO_HEX(string_expr)
Converts a sequence of BYTES into a hexadecimal STRING. Converts each byte in the
STRING as two hexadecimal characters in the range (0..9, a..f). To convert a hexadecimal-
encoded STRING to BYTES, use FROM_HEX.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
string_expr: The string to convert to hexadecimal encoding.
TRIM(str1 [, str2])
Removes all leading and trailing characters that match value2. If value2 is not specified, all
leading and trailing whitespace characters (as defined by the Unicode standard) are
removed. If the first argument is of type BYTES, the second argument is required. If value2
contains more than one character or byte, the function removes all leading or trailing
characters or bytes contained in value2.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
114 | Google BigQuery Adapter
l
str1: The string to trim.
l
str2: The optional string characters to trim from str1.
UPPER(str)
Returns the original string with all characters in upper case.
l
str: The string to upper.
JSON_EXTRACT(json, json_path)
Selects a value in json according to the JSONPath expression json_path. json_path must be
a string constant. Returns the value in JSON string format.
l
json: The JSON to select a value from.
l
json_path: The JSON path of the value contained in json.
JSON_EXTRACT_SCALAR(json, json_path)
Selects a value in json according to the JSONPath expression json_path. json_path must be
a string constant, and bracket notation is not supported. Returns a scalar JSON value.
l
json: The JSON to select a value from.
l
json_path: The JSON path of the value contained in json.
REGEXP_CONTAINS(str, reg_exp)
Returns TRUE if value is a partial match for the regular expression, regex. You can search
for a full match by using ^ (beginning of text) and $ (end of text). If the regex argument is
invalid, the function returns an error.
Note: this function is only available when UseLegacySQL=True.
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
TIBCO® Data Virtualization Google Big Query Adapter Guide
115 | Google BigQuery Adapter
REGEXP_EXTRACT(str, reg_exp)
Returns the portion of str that matches the capturing group within the regular expression.
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
REGEXP_EXTRACT_ALL(str, reg_exp)
Returns an array of all substrings of value that match the regular expression, regex. The
REGEXP_EXTRACT_ALL function only returns non-overlapping matches. For example, using
this function to extract ana from banana returns only one substring, not two.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
str: The string to match in the regular expression.
l
reg_exp: The regular expression to match.
REGEXP_REPLACE(orig_str, reg_exp, replace_str)
Returns a string where any substring of orig_str that matches reg_exp is replaced with
replace_str. For example, REGEXP_REPLACE ('Hello', 'lo', 'p') returns Help.
l
orig_str: The original string to match in the regular expression.
l
reg_exp: The regular expression to match.
l
replace_str: The replacement for the matched orig_str in the regular expression.
FORMAT_IP(integer_value)
Converts 32 least significant bits of integer_value to human-readable IPv4 address string.
Note: this function is only available when UseLegacySQL=True.
l
integer_value: The integer value to convert to an IPv4 address.
TIBCO® Data Virtualization Google Big Query Adapter Guide
116 | Google BigQuery Adapter
PARSE_IP(readable_ip)
Converts a string representing IPv4 address to unsigned integer value. For example,
PARSE_IP('0.0.0.1') will return 1. If string is not a valid IPv4 address, PARSE_IP will return
NULL.
Note: this function is only available when UseLegacySQL=True.
l
readable_ip: The IPv4 address to convert to an integer.
NET.IPV4_FROM_INT64(integer_value)
Converts an IPv4 address from integer format to binary (BYTES) format in network byte
order. In the integer input, the least significant bit of the IP address is stored in the least
significant bit of the integer, regardless of host or client architecture. For example, 1 means
0.0.0.1, and 0x1FF means 0.0.1.255.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
integer_value: The integer value to convert to an IPv4 address.
NET.IPV4_TO_INT64(readable_ip)
Converts an IPv4 address from binary (BYTES) format in network byte order to integer
format. In the integer output, the least significant bit of the IP address is stored in the least
significant bit of the integer, regardless of host or client architecture. For example, 1 means
0.0.0.1, and 0x1FF means 0.0.1.255. The output is in the range [0, 0xFFFFFFFF]. If the input
length is not 4, this function throws an error. This function does not support IPv6.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
readable_ip: The IPv4 address to convert to an integer.
FARM_FINGERPRINT(expression)
Computes the fingerprint of the STRING or BYTES input using the Fingerprint64 function
from the open-source FarmHash library. The output of this function for a particular input
will never change.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
117 | Google BigQuery Adapter
l
expression: The expression to use to compute the fingerprint.
MD5(expression)
Computes the hash of the input using the MD5 algorithm. The input can either be STRING
or BYTES. The string version treats the input as an array of bytes. This function returns 16
bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
SHA1(expression)
Computes the hash of the input using the SHA-1 algorithm. The input can either be STRING
or BYTES. The string version treats the input as an array of bytes. This function returns 20
bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
SHA256(expression)
Computes the hash of the input using the SHA-256 algorithm. The input can either be
STRING or BYTES. The string version treats the input as an array of bytes. This function
returns 32 bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
l
expression: The expression to use to compute the hash.
SHA512(expression)
Computes the hash of the input using the SHA-512 algorithm. The input can either be
STRING or BYTES. The string version treats the input as an array of bytes. This function
returns 64 bytes.
Note: this function is only available when using Standard SQL (UseLegacySQL=False).
TIBCO® Data Virtualization Google Big Query Adapter Guide
118 | Google BigQuery Adapter
l
expression: The expression to use to compute the hash.
TIMESTAMP(datetime_expression[, timezone])
Convert a date, datetime, or string to a TIMESTAMP data type.
Note: this function does not support the timezone parameter and requires datetime_
expression to be a string when using Legacy SQL (UseLegacySQL=True).
l
datetime_expression: The expression to be converted to a timestamp
l
timezone: The timezone to be used. If no timezone is specified, the default
timezone, UTC, is used
SELECT INTO Statements
You can use the SELECT INTO statement to export formatted data to a file.
Data Export with an SQL Query
The following query exports data into a file formatted in comma-separated values (CSV):
boolean ret = stat.execute("SELECT actor.attributes.email,
repository.name INTO "csv://c:/[publicdata].[samples].github_nested.txt"
FROM "[publicdata].[samples].github_nested" WHERE repository.name =
'EntityFramework'");
System.out.println(stat.getUpdateCount()+" rows affected");
You can specify other file formats in the URI. The following example exports tab-separated
values:
Statement stat = conn.createStatement();
boolean ret = stat.execute("SELECT * INTO "[publicdata].
[samples].github_nested" IN 'csv://filename=c:/[publicdata].
[samples].github_nested.csv;delimiter=tab' FROM "[publicdata].
[samples].github_nested" WHERE repository.name = 'EntityFramework'");
System.out.println(stat.getUpdateCount()+" rows affected");
INSERT Statements
To create new records, use INSERT statements.
TIBCO® Data Virtualization Google Big Query Adapter Guide
119 | Google BigQuery Adapter
INSERT Syntax
The INSERT statement specifies the columns to be inserted and the new column values.
You can specify the column values in a comma-separated list in the VALUES clause, as
shown in the following example:
INSERT INTO <table_name>
( <column_reference> [ , ... ] )
VALUES
( { <expression> | NULL } [ , ... ] )
<expression> ::=
| @ <parameter>
| ?
| <literal>
You can use the executeUpdate method of the Statement and PreparedStatement classes
to execute data manipulation commands and retrieve the rows affected. To retrieve the Id
of the last inserted record use getGeneratedKeys. Additionally, set the RETURN_
GENERATED_KEYS flag of the Statement class when you call prepareStatement.
String cmd = "INSERT INTO [publicdata].[samples].github_nested
(repository.name) VALUES (?)";
PreparedStatement pstmt = connection.prepareStatement
(cmd,Statement.RETURN_GENERATED_KEYS);
pstmt.setString(1, "CoreCLR");
int count = pstmt.executeUpdate();
System.out.println(count+" rows were affected");
ResultSet rs = pstmt.getGeneratedKeys();
while(rs.next()){
System.out.println(rs.getString("Id"));
}
connection.close();
UPDATE Statements
To modify existing records, use UPDATE statements.
Update Syntax
The UPDATE statement takes as input a comma-separated list of columns and new column
values as name-value pairs in the SET clause, as shown in the following example:
TIBCO® Data Virtualization Google Big Query Adapter Guide
120 | Google BigQuery Adapter
UPDATE <table_name> SET { <column_reference> = <expression> } [ , ... ]
WHERE { Id = <expression> } [ { AND | OR } ... ]
<expression> ::=
| @ <parameter>
| ?
| <literal>
You can use the executeUpdate method of the Statement or PreparedStatement classes to
execute data manipulation commands and retrieve the rows affected, as shown in the
following example:
String cmd = "UPDATE [publicdata].[samples].github_nested SET
repository.name='CoreCLR' WHERE Id = ?";
PreparedStatement pstmt = connection.prepareStatement(cmd);
pstmt.setString(1, "1");
int count = pstmt.executeUpdate();
System.out.println(count + " rows were affected");
connection.close();
DELETE Statements
To delete information from a table, use DELETE statements.
DELETE Syntax
The DELETE statement requires the table name in the FROM clause and the row's primary
key in the WHERE clause, as shown in the following example:
<delete_statement> ::= DELETE FROM <table_name> WHERE { Id =
<expression> } [ { AND | OR } ... ]
<expression> ::=
| @ <parameter>
| ?
| <literal>
You can use the executeUpdate method of the Statement or PreparedStatement classes to
execute data manipulation commands and retrieve the number of affected rows, as shown
in the following example:
Connection connection = DriverManager.getConnection
TIBCO® Data Virtualization Google Big Query Adapter Guide
121 | Google BigQuery Adapter
("jdbc:googlebigquery:InitiateOAuth=GETANDREFRESH;ProjectId=NameOfProjec
t;DatasetId=NameOfDataset;",);
String cmd = "DELETE FROM [publicdata].[samples].github_nested WHERE Id
= ?";
PreparedStatement pstmt = connection.prepareStatement(cmd);
pstmt.setString(1, "1");
int count=pstmt.executeUpdate();
connection.close();
EXECUTE Statements
To execute stored procedures, you can use EXECUTE or EXEC statements.
EXEC and EXECUTE assign stored procedure inputs, referenced by name, to values or
parameter names.
Stored Procedure Syntax
To execute a stored procedure as an SQL statement, use the following syntax:
{ EXECUTE | EXEC } <stored_proc_name>
{
[ @ ] <input_name> = <expression>
} [ , ... ]
<expression> ::=
| @ <parameter>
| ?
| <literal>
Example Statements
Reference stored procedure inputs by name:
EXECUTE my_proc @second = 2, @first = 1, @third = 3;
Execute a parameterized stored procedure statement:
EXECUTE my_proc second = @p1, first = @p2, third = @p3;
TIBCO® Data Virtualization Google Big Query Adapter Guide
122 | Google BigQuery Adapter
PIVOT and UNPIVOT
PIVOT and UNPIVOT can be used to change a table-valued expression into another table.
PIVOT
PIVOT rotates a table-value expression by turning unique values from one column into
multiple columns in the output. PIVOT can run aggregations where required on any column
value.
PIVOT Synax
"SELECT 'AverageCost' AS Cost_Sorted_By_Production_Days, [0], [1], [2],
[3], [4]
FROM
(
SELECT DaysToManufacture, StandardCost
FROM Production.Product
) AS SourceTable
PIVOT
(
AVG(StandardCost)
FOR DaysToManufacture IN ([0], [1], [2], [3], [4])
) AS PivotTable;"
UNPIVOT
UNPIVOT carries out nearly the opposite to PIVOT by rotating columns of a table-valued
expressions into column values.
UNPIVOT Sytax
"SELECT VendorID, Employee, Orders
FROM
TIBCO® Data Virtualization Google Big Query Adapter Guide
123 | Google BigQuery Adapter
(SELECT VendorID, Emp1, Emp2, Emp3, Emp4, Emp5
FROM pvt) p
UNPIVOT
(Orders FOR Employee IN
(Emp1, Emp2, Emp3, Emp4, Emp5)
)AS unpvt;"
For further information on PIVOT and UNPIVOT, see FROM clause plus JOIN, APPLY, PIVOT
(Transact-SQL)
Data Model
The Google BigQuery Adapter models the data as defined within Google BigQuery for the
ProjectId and DatasetId configured.
Views
Views are client-side tables that cannot be modified. The adapter uses these to report
metadata about the Google BigQuery projects and datsets it is connected to.
In addition, the adapter supports server-side views defined within Google BigQuery. These
views may be used in SELECT statements the same way as tables. However, view schemas
can easily become out of date and require the adapter to refresh them. Please see
RefreshViewSchemas for more details.
Stored Procedures
Stored Procedures are function-like interfaces to the data source. The adapter uses these
to manage Google BigQuery tables and jobs and to perform OAuth operations.
In addition to the client-side stored procedures offered by the adapter, there is also
support for server-side stored procedures defined in Google BigQuery. The adapter
supports both CALL and EXEC using the procedure's parameter names. Note that adapter
only supports IN parameters and resultset return values.
CALL `psychic-valve-137816`.Northwind.MostPopularProduct()
CALL `psychic-valve-137816`.Northwind.GetStockedValue(24, 0.75)
EXEC `psychic-valve-137816`.Northwind.MostPopularProduct
TIBCO® Data Virtualization Google Big Query Adapter Guide
124 | Google BigQuery Adapter
EXEC `psychic-valve-137816`.Northwind.GetSockedValue productId = 24,
discountRate = 0.75
Additional Metadata
Table Descriptions
Google BigQuery supports setting descriptions on tables but the adapter does not report
these by default. ShowTableDescriptions can be used to report table descriptions.
Primary Keys
Google BigQuery does not support primary keys natively, but the adapter allows you to
define them so they can be used in environments that require primary keys to modify data.
Primary keys can be defined using the PrimaryKeyIdentifiers option.
Policy Tags
If policy tags from the Data Catalog service are defined on a table, they can be retrieved
from the system tables using the PolicyTags column:
SELECT ColumnName, PolicyTags FROM sys_tablecolumns
WHERE CatalogName = 'psychic-valve-137816'
AND SchemaName = 'Northwind'
AND TableName = 'Customers
Tables
Tables
Table definitions are dynamically generated based on the table definitions within Google
BigQuery for the Project and Dataset specified in the connection string options.
TIBCO® Data Virtualization Google Big Query Adapter Guide
125 | Google BigQuery Adapter
Views
Views are composed of columns and pseudo columns. Views are similar to tables in the
way that data is represented; however, views do not support updates. Entities that are
represented as views are typically read-only entities. Often, a stored procedure is available
to update the data if such functionality is applicable to the data source.
Queries can be executed against a view as if it were a normal table, and the data that
comes back is similar in that regard.
Dynamic views, such as queries exposed as views, and views for looking up specific
combinations of project_team work items are supported.
Google BigQuery Adapter Views
Name Description
Datasets Lists all the accessible datasets for a given project.
PartitionsList Lists the partitioning definitions for tables
PartitionsValues Lists the partitioning ranges for tables
Projects Lists all the projects for the authorized user.
Datasets
Lists all the accessible datasets for a given project.
Columns
Name Type Description
TIBCO® Data Virtualization Google Big Query Adapter Guide
126 | Google BigQuery Adapter
Id
[KEY]
String The fully qualified, unique, opaque Id of the
dataset.
Kind
String The resource type.
FriendlyName
String A descriptive name for the dataset
DatasetReference_ProjectId
String A unique reference to the container project.
DatasetReference_DatasetId
String A unique reference to the dataset, without the
project name.
PartitionsList
Lists the partitioning definitions for tables
Columns
Name Type Description
Id
[KEY]
String A unique identifier for the partition.
ProjectId
String The project that the table belongs to.
DatasetId
String The dataset that the table belongs to.
TableName
String The name of the table.
TIBCO® Data Virtualization Google Big Query Adapter Guide
127 | Google BigQuery Adapter
ColumnName
String The name of the column used for partitioning.
ColumnType
String The type of the partitioning column.
Kind
String The type of partitioning used by the table. One of DATE,
RANGE or INGESTION.
RequireFilter
Boolean Whether a filter on the partition column is required to query
the table.
PartitionsValues
Lists the partitioning ranges for tables
Columns
Name Type Description
Id
String A unique identifier for the partition.
RangeLow
String The lowest value of the partition column. Either an integer
when Kind is RANGE, or a date otherwise.
RangeHigh
String The highest value of the partition column. Either an integer
when Kind is RANGE, or a date otherwise.
RangeInterval
String The range of values which are included in each partition. Only
valid when Kind is RANGE
DateResolution
String How much of the date is significant to a TIME or INGESTION
partition column. One of DAY, HOUR, MONTH or YEAR.
TIBCO® Data Virtualization Google Big Query Adapter Guide
128 | Google BigQuery Adapter
Projects
Lists all the projects for the authorized user.
Columns
Name Type Description
Id
[KEY]
String The unique identifier of the Project
Kind
String The resource type.
FriendlyName
String A descriptive name for the project.
NumericId
String The numeric Id of the project.
ProjectReference_ProjectId
String A unique reference to the project.
Stored Procedures
Stored procedures are function-like interfaces that extend the functionality of the adapter
beyond simple SELECT/INSERT/UPDATE/DELETE operations with Google BigQuery.
Stored procedures accept a list of parameters, perform their intended function, and then
return, if applicable, any relevant response data from Google BigQuery, along with an
indication of whether the procedure succeeded or failed.
Google BigQuery Adapter Stored Procedures
TIBCO® Data Virtualization Google Big Query Adapter Guide
129 | Google BigQuery Adapter
Name Description
CancelJob Cancels a running BigQuery job.
CreateSchema Creates a schema file for the specified table or view.
DeleteTable Deletes the specified table from Google BigQuery.
GetJob Retrieves the configuration information and execution state
for an existing job
GetOAuthAccessToken Obtains the OAuth access token to be used for authentication
with various Google services.
GetOAuthAuthorizationURL Obtains the OAuth authorization URL for authentication with
various Google services.
InsertJob Inserts a Google BigQuery job, which can then be selected
later to retrieve the query results.
InsertLoadJob Inserts a Google BigQuery load job, which adds data from
Google Cloud Storage into an existing table.
RefreshOAuthAccessToken Obtains the OAuth access token to be used for authentication
with various Google services.
CancelJob
Cancels a running BigQuery job.
Input
Name Type Description
JobId
String The Id of the job you wish to cancel.
TIBCO® Data Virtualization Google Big Query Adapter Guide
130 | Google BigQuery Adapter
Region
String The region where the job is executing. Not required if the job is a
US or EU region.
Result Set Columns
Name Type Description
JobId
String The JobId of the
cancelled Job.
Region
String The region where the job
was executing.
Configuration_query_query
String The query of the
cancelled job.
Configuration_query_destinationTable_tableId
String The destination table
tableId of the cancelled
job.
Configuration_query_destinationTable_projectId
String The destination table
projectId of the newly
inserted job.
Configuration_query_destinationTable_datasetId
String The destination table
datasetId of the newly
inserted job.
Status_State
String Running state of the job.
Status_errorResult_reason
String A short error code that
summarizes the error.
Status_errorResult_message
String A human-readable
description of the error.
TIBCO® Data Virtualization Google Big Query Adapter Guide
131 | Google BigQuery Adapter
CreateSchema
Creates a schema file for the specified table or view.
CreateSchema
Creates a local schema file (.rsd) from an existing table or view in the data model.
The schema file is created in the directory set in the Location connection property when
this procedure is executed. You can edit the file to include or exclude columns, rename
columns, or adjust column datatypes.
The adapter checks the Location to determine if the names of any .rsd files match a table
or view in the data model. If there is a duplicate, the schema file will take precedence over
the default instance of this table in the data model. If a schema file is present in Location
that does not match an existing table or view, a new table or view entry is added to the
data model of the adapter.
Input
Name Type Accepts
Output
Streams
Description
TableName
String False The name of the table or view.
FileName
String False The full file path and name of the schema to
generate. Ex:
'C:\\Users\\User\\Desktop\\GoogleBigQuery\\table.rs
d'
FileStream
String True Stream to write the schema to. Only used if
FileName is not provided.
TIBCO® Data Virtualization Google Big Query Adapter Guide
132 | Google BigQuery Adapter
Result Set Columns
Name Type Description
Result
String Returns Success or Failure.
FileData
String The content of the schema encoded as base64. Only returned if
FileName and FileStream are not provided.
DeleteTable
Deletes the specified table from Google BigQuery.
Input
Name Type Description
TableId
String TableId of the table you wish to delete. ProjectId and DatasetId
can come from connection properties, or to override them, the
format projectId:datasetId.TableId.
Result Set Columns
Name Type Description
Success
String Returns true if operation is successful, else an exception is
returned.
GetJob
TIBCO® Data Virtualization Google Big Query Adapter Guide
133 | Google BigQuery Adapter
Retrieves the configuration information and execution state for an existing job
Input
Name Type Description
JobId
String The Id of the job you wish to return.
Region
String The region where the job is executing. Not required if the job is a
US or EU region.
Result Set Columns
Name Type Description
JobId
String The JobId of the newly
insert Job.
Region
String The region where the job
is executing.
Configuration_query_query
String The query of the newly
inserted Job.
Configuration_query_destinationTable_tableId
String The destination table
tableId of the newly
inserted Job.
Configuration_query_destinationTable_projectId
String The destination table
projectId of the newly
inserted Job.
Configuration_query_destinationTable_datasetId String The destination table
TIBCO® Data Virtualization Google Big Query Adapter Guide
134 | Google BigQuery Adapter
datasetId of the newly
inserted Job.
Status_State
String Running state of the job.
Status_errorResult_reason
String A short error code that
summarizes the error.
Status_errorResult_message
String A human-readable
description of the error.
GetOAuthAccessToken
Obtains the OAuth access token to be used for authentication with various Google services.
Input
Name Type Description
AuthMode
String The type of authentication mode to use.
The allowed values are APP, WEB.
The default value is WEB.
Verifier
String The verifier code returned by Google after permissions have
been granted for the app to connect. WEB Authmode only.
Scope
String The scope of access to Google APIs. By default, access to all
APIs used by this data provider will be specified.
CallbackURL
String Determines where the response is sent. The value of this
parameter must exactly match one of the values registered in
the APIs Console (including the http or https schemes, case,
and trailing '/').
TIBCO® Data Virtualization Google Big Query Adapter Guide
135 | Google BigQuery Adapter
Prompt
String This field indicates the prompt to present the user. It accepts
one of the following values: NONE, CONSENT, SELECT
ACCOUNT. The default is SELECT_ACCOUNT, so a given user
will be prompted to select the account to connect to. If it is
set to CONSENT, the user will see a consent page every time,
even if they have previously given consent to the application
for a given set of scopes. Lastly, if it is set to NONE, no
authentication or consent screens will be displayed to the
user.
The default value is SELECT_ACCOUNT.
AccessType
String Indicates if your application needs to access a Google API
when the user is not present at the browser. This parameter
defaults to offline. If your application needs to refresh access
tokens when the user is not present at the browser, then use
offline. This will result in your application obtaining a refresh
token the first time your application exchanges an
authorization code for a user.
The allowed values are ONLINE, OFFLINE.
The default value is OFFLINE.
State
String Indicates any state which may be useful to your application
upon receipt of the response. The Google Authorization
Server roundtrips this parameter, so your application receives
the same value it sent. Possible uses include redirecting the
user to the correct resource in your site, nonces, and cross-
site-request-forgery mitigations.
Result Set Columns
Name Type Description
OAuthAccessToken
String The authentication token returned from Google. This
can be used in subsequent calls to other operations
for this particular service.
TIBCO® Data Virtualization Google Big Query Adapter Guide
136 | Google BigQuery Adapter
OAuthRefreshToken
String A token that may be used to obtain a new access
token.
ExpiresIn
String The remaining lifetime on the access token.
GetOAuthAuthorizationURL
Obtains the OAuth authorization URL for authentication with various Google services.
Input
Name Type Description
Scope
String The scope of access to Google APIs. By default, access to all
APIs used by this data provider will be specified.
CallbackURL
String Determines where the response is sent. The value of this
parameter must exactly match one of the values registered in
the APIs Console (including the http or https schemes, case,
and trailing '/').
Prompt
String This field indicates the prompt to present the user. It accepts
one of the following values: NONE, CONSENT, SELECT
ACCOUNT. The default is SELECT_ACCOUNT, so a given user
will be prompted to select the account to connect to. If it is
set to CONSENT, the user will see a consent page every time,
even if they have previously given consent to the application
for a given set of scopes. Lastly, if it is set to NONE, no
authentication or consent screens will be displayed to the
user.
The default value is SELECT_ACCOUNT.
AccessType
String Indicates if your application needs to access a Google API
when the user is not present at the browser. This parameter
defaults to offline. If your application needs to refresh access
TIBCO® Data Virtualization Google Big Query Adapter Guide
137 | Google BigQuery Adapter
tokens when the user is not present at the browser, then use
offline. This will result in your application obtaining a refresh
token the first time your application exchanges an
authorization code for a user.
The allowed values are ONLINE, OFFLINE.
The default value is OFFLINE.
State
String Indicates any state which may be useful to your application
upon receipt of the response. The Google Authorization
Server roundtrips this parameter, so your application receives
the same value it sent. Possible uses include redirecting the
user to the correct resource in your site, nonces, and cross-
site-request-forgery mitigations.
Result Set Columns
Name Type Description
URL
String The URL to complete user authentication.
InsertJob
Inserts a Google BigQuery job, which can then be selected later to retrieve the query
results.
Input
Name Type Description
Query
String The query to submit to Google BigQuery.
TIBCO® Data Virtualization Google Big Query Adapter Guide
138 | Google BigQuery Adapter
IsDML
String Should be true if the query is a DML statement and
false otherwise.
The default value is false.
DestinationTable
String The destination table for the query, in the format
DestProjectId:DestDatasetId.DestTable
WriteDisposition
String How to write data to the destination table, such as
truncate existing results, appending existing results,
or writing only when the table is empty.
The allowed values are WRITE_TRUNCATE, WRITE_
APPEND, WRITE_EMPTY.
The default value is WRITE_TRUNCATE.
DryRun
String Whether or not this is a dry run of the job.
MaximumBytesBilled
String If provided, BigQuery will cancel the job if it attempts
to process more than this many bytes.
Region
String The region to start executing the job in.
Result Set Columns
Name Type Description
JobId
String The JobId of the newly
insert Job.
Region
String The region where the job
is executing.
Configuration_query_query
String The query of the newly
inserted Job.
TIBCO® Data Virtualization Google Big Query Adapter Guide
139 | Google BigQuery Adapter
Configuration_query_destinationTable_tableId
String The destination table
tableId of the newly
inserted Job.
Configuration_query_destinationTable_projectId
String The destination table
projectId of the newly
inserted Job.
Configuration_query_destinationTable_datasetId
String The destination table
datasetId of the newly
inserted Job.
Status_State
String Running state of the job.
Status_errorResult_reason
String A short error code that
summarizes the error.
Status_errorResult_message
String A human-readable
description of the error.
InsertLoadJob
Inserts a Google BigQuery load job, which adds data from Google Cloud Storage into an
existing table.
Input
Name Type Description
SourceURIs
String A space-separated list of Google Cloud
Storage URIs
SourceFormat
String The source format that the files are
formatted in.
The allowed values are AVRO, NEWLINE_
TIBCO® Data Virtualization Google Big Query Adapter Guide
140 | Google BigQuery Adapter
DELIMITED_JSON, DATASTORE_BACKUP,
PARQUET, ORC, CSV.
DestinationTable
String The destination table for the query, in
the format
DestProjectId.DestDatasetId.DestTable
DestinationTableProperties
String A JSON object containing the table
friendlyName, description and list of
labels.
DestinationTableSchema
String A JSON list contianing the fields used
to create the table.
DestinationEncryptionConfiguration
String A JSON object giving the KMS
encryption settings for the table.
SchemaUpdateOptions
String A JSON list giving the options to apply
when updating the destination table
schema.
TimePartitioning
String A JSON object giving the time
partitioning type and field.
RangePartitioning
String A JSON object giving the range
partitioning field and buckets.
Clustering
String A JSON object giving the fields to be
used for clustering.
Autodetect
String Whether options and schema should be
automatically determined for JSON and
CSV files.
CreateDisposition
String Whether to create the destination table
if it does not exist.
The allowed values are CREATE_IF_
NEEDED, CREATE_NEVER.
The default value is CREATE_IF_NEEDED.
TIBCO® Data Virtualization Google Big Query Adapter Guide
141 | Google BigQuery Adapter
WriteDisposition
String How to write data to the destination
table, such as truncate existing results,
appending existing results, or writing
only when the table is empty.
The allowed values are WRITE_
TRUNCATE, WRITE_APPEND, WRITE_
EMPTY.
The default value is WRITE_APPEND.
Region
String The region to start executing the job in.
Both the GCS resources and the
BigQuery dataset must be in the same
region.
DryRun
String Whether or not this is a dry run of the
job.
The default value is false.
MaximumBadRecords
String If provided, the number of records that
can be invalid before the entire job is
canceled. By default all records must
be valid.
The default value is 0.
IgnoreUnknownValues
String Whether to ignore unknown fields in
the input file or treat them as errors. By
default they are treated as errors.
The default value is false.
AvroUseLogicalTypes
String Whether to use Avro logical types when
converting Avro data into BigQuery
types.
The default value is true.
CSVSkipLeadingRows
String How many rows to skip at the start of
CSV files. Usually used for skipping
header rows.
TIBCO® Data Virtualization Google Big Query Adapter Guide
142 | Google BigQuery Adapter
CSVEncoding
String The name of the encoding used for CSV
files.
The allowed values are ISO-8859-1, UTF-
8.
The default value is UTF-8.
CSVNullMarker
String If provided, this string is used for NULL
values within CSV files. By default CSV
files cannot use NULL.
CSVFieldDelimiter
String The character used to separate
columns within CSV files.
The default value is ,.
CSVQuote
String The character used for quoted fields in
CSV files. May be set to empty to
disable quoting.
The default value is ".
CSVAllowQuotedNewlines
String Whether CSV files can contain newlines
within quoted fields.
The default value is false.
CSVAllowJaggedRows
String Whether lines in CSV files may contain
missing fields. False by default
The default value is false.
DSBackupProjectionFields
String A JSON list of fields to load from a
Cloud datastore backup.
ParquetOptions
String A JSON object giving the Parquet-
specific import options.
DecimalTargetTypes
String A JSON list giving the preference order
applied to numeric types.
HivePartitioningOptions
String A JSON object giving the source-side
partitioning options.
TIBCO® Data Virtualization Google Big Query Adapter Guide
143 | Google BigQuery Adapter
Result Set Columns
Name Type Description
JobId
String The JobId of the newly
insert Job.
Region
String The region where the job
is executing.
Configuration_load_destinationTable_tableId
String The destination table
tableId of the newly
inserted Job.
Configuration_load_destinationTable_projectId
String The destination table
projectId of the newly
inserted Job.
Configuration_load_destinationTable_datasetId
String The destination table
datasetId of the newly
inserted Job.
Status_State
String Running state of the job.
Status_errorResult_reason
String A short error code that
summarizes the error.
Status_errorResult_message
String A human-readable
description of the error.
RefreshOAuthAccessToken
Obtains the OAuth access token to be used for authentication with various Google services.
TIBCO® Data Virtualization Google Big Query Adapter Guide
144 | Google BigQuery Adapter
Input
Name Type Description
OAuthRefreshToken
String The refresh token returned from the original
authorization code exchange.
Result Set Columns
Name Type Description
OAuthAuthToken
String The authentication token returned from Google. This
can be used in subsequent calls to other operations
for this particular service.
OAuthRefreshToken
String A token that may be used to obtain a new access
token.
ExpiresIn
String The remaining lifetime on the access token.
Data Type Mapping
Data Type Mappings
The adapter maps types from the data source to the corresponding data type available in
the schema. The table below documents these mappings.
Google BigQuery CData Schema
TIBCO® Data Virtualization Google Big Query Adapter Guide
145 | Google BigQuery Adapter
STRING string
BYTES binary
INTEGER long
FLOAT double
NUMERIC decimal
BIGNUMERIC decimal
BOOLEAN bool
DATE date
TIME time
DATETIME datetime
TIMESTAMP datetime
STRUCT See below
ARRAY See below
GEOGRAPHY string
Note that the NUMERIC type supports 38 digits of precision and the BIGDECIMAL type
supports 76 digits of precision. Most platforms do not have a decimal type that supports
the full precision of these values (.NET decimal supports 28 digits, and Java BigDecimal
supports 38 by default). If this is the case then these columns can be cast to a string when
queried, or the connection can be configured to ignore them by setting
IgnoreTypes=decimal.
STRUCT and ARRAY Types
Google BigQuery supports two kinds of types for storing compound values in a single row,
STRUCT and ARRAY. In some places within Google BigQuery these are also known as
TIBCO® Data Virtualization Google Big Query Adapter Guide
146 | Google BigQuery Adapter
RECORD and REPEATED types.
A STRUCT is a fixed-size group of values which are accessed by name and can have
different types. The adapter flattens structs so their individual fields can be accessed using
dotted names. Note that these dotted names must be quoted.
-- trade_value STRUCT<currency STRING, value FLOAT>
SELECT CONCAT([trade_value.value], ' ', NULLIF([trade_value.currency],
'USD'))
FROM trades
An ARRAY is a group of values with the same type that can have any size. The adapter
treats the array as a single compound value and reports it as a JSON aggregate.
These types may be combined such that a STRUCT type contains an ARRAY field, or an
ARRAY field is a list of STRUCT values. The outer type takes precedence in how the field is
processed:
/* Table contains fields:
stocks STRUCT<symbol STRING, prices ARRAY<FLOAT>>
offers: ARRAY<STRUCT<currency STRING, value FLOAT>>
*/
SELECT [stocks.symbol], /* ARRAY field can be read from STRUCT, but is
converted to JSON */
[stocks.prices],
[offers] /* STRUCT fields in an ARRAY cannot be accessed
*/
FROM market
Type Parameters
The adapter exposes parameters on the following types. In each case the type parameters
are optional, Google BigQuery has default values for types that are not parameterized.
l
STRING(length)
l
BYTES(length)
l
NUMERIC(precision) or NUMERIC(precision, scale)
l
BIGNUMERIC(precision) or BIGNUMERIC(precision, scale)
TIBCO® Data Virtualization Google Big Query Adapter Guide
147 | Google BigQuery Adapter
These parameters are primarily for restricting the data written to the table. They are
included in the table metadata as the column size for STRING and BYTES, and the numeric
precision and scale for NUMERIC and BIGNUMERIC.
Type parameters have no effect on queries and are not reported within query metadata.
For example, here the output of CONCAT is a plain STRING even though its inputs are a
STRING(100) and b STRING(100).
SELECT CONCAT(a, b) FROM table_with_length_params
Connection String Options
The connection string properties are the various options that can be used to establish a
connection. This section provides a complete list of the options you can configure in the
connection string for this provider. Click the links for further details.
For more information on establishing a connection, see Basic Tab.
Authentication
Property Description
AuthScheme The type of authentication to use when connecting to Google BigQuery.
ProjectId The ProjectId used to resolve unqualified tables.
DatasetId The DatasetId used to resolve unqualified tables.
BillingProjectId The ProjectId of the billing project for executing jobs.
BigQuery
Property Description
AllowLargeResultSets Whether or not to allow large datasets to be stored in
TIBCO® Data Virtualization Google Big Query Adapter Guide
148 | Google BigQuery Adapter
temporary tables for large datasets.
DestinationTable This property determines where query results are stored in
Google BigQuery.
UseQueryCache Specifies whether to use Google BigQuery's built-in query
cache.
PageSize The number of results to return per page from Google
BigQuery.
PollingInterval This determines how long to wait in seconds, between checks
to see if a job has completed.
AllowUpdatesWithoutKey Whether or not to allow update without primary keys.
FilterColumns Please set `AllowUpdatesWithoutKey` to true before you could
use this property.
UseLegacySQL Specifies whether to use BigQuery's legacy SQL dialect for this
query. By default, Standard SQL will be used.
Storage API
Property Description
UseStorageAPI Specifies whether to use BigQuery's Storage API for bulk data reads.
UseArrowFormat Specifies whether to use the Arrow format with BigQuery's Storage API.
StorageThreshold The minimum number of rows a query must return to invoke the
Storage API.
StoragePageSize Specifies the page size to use for Storage API queries.
Uploading
TIBCO® Data Virtualization Google Big Query Adapter Guide
149 | Google BigQuery Adapter
Property Description
InsertMode Specifies what kind of method to use when inserting data. By
default streaming inserts are used.
WaitForBatchResults Whether to wait for the job to complete when using the bulk upload
API. Only active when InsertMode is set to Upload.
GCSBucket Specifies the name of a GCS bucket to upload bulk data for staging.
GCSBucketFolder Specifies the name of the folder in GCSBucket to upload bulk data
for staging.
TempTableDataset The prefix of the dataset that will contain temporary tables when
performing bulk UPDATE or DELETE operations.
OAuth
Property Description
InitiateOAuth Set this property to initiate the process to obtain or refresh the
OAuth access token when you connect.
OAuthClientId The client Id assigned when you register your application with an
OAuth authorization server.
OAuthClientSecret The client secret assigned when you register your application
with an OAuth authorization server.
OAuthAccessToken The access token for connecting using OAuth.
OAuthSettingsLocation The location of the settings file where OAuth values are saved
when InitiateOAuth is set to GETANDREFRESH or REFRESH.
Alternatively, this can be held in memory by specifying a value
starting with memory://.
TIBCO® Data Virtualization Google Big Query Adapter Guide
150 | Google BigQuery Adapter
Scope Identifies the Google API access that your application is
requesting. Scopes are space-delimited.
OAuthVerifier The verifier code returned from the OAuth authorization URL.
OAuthRefreshToken The OAuth refresh token for the corresponding OAuth access
token.
OAuthExpiresIn The lifetime in seconds of the OAuth AccessToken.
OAuthTokenTimestamp The Unix epoch timestamp in milliseconds when the current
Access Token was created.
JWT OAuth
Property Description
OAuthJWTCert The JWT Certificate store.
OAuthJWTCertType The type of key store containing the JWT Certificate.
OAuthJWTCertPassword The password for the OAuth JWT certificate.
OAuthJWTCertSubject The subject of the OAuth JWT certificate.
OAuthJWTIssuer The issuer of the Java Web Token.
OAuthJWTSubject The user subject for which the application is requesting
delegated access.
SSL
Property Description
TIBCO® Data Virtualization Google Big Query Adapter Guide
151 | Google BigQuery Adapter
SSLServerCert The certificate to be accepted from the server when connecting using
TLS/SSL.
Firewall
Property Description
FirewallType The protocol used by a proxy-based firewall.
FirewallServer The name or IP address of a proxy-based firewall.
FirewallPort The TCP port for a proxy-based firewall.
FirewallUser The user name to use to authenticate with a proxy-based firewall.
FirewallPassword A password used to authenticate to a proxy-based firewall.
Proxy
Property Description
ProxyAutoDetect This indicates whether to use the system proxy settings or not. This
takes precedence over other proxy settings, so you'll need to set
ProxyAutoDetect to FALSE in order use custom proxy settings.
ProxyServer The hostname or IP address of a proxy to route HTTP traffic through.
ProxyPort The TCP port the ProxyServer proxy is running on.
ProxyAuthScheme The authentication type to use to authenticate to the ProxyServer
proxy.
ProxyUser A user name to be used to authenticate to the ProxyServer proxy.
TIBCO® Data Virtualization Google Big Query Adapter Guide
152 | Google BigQuery Adapter
ProxyPassword A password to be used to authenticate to the ProxyServer proxy.
ProxySSLType The SSL type to use when connecting to the ProxyServer proxy.
ProxyExceptions A semicolon separated list of destination hostnames or IPs that are
exempt from connecting through the ProxyServer .
Logging
Property Description
LogModules Core modules to be included in the log file.
Schema
Property Description
Location A path to the directory that contains the schema files defining
tables, views, and stored procedures.
RefreshViewSchemas Allows the provider to determine up-to-date view schemas
automatically.
ShowTableDescriptions Controls whether table descriptions are returned via the platform
metadata APIs and sys_tables / sys_views.
PrimaryKeyIdentifiers Set this property to define primary keys.
AllowedTableTypes Specifies what kinds of tables will be visible.
FlattenObjects Determines whether the provider flattens STRUCT fields into top-
level columns.
Miscellaneous
TIBCO® Data Virtualization Google Big Query Adapter Guide
153 | Google BigQuery Adapter
Property Description
StorageTimeout How long a Storage API connection must remain idle before
the provider reconnects.
AllowAggregateParameters Allows raw aggregates to be used in parameters when
QueryPassthrough is enabled.
ApplicationName An application name in the form application/version. For
example, AcmeReporting/1.0.
AuditLimit The maximum number of rows which will be stored within an
audit table.
AuditMode What provider actions should be recorded to audit tables.
BigQueryOptions A comma separated list of Google BigQuery options.
GenerateSchemaFiles Indicates the user preference as to when schemas should be
generated and saved.
MaximumBillingTier The MaximumBillingTier is a positive integer that serves as a
multiplier of the basic price per TB. For example, if you set
MaximumBillingTier to 2, the maximum cost for that query
will be 2x basic price per TB.
MaximumBytesBilled Limits how many bytes BigQuery will allow a job to consume
before it is cancelled.
MaxRows Limits the number of rows returned rows when no
aggregation or group by is used in the query. This helps avoid
performance issues at design time.
Other These hidden properties are used only in specific use cases.
Readonly You can use this property to enforce read-only access to
Google BigQuery from the provider.
TableSamplePercent This determines what percent of a table is sampled with the
TIBCO® Data Virtualization Google Big Query Adapter Guide
154 | Google BigQuery Adapter
TABLESAMPLE operator.
Timeout The value in seconds until the timeout error is thrown,
canceling the operation.
Authentication
This section provides a complete list of the Authentication properties you can configure in
the connection string for this provider.
Property Description
AuthScheme The type of authentication to use when connecting to Google BigQuery.
ProjectId The ProjectId used to resolve unqualified tables.
DatasetId The DatasetId used to resolve unqualified tables.
BillingProjectId The ProjectId of the billing project for executing jobs.
AuthScheme
The type of authentication to use when connecting to Google BigQuery.
Possible Values
Auto, OAuth, OAuthJWT, GCPInstanceAccount
Data Type
string
Default Value
"Auto"
TIBCO® Data Virtualization Google Big Query Adapter Guide
155 | Google BigQuery Adapter
Remarks
l
Auto: Lets the driver decide automatically based on the other connection properties
you have set.
l
OAuth: Set this to perform OAuth authentication using a standard user account.
l
OAuthJWT: Set this to perform OAuth authentication using an OAuth service account.
l
GCPInstanceAccount: Set this to get Access Token from Google Cloud Platform
instance.
ProjectId
The ProjectId used to resolve unqualified tables.
Data Type
string
Default Value
""
Remarks
When a query refers to a table it can leave the project implicit, or qualify the project
directly as the catalog portion of the table:
/* Implicit, resolved against connection string */
SELECT FirstName, LastName FROM `Northwind`.`customers`
/* Explicit, project specified as catalog */
SELECT FirstName, LastName FROM `psychic-valve-
137816`.`Northwind`.`customers`
If the query contains unqualified table references then they are resolved this way:
1. If this property is set then the specified project is used.
2. Otherwise, if the BillingProjectId is set then that project is used.
TIBCO® Data Virtualization Google Big Query Adapter Guide
156 | Google BigQuery Adapter
3. Finally, if neither are set then the catalog from the first table in the query is used. In
the above example the psychic-valve-137816 project would be used to resolve any
other unqualified tables that would appear in the query (for example, if a JOIN were
added).
Note that the query is not consulted when QueryPassthrough is enabled. So you either
must set the connection ProjectId and DatasetId or qualify each individual table; otherwise
the SELECT query fails.
DatasetId
The DatasetId used to resolve unqualified tables.
Data Type
string
Default Value
""
Remarks
When a query refers to a table it can leave the dataset implicit, or qualify the dataset
directly as the schema portion of the table:
/* Implicit, resolved against connection string */
SELECT FirstName, LastName FROM `customers`
/* Explicit, dataset specified as schema */
SELECT FirstName, LastName FROM `psychic-valve-
137816`.`Northwind`.`customers`
If the query contains unqualified table references then they are resolved this way:
1. If this property is set then the specified dataset is used.
2. Otherwise the schema from the first table in the query is used. In the above example
the Northwind dataset would be used to resolve any other unqualified tables that
would appear in the query (for example, if a JOIN were added).
TIBCO® Data Virtualization Google Big Query Adapter Guide
157 | Google BigQuery Adapter
Note that the query is not consulted when QueryPassthrough is enabled. So you either
must set the connection ProjectId and DatasetId or qualify each individual table; otherwise
the SELECT query fails.
BillingProjectId
The ProjectId of the billing project for executing jobs.
Data Type
string
Default Value
""
Remarks
The ProjectId of the billing project for executing jobs. You can obtain the project Id in the
Google APIs console: In the main menu, click API Project and copy the Id. For example:
psychic-valve-137816. (Note that your domain name is not part of the Id.)
The billing project is determined based on the value of this property, the ProjectId and the
query:
1. If this property is set then the specified project will be billed.
2. Otherwise, if the ProjectId is set then that project will be billed.
3. Finally, if neither are set then the catalog from the first table in the query will be
used. For example, the below query would bill the psychic-valve-137816 project.
SELECT FirstName, LastName FROM `psychic-valve-
137816`.`Northwind`.`customers`
BigQuery
TIBCO® Data Virtualization Google Big Query Adapter Guide
158 | Google BigQuery Adapter
This section provides a complete list of the BigQuery properties you can configure in the
connection string for this provider.
Property Description
AllowLargeResultSets Whether or not to allow large datasets to be stored in
temporary tables for large datasets.
DestinationTable This property determines where query results are stored in
Google BigQuery.
UseQueryCache Specifies whether to use Google BigQuery's built-in query
cache.
PageSize The number of results to return per page from Google
BigQuery.
PollingInterval This determines how long to wait in seconds, between checks
to see if a job has completed.
AllowUpdatesWithoutKey Whether or not to allow update without primary keys.
FilterColumns Please set `AllowUpdatesWithoutKey` to true before you could
use this property.
UseLegacySQL Specifies whether to use BigQuery's legacy SQL dialect for this
query. By default, Standard SQL will be used.
AllowLargeResultSets
Whether or not to allow large datasets to be stored in temporary tables for large datasets.
Data Type
bool
TIBCO® Data Virtualization Google Big Query Adapter Guide
159 | Google BigQuery Adapter
Default Value
true
Remarks
Whether or not to allow large datasets to be stored in temporary tables for large datasets.
DestinationTable
This property determines where query results are stored in Google BigQuery.
Data Type
string
Default Value
""
Remarks
Google BigQuery queries have a maximum amount of data they are allowed to return
directly. If this limit is exceeded, then queries will fail with an error message like Response
too large to return. When this option is enabled the response limit does not apply, because
all query responses are stored in a Google BigQuery table before being returned.
This option is set differently depending upon whether your connection is using
UseLegacySQL or not. By default this option is set using the standard SQL syntax:
DestinationTable=project-name.dataset-name.table-name
When UseLegacySQL is enabled, this option is set using the legacy table syntax:
DestinationTable=project-name:dataset-name.table-name
When using this option with multiple connections, make sure that each connection has its
own destination table. Sharing a table between connections can lead to results getting lost
because parallel queries can overwrite each others results.
TIBCO® Data Virtualization Google Big Query Adapter Guide
160 | Google BigQuery Adapter
UseQueryCache
Specifies whether to use Google BigQuery's built-in query cache.
Data Type
bool
Default Value
true
Remarks
Google BigQuery will cache the results of recent queries, and will use this cache for queries
by default. Google BigQuery automatically updates the cache when a table is modified, so
performance is generally better without any risk of queries returning stale data.
If this is set to false, the query is always run against the table directly.
PageSize
The number of results to return per page from Google BigQuery.
Data Type
string
Default Value
"100000"
Remarks
The pagesize can control the number of results returned per page from Google BigQuery.
Setting a higher pagesize will cause more data to come back in a single HTTP request, but
may take longer to execute. Setting a smaller pagesize will increase the number of HTTP
TIBCO® Data Virtualization Google Big Query Adapter Guide
161 | Google BigQuery Adapter
requests to get all the data, but is generally recommended to ensure timeout exceptions
do not occur.
Note that this option does not have an effect if UseStorageApi is enabled and the queries
being executed can be executed on the Storage API. See StoragePageSize for more
information.
PollingInterval
This determines how long to wait in seconds, between checks to see if a job has
completed.
Data Type
string
Default Value
"1"
Remarks
This only applies to queries which are stored to a table instead of streamed directly to the
adapter. This applies in only three cases:
l
DestinationTable is set.
l
AllowLargeResultSets is true and the query takes longer than Timeout seconds.
l
UseStorageApi is enabled and the query is complex.
This property determines how long to wait between checking whether or not the query's
results are ready. Very large resultsets or complex queries may take longer to process, and
a low polling interval may result in may unnecessary requests being made to check the
query status.
AllowUpdatesWithoutKey
Whether or not to allow update without primary keys.
TIBCO® Data Virtualization Google Big Query Adapter Guide
162 | Google BigQuery Adapter
Data Type
bool
Default Value
false
Remarks
Whether or not to allow update without primary keys.
FilterColumns
Please set `AllowUpdatesWithoutKey` to true before you could use this property.
Data Type
string
Default Value
""
Remarks
Remember setting `AllowUpdatesWithoutKey` to true before you could use this property:
Set the property like this:
`filterColumns=col1[,col2[,col3]];`
UseLegacySQL
Specifies whether to use BigQuery's legacy SQL dialect for this query. By default, Standard
SQL will be used.
TIBCO® Data Virtualization Google Big Query Adapter Guide
163 | Google BigQuery Adapter
Data Type
bool
Default Value
false
Remarks
If set to true, the query will use BigQuery's Legacy SQL dialect to rebuild the query. If set to
false, the query will use BigQuery's standard SQL: https://cloud.google.com/bigquery/sql-
reference/.
When UseLegacySQL is set to false, the values of AllowLargeResultSets is ignored. The
query will be run as if AllowLargeResultSets is true.
Storage API
This section provides a complete list of the Storage API properties you can configure in the
connection string for this provider.
Property Description
UseStorageAPI Specifies whether to use BigQuery's Storage API for bulk data reads.
UseArrowFormat Specifies whether to use the Arrow format with BigQuery's Storage API.
StorageThreshold The minimum number of rows a query must return to invoke the
Storage API.
StoragePageSize Specifies the page size to use for Storage API queries.
UseStorageAPI
Specifies whether to use BigQuery's Storage API for bulk data reads.
TIBCO® Data Virtualization Google Big Query Adapter Guide
164 | Google BigQuery Adapter
Data Type
bool
Default Value
true
Remarks
By default the adapter will use the Storage API instead of the default REST API. Depending
upon the complexity of the query, the adapter may execute the query in one of two ways:
l
Simple queries that read all columns from only one table, and have no extra clauses
except LIMIT, are executed directly within the Storage API.
l
All other queries are executed as a query job which writes to a temporary table. Once
the query is complete, the results are read from the temporary table using the
Storage API.
The BigQuery Storage API can read data faster and more efficiently than the REST API
(accessible by setting this option to false), but is priced differently and requires extra OAuth
permissions when using your own OAuth app. It also uses the separate StoragePageSize
property instead of PageSize.
The BigQuery REST API requires no extra permissions and uses standard pricing, but is
slower than the Storage API.
UseArrowFormat
Specifies whether to use the Arrow format with BigQuery's Storage API.
Data Type
bool
Default Value
false
TIBCO® Data Virtualization Google Big Query Adapter Guide
165 | Google BigQuery Adapter
Remarks
This property only has an effect when UseStorageApi is enabled. When performing reads
against the Storage API, the adapter can request data in different formats. By default it
uses Avro but enabling this option makes it use Arrow.
This option should be enabled when working with time series data or other datasets that
have many date, time, datetime or timestamp fields. For these datasets using Arrow can
have noticable improvements over using Avro. Otherwise Avro and Arrow read times are
very close and switching between them is unlikely to make a significant difference.
StorageThreshold
The minimum number of rows a query must return to invoke the Storage API.
Data Type
string
Default Value
"100000"
Remarks
When the adapter receives a query too complex to be run directly in the Storage API, it
creates a query job and uses the Storage API to read from the query results table. If the
query job returns fewer than the number of rows provided in this option, then the results
are returned directly and the Storage API is not used.
This value should be set between 1 and 100000. Higher values will use the Storage API only
for large resultsets, but will be delayed by reading more results from the query job. Lower
values will result in smaller delays but will use the Storage API for more queries.
Note that this option only has an effect if UseStorageApi is enabled and the queries being
executed cannot be executed directly on the Storage API. Queries which run directly on
Storage never create query jobs.
StoragePageSize
TIBCO® Data Virtualization Google Big Query Adapter Guide
166 | Google BigQuery Adapter
Specifies the page size to use for Storage API queries.
Data Type
string
Default Value
"10000"
Remarks
When UseStorageApi is enabled and the query being executed can be run on the Storage
API, this option controls how many rows the adapter is allowed to buffer on the client.
A higher value will generally make queries faster at the expense of consuming more
memory, while lower values will conserve memory but make queries slower.
Uploading
This section provides a complete list of the Uploading properties you can configure in the
connection string for this provider.
Property Description
InsertMode Specifies what kind of method to use when inserting data. By
default streaming inserts are used.
WaitForBatchResults Whether to wait for the job to complete when using the bulk upload
API. Only active when InsertMode is set to Upload.
GCSBucket Specifies the name of a GCS bucket to upload bulk data for staging.
GCSBucketFolder Specifies the name of the folder in GCSBucket to upload bulk data
for staging.
TempTableDataset The prefix of the dataset that will contain temporary tables when
performing bulk UPDATE or DELETE operations.
TIBCO® Data Virtualization Google Big Query Adapter Guide
167 | Google BigQuery Adapter
InsertMode
Specifies what kind of method to use when inserting data. By default streaming inserts are
used.
Possible Values
Streaming, DML, Upload, GCSStaging
Data Type
string
Default Value
"Streaming"
Remarks
This section provides only a summary of the mechanisms that each of these modes use.
Please see Advanced Integrations for more details on how to use each of these modes.
l
Streaming uses the Google BigQuery streaming API (also called insertAll).
l
DML uses the Google BigQuery query API to generate INSERT SQL statements which
insert individual rows.
l
Upload uses the Google BigQuery upload API to create a load job which copies the
rows from temporary server-side storage.
l
GCSStaging is similar to the Upload mode except that it uses your Google Cloud
Storage account instead of public storage.
When UseLegacySQL is true only Streaming and Upload modes are allowed. The Legacy
SQL dialect does not support DML statements.
WaitForBatchResults
Whether to wait for the job to complete when using the bulk upload API. Only active when
InsertMode is set to Upload.
TIBCO® Data Virtualization Google Big Query Adapter Guide
168 | Google BigQuery Adapter
Data Type
bool
Default Value
true
Remarks
This property determines whether the adapter will wait for batch jobs to report their
status. By default property is true and INSERT queries will complete only once Google
BigQuery has finished executed them. When this property is false the INSERT query will
complete as soon as a job is submitted for it.
The default mode is recommended for reliability:
1. Inserts will never fail silently. If the adapter does not wait for the job to finish, it will
never receive an error if the job failed to execute.
2. If the insert batch size is small enough, the adapter may submit jobs quickly enough
that it hits Google BigQuery's load job limits. This does not happen when waiting for
batch results because the adapter will not allow more than one job to execute at the
same time on the same connection.
You can disable this option to achieve lower delays when inserting, but you must also
make sure to obey the Google BigQuery rate limits and check the status of each job to
track their status and determine whether they have succeeded or failed.
GCSBucket
Specifies the name of a GCS bucket to upload bulk data for staging.
Data Type
string
Default Value
""
TIBCO® Data Virtualization Google Big Query Adapter Guide
169 | Google BigQuery Adapter
Remarks
Only applies when InsertMode is set to GCSStaging, and if that option is set to use staging
then this option is required.
GCSBucketFolder
Specifies the name of the folder in GCSBucket to upload bulk data for staging.
Data Type
string
Default Value
""
Remarks
Only applies when InsertMode is set to GCSStaging. If not set the adapter defaults to
writing to the root of the bucket.
TempTableDataset
The prefix of the dataset that will contain temporary tables when performing bulk UPDATE
or DELETE operations.
Data Type
string
Default Value
"_CDataTempTableDataset"
TIBCO® Data Virtualization Google Big Query Adapter Guide
170 | Google BigQuery Adapter
Remarks
Internally bulk UPDATE and DELETE use Google BigQuery MERGE queries, which require
creating a table to hold all the update operations. This option is used along with the target
table's region to determine the name of the dataset where these temporary tables are
created. Each region must have its own temporary dataset so that the temporary table and
the MERGE table can be stored in the same project/dataset. This avoids unnecessary data
transfer charges.
For example, the adapter would create a dataset called "_CDataTempTableDataset_US" for
tables in the US region and a dataset called "_CDataTempTableDataset_asia_southeast_1"
for tables in the Singapore region.
OAuth
This section provides a complete list of the OAuth properties you can configure in the
connection string for this provider.
Property Description
InitiateOAuth Set this property to initiate the process to obtain or refresh the
OAuth access token when you connect.
OAuthClientId The client Id assigned when you register your application with an
OAuth authorization server.
OAuthClientSecret The client secret assigned when you register your application
with an OAuth authorization server.
OAuthAccessToken The access token for connecting using OAuth.
OAuthSettingsLocation The location of the settings file where OAuth values are saved
when InitiateOAuth is set to GETANDREFRESH or REFRESH.
Alternatively, this can be held in memory by specifying a value
starting with memory://.
Scope Identifies the Google API access that your application is
requesting. Scopes are space-delimited.
TIBCO® Data Virtualization Google Big Query Adapter Guide
171 | Google BigQuery Adapter
OAuthVerifier The verifier code returned from the OAuth authorization URL.
OAuthRefreshToken The OAuth refresh token for the corresponding OAuth access
token.
OAuthExpiresIn The lifetime in seconds of the OAuth AccessToken.
OAuthTokenTimestamp The Unix epoch timestamp in milliseconds when the current
Access Token was created.
InitiateOAuth
Set this property to initiate the process to obtain or refresh the OAuth access token when
you connect.
Possible Values
OFF, GETANDREFRESH, REFRESH
Data Type
string
Default Value
"OFF"
Remarks
The following options are available:
1. OFF: Indicates that the OAuth flow will be handled entirely by the user. An
OAuthAccessToken will be required to authenticate.
2. GETANDREFRESH: Indicates that the entire OAuth Flow will be handled by the
adapter. If no token currently exists, it will be obtained by prompting the user via the
browser. If a token exists, it will be refreshed when applicable.
3. REFRESH: Indicates that the adapter will only handle refreshing the
TIBCO® Data Virtualization Google Big Query Adapter Guide
172 | Google BigQuery Adapter
OAuthAccessToken. The user will never be prompted by the adapter to authenticate
via the browser. The user must handle obtaining the OAuthAccessToken and
OAuthRefreshToken initially.
OAuthClientId
The client Id assigned when you register your application with an OAuth authorization
server.
Data Type
string
Default Value
""
Remarks
As part of registering an OAuth application, you will receive the OAuthClientId value,
sometimes also called a consumer key, and a client secret, the OAuthClientSecret.
OAuthClientSecret
The client secret assigned when you register your application with an OAuth authorization
server.
Data Type
string
Default Value
""
TIBCO® Data Virtualization Google Big Query Adapter Guide
173 | Google BigQuery Adapter
Remarks
As part of registering an OAuth application, you will receive the OAuthClientId, also called a
consumer key. You will also receive a client secret, also called a consumer secret. Set the
client secret in the OAuthClientSecret property.
OAuthAccessToken
The access token for connecting using OAuth.
Data Type
string
Default Value
""
Remarks
The OAuthAccessToken property is used to connect using OAuth. The OAuthAccessToken is
retrieved from the OAuth server as part of the authentication process. It has a server-
dependent timeout and can be reused between requests.
The access token is used in place of your user name and password. The access token
protects your credentials by keeping them on the server.
OAuthSettingsLocation
The location of the settings file where OAuth values are saved when InitiateOAuth is set to
GETANDREFRESH or REFRESH. Alternatively, this can be held in memory by specifying a
value starting with memory://.
Data Type
string
TIBCO® Data Virtualization Google Big Query Adapter Guide
174 | Google BigQuery Adapter
Default Value
"%APPDATA%\\CData\\GoogleBigQuery Data Provider\\OAuthSettings.txt"
Remarks
When InitiateOAuth is set to GETANDREFRESH or REFRESH, the adapter saves OAuth values
to avoid requiring the user to manually enter OAuth connection properties and allowing
the credentials to be shared across connections or processes.
Alternatively to specifying a file path, memory storage can be used instead. Memory
locations are specified by using a value starting with 'memory://' followed by a unique
identifier for that set of credentials (ex: memory://user1). The identifier can be anything
you choose but should be unique to the user. Unlike with the file based storage, you must
manually store the credentials when closing the connection with memory storage to be
able to set them in the connection when the process is started again. The OAuth property
values can be retrieved with a query to the sys_connection_props system table. If there are
multiple connections using the same credentials, the properties should be read from the
last connection to be closed.
If left unspecified, the default location is "%APPDATA%\\CData\\GoogleBigQuery Data
Provider\\OAuthSettings.txt" with %APPDATA% being set to the user's configuration
directory:
Platform %APPDATA%
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config
Scope
Identifies the Google API access that your application is requesting. Scopes are space-
delimited.
TIBCO® Data Virtualization Google Big Query Adapter Guide
175 | Google BigQuery Adapter
Data Type
string
Default Value
""
Remarks
By default the adapter requests only the scope https://www.googleapis.com/auth/bigquery,
but in some cases more scopes are required. For example, when reading data using an
external table connected to Google Drive, the additional scope
https://www.googleapis.com/auth/drive is also required.
OAuthVerifier
The verifier code returned from the OAuth authorization URL.
Data Type
string
Default Value
""
Remarks
The verifier code returned from the OAuth authorization URL. This can be used on systems
where a browser cannot be launched such as headless systems.
Authentication on Headless Machines
See to obtain the OAuthVerifier value.
TIBCO® Data Virtualization Google Big Query Adapter Guide
176 | Google BigQuery Adapter
Set OAuthSettingsLocation along with OAuthVerifier. When you connect, the adapter
exchanges the OAuthVerifier for the OAuth authentication tokens and saves them,
encrypted, to the specified file. Set InitiateOAuth to GETANDREFRESH automate the
exchange.
Once the OAuth settings file has been generated, you can remove OAuthVerifier from the
connection properties and connect with OAuthSettingsLocation set.
To automatically refresh the OAuth token values, set OAuthSettingsLocation and
additionally set InitiateOAuth to REFRESH.
OAuthRefreshToken
The OAuth refresh token for the corresponding OAuth access token.
Data Type
string
Default Value
""
Remarks
The OAuthRefreshToken property is used to refresh the OAuthAccessToken when using
OAuth authentication.
OAuthExpiresIn
The lifetime in seconds of the OAuth AccessToken.
Data Type
string
TIBCO® Data Virtualization Google Big Query Adapter Guide
177 | Google BigQuery Adapter
Default Value
""
Remarks
Pair with OAuthTokenTimestamp to determine when the AccessToken will expire.
OAuthTokenTimestamp
The Unix epoch timestamp in milliseconds when the current Access Token was created.
Data Type
string
Default Value
""
Remarks
Pair with OAuthExpiresIn to determine when the AccessToken will expire.
JWT OAuth
This section provides a complete list of the JWT OAuth properties you can configure in the
connection string for this provider.
Property Description
OAuthJWTCert The JWT Certificate store.
OAuthJWTCertType The type of key store containing the JWT Certificate.
TIBCO® Data Virtualization Google Big Query Adapter Guide
178 | Google BigQuery Adapter
OAuthJWTCertPassword The password for the OAuth JWT certificate.
OAuthJWTCertSubject The subject of the OAuth JWT certificate.
OAuthJWTIssuer The issuer of the Java Web Token.
OAuthJWTSubject The user subject for which the application is requesting
delegated access.
OAuthJWTCert
The JWT Certificate store.
Data Type
string
Default Value
""
Remarks
The name of the certificate store for the client certificate.
The OAuthJWTCertType field specifies the type of the certificate store specified by
OAuthJWTCert. If the store is password protected, specify the password in
OAuthJWTCertPassword.
OAuthJWTCert is used in conjunction with the OAuthJWTCertSubject field in order to
specify client certificates. If OAuthJWTCert has a value, and OAuthJWTCertSubject is set, a
search for a certificate is initiated. Please refer to the OAuthJWTCertSubject field for
details.
Designations of certificate stores are platform-dependent.
The following are designations of the most common User and Machine certificate stores in
Windows:
TIBCO® Data Virtualization Google Big Query Adapter Guide
179 | Google BigQuery Adapter
MY A certificate store holding personal certificates with their associated private keys.
CA Certifying authority certificates.
ROOT Root certificates.
SPC Software publisher certificates.
In Java, the certificate store normally is a file containing certificates and optional private
keys.
When the certificate store type is PFXFile, this property must be set to the name of the file.
When the type is PFXBlob, the property must be set to the binary contents of a PFX file (i.e.
PKCS12 certificate store).
OAuthJWTCertType
The type of key store containing the JWT Certificate.
Possible Values
USER, MACHINE, PFXFILE, PFXBLOB, JKSFILE, JKSBLOB, PEMKEY_FILE, PEMKEY_BLOB,
PUBLIC_KEY_FILE, PUBLIC_KEY_BLOB, SSHPUBLIC_KEY_FILE, SSHPUBLIC_KEY_BLOB,
P7BFILE, PPKFILE, XMLFILE, XMLBLOB, GOOGLEJSON, GOOGLEJSONBLOB
Data Type
string
Default Value
"GOOGLEJSON"
Remarks
This property can take one of the following values:
TIBCO® Data Virtualization Google Big Query Adapter Guide
180 | Google BigQuery Adapter
USER For Windows, this specifies that the certificate store is a certificate
store owned by the current user. Note: This store type is not
available in Java.
MACHINE For Windows, this specifies that the certificate store is a machine
store. Note: this store type is not available in Java.
PFXFILE The certificate store is the name of a PFX (PKCS12) file containing
certificates.
PFXBLOB The certificate store is a string (base-64-encoded) representing a
certificate store in PFX (PKCS12) format.
JKSFILE The certificate store is the name of a Java key store (JKS) file
containing certificates. Note: this store type is only available in
Java.
JKSBLOB The certificate store is a string (base-64-encoded) representing a
certificate store in Java key store (JKS) format. Note: this store
type is only available in Java.
PEMKEY_FILE The certificate store is the name of a PEM-encoded file that
contains a private key and an optional certificate.
PEMKEY_BLOB The certificate store is a string (base64-encoded) that contains a
private key and an optional certificate.
PUBLIC_KEY_FILE The certificate store is the name of a file that contains a PEM- or
DER-encoded public key certificate.
PUBLIC_KEY_BLOB The certificate store is a string (base-64-encoded) that contains a
PEM- or DER-encoded public key certificate.
SSHPUBLIC_KEY_FILE The certificate store is the name of a file that contains an SSH-
style public key.
SSHPUBLIC_KEY_BLOB The certificate store is a string (base-64-encoded) that contains an
SSH-style public key.
P7BFILE The certificate store is the name of a PKCS7 file containing
TIBCO® Data Virtualization Google Big Query Adapter Guide
181 | Google BigQuery Adapter
certificates.
PPKFILE The certificate store is the name of a file that contains a PPK
(PuTTY Private Key).
XMLFILE The certificate store is the name of a file that contains a
certificate in XML format.
XMLBLOB The certificate store is a string that contains a certificate in XML
format.
GOOGLEJSON The certificate store is the name of a JSON file containing the
service account information. Only valid when connecting to a
Google service.
GOOGLEJSONBLOB The certificate store is a string that contains the service account
JSON. Only valid when connecting to a Google service.
OAuthJWTCertPassword
The password for the OAuth JWT certificate.
Data Type
string
Default Value
""
Remarks
If the certificate store is of a type that requires a password, this property is used to specify
that password in order to open the certificate store.
This is not required when using the GOOGLEJSON OAuthJWTCertType. Google JSON keys
are not encrypted.
TIBCO® Data Virtualization Google Big Query Adapter Guide
182 | Google BigQuery Adapter
OAuthJWTCertSubject
The subject of the OAuth JWT certificate.
Data Type
string
Default Value
"*"
Remarks
When loading a certificate the subject is used to locate the certificate in the store.
If an exact match is not found, the store is searched for subjects containing the value of
the property.
If a match is still not found, the property is set to an empty string, and no certificate is
selected.
The special value "*" picks the first certificate in the certificate store.
The certificate subject is a comma separated list of distinguished name fields and values.
For instance "CN=www.server.com, OU=test, C=US, [email protected]". Common fields
and their meanings are displayed below.
Field Meaning
CN Common Name. This is commonly a host name like www.server.com.
O Organization
OU Organizational Unit
L Locality
S State
TIBCO® Data Virtualization Google Big Query Adapter Guide
183 | Google BigQuery Adapter
C Country
E Email Address
If a field value contains a comma it must be quoted.
OAuthJWTIssuer
The issuer of the Java Web Token.
Data Type
string
Default Value
""
Remarks
The issuer of the Java Web Token. This is typically either the Client Id or Email Address of
the OAuth Application.
This is not required when using the GOOGLEJSON OAuthJWTCertType. Google JSON keys
contain a copy of the issuer account.
OAuthJWTSubject
The user subject for which the application is requesting delegated access.
Data Type
string
Default Value
""
TIBCO® Data Virtualization Google Big Query Adapter Guide
184 | Google BigQuery Adapter
Remarks
The user subject for which the application is requesting delegated access. Typically, the
user account name or email address.
SSL
This section provides a complete list of the SSL properties you can configure in the
connection string for this provider.
Property Description
SSLServerCert The certificate to be accepted from the server when connecting using
TLS/SSL.
SSLServerCert
The certificate to be accepted from the server when connecting using TLS/SSL.
Data Type
string
Default Value
""
Remarks
If using a TLS/SSL connection, this property can be used to specify the TLS/SSL certificate
to be accepted from the server. Any other certificate that is not trusted by the machine is
rejected.
This property can take the following forms:
TIBCO® Data Virtualization Google Big Query Adapter Guide
185 | Google BigQuery Adapter
Description Example
A full PEM Certificate (example shortened for brevity) -----BEGIN CERTIFICATE-----
MIIChTCCAe4CAQAwDQYJKoZIhv...
...Qw== -----END CERTIFICATE-----
A path to a local file containing the certificate C:\cert.cer
The public key (example shortened for brevity) -----BEGIN RSA PUBLIC KEY-----
MIGfMA0GCSq......AQAB -----END
RSA PUBLIC KEY-----
The MD5 Thumbprint (hex values can also be either space or colon separated) ecadbdda5a1529c58a1e9e09828d
70e4
The SHA1 Thumbprint (hex values can also be either space or colon separated) 34a929226ae0819f2ec14b4a3d904f
801cbb150d
If not specified, any certificate trusted by the machine is accepted.
Certificates are validated as trusted by the machine based on the System's trust store. The
trust store used is the 'javax.net.ssl.trustStore' value specified for the system. If no value is
specified for this property, Java's default trust store is used (for example, JAVA_
HOME\lib\security\cacerts).
Use '*' to signify to accept all certificates. Note that this is not recommended due to
security concerns.
Firewall
This section provides a complete list of the Firewall properties you can configure in the
connection string for this provider.
Property Description
FirewallType The protocol used by a proxy-based firewall.
FirewallServer The name or IP address of a proxy-based firewall.
TIBCO® Data Virtualization Google Big Query Adapter Guide
186 | Google BigQuery Adapter
FirewallPort The TCP port for a proxy-based firewall.
FirewallUser The user name to use to authenticate with a proxy-based firewall.
FirewallPassword A password used to authenticate to a proxy-based firewall.
FirewallType
The protocol used by a proxy-based firewall.
Possible Values
NONE, TUNNEL, SOCKS4, SOCKS5
Data Type
string
Default Value
"NONE"
Remarks
This property specifies the protocol that the adapter will use to tunnel traffic through the
FirewallServer proxy. Note that by default, the adapter connects to the system proxy; to
disable this behavior and connect to one of the following proxy types, set ProxyAutoDetect
to false.
Type Default
Port
Description
TUNNEL
80 When this is set, the adapter opens a connection to Google
BigQuery and traffic flows back and forth through the proxy.
SOCKS4 1080 When this is set, the adapter sends data through the SOCKS 4
TIBCO® Data Virtualization Google Big Query Adapter Guide
187 | Google BigQuery Adapter
proxy specified by FirewallServer and FirewallPort and passes the
FirewallUser value to the proxy, which determines if the connection
request should be granted.
SOCKS5
1080 When this is set, the adapter sends data through the SOCKS 5
proxy specified by FirewallServer and FirewallPort. If your proxy
requires authentication, set FirewallUser and FirewallPassword to
credentials the proxy recognizes.
To connect to HTTP proxies, use ProxyServer and ProxyPort. To authenticate to HTTP
proxies, use ProxyAuthScheme, ProxyUser, and ProxyPassword.
FirewallServer
The name or IP address of a proxy-based firewall.
Data Type
string
Default Value
""
Remarks
This property specifies the IP address, DNS name, or host name of a proxy allowing
traversal of a firewall. The protocol is specified by FirewallType: Use FirewallServer with
this property to connect through SOCKS or do tunneling. Use ProxyServer to connect to an
HTTP proxy.
Note that the adapter uses the system proxy by default. To use a different proxy, set
ProxyAutoDetect to false.
FirewallPort
The TCP port for a proxy-based firewall.
TIBCO® Data Virtualization Google Big Query Adapter Guide
188 | Google BigQuery Adapter
Data Type
int
Default Value
0
Remarks
This specifies the TCP port for a proxy allowing traversal of a firewall. Use FirewallServer to
specify the name or IP address. Specify the protocol with FirewallType.
FirewallUser
The user name to use to authenticate with a proxy-based firewall.
Data Type
string
Default Value
""
Remarks
The FirewallUser and FirewallPassword properties are used to authenticate against the
proxy specified in FirewallServer and FirewallPort, following the authentication method
specified in FirewallType.
FirewallPassword
A password used to authenticate to a proxy-based firewall.
TIBCO® Data Virtualization Google Big Query Adapter Guide
189 | Google BigQuery Adapter
Data Type
string
Default Value
""
Remarks
This property is passed to the proxy specified by FirewallServer and FirewallPort, following
the authentication method specified by FirewallType.
Proxy
This section provides a complete list of the Proxy properties you can configure in the
connection string for this provider.
Property Description
ProxyAutoDetect This indicates whether to use the system proxy settings or not. This
takes precedence over other proxy settings, so you'll need to set
ProxyAutoDetect to FALSE in order use custom proxy settings.
ProxyServer The hostname or IP address of a proxy to route HTTP traffic through.
ProxyPort The TCP port the ProxyServer proxy is running on.
ProxyAuthScheme The authentication type to use to authenticate to the ProxyServer
proxy.
ProxyUser A user name to be used to authenticate to the ProxyServer proxy.
ProxyPassword A password to be used to authenticate to the ProxyServer proxy.
ProxySSLType The SSL type to use when connecting to the ProxyServer proxy.
ProxyExceptions A semicolon separated list of destination hostnames or IPs that are
exempt from connecting through the ProxyServer .
TIBCO® Data Virtualization Google Big Query Adapter Guide
190 | Google BigQuery Adapter
ProxyAutoDetect
This indicates whether to use the system proxy settings or not. This takes precedence over
other proxy settings, so you'll need to set ProxyAutoDetect to FALSE in order use custom
proxy settings.
Data Type
bool
Default Value
true
Remarks
This takes precedence over other proxy settings, so you'll need to set ProxyAutoDetect to
FALSE in order use custom proxy settings.
NOTE: When this property is set to True, the proxy used is determined as follows:
l
A search from the JVM properties (http.proxy, https.proxy, socksProxy, etc.) is
performed.
l
In the case that the JVM properties don't exist, a search from
java.home/lib/net.properties is performed.
l
In the case that java.net.useSystemProxies is set to True, a search from the
SystemProxy is performed.
l
In Windows only, an attempt is made to retrieve these properties from the Internet
Options in the registry.
To connect to an HTTP proxy, see ProxyServer. For other proxies, such as SOCKS or
tunneling, see FirewallType.
ProxyServer
The hostname or IP address of a proxy to route HTTP traffic through.
TIBCO® Data Virtualization Google Big Query Adapter Guide
191 | Google BigQuery Adapter
Data Type
string
Default Value
""
Remarks
The hostname or IP address of a proxy to route HTTP traffic through. The adapter can use
the HTTP, Windows (NTLM), or Kerberos authentication types to authenticate to an HTTP
proxy.
If you need to connect through a SOCKS proxy or tunnel the connection, see FirewallType.
By default, the adapter uses the system proxy. If you need to use another proxy, set
ProxyAutoDetect to false.
ProxyPort
The TCP port the ProxyServer proxy is running on.
Data Type
int
Default Value
80
Remarks
The port the HTTP proxy is running on that you want to redirect HTTP traffic through.
Specify the HTTP proxy in ProxyServer. For other proxy types, see FirewallType.
ProxyAuthScheme
The authentication type to use to authenticate to the ProxyServer proxy.
TIBCO® Data Virtualization Google Big Query Adapter Guide
192 | Google BigQuery Adapter
Possible Values
BASIC, DIGEST, NONE, NEGOTIATE, NTLM, PROPRIETARY
Data Type
string
Default Value
"BASIC"
Remarks
This value specifies the authentication type to use to authenticate to the HTTP proxy
specified by ProxyServer and ProxyPort.
Note that the adapter will use the system proxy settings by default, without further
configuration needed; if you want to connect to another proxy, you will need to set
ProxyAutoDetect to false, in addition to ProxyServer and ProxyPort. To authenticate, set
ProxyAuthScheme and set ProxyUser and ProxyPassword, if needed.
The authentication type can be one of the following:
l
BASIC: The adapter performs HTTP BASIC authentication.
l
DIGEST: The adapter performs HTTP DIGEST authentication.
l
NEGOTIATE: The adapter retrieves an NTLM or Kerberos token based on the
applicable protocol for authentication.
l
PROPRIETARY: The adapter does not generate an NTLM or Kerberos token. You must
supply this token in the Authorization header of the HTTP request.
If you need to use another authentication type, such as SOCKS 5 authentication, see
FirewallType.
ProxyUser
A user name to be used to authenticate to the ProxyServer proxy.
TIBCO® Data Virtualization Google Big Query Adapter Guide
193 | Google BigQuery Adapter
Data Type
string
Default Value
""
Remarks
The ProxyUser and ProxyPassword options are used to connect and authenticate against
the HTTP proxy specified in ProxyServer.
You can select one of the available authentication types in ProxyAuthScheme. If you are
using HTTP authentication, set this to the user name of a user recognized by the HTTP
proxy. If you are using Windows or Kerberos authentication, set this property to a user
name in one of the following formats:
user@domain
domain\user
ProxyPassword
A password to be used to authenticate to the ProxyServer proxy.
Data Type
string
Default Value
""
Remarks
This property is used to authenticate to an HTTP proxy server that supports NTLM
(Windows), Kerberos, or HTTP authentication. To specify the HTTP proxy, you can set
ProxyServer and ProxyPort. To specify the authentication type, set ProxyAuthScheme.
TIBCO® Data Virtualization Google Big Query Adapter Guide
194 | Google BigQuery Adapter
If you are using HTTP authentication, additionally set ProxyUser and ProxyPassword to
HTTP proxy.
If you are using NTLM authentication, set ProxyUser and ProxyPassword to your Windows
password. You may also need these to complete Kerberos authentication.
For SOCKS 5 authentication or tunneling, see FirewallType.
By default, the adapter uses the system proxy. If you want to connect to another proxy, set
ProxyAutoDetect to false.
ProxySSLType
The SSL type to use when connecting to the ProxyServer proxy.
Possible Values
AUTO, ALWAYS, NEVER, TUNNEL
Data Type
string
Default Value
"AUTO"
Remarks
This property determines when to use SSL for the connection to an HTTP proxy specified
by ProxyServer. This value can be AUTO, ALWAYS, NEVER, or TUNNEL. The applicable
values are the following:
AUTO Default setting. If the URL is an HTTPS URL, the adapter will use the TUNNEL
option. If the URL is an HTTP URL, the component will use the NEVER option.
ALWAYS The connection is always SSL enabled.
NEVER The connection is not SSL enabled.
TIBCO® Data Virtualization Google Big Query Adapter Guide
195 | Google BigQuery Adapter
TUNNEL The connection is through a tunneling proxy. The proxy server opens a
connection to the remote host and traffic flows back and forth through the
proxy.
ProxyExceptions
A semicolon separated list of destination hostnames or IPs that are exempt from
connecting through the ProxyServer .
Data Type
string
Default Value
""
Remarks
The ProxyServer is used for all addresses, except for addresses defined in this property. Use
semicolons to separate entries.
Note that the adapter uses the system proxy settings by default, without further
configuration needed; if you want to explicitly configure proxy exceptions for this
connection, you need to set ProxyAutoDetect = false, and configure ProxyServer and
ProxyPort. To authenticate, set ProxyAuthScheme and set ProxyUser and ProxyPassword, if
needed.
Logging
This section provides a complete list of the Logging properties you can configure in the
connection string for this provider.
TIBCO® Data Virtualization Google Big Query Adapter Guide
196 | Google BigQuery Adapter
Property Description
LogModules Core modules to be included in the log file.
LogModules
Core modules to be included in the log file.
Data Type
string
Default Value
""
Remarks
Only the modules specified (separated by ';') will be included in the log file. By default all
modules are included.
See the Logging page for an overview.
Schema
This section provides a complete list of the Schema properties you can configure in the
connection string for this provider.
Property Description
Location A path to the directory that contains the schema files defining
tables, views, and stored procedures.
RefreshViewSchemas Allows the provider to determine up-to-date view schemas
automatically.
TIBCO® Data Virtualization Google Big Query Adapter Guide
197 | Google BigQuery Adapter
ShowTableDescriptions Controls whether table descriptions are returned via the platform
metadata APIs and sys_tables / sys_views.
PrimaryKeyIdentifiers Set this property to define primary keys.
AllowedTableTypes Specifies what kinds of tables will be visible.
FlattenObjects Determines whether the provider flattens STRUCT fields into top-
level columns.
Location
A path to the directory that contains the schema files defining tables, views, and stored
procedures.
Data Type
string
Default Value
"%APPDATA%\\CData\\GoogleBigQuery Data Provider\\Schema"
Remarks
The path to a directory which contains the schema files for the adapter (.rsd files for tables
and views, .rsb files for stored procedures). The folder location can be a relative path from
the location of the executable. The Location property is only needed if you want to
customize definitions (for example, change a column name, ignore a column, and so on) or
extend the data model with new tables, views, or stored procedures.
If left unspecified, the default location is "%APPDATA%\\CData\\GoogleBigQuery Data
Provider\\Schema" with %APPDATA% being set to the user's configuration directory:
Platform %APPDATA%
TIBCO® Data Virtualization Google Big Query Adapter Guide
198 | Google BigQuery Adapter
Windows The value of the APPDATA environment variable
Mac ~/Library/Application Support
Linux ~/.config
RefreshViewSchemas
Allows the provider to determine up-to-date view schemas automatically.
Data Type
bool
Default Value
true
Remarks
When using BigQuery views, BigQuery stores a copy of the view schema with the view itself.
However, these stored view schemas are not updated when the tables used by the view
change. This means that the stored view schema can easily become out of date and cause
queries using the view to fail.
By default, the adapter will not use the stored view schema and will instead query the view
to determine the available columns. This guarantees that the schema will be up to date
although it requires the adapter to start a query job.
You can disable this option to force the adapter to use the stored view schemas. This
prevents the adapter from running any queries when getting a view schema, but also
means that queries using the view will fail if the schema is out of date.
ShowTableDescriptions
Controls whether table descriptions are returned via the platform metadata APIs and sys_
tables / sys_views.
TIBCO® Data Virtualization Google Big Query Adapter Guide
199 | Google BigQuery Adapter
Data Type
bool
Default Value
false
Remarks
By default table descriptions are not shown, since the Google BigQuery API requires an
extra request beyond what is usually required for reading tables.
Enabling this option will show table descriptions, but will cost an extra API request for
every table when a table list is fetched. This can slow down metadata operations on large
datasets.
PrimaryKeyIdentifiers
Set this property to define primary keys.
Data Type
string
Default Value
""
Remarks
Google BigQuery does not natively support primary keys, but for certain DML operations or
database tools you may need to define them. By default this option is disabled and no
tables will have primary keys except for the ones defined in schema files (if you set
Location).
Primary keys are defined using a list of rules which match tables and provide a list of key
columns. For example, PrimaryKeyIdentifiers="*=key;transactions=tx_date,tx_
serial;user_comments=" has three rules separated by semicolons:
TIBCO® Data Virtualization Google Big Query Adapter Guide
200 | Google BigQuery Adapter
1. The first rule *=key means that every table without a more specific rule will have one
primary key column called key. Tables that do not have a key column will not have
any primary keys.
2. The second rule transactions=tx_date,tx_serial means that the transactions table
will have the two primary key columns tx_date and tx_serial. If any of those columns
are missing from the table then they will be ignored.
3. The third rule user_comments= means that the user_comments table will have no
primary keys. The only use that empty key lists have is in overriding the default rule.
If there is no default rule present then the only tables with primary keys would be the
ones explicitly listed.
Note that the table names can include just the table, the table and dataset or the table,
dataset and project. Both column and table names may be quoted using SQL quotes:
/* Rules with just table names use the connection ProjectId (or
DataProjectId) and DatasetId.
All these rules refer to the same table with a connection where
ProjectId=someProject;DatasetId=someDataset */
someTable=a,b,c
someDataset.someTable=a,b,c
someProject.someDataset.someTable=a,b,c
/* Any table or column name may be quoted */
`someProject`."someDataset".[someTable]=`a`,[b],"c"
AllowedTableTypes
Specifies what kinds of tables will be visible.
Data Type
string
Default Value
"TABLE,EXTERNAL,VIEW,MATERIALIZED_VIEW"
TIBCO® Data Virtualization Google Big Query Adapter Guide
201 | Google BigQuery Adapter
Remarks
This option is a comma-separated list of the table type values that the adapter displays.
Any table-like or view-like entity that doesn't have a matching type will not be reported
when listing tables.
l
TABLE Standard tables
l
EXTERNAL Read-only table stored on another service (like GCS or Drive)
l
SNAPSHOT A read-only table that preserves the data of another table at a specific
point in time
l
VIEW Standard views
l
MATERIALIZED_VIEW A view that is recalculated and cached each time its base table
changes
For example, to restrict the adapter to listing only simple tables and views, this option
would be set to TABLE,VIEW
FlattenObjects
Determines whether the provider flattens STRUCT fields into top-level columns.
Data Type
bool
Default Value
true
Remarks
By default the adapter reports each field in a STRUCT column as its own column while the
STRUCT column itself is hidden. This process is recursively applied to nested STRUCT
values. For example, if the following table is defined in Google BigQuery then the adapter
reports 3 columns: location.coords.lat, location.coords.lon and location.country:
CREATE TABLE t(location STRUCT<coords STRUCT<lat FLOAT64, lon FLOAT64>,
country STRING>);
TIBCO® Data Virtualization Google Big Query Adapter Guide
202 | Google BigQuery Adapter
If this property is disabled, then the top-level STRUCT is not expanded and is left as its own
column. The value of this column is reported as a JSON aggregate. In the above example,
the adapter reports only the location column when flattening is disabled.
Miscellaneous
This section provides a complete list of the Miscellaneous properties you can configure in
the connection string for this provider.
Property Description
StorageTimeout How long a Storage API connection must remain idle before
the provider reconnects.
AllowAggregateParameters Allows raw aggregates to be used in parameters when
QueryPassthrough is enabled.
ApplicationName An application name in the form application/version. For
example, AcmeReporting/1.0.
AuditLimit The maximum number of rows which will be stored within an
audit table.
AuditMode What provider actions should be recorded to audit tables.
BigQueryOptions A comma separated list of Google BigQuery options.
GenerateSchemaFiles Indicates the user preference as to when schemas should be
generated and saved.
MaximumBillingTier The MaximumBillingTier is a positive integer that serves as a
multiplier of the basic price per TB. For example, if you set
MaximumBillingTier to 2, the maximum cost for that query
will be 2x basic price per TB.
MaximumBytesBilled Limits how many bytes BigQuery will allow a job to consume
before it is cancelled.
MaxRows Limits the number of rows returned rows when no
TIBCO® Data Virtualization Google Big Query Adapter Guide
203 | Google BigQuery Adapter
aggregation or group by is used in the query. This helps avoid
performance issues at design time.
Other These hidden properties are used only in specific use cases.
Readonly You can use this property to enforce read-only access to
Google BigQuery from the provider.
TableSamplePercent This determines what percent of a table is sampled with the
TABLESAMPLE operator.
Timeout The value in seconds until the timeout error is thrown,
canceling the operation.
StorageTimeout
How long a Storage API connection must remain idle before the provider reconnects.
Data Type
string
Default Value
"60"
Remarks
Google BigQuery and many proxies/firewalls restrict the amount of time that idle
connections stay alive before they are forcibly closed. This can be a problem when using
the Storage API because the adapter may stream data faster than it can be consumed.
While the consumer is catching up, the adapter does not use its connection and it may be
closed by the next time the adapter uses it.
To avoid this the adapter will automatically close and reopen the connection if it has been
idle for too long. This property controls how many seconds the connection has to be idle
for the adapter to reset it. To disable these resets this property can also set to 0 or a
negative value.
TIBCO® Data Virtualization Google Big Query Adapter Guide
204 | Google BigQuery Adapter
AllowAggregateParameters
Allows raw aggregates to be used in parameters when QueryPassthrough is enabled.
Data Type
bool
Default Value
false
Remarks
This option affects how string parameters are handled when using direct queries through
QueryPassthrough. For example, consider this query:
INSERT INTO proj.data.tbl(x) VALUES (@x)
By default, this option is disabled and string parameters are quoted and escaped into SQL
strings. That means that any value can be safely used as a string parameter, but it also
means that parameters cannot be used as raw aggregate values:
/*
* If @x is set to: test value ' contains quote
*
* Result is a valid query
*/
INSERT INTO proj.data.tbl(x) VALUES ('test value \' contains quote')
/*
* If @x is set to: ['valid', ('aggregate', 'value')]
*
* Result contains string instead of aggregate:
*/
INSERT INTO proj.data.tbl(x) VALUES ('[\'valid\', (\'aggregate\',
\'value\')]')
When this option is enabled, string parameters are inserted directly into the query. This
means that raw aggregates can be used as parameters, but it also means that all simple
strings must be escaped:
TIBCO® Data Virtualization Google Big Query Adapter Guide
205 | Google BigQuery Adapter
/*
* If @x is set to: test value ' contains quote
*
* Result is an invalid query
*/
INSERT INTO proj.data.tbl(x) VALUES (test value ' contains quote)
/*
* If @x is set to: ['valid', ('aggregate', 'value')]
*
* Result is an aggregate
*/
INSERT INTO proj.data.tbl(x) VALUES (['valid', ('aggregate', 'value')])
ApplicationName
An application name in the form application/version. For example, AcmeReporting/1.0.
Data Type
string
Default Value
""
Remarks
The adapter identifies itself to BigQuery using a Google partner User-Agent header. The
first part of the User-Agent is fixed and identifies the client as a specific build of the CData
adapter. The last portion reports the specific application using the adapter.
AuditLimit
The maximum number of rows which will be stored within an audit table.
Data Type
string
TIBCO® Data Virtualization Google Big Query Adapter Guide
206 | Google BigQuery Adapter
Default Value
"1000"
Remarks
When auditing is enabled with the AuditMode option, this property is used to determine
how many rows will be allowed in the audit table at once.
By default this property is 1000, meaning that only the 1000 most recent audit events will
be available within the audit table.
This property can also be set to -1, which places no limits on the size of the audit table. In
this mode, the audit table should be periodically cleared to prevent the adapter from using
excessive memory.
DELETE FROM AuditJobs#TEMP
AuditMode
What provider actions should be recorded to audit tables.
Data Type
string
Default Value
""
Remarks
The adapter can record certain internal actions taken when it runs queries. For each of
those actions listed in this option, the adapter will create a temproary audit table which
logs when the action took place, what query caused the action and any other relevant
information.
TIBCO® Data Virtualization Google Big Query Adapter Guide
207 | Google BigQuery Adapter
By default this option is set to 'none' and the adapter does not record any audit
information. This option can also be set to a comma-separated list of the following actions:
Mode Name Audit Table Description Columns
start-jobs AuditJobs#TEMP Records all
jobs started
by the
adapter
Timestamp,Query,ProjectId,Location,Job
Id
Refer to AuditLimit for more information on how to limit the size of these tables.
BigQueryOptions
A comma separated list of Google BigQuery options.
Data Type
string
Default Value
""
Remarks
A list of Google BigQuery options:
Option Description
gbqoImplicitJoinAsUnion This option will prevent the driver from converting an IMPLICIT
JOIN into a CROSS JOIN as expected by SQL92. Instead, it will
leave it as an IMPLICIT JOIN, which Google BigQuery will
execute as a UNION ALL.
TIBCO® Data Virtualization Google Big Query Adapter Guide
208 | Google BigQuery Adapter
GenerateSchemaFiles
Indicates the user preference as to when schemas should be generated and saved.
Possible Values
Never, OnUse, OnStart, OnCreate
Data Type
string
Default Value
"Never"
Remarks
This property outputs schemas to .rsd files in the path specified by Location.
Available settings are the following:
l
Never: A schema file will never be generated.
l
OnUse: A schema file will be generated the first time a table is referenced, provided
the schema file for the table does not already exist.
l
OnStart: A schema file will be generated at connection time for any tables that do
not currently have a schema file.
l
OnCreate: A schema file will be generated by when running a CREATE TABLE SQL
query.
Note that if you want to regenerate a file, you will first need to delete it.
Generate Schemas with SQL
When you set GenerateSchemaFiles to OnUse, the adapter generates schemas as you
execute SELECT queries. Schemas are generated for each table referenced in the query.
When you set GenerateSchemaFiles to OnCreate, schemas are only generated when a
CREATE TABLE query is executed.
TIBCO® Data Virtualization Google Big Query Adapter Guide
209 | Google BigQuery Adapter
Generate Schemas on Connection
Another way to use this property is to obtain schemas for every table in your database
when you connect. To do so, set GenerateSchemaFiles to OnStart and connect.
MaximumBillingTier
The MaximumBillingTier is a positive integer that serves as a multiplier of the basic price
per TB. For example, if you set MaximumBillingTier to 2, the maximum cost for that query
will be 2x basic price per TB.
Data Type
string
Default Value
""
Remarks
Limits the billing tier for this job. Queries that have resource usage beyond this tier will fail
(without incurring a charge). If unspecified, this will be set to your project default. If your
query is too compute intensive for BigQuery to complete at the standard per TB pricing
tier, BigQuery returns a billingTierLimitExceeded error and an estimate of how much the
query would cost. To run the query at a higher pricing tier, pass a new value for
maximumBillingTier as part of the query request. The maximumBillingTier is a positive
integer that serves as a multiplier of the basic price per TB. For example, if you set
maximumBillingTier to 2, the maximum cost for that query will be 2x basic price per TB.
MaximumBytesBilled
Limits how many bytes BigQuery will allow a job to consume before it is cancelled.
Data Type
string
TIBCO® Data Virtualization Google Big Query Adapter Guide
210 | Google BigQuery Adapter
Default Value
""
Remarks
When this value is provided, all jobs will use this value as their default billing cap. If a job
uses more than this many bytes, BigQuery will cancel it and it will not be billed. By default
there is no cap and all jobs will be billed for however many bytes they consume.
This only has an effect when using DestinationTable or when using the InsertJob stored
procedure. BigQuery does not allow standard query jobs to have byte limits.
MaxRows
Limits the number of rows returned rows when no aggregation or group by is used in the
query. This helps avoid performance issues at design time.
Data Type
int
Default Value
-1
Remarks
Limits the number of rows returned rows when no aggregation or group by is used in the
query. This helps avoid performance issues at design time.
Other
These hidden properties are used only in specific use cases.
Data Type
string
TIBCO® Data Virtualization Google Big Query Adapter Guide
211 | Google BigQuery Adapter
Default Value
""
Remarks
The properties listed below are available for specific use cases. Normal driver use cases
and functionality should not require these properties.
Specify multiple properties in a semicolon-separated list.
Integration and Formatting
DefaultColumnSize Sets the default length of string fields when the data source
does not provide column length in the metadata. The default
value is 2000.
ConvertDateTimeToGMT Determines whether to convert date-time values to GMT, instead
of the local time of the machine.
RecordToFile=filename Records the underlying socket data transfer to the specified file.
Readonly
You can use this property to enforce read-only access to Google BigQuery from the
provider.
Data Type
bool
Default Value
false
TIBCO® Data Virtualization Google Big Query Adapter Guide
212 | Google BigQuery Adapter
Remarks
If this property is set to true, the adapter will allow only SELECT queries. INSERT, UPDATE,
DELETE, and stored procedure queries will cause an error to be thrown.
TableSamplePercent
This determines what percent of a table is sampled with the TABLESAMPLE operator.
Data Type
string
Default Value
""
Remarks
This option can be set to make the adapter use the TABLESAMPLE for each table
referenced by a query. The value determines what percent is provided to the PERCENT
clause. That clause will only be generated if this property's value is above zero.
-- Input SQL
SELECT * FROM `tbl`
-- Generated Google BigQuery SQL when TableSamplePercent=10
SELECT * FROM `tbl` TABLESAMPLE SYSTEM (10 PERCENT)
This option is subject to a few limitations:
l
It is applied during query converison and has no effect when QueryPassthrough is
set.
l
More rows may be returned than expected due to how the server implements
TABLESAMPLE. Please see the Google BigQuery documentation for more information.
l
TABLESAMPLE is not supported on views. If a view is queried in sampling mode, the
adapter will omit the TABLESAMPLE clause for the view.
Timeout
TIBCO® Data Virtualization Google Big Query Adapter Guide
213 | Google BigQuery Adapter
The value in seconds until the timeout error is thrown, canceling the operation.
Data Type
string
Default Value
"300"
Remarks
If Timeout = 0, operations do not time out. The operations run until they complete
successfully or until they encounter an error condition.
If Timeout expires and the operation is not yet complete, the adapter throws an exception.
Third Party Copyrights
TIBCO® Data Virtualization Google Big Query Adapter Guide
214 | Google BigQuery Adapter Limitations
Google BigQuery Adapter Limitations
Following are some of the limitations of the Google BigQuery adapter:
1. For clearing cache, the following two conditions must be met:
Rows can be removed from Google BigQuery tables after 90 minutes
The cache_status/cache_tracking tables should be pointed to a different datasource,
such as PostgresSQL. It is recommended to choose the “Use a different data source”
option, while defining the Cache status and Tracking tables.
2.In the caching configuration, while choosing the table for multi-table caching, the fields
“Table Schema” and “Table Catalog” are not optional for Google BigQuery data source.
3.CREATE/DROP indexes for tables is not supported. You must uncheck the option "Drop
indexes before load and create indexes after load" in the caching configuration before
initiating caching.
4.In order to use the Select/Insert native cache loading functions, choose the “Enable
Native Data Loading” in the Overrides panel and uncheck the “Use Streaming Inserts”
option in the datasource configuration.
5.Native cache loading with Insert/Select is not supported for synthesized views.
6.The following datatypes are not supported by Google BigQuery. Tables with columns of
these datatypes cannot be cached:
—BLOB
—CLOB
—BT_INT_LONG
TIBCO® Data Virtualization Google Big Query Adapter Guide
215 | TIBCO Product Documentation and Support Services
TIBCO Product Documentation and Support
Services
For information about this product, you can read the documentation, contact TIBCO
Support, and join the TIBCO Community.
How to Access TIBCO Documentation
Documentation for TIBCO products is available on the TIBCO Product Documentation
website, mainly in HTML and PDF formats.
The TIBCO Product Documentation website is updated frequently and is more current than
any other documentation included with the product.
Product-Specific Documentation
The following documentation for this product is available on the TIBCO® Data Virtualization
page.
Users
TDV Getting Started Guide
TDV User Guide
TDV Web UI User Guide
TDV Client Interfaces Guide
TDV Tutorial Guide
TDV Northbay Example
Administration
TDV Installation and Upgrade Guide
TDV Administration Guide
TDV Active Cluster Guide
TDV Security Features Guide
Data Sources
TIBCO® Data Virtualization Google Big Query Adapter Guide
216 | TIBCO Product Documentation and Support Services
TDV Adapter Guides
TDV Data Source Toolkit Guide (Formerly Extensibility Guide)
References
TDV Reference Guide
TDV Application Programming Interface Guide
Other
TDV Business Directory Guide
TDV Discovery Guide
TIBCO TDV and Business Directory Release Notes Read the release notes for a list of
new and changed features. This document also contains lists of known issues and
closed issues for this release.
Release Version Support
TDV 8.5 is designated as a Long Term Support (LTS) version. Some release versions of
TIBCO® Data Virtualization products are selected to be long-term support (LTS) versions.
Defect corrections will typically be delivered in a new release version and as hotfixes
orservice packs to one or more LTS versions. See also
https://docs.tibco.com/pub/tdv/general/LTS/tdv_LTS_releases.htm.
How to Contact TIBCO Support
Get an overview of TIBCO Support. You can contact TIBCO Support in the following ways:
For accessing the Support Knowledge Base and getting personalized content about
products you are interested in, visit the TIBCO Support website.
For creating a Support case, you must have a valid maintenance or support contract
with TIBCO. You also need a user name and password to log in to TIBCO Support
website. If you do not have a user name, you can request one by clicking Register
on the website.
TIBCO® Data Virtualization Google Big Query Adapter Guide
217 | TIBCO Product Documentation and Support Services
How to Join TIBCO Community
TIBCO Community is the official channel for TIBCO customers, partners, and employee
subject matter experts to share and access their collective experience. TIBCO Community
offers access to Q&A forums, product wikis, and best practices. It also offers access to
extensions, adapters, solution accelerators, and tools that extend and enable customers to
gain full value from TIBCO products. In addition, users can submit and vote on feature
requests from within the TIBCO Ideas Portal. For a free registration, visit TIBCO
Community.
TIBCO® Data Virtualization Google Big Query Adapter Guide
218 | Legal and Third-Party Notices
Legal and Third-Party Notices
SOME TIBCO SOFTWARE EMBEDS OR BUNDLES OTHER TIBCO SOFTWARE. USE OF SUCH
EMBEDDED OR BUNDLED TIBCO SOFTWARE IS SOLELY TO ENABLE THE FUNCTIONALITY (OR
PROVIDE LIMITED ADD-ON FUNCTIONALITY) OF THE LICENSED TIBCO SOFTWARE. THE
EMBEDDED OR BUNDLED SOFTWARE IS NOT LICENSED TO BE USED OR ACCESSED BY ANY
OTHER TIBCO SOFTWARE OR FOR ANY OTHER PURPOSE.
USE OF TIBCO SOFTWARE AND THIS DOCUMENT IS SUBJECT TO THE TERMS AND
CONDITIONS OF A LICENSE AGREEMENT FOUND IN EITHER A SEPARATELY EXECUTED
SOFTWARE LICENSE AGREEMENT, OR, IF THERE IS NO SUCH SEPARATE AGREEMENT, THE
CLICKWRAP END USER LICENSE AGREEMENT WHICH IS DISPLAYED DURING DOWNLOAD OR
INSTALLATION OF THE SOFTWARE (AND WHICH IS DUPLICATED IN THE LICENSE FILE) OR IF
THERE IS NO SUCH SOFTWARE LICENSE AGREEMENT OR CLICKWRAP END USER LICENSE
AGREEMENT, THE LICENSE(S) LOCATED IN THE “LICENSE” FILE(S) OF THE SOFTWARE. USE
OF THIS DOCUMENT IS SUBJECT TO THOSE TERMS AND CONDITIONS, AND YOUR USE
HEREOF SHALL CONSTITUTE ACCEPTANCE OF AND AN AGREEMENT TO BE BOUND BY THE
SAME.
This document is subject to U.S. and international copyright laws and treaties. No part of
this document may be reproduced in any form without the written authorization of TIBCO
Software Inc.
TIBCO, TIBCO logo, Two-Second Advantage, TIBCO Spotfire, TIBCO ActiveSpaces, TIBCO
Spotfire Developer, TIBCO EMS, TIBCO Spotfire Automation Services, TIBCO Enterprise
Runtime for R, TIBCO Spotfire Server, TIBCO Spotfire Web Player, TIBCO Spotfire Statistics
Services, S-PLUS, and TIBCO Spotfire S+ are either registered trademarks or trademarks of
TIBCO Software Inc. in the United States and/or other countries.
Java and all Java based trademarks and logos are trademarks or registered trademarks of
Oracle Corporation and/or its affiliates.
All other product and company names and marks mentioned in this document are the
property of their respective owners and are mentioned for identification purposes only.
TIBCO® Data Virtualization Google Big Query Adapter Guide
219 | Legal and Third-Party Notices
This software may be available on multiple operating systems. However, not all operating
system platforms for a specific software version are released at the same time. See the
readme file for the availability of this software version on a specific operating system
platform.
THIS DOCUMENT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT.
THIS DOCUMENT COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS.
CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION HEREIN; THESE CHANGES WILL
BE INCORPORATED IN NEW EDITIONS OF THIS DOCUMENT. TIBCO SOFTWARE INC. MAY
MAKE IMPROVEMENTS AND/OR CHANGES IN THE PRODUCT(S) AND/OR THE PROGRAM(S)
DESCRIBED IN THIS DOCUMENT AT ANY TIME.
THE CONTENTS OF THIS DOCUMENT MAY BE MODIFIED AND/OR QUALIFIED, DIRECTLY OR
INDIRECTLY, BY OTHER DOCUMENTATION WHICH ACCOMPANIES THIS SOFTWARE,
INCLUDING BUT NOT LIMITED TO ANY RELEASE NOTES AND "READ ME" FILES.
This and other products of TIBCO Software Inc. may be covered by registered patents.
Please refer to TIBCO's Virtual Patent Marking document (https://www.tibco.com/patents)
for details.
Copyright © 2002-2023 Cloud Software Group, Inc All Rights Reserved.