• Skip to main content
  • Skip to primary sidebar
BMA

BeMyAficionado

Inspire Affection

Big Query: How to Fetch Data In JSON Format By Consuming BigQuery’s REST API(s)

August 19, 2018 by varunshrivastava 10 Comments

BigQuery is revolutionizing the way Big Data is maintained and processed. Its “server-less” architecture is not just enterprise friendly but also developer friendly as it takes care of all the hardware configuration and scalability matrices on its own.

As a developer you just need to know the process of data extraction and that is all. All the performance related hard-work have been taken care by Google’s BigQuery.

Before you begin, I would suggest you to read following article:

  • Big Query: Everything You Need to Jump Start Your Development

This article will give you the entire picture of Big Query and why you should be using it in the first place to deal with Big Data.

Table of Contents

  • Create a New Project
  • Understanding Service Account
    • How does Service Account maintain authentication and authorization?
    • How to Create Service Account
  • Environment Setup
  • JAVA Project Setup for BigQuery Data Access Using REST API(s)
    • Project Structure
    • Classes Used
  • Dataset.java
  • BigQueryResponseVO.java
  • LabelsSample.java
  • BigQueryDemo.java
      • Form Request Body
      • Send Request and Get Query Result
  • Conclusion

Create a New Project

To do anything, you first need to create a Project under Google Cloud Platform.

Navigate to the Google Credentials Page. And select your project from the top navbar. Look at the image below:

If your project is not visible, then you will have to create it.

Click on select project drop-down and  New Project. Take a look at the below snapshot.

Upon clicking the New Project button, a new page will open where you will be asked to enter project details and create new project.

Create New GCP Project

Once your project is created, navigate back to the Credentials Page and look for Manage service Account Link at the bottom. Your application should use Service Account to interact with different Google Services.

Before moving forward, let me give you a quick overview of Service Account and its significance to better understand the tutorial.

Understanding Service Account

The first thing I tell everyone: Do not confuse Service Account with User Account.

Service accounts are not user specific, in-fact, these are special type of Google Accounts that belongs to your application instead of a User. Service account is a way to access Google Services with proper authentication and authorization.

Google assume identity of a Service account while calling Google APIs. Service accounts are made with the sole purpose to eliminate User from directly accessing the resources.

How does Service Account maintain authentication and authorization?

Service account have special keys that are used to authenticate themselves to google while accessing google services.

When you create Service Account, you decide:

  • Which resources must be accessible by the service account?
  • What permissions does the Service Account need in order to access a specific resource?
  • Where will the code that assumes the identity of the service account be running: on Google Cloud Platform or on-premises?

Use below flowchart to figure out how to configure service account:

How to Create Service Account

Login to your Google Cloud Platform dashboard. Navigate to IAM & admin link on the left side bar and click on Service Accounts link.

Access Service Account Link in Google Cloud Platform

Alternatively, you can Click on Manage Service Account link and that will lead you to Service Account Creation page.

Create new service account

After selecting project you will see all the listed accounts in the list format. And on the top navbar there is an option to create new service account. Click on Create New Service Account and provide the necessary information.

Make sure to choose the BigQuery role under Project role option. Checkout the image below:

Service Account Creation Page Google cloud Platform
Service Account Creation Page Google cloud Platform

Do not forget to check the furnish new private key checkbox. This will provide you file for download which we will be using later to access data from BigQuery tables.

Click save and your Service account will be created.

Note: Download the private key and save it on your local machine.

Great, you are done with all the boring service account creation part. Now, let’s jump into the development part. Here, you will be fetching the data from the BigQuery tables in JSON format.

Environment Setup

Set the path to the Private Key file in your System Environment. Follow following steps:

  • Right Click on PC in windows explore and click on properties.
PC Properties in windows explorer
  • Click on Advanced System Settings
Advanced System Settings Windows
Advanced System Settings
  • Click on Environment Variables button and create new environment variable with the key GOOGLE_APPLICATION_CREDENTIALS.
Environment Variables
GOOGLE_APPLICATION_CREDENTIALS
  • You are done setting up the environment and now can make calls to Google BigQuery.

JAVA Project Setup for BigQuery Data Access Using REST API(s)

In this project, you will query the Hacker News comments from the BigQuery database. This dataset is uploaded to BigQuery publicly and is accessible by everyone. To give you the overview, below is the query that you will be firing on the Bigquery database:

BigQuery Hacker News Comments Query Result
Query result for Hacker News comments on 3.41 GB dataset

You will first need to setup JAVA project and include all the required dependencies to make HTTP calls and retrieve data from the BigQuery database. For this project you will be using the pom.xml file provided by Google in their sample projects. You can copy and paste the file contents from the following link: https://github.com/vslala/BigQueryRestSample/blob/master/pom.xml. The main dependencies included in the pom.xml file are:

  • Google Cloud BigQuery Library (1.35.0)
  • Google API Client (1.23.0)
  • Google HTTP Client (1.23.0)
  • Google OAuth Client (1.23.0)

You can explore pom.xml file later. Once you have created a project and imported all the dependencies into your classpath, then its time for some action 😉

Project Structure

For this project, I would suggest you to use simple structure and try out the application.

  1. I assume you have already created a project with all the required dependencies.
  2. Create Package Structure: com.example.bigquery
  3. For simplicity, keep all the code inside this package.

Classes Used

  • BigQueryDemo.java
  • BigQueryResponseVO.java
  • Dataset.java
  • LabelsSample.java

Let’s take each class one-by-one and see what it does.

Dataset.java

Go through the below code:

package com.example.bigquery;

import java.util.HashMap;
import java.util.Map;

public class Dataset {
    
    private Map<String, Object> labels;
    
    public Dataset() {
        labels = new HashMap<>();
    }
    
    public Dataset addLabel(String labelKey, String labelValue) {
        labels.put(labelKey, labelValue);
        return this;
    }
    
    public Dataset removeLabel(String labelKey) {
        labels.remove(labelKey);
        return this;
    }
    
    public Map<String, Object> getLabels() {
        return labels;
    }
}

BigQueryResponseVO.java

This class is Response Value Object for the BigQuery Response. Take a look at its properties:

package com.example.bigquery;

import java.util.List;

import org.codehaus.jackson.annotate.JsonProperty;

import com.google.api.services.bigquery.model.JobReference;
import com.google.cloud.bigquery.BigQueryError;
import com.google.cloud.bigquery.Schema;

public class BigQueryResponseVO {
    @JsonProperty
    String kind;
    @JsonProperty
    String etag;
    @JsonProperty
    Schema schema;
    @JsonProperty
    JobReference jobReference;
    @JsonProperty
    long totalRows;
    @JsonProperty
    String pageToken;
    @JsonProperty
    boolean cacheHit;
    @JsonProperty
    long totalBytesProcessed;
    @JsonProperty
    List<BigQueryError> errors;
    @JsonProperty
    long numDmlAffectedRows;
    
    
}

LabelsSample.java

This class is the bootstrap class for the project. It contains the main() method that gives call to the BigQueryDemo.implicit() method. 

Take a look at the class:

/*
 * Copyright 2016 Google Inc.
 *
 * Licensed under the Apache License, Version 2.0 (the "License");
 * you may not use this file except in compliance with the License.
 * You may obtain a copy of the License at
 *
 *     http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */

package com.example.bigquery;

import java.io.IOException;

/** Sample demonstrating labeling a BigQuery dataset or table. */
public class LabelsSample {

    public static void main(String[] args) throws IOException, InterruptedException {
        BigQueryDemo.implicit();
    }
}

BigQueryDemo.java

Go through the below code and then read further:

package com.example.bigquery;

import java.io.IOException;
import java.io.InputStream;
import java.io.InputStreamReader;
import java.util.Arrays;

import org.apache.log4j.Logger;

import com.google.api.client.googleapis.auth.oauth2.GoogleCredential;
import com.google.api.client.http.GenericUrl;
import com.google.api.client.http.HttpContent;
import com.google.api.client.http.HttpHeaders;
import com.google.api.client.http.HttpRequest;
import com.google.api.client.http.HttpRequestFactory;
import com.google.api.client.http.HttpResponse;
import com.google.api.client.http.HttpTransport;
import com.google.api.client.http.javanet.NetHttpTransport;
import com.google.api.client.http.json.JsonHttpContent;
import com.google.api.client.json.JsonFactory;
import com.google.api.client.json.jackson2.JacksonFactory;
import com.google.common.io.CharStreams;

public class BigQueryDemo {
    
    private static final String QUERY_URL_FORMAT = "https://www.googleapis.com/bigquery/v2/projects/%s/queries" + "?access_token=%s";

    private static final String QUERY = "query";

    private static final String QUERY_HACKER_NEWS_COMMENTS = "SELECT * FROM [bigquery-public-data:hacker_news.comments] LIMIT 1000";

    private static final Logger logger = Logger.getLogger(BigQueryDemo.class);
    
    static GoogleCredential credential = null;
    static final HttpTransport HTTP_TRANSPORT = new NetHttpTransport();
    static final JsonFactory JSON_FACTORY = new JacksonFactory();
    static {
        // Authenticate requests using Google Application Default credentials.
        try {
            credential = GoogleCredential.getApplicationDefault();
            credential = credential.createScoped(Arrays.asList("https://www.googleapis.com/auth/bigquery"));
            credential.refreshToken();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void implicit() {
        String projectId = credential.getServiceAccountProjectId();
        String accessToken = generateAccessToken();
        // Set the content of the request.
        Dataset dataset = new Dataset().addLabel(QUERY,    QUERY_HACKER_NEWS_COMMENTS);
        HttpContent content = new JsonHttpContent(JSON_FACTORY, dataset.getLabels());
        // Send the request to the BigQuery API.
        GenericUrl url = new GenericUrl(String.format(QUERY_URL_FORMAT, projectId, accessToken));
        logger.debug("URL: " + url.toString());
        String responseJson = getQueryResult(content, url);
        logger.debug(responseJson);
    }

    private static String getQueryResult(HttpContent content, GenericUrl url) {
        String responseContent = null;
        HttpRequestFactory requestFactory = HTTP_TRANSPORT.createRequestFactory();
        HttpRequest request = null;
        try {
            request = requestFactory.buildPostRequest(url, content);
            request.setParser(JSON_FACTORY.createJsonObjectParser());
            request.setHeaders(
                    new HttpHeaders().set("X-HTTP-Method-Override", "POST").setContentType("application/json"));
            HttpResponse response = request.execute();
            InputStream is = response.getContent();
            responseContent = CharStreams.toString(new InputStreamReader(is));
        } catch (IOException e) {
            logger.error(e);
        }
        return responseContent;
    }

    private static String generateAccessToken() {
        String accessToken = null;
        if ((System.currentTimeMillis() > credential.getExpirationTimeMilliseconds())) {
            accessToken = credential.getRefreshToken();
        } else {
            accessToken = credential.getAccessToken();
        }
        System.out.println(accessToken);
        return accessToken;
    }
}

This is the main class that would do all the work.

The very first thing that executes in this class is the static block.

static {
        // Authenticate requests using Google Application Default credentials.
        try {
            credential = GoogleCredential.getApplicationDefault();
            credential = credential.createScoped(Arrays.asList("https://www.googleapis.com/auth/bigquery"));
            credential.refreshToken();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

The static block gets the Secret Json File of Service Account from the system environment properties. Then it sends a request to Google OAuth service to get the refresh token.

This refresh token is used to access the BigQuery with a scope as `https://www.googleapis.com/auth/bigquery`.

Form Request Body

Once the static block has executed successfully, the implicit method will be invoked.

The starting point for this class is its implicit() method. This implicit method is called from the main() method of LabelsSample.java.

Next, you will have to create the request body that will contain the query and BigQuery post URL.

public static void implicit() {
        String projectId = credential.getServiceAccountProjectId();
        String accessToken = generateAccessToken();
        // Set the content of the request.
        Dataset dataset = new Dataset().addLabel(QUERY,    QUERY_HACKER_NEWS_COMMENTS);
        HttpContent content = new JsonHttpContent(JSON_FACTORY, dataset.getLabels());
        // Send the request to the BigQuery API.
        GenericUrl url = new GenericUrl(String.format(QUERY_URL_FORMAT, projectId, accessToken));
        logger.debug("URL: " + url.toString());
        String responseJson = getQueryResult(content, url);
        logger.debug(responseJson);
    }

Send Request and Get Query Result

Once you have formed the request content, it is time to call the BigQuery’s REST API and fetch the results in JSON format.

Use request = requestFactory.buildPostRequest(url, content) (Request Factory) to create post request and pass in the URL and Content to it.

Set the appropriate headers (mandatory) and call the execute method.

request.setHeaders(
    new HttpHeaders().set("X-HTTP-Method-Override", "POST").setContentType("application/json"));

The response is provided in the InputStream. Convert the InputStream into Characters to retrieve the response in JSON.

That is all you need to do in order to fetch the data from BigQuery in JSON format.

Conclusion

Great! I hope you were able to get the response by following this tutorial.

This article was only intended to fetch the JSON response from BigQuery. Do not use the example as it is. Try to play around and modify the code to suit your need.

I will be modifying the code to make it more friendly and act as a framework for your project. So, do check the master branch of the repository as well.

If you have any question or if there is anything that you do not understand then please do comment below. I would be more than happy to interact with you and solve your queries.

Related

Filed Under: Tutorials Tagged With: Big Query, Fetch Data, java, Rest API

Primary Sidebar

Subscribe to Blog via Email

Do you enjoy the content? Feel free to leave your email with me to receive new content straight to your inbox. I'm an engineer, you can trust me :)

Join 874 other subscribers

Latest Podcasts

Recent Posts

  • Is The Cosmos a Vast Computation?
  • Building Semantic Search for E-commerce Using Product Embeddings and OpenSearch
  • Leader Election with ZooKeeper: Simplifying Distributed Systems Management
  • AWS Serverless Event Driven Data Ingestion from Multiple and Diverse Sources
  • A Step-by-Step Guide to Deploy a Static Website with CloudFront and S3 Using CDK Behind A Custom Domain

Recent Comments

  • Varun Shrivastava on Deploy Lambda Function and API Gateway With Terraform
  • Vaibhav Shrivastava on Deploy Lambda Function and API Gateway With Terraform
  • Varun Shrivastava on Should Girls Wear Short Clothes?
  • D on Should Girls Wear Short Clothes?
  • disqus_X5PikVsRAg on Basic Calculator Leetcode Problem Using Object-Oriented Programming In Java

Categories

  • Blogging
  • Cooking
  • Fashion
  • Finance & Money
  • Programming
  • Reviews
  • Software Quality Assurance
  • Technology
  • Travelling
  • Tutorials
  • Web Hosting
  • Wordpress N SEO

Archives

  • November 2024
  • September 2024
  • July 2024
  • April 2024
  • February 2024
  • November 2023
  • June 2023
  • May 2023
  • April 2023
  • August 2022
  • May 2022
  • April 2022
  • February 2022
  • January 2022
  • November 2021
  • September 2021
  • August 2021
  • June 2021
  • May 2021
  • April 2021
  • February 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • April 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • August 2019
  • July 2019
  • June 2019
  • May 2019
  • April 2019
  • March 2019
  • January 2019
  • November 2018
  • October 2018
  • September 2018
  • August 2018
  • July 2018
  • June 2018
  • May 2018
  • March 2018
  • February 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • June 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016
  • August 2016
  • July 2016
  • June 2016
  • May 2016

Tags

Affordable Hosting (4) algorithms (4) amazon (3) aoc-2020 (7) believe in yourself (4) best (4) database (4) earn money blogging (5) education (4) elementary sorting algorithms (4) experience (3) fashion (4) finance (6) Financial Freedom (7) food (7) friends (3) goals (5) google (5) india (10) indian cuisine (5) indian education system (4) java (16) life (16) life changing (4) love (4) make money (3) microservices (9) motivation (4) oops (4) podcast (6) poor education system (4) principles of microservices (5) problem-solving (7) programmer (5) programming (28) python (5) reality (3) seo (6) spring (3) success (10) success factor (4) technology (4) top 5 (7) typescript (3) wordpress (7)

Copyright © 2025 · Be My Aficionado · WordPress · Log in

Go to mobile version