Top "Google-hadoop" questions

The open-source Apache Hadoop framework can be run on Google Cloud Platform for large-scale data processing, using Google Compute Engine VMs and Persistent Disks and optionally incorporating Google's tools and libraries for integrating Hadoop with other cloud services like Google Cloud Storage and BigQuery.

Getting 'sudo: unknown user: hadoop' and 'sudo: unable to initialize policy plugin error' on Google Cloud Platform while running hadoop cluster

I am trying to deploy the sample Hadoop app provided by Google at https://github.com/GoogleCloudPlatform/solutions-google-compute-engine-cluster-for-hadoop on Google …

linux hadoop google-compute-engine google-cloud-platform google-hadoop
"No Filesystem for Scheme: gs" when running spark job locally

I am running a Spark job (version 1.2.0), and the input is a folder inside a Google Clous Storage bucket (i.…

apache-spark hadoop google-cloud-storage google-cloud-dataproc google-hadoop
Read from BigQuery into Spark in efficient way?

When using BigQuery Connector to read data from BigQuery I found that it copies all data first to Google Cloud …

apache-spark google-bigquery google-cloud-dataproc google-hadoop
Migrating 50TB data from local Hadoop cluster to Google Cloud Storage

I am trying to migrate existing data (JSON) in my Hadoop cluster to Google Cloud Storage. I have explored GSUtil …

google-api google-api-java-client google-hadoop