Permission denied (publickey) on EC2 while starting Hadoop

user149332 picture user149332 · Jan 21, 2013 · Viewed 8.4k times · Source

My manager has provided me with an Amazon instance along with a ppk. Able to login; trying to install hadoop; made the needed config changes like, edited the masters and slaves file from localhost to the EC2 instance name, added needed properties to the mapred-site.xml/hdfs-site.xml/core-site.xml files, formatted the namenode into HDFS. Now, when i run start-dfs.sh script, i get the following errors. starting namenode, logging to /home/ubuntu/hadoop/libexec/../logs/hadoop-ubuntu-namenode-domU-12-31-39-07-60-A9.out The authenticity of host 'XXX.amazonaws.com (some IP)' can't be established.

Are you sure you want to continue connecting (yes/no)? yes XXX.amazonaws.com: Warning: Permanently added 'XXX.amazonaws.com,' (ECDSA) to the list of known hosts. XXX.amazonaws.com: Permission denied (publickey). XXX.amazonaws.com: Permission denied (publickey).

as of now, the master and slave nodes would be the same machine.

XXX is the instance name, and some IP is its IP. Masking them for security reasons.

I have absolutely no idea about using an EC2 instance, SSH etc. only need to run a simple MapReduce Program in it.

Kindly suggest.

Answer

Eric Alberson picture Eric Alberson · Jan 21, 2013

Hadoop uses SSH to transfer information from master to slaves. It looks like your nodes are trying to talk to each other via SSH but haven't been configured to do so. In order to communicate, the Hadoop master node need passwordless SSH access to the slave nodes. Passwordless is useful so that every time you try to run a job you don't have to enter your password again for each of the slave nodes. That would be quite tedious. It looks like you'll have to set this up between the nodes before you can continue.

I would suggest you check this guide and find the section called "Configuring SSH". It lays out how to accomplish this.