Friday, April 1, 2016

Making spark use /etc/hosts file for binding in YARN cluster mode

1 comment

Have a spark cluster setup on a machine with two inets, one public another private. The /etc/hosts file in the cluster has the internal ip of all the other machines in the cluster, like so.

internal_ip FQDN

However when I request a SparkContext via pyspark in YARN client mode(pyspark --master yarn --deploy-mode client), akka binds onto the public ip and thus a time out occurs.

15/11/07 23:29:23 INFO Remoting: Starting remoting 15/11/07 23:29:23 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkYarnAM@public_ip:44015] 15/11/07 23:29:23 INFO util.Utils: Successfully started service 'sparkYarnAM' on port 44015. 15/11/07 23:29:23 INFO yarn.ApplicationMaster: Waiting for Spark driver to be reachable. 15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Failed to connect to driver at yarn_driver_public_ip:48875, retrying ... 15/11/07 23:31:30 ERROR yarn.ApplicationMaster: Uncaught exception:  org.apache.spark.SparkException: Failed to connect to driver!     at org.apache.spark.deploy.yarn.ApplicationMaster.waitForSparkDriver(ApplicationMaster.scala:427)     at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:293)     at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:149)     at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:574)     at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66)     at org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:65)     at java.security.AccessController.doPrivileged(Native Method)     at javax.security.auth.Subject.doAs(Subject.java:422)     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)     at org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:65)     at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:572)     at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:599)     at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) 15/11/07 23:31:30 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 10, (reason: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!) 15/11/07 23:31:30 INFO yarn.ApplicationMaster: Unregistering ApplicationMaster with FAILED (diag message: Uncaught exception: org.apache.spark.SparkException: Failed to connect to driver!) 15/11/07 23:31:30 INFO yarn.ApplicationMaster: Deleting staging directory .sparkStaging/application_1446960366742_0002 

As seen from the log, private IP is completely ignored, how can I make YARN and spark use the private IP address as specified in the hosts file ?

Cluster was provisioned using Ambari(HDP 2.4)

0 Answers

If You Enjoyed This, Take 5 Seconds To Share It

1 comment: