Sunday, April 10, 2016

spark scala on windows machine

By Hường Hana 3:30 PM apache-spark, scala Leave a Comment

I am learning from the class. I have run the code as shown in the class and i get below errors. Any idea what i should do?

I have spark 1.6.1 and Scala version 2.10.5 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_74)

val datadir = "C:/Personal/V2Maestros/Courses/Big Data Analytics with Spark/Scala"  //............................................................................ ////   Building and saving the model //............................................................................  val tweetData = sc.textFile(datadir + "/movietweets.csv") tweetData.collect()  def convertToRDD(inStr : String) : (Double,String) = {     val attList = inStr.split(",")     val sentiment = attList(0).contains("positive") match {             case  true => 0.0             case  false    => 1.0      }     return (sentiment, attList(1)) } val tweetText=tweetData.map(convertToRDD) tweetText.collect()  //val sqlContext = new org.apache.spark.sql.SQLContext(sc) import sqlContext.implicits._ var ttDF = sqlContext.createDataFrame(tweetText).toDF("label","text") ttDF.show()

The error is:

scala> ttDF.show() [Stage 2:>                                                          (0 + 2) / 2]16/03/30 11:40:25 ERROR ExecutorClassLoader: Failed to check existence of class org.apache.spark.sql.catalyst.expressio REPL class server at http://192.168.56.1:54595 java.net.ConnectException: Connection timed out: connect         at java.net.TwoStacksPlainSocketImpl.socketConnect(Native Method)        re/4729300

3 Answers

Answers 1

I'm no expert but the connection IP in the error message looks like a private node or even your router/modem local address.

As stated in the comment it could be that you're running the context with a wrong configuration that tries to spread the work to a cluster that's not there, instead of in your local jvm process.

For further information you can read here and experiment with something like

new SparkContext(master = "local[4]", appName = "tweetsClass", conf = new SparkConf)

Answers 2

I think the problem comes with connectivity and not from within the code.

Check if you can actually connect to this address and port (54595).

Answers 3

Probably your spark master is not accessible at the specified port. Use local[*] to validate using a smaller dataset and local master. Then, ckeck if the port is accessible or change it based on Spark port configuration (http://spark.apache.org/docs/latest/configuration.html)

Coding Question

Sunday, April 10, 2016

spark scala on windows machine

3 Answers

Answers 1

Answers 2

Answers 3

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment

Search

Popular Posts

Labels

Blog Archive

Find Us On Facebook