These are dependencies inside is my pom.xml file. When I run Driver from my maven project as a Scala application, it works fine. But when I create a jar-with-dependency and try to run my project through spark-submit
spark-submit --class package.signature.Driver --master local[*] /path/to/my/jar-with-dependencies.jar I get following exception.
Exception in thread "main" java.lang.NoSuchMethodError: org.apache.http.conn.ssl.SSLConnectionSocketFactory.(Ljavax/net/ssl/SSLContext;[Ljava/lang/String;[Ljava/lang/String;Ljavax/net/ssl/HostnameVerifier;)V at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.(HtmlUnitSSLConnectionSocketFactory.java:125) at com.gargoylesoftware.htmlunit.httpclient.HtmlUnitSSLConnectionSocketFactory.buildSSLSocketFactory(HtmlUnitSSLConnectionSocketFactory.java:112) at com.gargoylesoftware.htmlunit.HttpWebConnection.configureHttpsScheme(HttpWebConnection.java:597) at com.gargoylesoftware.htmlunit.HttpWebConnection.createHttpClient(HttpWebConnection.java:532) at com.gargoylesoftware.htmlunit.HttpWebConnection.getHttpClientBuilder(HttpWebConnection.java:494) at com.gargoylesoftware.htmlunit.HttpWebConnection.getResponse(HttpWebConnection.java:158) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponseFromWebConnection(WebClient.java:1321) at com.gargoylesoftware.htmlunit.WebClient.loadWebResponse(WebClient.java:1238) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:346) at com.gargoylesoftware.htmlunit.WebClient.getPage(WebClient.java:415) at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:541) at org.openqa.selenium.htmlunit.HtmlUnitDriver.get(HtmlUnitDriver.java:530)
Origin of this exception is .getPageSource from the code
import org.openqa.selenium.htmlunit.HtmlUnitDriver import java.util.concurrent.TimeUnit import scala.concurrent.duration._ import org.apache.spark.Logging object Selenium extends Logging { val driver = new HtmlUnitDriver(false) val implicitWaitTimeout = 30 // seconds driver.manage.timeouts.implicitlyWait(implicitWaitTimeout, TimeUnit.SECONDS) def getPageSource(urlPage: String): String = { driver.get(urlPage) driver.getPageSource } } What I have tried so far. Trying to exclude multiple versions of apache-httpcomponent. But that didn't work.
2 Answers
Answers 1
From exception it appears that selenium-htmlunit-driver (conclusion from seeing com.gargoylesoftware.htmlunit in exception) cannot find org.apache.httpcomponents.httpclient v. 4.4 or higher (conclusion from looking up in which package SSLConnectionSocketFactory is; more specifically it wants version 4.4 or above of this lib, since the method it cannot find is introduced in version 4.4,see here). Also if you look at dependencies of selenium-htmlunit-driver, it looks like it's using version 4.5.1. So include the following into your dependencies:
<dependency> <groupId>org.apache.httpcomponents</groupId> <artifactId>httpclient</artifactId> <version>4.5.1</version> </dependency> so that it packages this lib with your JAR. And of course make sure you don't have another httpclient-x.x.x.jar in the same folder.
Answers 2
It is a transitive dependency problem, it means that you have jars which have this class each having version of that class where this method is not present and by default maven loads that class which it gets first, you can check the class by pressing Ctrl+Shift+T, you can see multiple versions of that class, once you find the class which will not be having that method, in pom.xml for that .jar which contains that file write exclusion for that.
0 comments:
Post a Comment