I am running a hadoop job which is working fine when I am running it without yarn in pseudo-distributed mode, but it is giving me class not found exception when running with yarn
16/03/24 01:43:40 INFO mapreduce.Job: Task Id : attempt_1458775953882_0002_m_000003_1, Status : FAILED Error: java.lang.RuntimeException: java.lang.ClassNotFoundException: Class com.hadoop.keyword.count.ItemMapper not found at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2195) at org.apache.hadoop.mapreduce.task.JobContextImpl.getMapperClass(JobContextImpl.java:186) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:745) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Caused by: java.lang.ClassNotFoundException: Class com.hadoop.keyword.count.ItemMapper not found at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:2101) at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2193) ... 8 more
Here is the source-code for the job
Configuration conf = new Configuration(); conf.set("keywords", args[2]); Job job = Job.getInstance(conf, "item count"); job.setJarByClass(ItemImpl.class); job.setMapperClass(ItemMapper.class); job.setReducerClass(ItemReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1);
Here is the command I am running
hadoop jar ~/itemcount.jar /user/rohit/tweets /home/rohit/outputs/23mar-yarn13 vodka,wine,whisky
Edit Code, after suggestion
package com.hadoop.keyword.count; import java.io.IOException; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.Mapper.Context; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.json.simple.JSONObject; import org.json.simple.parser.JSONParser; import org.json.simple.parser.ParseException; public class ItemImpl { public static void main(String[] args) throws Exception { Configuration conf = new Configuration(); conf.set("keywords", args[2]); Job job = Job.getInstance(conf, "item count"); job.setJarByClass(ItemImpl.class); job.setMapperClass(ItemMapper.class); job.setReducerClass(ItemReducer.class); job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } public static class ItemMapper extends Mapper<Object, Text, Text, IntWritable> { private final static IntWritable one = new IntWritable(1); private Text word = new Text(); JSONParser parser = new JSONParser(); @Override public void map(Object key, Text value, Context output) throws IOException, InterruptedException { JSONObject tweetObject = null; String[] keywords = this.getKeyWords(output); try { tweetObject = (JSONObject) parser.parse(value.toString()); } catch (ParseException e) { e.printStackTrace(); } if (tweetObject != null) { String tweetText = (String) tweetObject.get("text"); if(tweetText == null){ return; } tweetText = tweetText.toLowerCase(); /* StringTokenizer st = new StringTokenizer(tweetText); ArrayList<String> tokens = new ArrayList<String>(); while (st.hasMoreTokens()) { tokens.add(st.nextToken()); }*/ for (String keyword : keywords) { keyword = keyword.toLowerCase(); if (tweetText.contains(keyword)) { output.write(new Text(keyword), one); } } output.write(new Text("count"), one); } } String[] getKeyWords(Mapper<Object, Text, Text, IntWritable>.Context context) { Configuration conf = (Configuration) context.getConfiguration(); String param = conf.get("keywords"); return param.split(","); } } public static class ItemReducer extends Reducer<Text, IntWritable, Text, IntWritable> { @Override protected void reduce(Text key, Iterable<IntWritable> values, Context output) throws IOException, InterruptedException { int wordCount = 0; for (IntWritable value : values) { wordCount += value.get(); } output.write(key, new IntWritable(wordCount)); } } }
2 Answers
Answers 1
Can you check the content of your itemcount.jar ?( jar -tvf itemcount.jar). I faced this issue once only to find that the .class was missing from the jar.
Answers 2
I had the same error a few days ago.
- Changing map and reduce classes to static fixed my problem.
- Make your map and reduce classes inner classes.
- Control constructors of map and reduce classes (i/o values and override statement)
- Check your jar command
old one
hadoop jar ~/itemcount.jar /user/rohit/tweets /home/rohit/outputs/23mar-yarn13 vodka,wine,whisky
new
hadoop jar ~/itemcount.jar com.hadoop.keyword.count.ItemImpl /user/rohit/tweets /home/rohit/outputs/23mar-yarn13 vodka,wine,whisky
- add packageName.mainclass after you specified .jar file
Try-catch
try { tweetObject = (JSONObject) parser.parse(value.toString()); } catch (Exception e) { **// Change ParseException to Exception if you don't only expect Parse error** e.printStackTrace(); return; **// return from function in case of any error** } }
extends Configured and implement Tool
public class ItemImpl extends Configured implements Tool{ public static void main (String[] args) throws Exception{ int res =ToolRunner.run(new ItemImpl(), args); System.exit(res); } @Override public int run(String[] args) throws Exception { Job job=Job.getInstance(getConf(),"ItemImpl "); job.setJarByClass(this.getClass()); job.setJarByClass(ItemImpl.class); job.setMapperClass(ItemMapper.class); job.setReducerClass(ItemReducer.class); job.setMapOutputKeyClass(Text.class);//probably not essential but make it certain and clear job.setMapOutputValueClass(IntWritable.class); //probably not essential but make it certain and clear job.setOutputKeyClass(Text.class); job.setOutputValueClass(IntWritable.class); FileInputFormat.addInputPath(job, new Path(args[0])); FileOutputFormat.setOutputPath(job, new Path(args[1])); System.exit(job.waitForCompletion(true) ? 0 : 1); } add public static map add public static reduce I'm not an expert about this topic but This implementation is from one of my working projects. Try this if doesn't work for you I would suggest you check the libraries you added to your project.
Probably first step will solve it but If these steps doesn't work , share the code with us.
0 comments:
Post a Comment