Two problems occur in the mapreduce program
java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Textjava.lang.ArrayIndexOutOfBoundsException: 4
I've set map output key and value class as found on other posts but still couldn't solve these two problems. For the second problem I specifically tested the set of code in map which is causing the problem and it was correct in a simple file read program.
For reference this is full output of problem 1
Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Text at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:194) at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:1350) at peoplemail.DomainGenderCount$ReduceClass.reduce(DomainGenderCount.java:52) at peoplemail.DomainGenderCount$ReduceClass.reduce(DomainGenderCount.java:1) at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1615) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1637) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1489) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) and this is the full output of problem 2
Error: java.lang.ArrayIndexOutOfBoundsException: 4 at peoplemail.DomainGenderCount$MapClass.map(DomainGenderCount.java:34) at peoplemail.DomainGenderCount$MapClass.map(DomainGenderCount.java:1) at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) Data This is few lines of csv file I'm procesing
18,Daveen,Cupitt,dcupitth@last.fm,6288608483,Female 19,Marney,Eskell,meskelli@nifty.com,8164369834,Female 20,Teri,Yitzhak,tyitzhakj@bloglovin.com,2548784310,Female 21,Alain,Niblo,aniblok@howstuffworks.com,5195420924,Male 22,Vin,Creevy,vcreevyl@sfgate.com,8574528831,Female 23,Ermina,Pena,epenam@mediafire.com,2236545787,Female 24,Chrisy,Chue,cchuen@google.com,9455751444,Male 25,Morgen,Izakof,mizakofo@noaa.gov,8031181365,Male MapClass
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text>{ @Override public void map(LongWritable key,Text value, OutputCollector<Text,Text> output, Reporter r)throws IOException{ String fields[] = value.toString().split(","); String gender = fields[5]; String domain = fields[3].split("@")[1]; output.collect(new Text(domain), new Text(gender)); } } ReduceClass
public static class ReduceClass extends MapReduceBase implements Reducer<Text, Text, Text, IntWritable>{ @Override public void reduce(Text key, Iterator<Text> value, OutputCollector<Text,IntWritable> output, Reporter r)throws IOException{ int count=0; while(value.hasNext()){ value.next(); count++; } output.collect(key, new IntWritable(count)); } } run method
public int run(String[] paths) throws Exception { JobConf jobConf = new JobConf(getConf(), DomainGenderCount.class); jobConf.setMapOutputKeyClass(Text.class); jobConf.setMapOutputValueClass(Text.class); jobConf.setJobName("Number of Users in each domain:"); jobConf.setOutputKeyClass(Text.class); jobConf.setOutputValueClass(IntWritable.class); jobConf.setMapperClass(MapClass.class); jobConf.setReducerClass(ReduceClass.class); jobConf.setCombinerClass(ReduceClass.class); FileInputFormat.setInputPaths(jobConf, new Path(paths[0])); FileOutputFormat.setOutputPath(jobConf, new Path(paths[1])); JobClient.runJob(jobConf); return 0; } This is my call to hadoop
hadoop jar C:\Users\suman\Desktop\domaingendercount.jar /Data/people.csv /Data/Output/ The input file I tested with this small program
package peoplemail; import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.io.IOException; public class Test { public static void main(String[] args) throws IOException { File file = new File("C:\\Users\\suman\\Desktop\\people.csv"); BufferedReader bufferedReader = new BufferedReader(new FileReader(file)); String line; while (null != (line=bufferedReader.readLine())) { String fields[] = line.split(","); String gender = fields[5]; String domain = fields[3].split("@")[1]; System.out.println(domain + " " + gender); } bufferedReader.close(); } } This code ran correct.
These files contain all the code, data and output from hadoop.
1 Answers
Answers 1
Your array fields[] would have 5 elements and index start from 0 and since the length of fields is 5, fields[5] is giving "ArrayIndexOutOfBoundsException".
Here is the corrected mapper,
public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text>{ @Override public void map(LongWritable key,Text value, OutputCollector<Text,Text> output, Reporter r)throws IOException{ String fields[] = value.toString().split(","); String domain = fields[3].split("@")[1]; String gender = fields[5]; output.collect(new Text(domain), new Text(gender)); } }
0 comments:
Post a Comment