Sunday, August 19, 2018

Hadoop MapReduce Errors Different Data Types

Leave a Comment

Two problems occur in the mapreduce program

  1. java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Text
  2. java.lang.ArrayIndexOutOfBoundsException: 4

I've set map output key and value class as found on other posts but still couldn't solve these two problems. For the second problem I specifically tested the set of code in map which is causing the problem and it was correct in a simple file read program.

For reference this is full output of problem 1

Error: java.io.IOException: wrong value class: class org.apache.hadoop.io.IntWritable is not class org.apache.hadoop.io.Text at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:194) at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect(Task.java:1350) at peoplemail.DomainGenderCount$ReduceClass.reduce(DomainGenderCount.java:52) at peoplemail.DomainGenderCount$ReduceClass.reduce(DomainGenderCount.java:1) at org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1615) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1637) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1489) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:460) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343) at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

and this is the full output of problem 2

Error: java.lang.ArrayIndexOutOfBoundsException: 4     at peoplemail.DomainGenderCount$MapClass.map(DomainGenderCount.java:34)     at peoplemail.DomainGenderCount$MapClass.map(DomainGenderCount.java:1)     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:453)     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)     at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)     at java.security.AccessController.doPrivileged(Native Method)     at javax.security.auth.Subject.doAs(Subject.java:415)     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)     at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158) 

Data This is few lines of csv file I'm procesing

18,Daveen,Cupitt,dcupitth@last.fm,6288608483,Female 19,Marney,Eskell,meskelli@nifty.com,8164369834,Female 20,Teri,Yitzhak,tyitzhakj@bloglovin.com,2548784310,Female 21,Alain,Niblo,aniblok@howstuffworks.com,5195420924,Male 22,Vin,Creevy,vcreevyl@sfgate.com,8574528831,Female 23,Ermina,Pena,epenam@mediafire.com,2236545787,Female 24,Chrisy,Chue,cchuen@google.com,9455751444,Male 25,Morgen,Izakof,mizakofo@noaa.gov,8031181365,Male 

MapClass

public static class MapClass  extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text>{     @Override     public void map(LongWritable  key,Text value,             OutputCollector<Text,Text> output, Reporter r)throws IOException{          String fields[] = value.toString().split(",");         String gender = fields[5];         String domain = fields[3].split("@")[1];         output.collect(new Text(domain), new Text(gender));     }  } 

ReduceClass

public static class ReduceClass  extends MapReduceBase implements Reducer<Text, Text, Text, IntWritable>{     @Override     public void reduce(Text key, Iterator<Text> value,             OutputCollector<Text,IntWritable> output, Reporter r)throws IOException{          int count=0;         while(value.hasNext()){             value.next();             count++;         }         output.collect(key, new IntWritable(count));      }  } 

run method

public int run(String[] paths) throws Exception {      JobConf jobConf = new JobConf(getConf(), DomainGenderCount.class);      jobConf.setMapOutputKeyClass(Text.class);     jobConf.setMapOutputValueClass(Text.class);      jobConf.setJobName("Number of Users in each domain:");      jobConf.setOutputKeyClass(Text.class);     jobConf.setOutputValueClass(IntWritable.class);      jobConf.setMapperClass(MapClass.class);     jobConf.setReducerClass(ReduceClass.class);     jobConf.setCombinerClass(ReduceClass.class);      FileInputFormat.setInputPaths(jobConf, new Path(paths[0]));     FileOutputFormat.setOutputPath(jobConf, new Path(paths[1]));      JobClient.runJob(jobConf);     return 0; } 

This is my call to hadoop

hadoop jar C:\Users\suman\Desktop\domaingendercount.jar /Data/people.csv /Data/Output/ 

The input file I tested with this small program

package peoplemail;  import java.io.BufferedReader; import java.io.File; import java.io.FileReader; import java.io.IOException;  public class Test {      public static void main(String[] args) throws IOException {         File file  = new File("C:\\Users\\suman\\Desktop\\people.csv");         BufferedReader bufferedReader = new BufferedReader(new FileReader(file));         String line;         while (null != (line=bufferedReader.readLine())) {             String fields[] = line.split(",");             String gender = fields[5];             String domain = fields[3].split("@")[1];             System.out.println(domain + " " + gender);         }         bufferedReader.close();     }  } 

This code ran correct.

These files contain all the code, data and output from hadoop.

DomainGenderCount.java

people.csv

output log

1 Answers

Answers 1

Your array fields[] would have 5 elements and index start from 0 and since the length of fields is 5, fields[5] is giving "ArrayIndexOutOfBoundsException".

Here is the corrected mapper,

public static class MapClass extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text>{ @Override     public void map(LongWritable  key,Text value,         OutputCollector<Text,Text> output, Reporter r)throws IOException{      String fields[] = value.toString().split(",");     String domain = fields[3].split("@")[1];     String gender = fields[5];     output.collect(new Text(domain), new Text(gender)); } 

}

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment