Monday, May 25, 2015

Map Reduce job failed with "Unable to initialize any output collector"

Problem:
Map Reduce / Hive query failed with below error:
2015-04-24 11:41:41,861 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.io.IOException: Unable to initialize any output collector
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:412)
 at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:439)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)

Cause: 
The error "Unable to initialize any output collector" indicates that the job failed to start the container's, there can be multiple reasons for the same. However, one must review the container logs at hdfs to identify the cause the error. 
In this specific instance, the value of mapreduce.task.io.sort.mb value was entered greater than 2047 MB, however the maximum value which it allows is 2047 MB, thus anything above its causes the jobs to fail marking the value provided as Invalid.
The size of Container logs revealed the below error:
2015-04-24 11:41:41,858 WARN [main] org.apache.hadoop.mapred.MapTask: Unable to initialize MapOutputCollector org.apache.hadoop.mapred.MapTask$MapOutputBuffer
java.io.IOException: Invalid "mapreduce.task.io.sort.mb": 2048
 at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:975)
 at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:401)
Solution:
Set the value of mapreduce.task.io.sort.mb < 2048MB 

No comments:

Post a Comment