How to use / How to Fix Topics on Greenplum, HAWQ, Hadoop

Symptom:

Map Reduce job failed with the below error

Error: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any valid local directory for output/attempt_1426272186088_184484_m_000232_4/file.out
 at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:398)

Cause:

Map Reduce task during execution will store intermediate data onto local directories, these directories are specified by the parameter "mapreduce.cluster.local.dir" (Old deprecated name: mapred.local.dir) specified in the mapred-site.xml.

During job processing, map reduce framework looks for the directories specified by mapreduce.cluster.local.dir parameter and verifies if there is enough space on the directories listed to create the intermediate file, if there is no directory which has the required space the map reduce job will fail with the error as shared above.

Workaround:

1. Ensure that there is enough space on to the local directories based on the requirement of data to be processed.

2. You may compress the intermediate output files to minimize the space consumption

3. After trying the above mentioned points, (space and compression), if it still does not work, check your job and try either improving the way it's designed or reduce the amount of data processed.

How to use / How to Fix Topics on Greenplum, HAWQ, Hadoop

Monday, May 25, 2015

No comments:

Post a Comment