Monday, May 25, 2015

Ambari fails with "Fail: Applying File['/etc/hadoop/conf/hadoop-env.sh'] failed, parent directory /etc/hadoop/conf doesn't exist"

Environment:
PHD 3.0
Ambari 1.7.1

Problem:
Pivotal Hadoop deploy failed with the below error:
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 123, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/stacks/PHD/2.0.6/hooks/before-ANY/scripts/hook.py", line 31, in hook
    setup_hadoop_env()
 ...
    raise Fail("Applying %s failed, parent directory %s doesn't exist" % (self.resource, dirname))
Fail: Applying File['/etc/hadoop/conf/hadoop-env.sh'] failed, parent directory /etc/hadoop/conf doesn't exist
Error: Error: Unable to run the custom hook script ['/usr/bin/python2.6', '/var/lib/ambari-agent/cache/stacks/PHD/2.0.6/hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-51.json', '/var/lib/ambari-agent/cache/stacks/PHD/2.0.6/hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-51.json', 'INFO', '/var/lib/ambari-agent/data/tmp']

Troubleshooting / Cause:
As suggested in the error, deploy fails to create a symbolic link from /etc/hadoop/conf.empty to /etc/hadoop/conf. However, in this specific case, the problem lied in the steps prior to this stage in deployment.
Before the symbolic is created, users and groups are created / modified, which based on the logs appeared to be completed successful, but were not existing on the server.
In order to add users / groups, the script "/usr/lib/python2.6/site-packages/resource_management/core/providers/accounts.py" is invoked which in-turns call /usr/lib/python2.6/site-packages/resource_management/core/shell.py and creates the command string to create the users or groups. It uses the below command add the users in a non interactive mode (--login).
/bin/bash --login -c useradd -m -G hadoop apple
{Note: Based on some conditions, it can also use command similar to: su -s /bin/bash - user -c "command". Please check script for details.}
At the server, there was a policy defined in the .bash_profile in which it prompts the user to enter their username before login for auditing purpose. However, the deploy steps was not capturing this issue and continuing with deploy and failing ultimately. Below is a snippet of the prompt which was configured.
                         WARNING!!! WARNING!!! WARNING!!!
        Entering an id other than your own is grounds for termination!!!!
 
Enter your user id :******
After disabling the auditing mechanism in .bash_profile, users were created successfuly and deploy was successful. If you run into a similar issue, verify if there are any kind of banner or prompts which requests an input as in this case.

No comments:

Post a Comment