Use the instructions provided in this section to configure Full-Stack HA fail over resiliency for the HDP clients.
![]() | Note |
|---|---|
Your Hadoop configuration directories are defined during the HDP installation. For details, see: Setting Up Hadoop Configuration. |
Step 1: Edit the $HADOOP_CONF_DIR/hdfs-site.xml file to add the following properties:
Enable the HDFS client retry policy.
<property> <name>dfs.client.retry.policy.enabled</name> <value>true</value> <description> Enables HDFS client retry in case of NameNode failure.</description> </property>Configure protection for NameNode edit log.
<property> <name>dfs.namenode.edits.toleration.length</name> <value>8192</value> <description> Prevents corruption of NameNode edit log.</description> </property>Configure safe mode extension time.
<property> <name>dfs.safemode.extension</name> <value>10</value> <description> The default value (30 seconds) is applicable for very large clusters. For small to large clusters (upto 200 nodes), recommended value is 10 seconds.</description> </property>Ensure that the allocated DFS blocks persist across multiple fail overs.
<property> <name>dfs.persist.blocks</name> <value>true</value> <description>Ensure that the allocated DFS blocks persist across multiple fail overs.</description> </property>Configure delay for first block report.
<property> <name>dfs.blockreport.initialDelay</name> <value>10</value> <description> Delay (in seconds) for first block report.</description> </property>
Step 2: Modify the following property in the $HADOOP_CONF_DIR/core-site.xml file:
<property>
<name>fs.checkpoint.period</name>
<value>3600</value>
<description> The number of seconds between two periodic checkpoints.</description>
</property>
This will ensure that the checkpoint is performed on an hourly basis.
Step 3: Edit the file to add the following properties:$HADOOP_CONF_DIR/mapred-site.xml
Enable the JobTracker’s safe mode functionality.
<property> <name>mapreduce.jt.hdfs.monitor.enable</name> <value>true</value> <description> Enable the JobTracker to go into safe mode when the NameNode is not responding.</description> </property>Enable retry for JobTracker clients (when the JobTracker is in safe mode).
<property> <name>mapreduce.jobclient.retry.policy.enabled</name> <value>true</value> <description> Enable the MapReduce job client to retry job submission when the JobTracker is in safe mode.</description> </property>Enable recovery of JobTracker’s queue after it is restarted.
<property> <name>mapred.jobtracker.restart.recover</name> <value>true</value> <description> Enable the JobTracker to recover its queue after it is restarted.</description> </property>

![[Note]](../common/images/admon/note.png)
