Note: This is post is part of the HDPCA exam objective series Apache Ranger is a security framework which lets you define the policies to control the data access in Hadoop. It provides a web-based console that can be used by the system administrators of the Hadoop cluster to define and activate the access policies. […]
Archives for July 2018
HDPCA Exam Objective – Recover a snapshot
Note: This is post is part of the HDPCA exam objective series We mentioned earlier that HDFS replication alone is not a suitable backup strategy. In the Hadoop 2 filesystem, snapshots have been added, which brings another level of data protection to HDFS. As changes to the filesystem are made, any change that would affect […]
HDPCA Exam Objective – Create a snapshot of an HDFS directory
Note: This is post is part of the HDPCA exam objective series HDFS Sanpshot In spite of having a replication factor of 3, there are chances of data loss in the Hadoop cluster due to human error or corruptions. Hadoop 2.0 added the capability of taking a snapshot (read-only copy and copy-on-write) of the filesystem […]
HDPCA Exam Objective – Change the configuration of a service using Ambari
Note: This is post is part of the HDPCA exam objective series When you install an HDP cluster using ambari, it selects an optimum value of the configuration parameter for each of the services in the cluster. But you may have a requirement to change these default values. You can use ambari to change these […]
HDPCA Exam Objective – Restart an HDP service
Note: This is post is part of the HDPCA exam objective series Ambari has made the job of hadoop admin much easier. With ambari you can start/stop/restart any service with a click of a button. With ambari you can see exactly what is happening behind the scenes when you start/stop a service. You can also […]
HDPCA Exam Objective – Configure ResourceManager HA
Note: This is post is part of the HDPCA exam objective series In a Hadoop cluster, if the RM goes offline for any reason, all the jobs on the cluster will fail. In production, there will be critical jobs that might be running for a long time and it does not make sense to start […]
HDPCA Exam Objective – Configure NameNode HA
Note: This is post is part of the HDPCA exam objective series NameNode High Availability Concepts In case of either accidental failures or regular maintenance of NameNode, the cluster will become unavailable. This is a big problem for a production Hadoop cluster. – In the cluster, there are two separate machines—the active state NameNode and […]
HDPCA Exam Objective – Add a new node to an existing cluster
Note: This is post is part of the HDPCA exam objective series Adding a new node to a Live cluster is almost similar to installing a new HDP cluster with ambari. To add a new host you must enable passwordless ssh from the ambari-server to the new host. How to setup passwordless SSH login in […]
HDPCA Exam Objective – Create a home directory for a user and configure permissions
Note: This is post is part of the HDPCA exam objective series HDFS HDFS (Hadoop Distributed File System) is the storage layer of the Hadoop cluster which stores the data. It is a distributed filesystem and it is very important for a Hadoop admin to know how to configure and manager HDFS inside out. For […]
HDPCA Exam Objective – Define and deploy a rack topology script
Note: This is post is part of the HDPCA exam objective series What is Rack Awareness To make sure that there is no single point of failure across the entire Hadoop infrastructure, and to ensure that the contention of resources is in a distributed manner, rack awareness plays an important role. Rack awareness is a […]