Note: This is post is part of the HDPCA exam objective series Capacity Scheduler is mainly designed for multitenancy, where multiple organizations collectively fund the cluster based on the computing needs. There is an added benefit that an organization can access any excess capacity not being used by others. This provides elasticity for the organizations […]
Hadoop
How to Create HDFS policies in Ranger
Note: This is post is part of the HDPCA exam objective series Apache Ranger is an application that enables data architects to implement security policies on a big data ecosystem. The goal of this project is to provide a unified way for all Hadoop applications to adhere to the security guidelines that are defined. Here […]
How to Configure Hive Authorization Using Apache Ranger
Note: This is post is part of the HDPCA exam objective series Apache Ranger is a framework for enabling, monitoring, and managing the comprehensive data security across the Hadoop platform. Ranger simply helps a Hadoop admin with various security management tasks. It provides a mechanism to manage the security from a single pane for various […]
HDPCA Exam Objective – Configure HiveServer2 HA ( Part 1 – Installing HiveServer )
Note: This is post is part of the HDPCA exam objective series It is important to configure high availability in production so that if one of the hiveserver2 fails, the others can respond to client requests. This can be achieved by using the ZooKeeper discovery mechanism to point the clients to the active Hive servers. […]
HDPCA Exam Objective – View an application’s log file (Troubleshoot a failed job)
Note: This is post is part of the HDPCA exam objective series It is an integral part of Haddop administration to troubleshoot running or failed jobs. In order to troubleshoot a running/failed job, we must view the application’s log file. This post focuses on the HDPCA exam objective “View an application’s log file”. We will […]
HDPCA Exam Objective – Configure and manage alerts
Note: This is post is part of the HDPCA exam objective series Monitoring the health of Hadoop cluster is an important aspect of Hadoop administration. Ambari provides us the centralized management of health alerts and checks for the services in your cluster. You can set thresholds and can disable/enable alerts using the ambari UI. You […]
HDPCA Exam Objective – Install and configure Knox
Note: This is post is part of the HDPCA exam objective series Knox Basics Knox Gateway is another Apache project that addresses the concern of secured access to the Hadoop cluster from corporate networks. Knox Gateway provides a single point-to-point of authentication and access for Apache Hadoop services in a cluster. Knox runs as a […]
HDPCA Exam Objective – Configure the Capacity Scheduler
Note: This is post is part of the HDPCA exam objective series YARN Schedulers The Hadoop YARN scheduler is responsible for assigning resources to the applications submitted by users. There are 3 types of schedulers in YARN. First in First out (FIFO) (Hadoop 1.x) Fair scheduler Capacity scheduler First in First out (FIFO) By default, […]
HDPCA Exam Objective – Configure HDFS ACLs
Note: This is post is part of the HDPCA exam objective series Starting from Haddop 2.4, HDFS can be configured with ACLs. These ACLs work very much the same way as extended ACLs in a Unix environment. This allows files and directories in HDFS to have more permissions than the basic POSIX permissions. To verify […]
HDPCA Exam Objective – Install and configure Ranger
Note: This is post is part of the HDPCA exam objective series Apache Ranger is a security framework which lets you define the policies to control the data access in Hadoop. It provides a web-based console that can be used by the system administrators of the Hadoop cluster to define and activate the access policies. […]