Cloudera Installation Paths
There are 3 paths in which the cloudera cluster can be installed and configured.
- PATH A – Automated Installation
- PATH B – Using Cloudera Manager with local Parcel/Packages
- PATH C – Manual Installation using tarballs
Cloudera installation Phases
Whichever installation path you choose, the below 6 phases of installation has to be completed in order to complete the CDH cluster installation.
- Installing JDK
- Setting up the databases
- Installing Cloudera Manager Server
- Installing Cloudera Manager Agents
- Installing CDH and Managed Services Packages
- Creating, configuring and starting CDH & Managed Services
The first 2 phases i.e. installing JDK and setting up the databases needs to be done manually in all the 3 installation phases. Let’s discuss each installation path in a bit detail.
1. PATH A – Automated Installation
This is one of the easiest methods of installation as all the 6 phases of the installation are completed by the Cloudera installer itself and thus no manual work is required. This path is generally not used in production setup though and can be used for quick installation for testing and development purposes.
2. PATH B – Using Cloudera Manager with local Parcel/Packages
In this path of installation, phase 3 through 6 are to be done manually as well. This is the most preferred path for setting up CDH cluster in production environments. The installations of Cloudera Manager/Agents and Managed service packages are done using the system’s package manager ( yum for example ).
3. PATH C – Manual Installation using tarballs
In this path of installation, all the phases have to be done manually using the tarballs from the Cloudera repository page. Unlike the 2nd path, here we can not use the system’s package manager like yum to install the Cloudera Manager Server/Agents etc.
For the sake of exam and most widely used installation method in the production environments, we will outline the steps to install Cloudera Manager Server and Agents.
Install Cloudera Manager server and agents
As shown in the figure and the table below we have to install the respective packages in the master and worker nodes.
Packages | Master node | Worker Nodes |
---|---|---|
cloudera-manager-agent.x86_64 | ✔ | ✔ |
cloudera-manager-daemons.x86_64 | ✔ | ✔ |
cloudera-manager-server.x86_64 | ✔ | ✘ |
cloudera-manager-server-db-2.x86_64 | ✘ | ✘ |
oracle-j2sdk1.7.x86_64 | ✔ | ✔ |
The package “cloudera-manager-server-db-2.x86_64” can be installed when we do not need a custom DB instance and let Cloudera create its own built-in, ready to use database instance for the Cloudera Manager Server to use. In production environments, we generally do not use this package and build our own DB instance on a separate server.
Installing Cloudera Manager Server
We will be installing the Cloudera Manager on the “master” node. To install the Cloudera Manager Server install the required packages with yum as shown below:
# yum install cloudera-manager-agent.x86_64 cloudera-manager-daemons.x86_64 cloudera-manager-server.x86_64 oracle-j2sdk1.7.x86_64
Installing Cloudera Manager Agents
On the rest of the cluster worker nodes we will install the Cloudera Manager Agents along with the other required packages.
# yum install cloudera-manager-agent.x86_64 cloudera-manager-daemons.x86_64 oracle-j2sdk1.7.x86_64
Setting up the JAVA_HOME variable on master node
Once you are done with installing Cloudera Manager Server and Agents, we need to set the JAVA_HOME path in the master node in the configuration file “/etc/default/cloudera-scm-server“. Make sure you have the JAVA_HOME located at the “/usr/java/jdk1.7.0_67-cloudera” directory.
# vi /etc/default/cloudera-scm-server export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
Setting up the Cloudera Manager Database (MariaDB)
As we are following the PATH B of installation, we need to manually install and configure the Cloudera Manager Database. We will install a MariaDB (which is a replacement of MySQL in CentOS/RHEL 7) database and use it as the Cloudera Manager Database. We are using the master node to host the MariaDB database, but in a real production environment, it is recommended to have a separate server for this. Follow the steps outlined below:
1. Install the required MariaDB packages on the master node. We would also need the Mysql connector package for remote JDBC connectivity to the database.
# yum mariadb mariadb-server mysql-connector-java
2. Start the mariaDB service and enable it to auto-start upon boot.
# systemctl start mariadb # systemctl enable mariadb
3. Verify if the installation went through properly, we will connect the MariaDB database and list the default databases. By default, the root user will not have any credentials set in the MariaDB database which is fine for our Lab setup.
# mysql -u root Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 2 Server version: 5.5.60-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | test | +--------------------+ 4 rows in set (0.00 sec) MariaDB [(none)]>
4. Next step is to create a temporary database user (tmp) and make it the superuser so that it has all the grants on all the database objects. We will use this user later on for configuring the Cloudera Manager Server. The password for the user is set as “password“. This user can be deleted once the Cloudera Manager Server configuration is completed.
GRANT ALL ON *.* TO 'tmp'@'master.localdomain' IDENTIFIED BY 'password' WITH GRANT OPTION;
5. Verify if you can log in with the “tmp” user.
# mysql -u tmp -h master@localdomain -p Enter password: Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 3 Server version: 5.5.60-MariaDB MariaDB Server Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
MariaDB [(none)]> show grants; +------------------------------------------------------------------------------------------------------------------------------------------------+ | Grants for tmp@master.localdomain | +------------------------------------------------------------------------------------------------------------------------------------------------+ | GRANT ALL PRIVILEGES ON *.* TO 'tmp'@'master.localdomain' IDENTIFIED BY PASSWORD '*2470C0C06DEE42FD1618BB99005ADCA2EC9D1E19' WITH GRANT OPTION | +------------------------------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec) MariaDB [(none)]>
Configuring the Cloudera Manager Database
1. Now that we have the MariaDB database setup, we can start configuring it as the Cloudera Manager Database. There is a custom cloudera script to configure the external database as shown below:
[root@master ~]# /usr/share/cmf/schema/scm_prepare_database.sh mysql -h master.localdomain -utmp -ppassword --scm-host master.localdomain scm scm scm JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera Verifying that we can write to /etc/cloudera-scm-server Creating SCM configuration file in /etc/cloudera-scm-server Executing: /usr/java/jdk1.7.0_67-cloudera/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /etc/cloudera-scm-server/db.properties com.cloudera.cmf.db. [ main] DbCommandExecutor INFO Successfully connected to database. All done, your SCM database is configured correctly!
Here,
-h master.localdomain – The Database host. Default is to connect locally..
-u tmp – Database username that has privileges for creating users and grants. The default is ‘root’ for MariaDB/MySQL database.
-p password – Database Password. Default is no password.
–scm-host master.localdomain – SCM server’s hostname.
scm (1st) – databases to be created.
scm (2nd) – username for scm database.
scm (3rd) – password for the scm user.
2. You can verify the new database “scm” in the MariaDB.
[root@master ~]# mysql -u tmp -h master.localdomain -p
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | scm | +--------------------+ 4 rows in set (0.00 sec)
Configuring Cloudera Manager Agents
We need to add the hostname of the cloudera manager in all the cloudera agents configuration files “/etc/cloudera-scm-agent/config.ini” including the master node.
# vi /etc/cloudera-scm-agent/config.ini server_host=master.localdomain
Start Cloudera Manager Server and Cloudera Manager Agents
1. Start the Cloudera Manager Agents on all hosts including master. Also enable the agent to auto-start on boot.
# systemctl start cloudera-scm-agent # systemctl enable cloudera-scm-agent
2. Similarly, start the Cloudera Manager Server on the master node and enable it to auto-start on boot.
# systemctl start cloudera-scm-server # systemctl enable cloudera-scm-server
3. You can verify the MariaDB database “scm” and verify the new tables taht are created as part of the CM Server/Agent configuration.
# mysql -u tmp -h master.localdomain -p
MariaDB [(none)]> show databases; +--------------------+ | Database | +--------------------+ | information_schema | | mysql | | performance_schema | | scm | +--------------------+ 4 rows in set (0.00 sec)
MariaDB [(none)]> use scm; Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Database changed MariaDB [scm]> show tables; +--------------------------------+ | Tables_in_scm | +--------------------------------+ | AUDITS | | CLIENT_CONFIGS | | CLIENT_CONFIGS_TO_HOSTS | | CLUSTERS | .....
Cloudera Manager Web UI
To access the Cloudera Manager Web UI, point your browser to “http://[manager_host]:7180“. The default username and password are admin:admin.
We will cover the CDH installation process in the next post. I hope the post was informative. Stay tuned for rest of the exam objectives.