Deploy Cloudera CDP Data Center on Oracle Cloud Infrastructure (OCI)

Deploy Cloudera CDP Data Center on Oracle Cloud Infrastructure (OCI)

Cloudera Data Platform (CDP) Data Center is a completely data platform that unifies the latest open-source analytics – Spark, Impala, Hive, HBase, Kafka, Hadoop, and more – into a multi-function analytics and data management system that features:

  • Higher performance SQL analytics
  • Real-time stream processing and management
  • Granular attribute-based access control
  • Dynamic column filtering and row masking
  • A 10x increase in file and object scalability

I used a Terraform (v0.12) module that deploy CDP Data Center on Oracle Cloud Infrastructure (OCI).

This template has support to target existing VCN/Subnets for cluster deployment. To engage this functionality, just use the schema menu system to select an existing VCN target, then select appropriate subnets for each cluster host type.

 

1. Download the zip file for either the Cloudera Terraform deployment with Resource Manager.

2. Sign in to the Oracle Cloud Infrastructure (OCI)

3. Select Resource Manager and click Stacks

 

 

4. Click Create Stack.

 

5. On the Stack Information page upload the zip file that you downloaded in step 1 then click Next.

6. On the Configure Variables page, enter the contents of an “SSH provided key” (opcional). Note: If you deploy Cloudera Manager to a private subnet, you will require a VPN or SSH Tunnel through an Edge Node to access cluster management. When you create a VPC, you must specify a range of IPv4 addresses for the VPC in the form of a Classless Inter-Domain Routing (CIDR) block for example 10.0.0.0/16. Click Next.

 

 

7. Verify your configuration and Click Create.

 

9. From the Terraform Actions menu select Plan

 

10. Click Plan

 

11. Wait a few moments for this job to complete.

 

12. Terraform has been successfully initialized.

 

13. Job complete Succeeded

14. Diagram showing what is typically deployed using this template. That resources are automatically distributed among “Fault Domains” in an “Availability Domain” to ensure fault tolerance.

 

 

 

References

Learn about deploying Hadoop on Oracle Cloud Infrastructure. Available at https://docs.oracle.com/en/solutions/learn-deploy-hadoop-oci/index.html#GUID-6BC025FF-829B-4BBD-9C80-69044F61F35B

Deploy Hadoop Easily on Oracle Cloud Infrastructure Using Resource Manager. Available at https://blogs.oracle.com/cloud-infrastructure/deploy-hadoop-easily-on-oracle-cloud-infrastructure-using-resource-manager

Cloudera on Oracle Cloud Infrastructure (Terraform deployment template). Available at https://github.com/oracle-quickstart/oci-cloudera

Overview of Resource Manager. Avalilable at https://docs.cloud.oracle.com/en-us/iaas/Content/ResourceManager/Concepts/resourcemanager.htm

CDP Data Center. Avalilable at https://docs.cloudera.com/cdp/latest/overview/topics/cdpdc-overview.html