百度智能云

All Product Document

          MapReduce

          Zeppelin

          Zeppelin Introduction

          The zeppelin is an interactive data analysis tool that supports spark, sql, and other data analysis tools. For more information, please see Official Website of Zeppelin.

          This document introduces the primary use of spl on zeppelin by describing how to link and configure hiveserver2 on zeppelin.

          Cluster Preparation

          Prepare Baidu AI Cloud Environment.

          1. Log in to the console (Baidu AI Cloud Login Platform), select "Product Service->Baidu MapReduce BMR", and click "Create Cluster" to enter the cluster creation page and configure the following:

            • Set cluster name
            • Set administrator password
            • Disable log. If you enable the log, you need to select the bos directory for log storage, and a bucket of bos directory must exist.
            • Select image version “BMR 2.0 (hadoop 3.1)”, and zeppelin is valid for BMR of version 2.0 and above.
            • Select the built-in template “zeppelin”. The hive is selected by default. If you need the spark, please manually select the spark component.
            • High availability is enabled by default, and you can disable HA mode.
            • Keep default setup of cluster network and security.
            • Click the next step, and select configuration of machines (cpu cores >= 8 and memory >= 16G recommended for master node) and number of machines (master node related to high availability mode enabled or disabled in the last step) for every group.

          Keep other default configurations, click the next step, and click "Pay" to view the created cluster in the cluster list page. The cluster is created successfully when cluster status changes from "Initializing" to "Waiting".

          1. Access cluster

            • Refer to Access Cluster to build a network environment (ssh or openvpn) where the local browser can access the cluster.
            • Log in to cluster master node, and enter hostname command at the terminal to obtain cluster’s fqdn name (hostname_master).
            • Enter $hosname_master:9995 in browser to link to the zeppelin UI interface.
          2. Use zeppelin

            • Default login account and password are admin/admin
            • Create a notebook named hive

          create_notebook.png

          • Configuration of key parameters (hive configuration of driver, user, password, and jdbc connection url in case of selection of jdbc group)

            image.png

          • Execute commands

          image.png image.png image.png

          Reference Documentations:

          1. http://zeppelin.apache.org/docs/0.8.0/index.html
          2. https://zeppelin.apache.org/
          Previous
          Presto
          Next
          Flink