百度智能云

All Product Document

          MapReduce

          API Introduction

          Overview

          Baidu MapReduce (BMR) is a full-hosting Hadoop/Spark cluster that is accessible to on-demand deployment and elastic expansion and focuses on the processing, analysis, and reporting of big data. The Baidu Operations OPS team, with years of experience in massively distributed computing technology, is fully responsible for the Operations OPS of a cluster.

          Baidu MapReduce supports the complete Hadoop ecology:

          Hadoop: Provides the reliable storage of HDFS and MapReduce programming paradigms for the massively parallel processing of data.

          Spark: Provides the distributed-memory-based massive parallel processing framework to enhance the big data analysis performance significantly. Spark provides a SQL query interface, stream data processing, and machine learning.

          HBase: Massively distributed NoSQL database provides unstructured and semi-structured random-access mass data.

          Compared with the self-built Hadoop cluster, Baidu MapReduce holds the following advantages:

          Convenience: Create a cluster in several minutes without assigning, deploying, and optimizing the time invested in nodes.

          Elasticity: Create and dynamically adjust any size of cluster, i.e., increase the size of the cluster during the peak period to improve the computing ability and decrease the size of the cluster during the off-peak period to reduce the costs.

          Openness: Full compatibility with the open-source Hadoop/Spark community, and zero-cost business migration.

          Tangible benefits: On-demand payment and prepaid service, and simple and transparent pricing.

          Security: Private VPC (Virtual Private Cloud) and system environment for exclusive use to ensure data security.

          Baidu MapReduce components

          ProductDescription-1_ef0e334.png

          If you are calling the API of Baidu AI Cloud product for the first time, you can watch the API Introductory Video Guide to master the method of calling the API quickly.

          Interface Overview

          This section summarizes the APIs, which can be called by the BMR cluster. Please see the interface details by clicking the links.

          Interface Description
          Cluster Operation Interface Query cluster list, query cluster information, create cluster, and release cluster.
          Instance Group Operation Interface Query instance group list, and modify instance group configuration
          Instance Operation Interface Query Instance List
          Step Operation Interface Add a step, query the step list, and query the step information

          Product Limitations

          System Limitations

          • Limitation on the total number of active clusters: The total number of clusters active at the same time is not more than 5.
          • Limitation on the total number of steps in a single cluster: The total number of steps submitted in a single cluster is not more than 256.
          • Limitation on the number of nodes in a single cluster: By default, the number of instances in the Master node instance group is 1 in a cluster and 2 in the HA mode; the number of instances in the Core instance group is not less than 2 and not more than 20, and the number of instances in the Task instance group is not more than 20.
          Previous
          Tutorials
          Next
          General Instructions