百度智能云

All Product Document

          MapReduce

          FAQs About Configuration

          How to configure a cluster to terminate the cluster after running the step automatically?

          When creating the cluster, enable "Automatic Termination" option to terminate the cluster after completing the last step automatically.

          Must I wait for the step to complete when BMR does not manually stop it?

          No, the step in the step stream can be manually stopped in the console homework list page.

          Does the cluster support the outer network login?

          It is supported. During the running of the cluster, with Secure Shell Protocol (SSH), you can connect to the Master node and interact with the cluster through Master node's public IP, login user name and password, and SSH port number.

          • In Linux environment: Connect Master node through ssh [username]@[eip] -p [port number] command.
          • In Windows environment: Use SSH to log in through SSH client (Putty, SecureCRT, Xshell, etc.).

          What are the usage differences between the Core node and Task node? How to select them?

          1. Difference between Core node and Task node: Compared with Core node, Task node deploys no hdfs and has no risk of losing hdfs data copy when expanding or reducing scale, so it is more suitable for dynamic adjustment of cluster's computing ability (Core node stores data and can run DataNode and NodeManager, and Task node only runs NodeManager).
          2. How to select: For example, if you process 10G data and segmentation is 128MB, then there are a total of 80 map tasks. If a computing node (such as optimal memory type) is expected to need eight tasks, then you need ten computing nodes to complete tasks ASAP. If five core nodes are configured, then you need to configure five task nodes. Number of core nodes: If using hdfs in cluster, you need to plan the number of core nodes according to size of data in hdfs (generally the case for resident cluster); if using bos as data source instead of using hdfs in cluster, you can use minimum number of core nodes (generally the case for cluster to start on demand). The above example is for reference only.
          Previous
          FAQs About Failure
          Next
          FAQs About Billing