Create PaddlePaddle Task
Last Updated:2022-01-14
You can create a PaddlePaddle task.
Prerequisites
- You already install the CCE AI Job Scheduler and CCE Deep Learning Frameworks Operator components successfully. Otherwise, the cloud native AI feature is unavailable.
- If you are a sub-user, you can only use the queue to create a task if you are among the users associated with the queue.
- During the installation of the CCE Deep Learning Frameworks Operator component, the system is installed with the PaddlePaddle deep learning framework.
Restriction Description
- At present, the PaddlePaddle task does not support GPU graphic memory sharing.
Operation Steps
- Log in to Baidu AI Cloud Official Website, and then enter the management console.
- Select “Product Service > Cloud Native > CCE”, and click CCE to enter the container engine management console.
- Click Cluster Management > Cluster List in the navbar on the left side.
- On the cluster list page, click the target cluster name to enter the cluster management page.
- On the cluster management page, click Cloud Native AI > Task Management.
- On the task management page, click Create Task.
- On the basic information, complete the configuration of the task.
- Task name: Customize the task name, which supports uppercase and lowercase letters, numbers, -, _, /, ., and other special characters, must start with a Chinese character or letter and have a length of 1-65 characters.
- Queue: Select the queue associated with the new task.
- Framework: Select the deep learning framework "PaddlePaddle" corresponding to the task.
- Complete the configuration by referring to the yaml template:
apiVersion: batch.paddlepaddle.org/v1
kind: PaddleJob
metadata:
name: resnet
spec:
cleanPodPolicy: Never
worker:
replicas: 2
template:
spec:
schedulerName: volcano
containers:
- name: resnet
image: registry.baidubce.com/paddle-operator/demo-resnet:v1
command:
- python
args:
- "-m"
- "paddle.distributed.launch"
- "train_fleet.py"
volumeMounts:
- mountPath: /dev/shm
name: dshm
resources:
requests:
cpu: 1
memory: 2Gi
limits:
baidu.com/v100_16g_cgpu: "1"
volumes:
- name: dshm
emptyDir:
medium: Memory
- Click the “OK” button to complete the task creation.