AutoDL Job

Last Updated：2020-10-16

AutoDL is an automatic deep learning product, which uses advanced transfer learning or neural network architecture search technology to provide data for business training. The platform provides a simple and easy-to-understand operation API. It only takes a few steps to train, evaluate and deploy the model, giving users a one-stop experience. AutoDL has a wide range of applicability. Beginners only need to submit data to obtain high-quality models, and experienced engineers can continue to study the high-quality models provided by the platform.

Create a Job

Select "Training-->AutoDLjob" in the left navigation bar to enter the AutoDLjob list page. Click the "Create job" button to enter the Create job process.

Image Sorting-Transfer Learning

Using trained convolutional neural network (CNN), combining with the training data to fine tune several layers, a model suitable for our training data is obtained, avoiding the long-term training process with large resources, which is the concept of transfer learning. Image sorting-Transfer learning, aiming at hundreds of tensor-level image data, an available model can be trained for users in 10 minutes, and the training process is displayed, including preprocessing results, loss and acc indexes in the training stage. Users can deploy and use this model directly in the prediction service.

Configuration Instructions:

Configuration name	Required	Description
Job name	Yes	It can only consist of numbers, letters,-or _ and can only start with letters, with less than 40 characters in length
Algorithm or framework	Yes	Image Sorting-Transfer Learning
Send SMS at the end of job	Yes	Whether to send SMS to inform users after job completion
Training data path	Yes	Path to store model training data. When filling in zip package, only use this zip package for training; when filling in the folder, use all zip packages under the folder for training. The folder does not support subdirectory iteration. Only the first 10 zip packages are taken from more than 10 zip packages. Currently, only zip packages are supported
Output path	Yes	The path to store the model and log. After the job succeeds, store the model in the path/{job_id}/model, and the log in the path/{job_id}/log
Computing resources	Yes	BML cluster or user self-owned CCE cluster
Resource package	Yes	Deep learning and development card and other GPU single card packages
Number of instances	Yes	1
Maximum running time	Yes	If the job runs for the maximum running time, BML will automatically force the job to stop, which may cause job failure

Notes for Training Data Format:

All prepared photos need to be sorted into a single folder, and all folders need to be compressed into .zip format .
If there are many photos, it is recommended to divide them into multiple packages, and the maximum support for training is 10 packages.
If the sorting and naming of multiple packages are consistent, the system will automatically merge the data as a kind of photo.
The sorting should be named in the form of numbers, letters, and underscores. At present, the Chinese format is not supported. At the same time, please note that there should be no spaces
Photo description: (1) The expanded-name of photos can support three common types: Jjpg, jpeg, png
(2) Photo size: temporarily unlimited (3) number of photos in each category: 20 <= number of photos in each category; Note: the number of photos in each category should be balanced to achieve better model effect (4) number of photo types: 2 categories < = number of photo sortings < = 200 categories (5) total number of photos: total number of photos < = 100,000

Example Configuration:

Training data is cifar10 data downloaded from the Internet. The developer converts the data into an input data format conforming to Image sorting-Transfer learning algorithm (the converted code can refer to transfer_cifar10.py, run in python2 environment and rely on numpy and opencv libraries), i.e. the photos are divided into different directories. As an example, the developer will take 100 photos for each category, totaling 1,000 photos for training. You can download the data and convert it yourself, or you can directly use our public BOS data for training.

Training data path: bos:/bml-public/autodl-demo/data/cifar10-for-transferlearning.zip

Output path configure your own BOS path.

Click "OK" to submit the job.

Notes for Model Output Format:

The output model is a model in pytorch format, ending with pth suffix, and containing the weight information of the network. The model file is stored in the output path/{jobid}/model specified by the user, which can be used for prediction service.

Activate Prediction Service:

After Image sorting -Transfer learning job runs successfully, copy the model output path on the job details page, such as: bos:/xxx/yyy/autodl-qianyixuexi/job-8c0yiasq02caa6hf/model/

On the Prediction > Prediction model library > Create model page, add the Image sorting -Transfer learning model, and fill in the path above in the model file path.

On the Prediction > Template configuration library > Create template page, create a template configuration, such as:

In the Prediction - > Endpoint management - > Create endpoint page, load the above template and start the prediction service, such as:

Send Prediction Request:

When the prediction endpoint status is in service, the prediction service request can be sent with the following code:

 #!/usr/bin/env python
 # -*- coding: utf-8 -*-
 import sys
 import json
 import base64
 import requests
 
 IMAGE_PATH = "aaa.jpg"
 ENDPOINT_URL = "http://10.181.114.16:8023/v1/endpoints/yyyy/invocations"
 PARAMS = "?interface=predict&action=predict"
 TARGET_URL = ENDPOINT_URL + PARAMS
 
 def get_request():
     """Construction Request"""
     arr_instances = []
     with open(IMAGE_PATH, "rb") as f:
         data = f.read()
     encoded_data = base64.b64encode(data)
     str_data = str(encoded_data)
     obj_instance = {
         "data": str_data,
     }
     arr_instances.append(obj_instance)
     request = {
         "instances": arr_instances,
     }
     return request
     
 if __name__ == '__main__':
     request = get_request()
     json_request = json.dumps(request)
 
     headers = {'Content-type': 'application/json'}
     res = requests.post(TARGET_URL, data = json_request, headers = headers)
     result = res.text
     print(result)

Modify IMAGE_PATH as the photo position and ENDPOINT_URL as the endpoint URL, and execute the above code to send a prediction request and obtain a photo sorting prediction result. Predicted photo:

Predicted results:

{"result":"[[1.288892149925232, -3.755728244781494, -1.8610944747924805, 6.285534381866455, -1.8683143854141235, 2.2759482860565186, -3.119175672531128, 0.8390857577323914, 1.8015419244766235, -1.3871209621429443]]"}

The results showed that the photo was predicted to be label 3.

Image Sorting - ENAs

Using the advanced Neural Network Architecture Search Technology, training is conducted on the data provided by the service, and the optimal model search and high-quality model output can be completed in a few hours. According to the classified image data given by the user, a trained model is output to the user, and the training process is displayed, including the preprocessing result, Metric index of each round, neural network structure diagram and tuning result.

Configuration Instructions:

Configuration name	Required	Description
Job name	Yes	It can only consist of numbers, letters,-or _ and can only start with letters, with less than 40 characters in length
Algorithm or framework	Yes	Image Sorting - ENAs
Send SMS at the end of job	Yes	Whether to send SMS to inform users after job completion
Training data path	Yes	Path to store model training data. When filling in zip package, only use this zip package for training; when filling in the folder, use all zip packages under the folder for training. The folder does not support subdirectory iteration. Only the first 10 zip packages are taken from more than 10 zip packages. Currently, only zip packages are supported
Output path	Yes	The path to store the model and log. After the job succeeds, store the model in the path/{job_id}/model, and the log in the path /{job_id}/log
Computing resources	Yes	BML cluster or user self-owned cce cluster
Resource package	Yes	Deep learning and development card and other GPU single card packages
Number of instances	Yes	1
Maximum running time	Yes	If the job runs for the maximum running time, BML will automatically force the job to stop, which may cause job failure

Notes for Training Data Format:

All prepared photos need to be sorted into a single folder, and all folders need to be compressed into *.zip format *.
If there are many photos, it is recommended to divide them into multiple packages, and the maximum support for training is 10 packages.
If the sorting and naming of multiple packages are consistent, the system will automatically merge the data as a kind of photo.
The sorting should be named in the form of numbers, letters, and underscores. At present, the Chinese format is not supported. At the same time, please note that there should be no spaces
Photo description: (1) The expanded-name of photos can support three common types: jpg, jpeg, png
(2) Photo size: 100k < = photo size < = 3M, length-width ratio is within 3:1, with the longest side less than 4096px, the shortest side greater than 30px (3) number of photos per category: 50 <= number of photos per category; Note: The number of photos in each category should be balanced to achieve better model effect (4) Number of photo categories: Category 2 < = number of photos classified < = 200 categories (5) total number of photos: total number of photos < = 100,000

Example Configuration:

Training data is cifar10 data downloaded from the Internet. The developers convert the data into the input data format that conforms to the image sorting - ENAs algorithm (the converted code can refer to the transfer ﹣ cifar10.py, which runs in the python2 environment and relies on the numpy and opencv libraries), i.e. classifying the images according to different directories and making them into zip packages. You can download the data and convert it yourself, or you can directly use our public BOS data for training.

Training data path: bos:/bml-public/autodl-demo/data/cifar10.zip

Output path configure your own BOS path.

Click "OK" to submit the job.

Notes for Model Output Format:

The output model is a model in keras HDF5 format, including network structure and weight parameters.

Terminate: terminate the job that is currently running or queued, no longer queued, no longer running. After termination of operation, the job results and job logs will not be uploaded to the specified BOS path.
Clone: clone the configuration item of a job to enter create job page.
Delete: delete the job. If the job is still queued or running at the time of deletion, the queue or running will be terminated first, and then the job will be deleted. After deletion, the job will disappear from the job list.
View job details: click job name to enter job details, view job information, parameter information and cluster information.
View operation details: click job name and select operation details tab to enter operation details and view operation status, start and end time, log details, operation curve, etc.

View Job Results

After the job runs successfully, the model and log will be stored in the corresponding BOS address according to the output path specified during job configuration. Users need to go to BOS to view or download the job model or log.

If the job model or log cannot be saved, it may be the following:

Terminate job manually
Job runs timeout and is automatically terminated
Job failed to run

User job failed, possibly due to the following conditions:

Input data does not match data format requirements
The BOS address of the input data does not exist or is not accessible
Bucket of output path does not exist or is not accessible
Training timeout

Machine Learning Job

AutoML Job

百度智能云

Baidu Machine Learning

AutoDL Job

Create a Job

Image Sorting-Transfer Learning

Image Sorting - ENAs

View Job Results

Baidu Machine Learning

AutoDL Job

Create a Job

Image Sorting-Transfer Learning

Image Sorting - ENAs

Job List Related Operations

View Job Results