百度智能云

All Product Document

          Baidu Machine Learning

          AutoDL Job

          AutoDL is an automatic deep learning product, which uses advanced transfer learning or neural network architecture search technology to provide data for business training. The platform provides a simple and easy-to-understand operation API. It only takes a few steps to train, evaluate and deploy the model, giving users a one-stop experience. AutoDL has a wide range of applicability. Beginners only need to submit data to obtain high-quality models, and experienced engineers can continue to study the high-quality models provided by the platform.

          Create a Job

          Select "Training-->AutoDLjob" in the left navigation bar to enter the AutoDLjob list page. Click the "Create job" button to enter the Create job process.

          Image Sorting-Transfer Learning

          Using trained convolutional neural network (CNN), combining with the training data to fine tune several layers, a model suitable for our training data is obtained, avoiding the long-term training process with large resources, which is the concept of transfer learning. Image sorting-Transfer learning, aiming at hundreds of tensor-level image data, an available model can be trained for users in 10 minutes, and the training process is displayed, including preprocessing results, loss and acc indexes in the training stage. Users can deploy and use this model directly in the prediction service.

          Configuration Instructions:

          Configuration name Required Description
          Job name Yes It can only consist of numbers, letters,-or _ and can only start with letters, with less than 40 characters in length
          Algorithm or framework Yes Image Sorting-Transfer Learning
          Send SMS at the end of job Yes Whether to send SMS to inform users after job completion
          Training data path Yes Path to store model training data. When filling in zip package, only use this zip package for training; when filling in the folder, use all zip packages under the folder for training. The folder does not support subdirectory iteration. Only the first 10 zip packages are taken from more than 10 zip packages. Currently, only zip packages are supported
          Output path Yes The path to store the model and log. After the job succeeds, store the model in the path/{job_id}/model, and the log in the path/{job_id}/log
          Computing resources Yes BML cluster or user self-owned CCE cluster
          Resource package Yes Deep learning and development card and other GPU single card packages
          Number of instances Yes 1
          Maximum running time Yes If the job runs for the maximum running time, BML will automatically force the job to stop, which may cause job failure

          Notes for Training Data Format:

          image.png

          • All prepared photos need to be sorted into a single folder, and all folders need to be compressed into .zip format .
          • If there are many photos, it is recommended to divide them into multiple packages, and the maximum support for training is 10 packages.
          • If the sorting and naming of multiple packages are consistent, the system will automatically merge the data as a kind of photo.
          • The sorting should be named in the form of numbers, letters, and underscores. At present, the Chinese format is not supported. At the same time, please note that there should be no spaces
          • Photo description: (1) The expanded-name of photos can support three common types: Jjpg, jpeg, png
            (2) Photo size: temporarily unlimited (3) number of photos in each category: 20 <= number of photos in each category; Note: the number of photos in each category should be balanced to achieve better model effect (4) number of photo types: 2 categories < = number of photo sortings < = 200 categories (5) total number of photos: total number of photos < = 100,000

          Example Configuration:

          Training data is cifar10 data downloaded from the Internet. The developer converts the data into an input data format conforming to Image sorting-Transfer learning algorithm (the converted code can refer to transfer_cifar10.py, run in python2 environment and rely on numpy and opencv libraries), i.e. the photos are divided into different directories. As an example, the developer will take 100 photos for each category, totaling 1,000 photos for training. You can download the data and convert it yourself, or you can directly use our public BOS data for training.

          Training data path: bos:/bml-public/autodl-demo/data/cifar10-for-transferlearning.zip

          Output path configure your own BOS path.

          Click "OK" to submit the job.

          Notes for Model Output Format:

          • The output model is a model in pytorch format, ending with pth suffix, and containing the weight information of the network. The model file is stored in the output path/{jobid}/model specified by the user, which can be used for prediction service.

          Activate Prediction Service:

          After Image sorting -Transfer learning job runs successfully, copy the model output path on the job details page, such as: bos:/xxx/yyy/autodl-qianyixuexi/job-8c0yiasq02caa6hf/model/

          • On the Prediction > Prediction model library > Create model page, add the Image sorting -Transfer learning model, and fill in the path above in the model file path.
          • On the Prediction > Template configuration library > Create template page, create a template configuration, such as:
          • In the Prediction - > Endpoint management - > Create endpoint page, load the above template and start the prediction service, such as:

          Send Prediction Request:

          When the prediction endpoint status is in service, the prediction service request can be sent with the following code:

           #!/usr/bin/env python
           # -*- coding: utf-8 -*-
           import sys
           import json
           import base64
           import requests
           
           IMAGE_PATH = "aaa.jpg"
           ENDPOINT_URL = "http://10.181.114.16:8023/v1/endpoints/yyyy/invocations"
           PARAMS = "?interface=predict&action=predict"
           TARGET_URL = ENDPOINT_URL + PARAMS
           
           def get_request():
               """Construction Request"""
               arr_instances = []
               with open(IMAGE_PATH, "rb") as f:
                   data = f.read()
               encoded_data = base64.b64encode(data)
               str_data = str(encoded_data)
               obj_instance = {
                   "data": str_data,
               }
               arr_instances.append(obj_instance)
               request = {
                   "instances": arr_instances,
               }
               return request
               
           if __name__ == '__main__':
               request = get_request()
               json_request = json.dumps(request)
           
               headers = {'Content-type': 'application/json'}
               res = requests.post(TARGET_URL, data = json_request, headers = headers)
               result = res.text
               print(result)

          Modify IMAGE_PATH as the photo position and ENDPOINT_URL as the endpoint URL, and execute the above code to send a prediction request and obtain a photo sorting prediction result. Predicted photo:

          Predicted results:

          {"result":"[[1.288892149925232, -3.755728244781494, -1.8610944747924805, 6.285534381866455, -1.8683143854141235, 2.2759482860565186, -3.119175672531128, 0.8390857577323914, 1.8015419244766235, -1.3871209621429443]]"}

          The results showed that the photo was predicted to be label 3.

          Image Sorting - ENAs

          Using the advanced Neural Network Architecture Search Technology, training is conducted on the data provided by the service, and the optimal model search and high-quality model output can be completed in a few hours. According to the classified image data given by the user, a trained model is output to the user, and the training process is displayed, including the preprocessing result, Metric index of each round, neural network structure diagram and tuning result.

          Configuration Instructions:

          Configuration name Required Description
          Job name Yes It can only consist of numbers, letters,-or _ and can only start with letters, with less than 40 characters in length
          Algorithm or framework Yes Image Sorting - ENAs
          Send SMS at the end of job Yes Whether to send SMS to inform users after job completion
          Training data path Yes Path to store model training data. When filling in zip package, only use this zip package for training; when filling in the folder, use all zip packages under the folder for training. The folder does not support subdirectory iteration. Only the first 10 zip packages are taken from more than 10 zip packages. Currently, only zip packages are supported
          Output path Yes The path to store the model and log. After the job succeeds, store the model in the path/{job_id}/model, and the log in the path /{job_id}/log
          Computing resources Yes BML cluster or user self-owned cce cluster
          Resource package Yes Deep learning and development card and other GPU single card packages
          Number of instances Yes 1
          Maximum running time Yes If the job runs for the maximum running time, BML will automatically force the job to stop, which may cause job failure

          Notes for Training Data Format:

          • All prepared photos need to be sorted into a single folder, and all folders need to be compressed into *.zip format *.
          • If there are many photos, it is recommended to divide them into multiple packages, and the maximum support for training is 10 packages.
          • If the sorting and naming of multiple packages are consistent, the system will automatically merge the data as a kind of photo.
          • The sorting should be named in the form of numbers, letters, and underscores. At present, the Chinese format is not supported. At the same time, please note that there should be no spaces
          • Photo description: (1) The expanded-name of photos can support three common types: jpg, jpeg, png
            (2) Photo size: 100k < = photo size < = 3M, length-width ratio is within 3:1, with the longest side less than 4096px, the shortest side greater than 30px (3) number of photos per category: 50 <= number of photos per category; Note: The number of photos in each category should be balanced to achieve better model effect (4) Number of photo categories: Category 2 < = number of photos classified < = 200 categories (5) total number of photos: total number of photos < = 100,000

          Example Configuration:

          Training data is cifar10 data downloaded from the Internet. The developers convert the data into the input data format that conforms to the image sorting - ENAs algorithm (the converted code can refer to the transfer ﹣ cifar10.py, which runs in the python2 environment and relies on the numpy and opencv libraries), i.e. classifying the images according to different directories and making them into zip packages. You can download the data and convert it yourself, or you can directly use our public BOS data for training.

          Training data path: bos:/bml-public/autodl-demo/data/cifar10.zip

          Output path configure your own BOS path.

          Click "OK" to submit the job.

          Notes for Model Output Format:

          • The output model is a model in keras HDF5 format, including network structure and weight parameters.
          • Terminate: terminate the job that is currently running or queued, no longer queued, no longer running. After termination of operation, the job results and job logs will not be uploaded to the specified BOS path.
          • Clone: clone the configuration item of a job to enter create job page.
          • Delete: delete the job. If the job is still queued or running at the time of deletion, the queue or running will be terminated first, and then the job will be deleted. After deletion, the job will disappear from the job list.
          • View job details: click job name to enter job details, view job information, parameter information and cluster information.
          • View operation details: click job name and select operation details tab to enter operation details and view operation status, start and end time, log details, operation curve, etc.

          View Job Results

          After the job runs successfully, the model and log will be stored in the corresponding BOS address according to the output path specified during job configuration. Users need to go to BOS to view or download the job model or log.

          If the job model or log cannot be saved, it may be the following:

          • Terminate job manually
          • Job runs timeout and is automatically terminated
          • Job failed to run

          User job failed, possibly due to the following conditions:

          • Input data does not match data format requirements
          • The BOS address of the input data does not exist or is not accessible
          • Bucket of output path does not exist or is not accessible
          • Training timeout
          Previous
          Machine Learning Job
          Next
          AutoML Job