百度智能云

All Product Document

          Baidu Machine Learning

          FAQs

          Technical Support

          For any problems you encounter in the use process, you are welcome to use QQ to scan the following QR code to join the technical exchange group for consultation and discussion.

          Tips: the platform provides a lot of tips on the page. Don't forget to click the small question mark on the right to view related help.

          What is BOS? Why is BOS enabled when using BML?

          BOS (Baidu Object Storage) provides stable, secure, efficient and highly extended storage services, and supports any type of data storage, such as text, multimedia and binary, with a maximum of 5TB per file. In order to facilitate processing data and codes more securely and efficiently in the cloud, BML has achieved a deep connection with BOS, so you need to open BOS and create "Bucket" before using BML.

          Where can I view the running logs of training jobs?

          When the training operation status is running, enter "Operation details" and click "Real-time log details" to view the real-time log.

          After the job succeeds or fails, enter "Operation details", click "Log details" to enter your "Bucket" Management page, enter the trainer/folder, and trainer.log is the log file of the training job. Currently, BOS does not support preview of files in this format. You need to click the download symbol to the right of "File Name: trainer.log』" and view it after downloading.

          Does it support the installation of third-party packages? How to install?

          At present, the CPU / GPU instances of workarea and training jobs support users to install third-party packages. We have configured common third-party packages for you, such as “numpy”, “sklearn”, “torch”, etc. You can use pip list to view the currently installed third-party packages and versions, and use pip install to install other third-party packages you need. (you need to distinguish the python version when installing package, pip for python2 and pip3 for python3.)

          In the workarea, first, you can use the command pip install xxx to install directly in terminal; second, you can use the command !pip install xxx to install in notebook.

          Training job is different from workarea, and the system command installation needs to be called in the training script. The specific installation method is shown in the following figure. install.jpg

          When installing a third-party package in the workarea, the error "Could not find a version that satisfies the requirement XXX (from versions: )No matching distribution found for XXX "? Is reported, how to solve it?

          The default “pip” source of the workarea is Tsinghua image source (https://pypi.tuna.tsinghua.edu.cn/simple). If the error "could not find a version that satisfies the requirement XXX (from versions: )No matching distribution found for xxx" is reported, indicating that the software package is not in Tsinghua image source, please use Python's official source image (https://pypi.python.org/pypi).

          Taking “kaggle” as an example, you can use !pip install kaggle -i https://pypi.python.org/pypi to install the software package.

          How to convert the debugged code in Notebook into. Py file and submit training job?

          The code file in Notebook is in. ipynb format. If you want to submit a training job, you need to convert it to . py file. You can use jupyter nbconvert --to python xxx.ipynb in terminal to convert. ipynb files to. py files with the same name. As shown in the figure below, first input ll all the files listed under /mnt/demo path, then convert tensorflow-mnist.ipynb file to tensorflow-mnist.py file through jupyter nbconvert --to python tensorflow-mnist.ipynb, and finally input ll the validation file and save it successfully.

          During conversion, non-code cells will be commented out, and jupyter unique statement that does not conform to python specifications may appear, which need to be corrected manually.

          In addition, it should be noted that the input and output paths in the code should be appropriately modified: the training data path should be changed to "./train_data/", the prediction data path should be changed to "./test_data/", and the model and data of training output should be changed to "./output/". See FAQ-Why is the training / prediction data of deep learning job not downloaded and the output data saved?

          Why is the training / prediction data of deep learning job not downloaded and the output data not saved?

          The following figure is a screenshot of the parameter configuration of deep learning job. You need to fill in the output path (required), training data path (optional) and prediction data path (optional).

          The platform will automatically download the BOS data under the training data path to the train_data directory in the container, download the BOS data under the test data path to the test_data directory in the container, and upload the contents under the output directory in the container to the output path during code running.

          Therefore, it should be noted that the relative path is correctly filled in the code, the training data path is "./train_data/", the test data path is "./test_data/", and the model and data of the training output is "./output/", so that the train/test data can be downloaded into the container and the output data stored in BOS.

          How to troubleshoot the error of "invalid region, please check and try again"?

          First, check whether the "Bucket" area is North China-Beijing. BML currently only supports "Bucket" reading or writing in North China-Beijing region. If "Bucket" region is North China-Beijing, this error is still reported. It is likely that you have switched to other region in other products, which will cause this error when switching back to BML. The solution is that you can enter Baidu Cloud Compute BCC product on the console, select "North China-Beijing" as the region, and then enter BML product for operation.

          Does it support sub-account?

          BML has not been connected to the multi-account system at present, so it does not support sub-account now. Please use a normal account. The demand for sub-account is under planning, please look forward to it.

          How to download data from unmounted BOS directory to Notebook?

          The data under the BOS path selected when starting the workarea will be directly mounted in the "Data" directory shown on the left. When downloading data that is not under the bos path, you can use bos_utility's download_train_data_from_bos or download_from_bos and other methods to download. You need to fill in your own aksk (Get aksk method) in use. The difference is that download_train_data_from_bos downloads data to './train_data' directory, while download_from_bos specifies target_path after downloading, for example, to the'./mydata' directory. The main use method is shown in the figure below. image.png

          Previous
          Operation Guide
          Next
          Service Level Agreement (SLA)