百度智能云

All Product Document

          Multimedia Cloud Processing

          Feature

          Video processing transcodes the AV files into the files with different resolution ratios and formats to satisfy the requirements of the users from different network bandwidths and terminal devices. The core competency characteristics are as follows:

          • Transcoding format: Overlay the mainstream video transcoding format.
          • Video coding: Support a large number of video encoding parameters. It supports multiple video formats, video resolutions and different code rates.
          • Audio coding: Support for multiple audio coding formats, etc.
          • Video editing: Support secondary operation for video editing and splicing.
          • Video screenshot: Multiple abilities of video screenshot to satisfy the requirements of different scenarios, cut out the beautiful cover and improve video click-through rate.
          • Video encryption: Copyright protection, guide against rampant piracy.
          • intelligent super-definition: Improve the video definition and reduce the video code rate through the capability of video AI.
          • Extreme transcoding: Separate audios and videos. Dynamic segment technology improves the transcoding speed, with a maximum of 50x speed.
          • BD265: Adopt more than 50 optimization algorithms and AI coding technology to provide higher image quality with lower code rate and faster speed.

          Transcoding format

          Type Description
          Input format ·Packaging format: MP4、FLV、MOV、M3U8、3GP、AVI、MPG、ASF、WMV、MKV、TS、WebM、MXF;
          · Video coding format: H.264/AVC、H.265/HEVC、MPEG-1、MPEG-2、MPEG-4、MJPEG、VP8、VP9、Quicktime、RealVideo、Windows Media Video ;
          · Audio coding format: AAC、AC-3、ADPCM、AMR、DSD、MP1、MP2、MP3、PCM、RealAudio、Windows Media Audio
          Output format ·Video packing formats: FLV、MP4、HLS(m3u8+ts)、MPEG-DASH(MPD+fMP4);
          · Audio packing format: MP3、MP4、OGG、FLAC、m4a;
          · Photo packaging format: JPG、PNG、GIF、WEBP;
          · Video coding format: H.264/AVC、 H.265/HEVC;
          · Audio coding format: MP3、AAC、VORBIS、FLAC

          Video coding

          Parameter Description
          codec Coding standard: h264、h265,The default is h264
          profile Encoding level: baseline, main, high,The default is baseline, and h265 only supports main.
          bitRateInbps Target code rate, ranging [100, 50,000], in kbps
          maxFrameRate Maximum frame rate, the options are: 10,15, 23.97, 24, 25, 29.97, 30, 50, 60
          maxWidthInPixel Width of resolution, range[128,4096]
          maxHeightInPixel Height of the resolution ratio, range[96,3072]
          sizingPolicy Shrinking policies, including:
          · Keep: When the width and height of the original video are all less than the template’s, keep the resolution ratio of the original video; when the width or height of the original video is more than the template’s, shrink the longer side to be the same as the template and the other side takes equal scaling.
          · shrinkToFit: Keep the original video width-height proportion and add the black border to reach the resolution ratio of the template.;
          · shrinkToFitBlur: Keep the width-height proportion of the original video and add Gaussian Blur effect to reach the resolution rate of the template;
          · stretch: It means stretching the original video to achieve the template resolution.
          crf Code rate control policy:
          ·The default is VBR;
          ·Constant quality crf scope may be set. [1, 51]
          playbackSpeed Speed playback: Scope [0.05,20.0]
          ·It is decelerated playback when it is smaller than 1.0.;
          · It is accelerated playback when it is greater than 1.0.。
          transMode Transcoding mode, supporting: normal、twopass、cae(intelligent super-definition)

          Audio coding

          Parameter Description
          codec The default is AAC.
          bitRateInBps Object code rate, ranging [0,1000]
          ampleRateInHz Audio sampling rate, optional: 22050, 32000, 44100, 48000, 96000
          channels Number of audio tracks, optional: 1, 2
          gain Sound volume, ranging [-60,60], the unit is db, the value is negative low volume, the value is positive high volume.
          mute Whether to set mute
          norm Whether to normalize the volume to avoid the fluctuation of the volume

          Video editing

          Function Description
          Video stitching A maximum of 200 videos are supported to be spliced into one.
          Video clip Support to set the start time and duration, in second or millisecond
          Watermark removing Support to set the watermark-removed area (x, y, width and height), and support to identify and remove the watermark automatically
          Black edge removal Support setting the effective frame areas (x, y, width, height) after clipping the black edge
          Gaussian Blur Support the filling of dynamic Gaussian blur effect
          Overlay static/dynamic watermarks The format includes: jpg, png, apng, gif, webp, mov and mp4, etc., support to set display position and start time
          Superposition of subtitles Support srt subtitle files; support for the setting of font, type size, display position and start time
          Superposition of audio The supported audio formats include mp3 and aac

          Video screenshot

          Function Description
          Specify
          screenshot
          · manual: Capture thumbnails according to the specified start and end time and interval.;
          · split: Intercept a thumbnail according to the specified start and end time as well as numbers;
          · splitss0: Ensure to capture the first frame based on split mode
          Intelligent
          screenshot
          · auto: Automatically capture the frame of higher entropy;
          · shot: Automatically capture the scenario change frame according to the scenario switchover;
          · idl: Use Baidu IDL (Institute of Deep Learning) AI thumbnail algorithm to capture one frame of thumbnail;
          · highlight: Automatically generate a 0.5s highlight according to AI model, the captured duration can be set
          CSS Sprite Specify the width of row, column and external frame of Sprite image as well as subgraph interval
          Image
          format
          ·Static picture format: jpg、png
          · GIF: gif、webp、mp4
          · The dynamic photo supports to set frame rate and playback speed
          Photo
          width and height
          The width/height ranges [10, 2,000], and the width defaults to 600 and height to 450. If the actual resolution of the video is lower than the target resolution, it will output according to the actual resolution
          Scaling
          policy
          ·keep indicates to maintain the width-to-height ratio of the original video;
          · shrinkToFit indicates to maintain the width-height ratio of the original video and add black edges;
          · stretch indicates to stretch the original video
          Watermark removing Specify the watermark-removed fuzzy area (x, y, width and height)
          Black edge
          Clipping
          Specify the effective picture area after the black edge is removed (x, y, width, height), and support the automatic detection of area with the black edge removed

          Media information

          Type Description
          File information Include : File size, file duration, container format, file type and MD5 value
          Video information Include : Coding standard, resolution ratio (width/height), code rate, frame rate
          Audio information Include : Coding standard, sound track, sampling rate, code rate

          Video encryption

          Encryption mode Description
          fixed To encrypt the fixed key, use the custom key to encrypt the video, at which point the aesKey is required;
          open With an open key, the system will generate an encrypted key automatically, but the key is disclosed, without access control
          playerBinding The system automatically generates the encryption secret key, which binds the player with access control, so the safety is high and is recommended.

          intelligent super-definition

          Type Description
          intelligent super-definition 1.0 Intelligently analyze the picture complexity on the video scenario level, and dynamically allocate the optimized coding parameters. Under the same picture quality, the code rate is lower, thereby saving bandwidth and storage costs.
          intelligent super-definition 2.0 Aiming at the best subjective experience of human eyes, optimize the color, brightness, contrast and saturation of the picture, strengthen the picture quality of the area noted by human eyes, save the code rate, and improve the picture quality.
          Repair of old film Debounce, remove the scratches, noise and Mosaic, etc. for the old videos or over-compressed videos Vertical application scenario of intelligent sense and super-definition2.0
          Intelligent frame insertion For general framerate videos within 30 frames per second (included), generate a version with higher framerate of 60 frames per second or even 120 frames per second, to improve the picture smoothness, and generally it is used with super-resolution.
          Super resolution Improve the video image details through further learning model to recreate the low resolution ratio to high resolution ratio, such as: SK to HD, 2k to 4k, etc.

          Extreme transcoding

          The topspeed transcoding includes common topspeed transcoding and intelligent topspeed transcoding.

          Common rapid transcoding separated transcode the audio and video through AV separation technology to reduce merge time, which can reach 10 times of speed acceleration for long-form video transcoding.

          On the basis of common topspeed transcoding, the intelligent topspeed transcoding can predict the multipart policy through AI model based on the attributes (Codec, B frame, frame rate, code rate and resolution, etc.) of the input video/output template, and the transcoding speed can be up to 50 times faster.

          BD265

          The BD265 encoder uses more than 50 coding optimization algorithm and AI coding technology to provide higher picture quality with lower code rate and faster speed.

          BD265 reduces the computing costs concerning motion estimation, mode selection and rate-distortion optimization by utilizing a variety of defined prediction models, and improves the coding speed greatly. Comparing with the open-sourcing HEVC encoder (x265), BD265 saves 30%-40% code rate under the same subjective quality and increase the coding speed of more than 2 times in the meantime. BD265 combines the latest visual coding technology and dynamic code-rate allocation algorithm, and spends each bit where human eyes are sensitive, and provides better subjective quality. BD265 coder supports rich coding gears, such as fast gear, common gear, slow gear and multiple coding configurations, such as crf, abr and two pass, and can be deployed quickly to multiple application scenarios like LVB, VOD and short videos, etc.

          Previous
          Noun Interpretation
          Next
          Product Advantages