Baidu AI Cloud
中国站

百度智能云

Multimedia Cloud Processing

Feature

Video processing transcodes the AV files into the files with different resolution ratios and formats to satisfy the requirements of the users from different network bandwidths and terminal devices. The core competency characteristics are as follows:

  • Transcoding format: Overlay the mainstream video transcoding format.
  • Video coding: Support a large number of video encoding parameters. It supports multiple video formats, video resolutions and different code rates.
  • Audio coding: Support for multiple audio coding formats, etc.
  • Video editing: Support secondary operation for video editing and splicing.
  • Video screenshot: Multiple abilities of video screenshot to satisfy the requirements of different scenarios, cut out the beautiful cover and improve video click-through rate.
  • Video encryption: Copyright protection, guide against rampant piracy.
  • intelligent super-definition: Improve the video definition and reduce the video code rate through the capability of video AI.
  • Extreme transcoding: Separate audios and videos. Dynamic segment technology improves the transcoding speed, with a maximum of 50x speed.
  • BD265: Adopt more than 50 optimization algorithms and AI coding technology to provide higher image quality with lower code rate and faster speed.

Transcoding format

Type Description
Input format ·Packaging format: MP4、FLV、MOV、M3U8、3GP、AVI、MPG、ASF、WMV、MKV、TS、WebM、MXF;
· Video coding format: H.264/AVC、H.265/HEVC、MPEG-1、MPEG-2、MPEG-4、MJPEG、VP8、VP9、Quicktime、RealVideo、Windows Media Video ;
· Audio coding format: AAC、AC-3、ADPCM、AMR、DSD、MP1、MP2、MP3、PCM、RealAudio、Windows Media Audio
Output format ·Video packing formats: FLV、MP4、HLS(m3u8+ts)、MPEG-DASH(MPD+fMP4);
· Audio packing format: MP3、MP4、OGG、FLAC、m4a;
· Photo packaging format: JPG、PNG、GIF、WEBP;
· Video coding format: H.264/AVC、 H.265/HEVC;
· Audio coding format: MP3、AAC、VORBIS、FLAC

Video coding

Parameter Description
codec Coding standard: h264、h265,The default is h264
profile Encoding level: baseline, main, high,The default is baseline, and h265 only supports main.
bitRateInbps Target code rate, ranging [100, 50,000], in kbps
maxFrameRate Maximum frame rate, the options are: 10,15, 23.97, 24, 25, 29.97, 30, 50, 60
maxWidthInPixel Width of resolution, range[128,4096]
maxHeightInPixel Height of the resolution ratio, range[96,3072]
sizingPolicy Shrinking policies, including:
· Keep: When the width and height of the original video are all less than the template’s, keep the resolution ratio of the original video; when the width or height of the original video is more than the template’s, shrink the longer side to be the same as the template and the other side takes equal scaling.
· shrinkToFit: Keep the original video width-height proportion and add the black border to reach the resolution ratio of the template.;
· shrinkToFitBlur: Keep the width-height proportion of the original video and add Gaussian Blur effect to reach the resolution rate of the template;
· stretch: It means stretching the original video to achieve the template resolution.
crf Code rate control policy:
·The default is VBR;
·Constant quality crf scope may be set. [1, 51]
playbackSpeed Speed playback: Scope [0.05,20.0]
·It is decelerated playback when it is smaller than 1.0.;
· It is accelerated playback when it is greater than 1.0.。
transMode Transcoding mode, supporting: normal、twopass、cae(intelligent super-definition)

Audio coding

Parameter Description
codec The default is AAC.
bitRateInBps Object code rate, ranging [0,1000]
ampleRateInHz Audio sampling rate, optional: 22050, 32000, 44100, 48000, 96000
channels Number of audio tracks, optional: 1, 2
gain Sound volume, ranging [-60,60], the unit is db, the value is negative low volume, the value is positive high volume.
mute Whether to set mute
norm Whether to normalize the volume to avoid the fluctuation of the volume

Video editing

Function Description
Video stitching A maximum of 200 videos are supported to be spliced into one.
Video clip Support to set the start time and duration, in second or millisecond
Watermark removing Support to set the watermark-removed area (x, y, width and height), and support to identify and remove the watermark automatically
Black edge removal Support setting the effective frame areas (x, y, width, height) after clipping the black edge
Gaussian Blur Support the filling of dynamic Gaussian blur effect
Overlay static/dynamic watermarks The format includes: jpg, png, apng, gif, webp, mov and mp4, etc., support to set display position and start time
Superposition of subtitles Support srt subtitle files; support for the setting of font, type size, display position and start time
Superposition of audio The supported audio formats include mp3 and aac

Video screenshot

Function Description
Specify
screenshot
· manual: Capture thumbnails according to the specified start and end time and interval.;
· split: Intercept a thumbnail according to the specified start and end time as well as numbers;
· splitss0: Ensure to capture the first frame based on split mode
Intelligent
screenshot
· auto: Automatically capture the frame of higher entropy;
· shot: Automatically capture the scenario change frame according to the scenario switchover;
· idl: Use Baidu IDL (Institute of Deep Learning) AI thumbnail algorithm to capture one frame of thumbnail;
· highlight: Automatically generate a 0.5s highlight according to AI model, the captured duration can be set
CSS Sprite Specify the width of row, column and external frame of Sprite image as well as subgraph interval
Image
format
·Static picture format: jpg、png
· GIF: gif、webp、mp4
· The dynamic photo supports to set frame rate and playback speed
Photo
width and height
The width/height ranges [10, 2,000], and the width defaults to 600 and height to 450. If the actual resolution of the video is lower than the target resolution, it will output according to the actual resolution
Scaling
policy
·keep indicates to maintain the width-to-height ratio of the original video;
· shrinkToFit indicates to maintain the width-height ratio of the original video and add black edges;
· stretch indicates to stretch the original video
Watermark removing Specify the watermark-removed fuzzy area (x, y, width and height)
Black edge
Clipping
Specify the effective picture area after the black edge is removed (x, y, width, height), and support the automatic detection of area with the black edge removed

Media information

Type Description
File information Include : File size, file duration, container format, file type and MD5 value
Video information Include : Coding standard, resolution ratio (width/height), code rate, frame rate
Audio information Include : Coding standard, sound track, sampling rate, code rate

Video encryption

Encryption mode Description
fixed To encrypt the fixed key, use the custom key to encrypt the video, at which point the aesKey is required;
open With an open key, the system will generate an encrypted key automatically, but the key is disclosed, without access control
playerBinding The system automatically generates the encryption secret key, which binds the player with access control, so the safety is high and is recommended.

intelligent super-definition

Type Description
intelligent super-definition 1.0 Intelligently analyze the picture complexity on the video scenario level, and dynamically allocate the optimized coding parameters. Under the same picture quality, the code rate is lower, thereby saving bandwidth and storage costs.
intelligent super-definition 2.0 Aiming at the best subjective experience of human eyes, optimize the color, brightness, contrast and saturation of the picture, strengthen the picture quality of the area noted by human eyes, save the code rate, and improve the picture quality.
Repair of old film Debounce, remove the scratches, noise and Mosaic, etc. for the old videos or over-compressed videos Vertical application scenario of intelligent sense and super-definition2.0
Intelligent frame insertion For general framerate videos within 30 frames per second (included), generate a version with higher framerate of 60 frames per second or even 120 frames per second, to improve the picture smoothness, and generally it is used with super-resolution.
Super resolution Improve the video image details through further learning model to recreate the low resolution ratio to high resolution ratio, such as: SK to HD, 2k to 4k, etc.

Extreme transcoding

The topspeed transcoding includes common topspeed transcoding and intelligent topspeed transcoding.

Common rapid transcoding separated transcode the audio and video through AV separation technology to reduce merge time, which can reach 10 times of speed acceleration for long-form video transcoding.

On the basis of common topspeed transcoding, the intelligent topspeed transcoding can predict the multipart policy through AI model based on the attributes (Codec, B frame, frame rate, code rate and resolution, etc.) of the input video/output template, and the transcoding speed can be up to 50 times faster.

BD265

The BD265 encoder uses more than 50 coding optimization algorithm and AI coding technology to provide higher picture quality with lower code rate and faster speed.

BD265 reduces the computing costs concerning motion estimation, mode selection and rate-distortion optimization by utilizing a variety of defined prediction models, and improves the coding speed greatly. Comparing with the open-sourcing HEVC encoder (x265), BD265 saves 30%-40% code rate under the same subjective quality and increase the coding speed of more than 2 times in the meantime. BD265 combines the latest visual coding technology and dynamic code-rate allocation algorithm, and spends each bit where human eyes are sensitive, and provides better subjective quality. BD265 coder supports rich coding gears, such as fast gear, common gear, slow gear and multiple coding configurations, such as crf, abr and two pass, and can be deployed quickly to multiple application scenarios like LVB, VOD and short videos, etc.

Previous
Noun Interpretation
Next
Product Advantages