Data Preprocessing

Last Updated：2020-07-20

Create Preprocessing Rules

When the amount of data stored in TSDB is large, it takes a long time to filter the data out according to rules specified by users, which may result in a time-out of the request and a query failure.

In this case, you can configure data preprocessing, filter and aggregate the relevant data in advance, to quickly return query results.

Note:

Preprocessing rules cannot be created in a free database.

Only 2 preprocessing rules can be created in each paid database.

The upper limit of tags of preprocessing rules is 5.

The upper limit of selected metrics of preprocessing rules is 10, and only Select One or Select All is available.

The upper limit of aggregators of preprocessing rules is 3.

Preprocessing rules can be modified and reset, and the gap for reset must be more than an hour.

Select Product Service > Time Series Database TSDB > Database Name, and enter the database details page.
Click database name, and enter the database details page. Click Preprocessing tab to enter the configuration page.
Click Create Rules, and configure preprocessing rules in the pop-up window.

Specifically, the following configuration items are included:

Rule name: Set a rule name to identify the specific rule.
Metric: Select the Metrics name already in the current database from the drop-down list, or select all Metrics.
Starting time: A starting baseline time for processing the raw data by the preprocessing operations. In the configuration above, the starting baseline time for the raw data for the preprocessing rule is 2000-01-01 00:10:00, which is used as the starting time to first aggregate all data points so far from 2000-01-01 00:10:00 (aligned to 5 hours). As following figure shows:

During rule initialization, the preprocessing rule has not yet taken effect and the query request does not hit the preprocessed data point. After rule initialization, the preprocessing rule has taken effect and the matched query request hits the preprocessed data point. After the rule is modified, the rule is in a reset state, and after the reset, the query request can be hit.
Tag index: The tag is designated to filter the data, and the upper limit of tags of preprocessing rules is 5.
Aggregator: The data is processed by an intrinsic function, and the upper limit of aggregators of preprocessing rules is 3. The currently supported aggregators include:
- Avg: Select an average value of values within each sampling time range as a result
- Dev: Select a standard deviation of values within each sampling time range as a result
- Count: Select the number of points within each sampling time range as a result
- First: Select the first value within each sampling time range as a result
- Last: Select the last value within each sampling time range as a result
- LeastSquares: Fit the values within each sampling time range by using the least squares method
- Max: Select the maximum within each sampling time range as a result
- Min: Select the minimum within each sampling time range as a result
- Percentile: Select the p percentile of values within each sampling time range as a result.
- Sum: Select the sum value of values within each sampling time range as a result
- Diff: Select a difference between each two adjacent values as a result
- Div: Select a value of one divisor divided by each value as a result
- Scale: Select a value of multiplying each value by one multiple as a result

View Preprocessing Details

After you create a preprocessing rule, you can view data preprocessing in the rule details.

Select Product Service > Time Series Database TSDB > Database Name, and enter the database details page.
Click database name, and enter the database details page. Click Preprocessing tab to enter the configuration page.
Find the preprocessing rules you've created, click View Details to get the following configuration information, as shown in the figure below:

Cumulative Hits: Number of times the rule is hit when data points are viewed through the query panel or API/SDK.
Cumulative Raw Points: The number of raw data points filtered out of the database based on metric values and tags in the preprocessing rules.
Cumulative Rule Points: Data points obtained after the "Cumulative Raw Points" are aggregated.

For interpretation of other information, see Create Preprocessing Rules.

View Preprocessing Results

Users can view preprocessed data through the Query Panel, SDK, or API. To ensure that the preprocessed data can be hit when a query is done, the following conditions need to be met:

The metric values of query rules must be a subset of the preprocessing rules (all metrics or an independent metric may be specified in the preprocessing rules).
Filtering tags plus specified grouping tags in query rules must be equal to the tag index in the preprocessing rules.
The settings of aggregators in the query rules must be consistent with that of the preprocessing rules.

For specific operations of generating a chart by the query panel, see Generate Chart.

Interpolation Query

Identity and Access Management

百度智能云

Time-Spatial Database

Data Preprocessing

Create Preprocessing Rules

View Preprocessing Details

View Preprocessing Results