Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions src/.vuepress/sidebar/V2.0.x/en-Table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ export const enSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: 'Common Concepts', link: 'Cluster-Concept_apache' },
{ text: 'Timeseries Data Model', link: 'Navigating_Time_Series_Data_apache' },
{ text: 'Basic Concepts', link: 'Common-Concepts_apache' },
{ text: 'Modeling Scheme Design', link: 'Data-Model-and-Terminology_apache' },
{ text: 'Data Type', link: 'Data-Type_apache' },
],
Expand Down
6 changes: 1 addition & 5 deletions src/.vuepress/sidebar/V2.0.x/en-Tree.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,7 @@ export const enSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: 'Common Concepts', link: 'Cluster-Concept_apache' },
{
text: 'Timeseries Data Model',
link: 'Navigating_Time_Series_Data_apache',
},
{ text: 'Basic Concepts', link: 'Common-Concepts_apache' },
{
text: 'Modeling Scheme Design',
link: 'Data-Model-and-Terminology_apache',
Expand Down
3 changes: 1 addition & 2 deletions src/.vuepress/sidebar/V2.0.x/zh-Table.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '常见概念', link: 'Cluster-Concept_apache' },
{ text: '时序数据模型', link: 'Navigating_Time_Series_Data_apache' },
{ text: '基础概念', link: 'Common-Concepts_apache' },
{ text: '建模方案设计', link: 'Data-Model-and-Terminology_apache' },
{ text: '数据类型', link: 'Data-Type_apache' },
],
Expand Down
3 changes: 1 addition & 2 deletions src/.vuepress/sidebar/V2.0.x/zh-Tree.ts
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,7 @@ export const zhSidebar = {
collapsible: true,
prefix: 'Background-knowledge/',
children: [
{ text: '常见概念', link: 'Cluster-Concept_apache' },
{ text: '时序数据模型', link: 'Navigating_Time_Series_Data_apache' },
{ text: '基础概念', link: 'Common-Concepts_apache' },
{ text: '建模方案设计', link: 'Data-Model-and-Terminology_apache' },
{ text: '数据类型', link: 'Data-Type' },
],
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,182 @@
# Basic Concepts

## 1. General Time Series Database Concepts

This section introduces basic concepts commonly used in time series databases, including time series data, time series, devices, timeseries or fields, data points, collection frequency, TTL, schema, encoding, and compression.

### 1.1 Time Series Data

In scenarios such as IoT, industrial production, energy and power, connected vehicles, and infrastructure monitoring, devices usually use sensors to continuously collect status data about themselves or their environment. For example, motors collect voltage and current, wind turbines collect blade speed, angular velocity, and power generation, vehicles collect longitude, latitude, speed, and fuel consumption, and bridges collect vibration frequency, deflection, and displacement.

![](/img/time-series-data-en-01.png)

The common feature of this type of data is that it is related to time: the same collection object continuously generates new records as time passes. Data that is continuously generated and recorded in chronological order is called time series data.

### 1.2 Time Series

In time series data scenarios, a collection point continuously generates data points over time. When these data points are arranged in ascending timestamp order, they form a time series. In table form, a time series can be represented as a data table made up of time and value. In graph form, a time series can be represented as a trend curve that changes over time, and can also be described figuratively as the "electrocardiogram" of a device.

![](/img/time-series-data-en-02.png)

### 1.3 Device

A device, also called an entity or equipment, is a device or apparatus with physical quantities in a real-world scenario. It can be a physical device, a measurement apparatus, or a collection of sensors.

Common examples are as follows:

| Scenario | Device Example | Identifier Example |
| --- | --- | --- |
| Energy | Wind turbine | Region, station, line, model, instance, etc. |
| Factory | Robotic arm | Unique ID generated by an IoT platform |
| Connected vehicle | Vehicle | Vehicle identification number (VIN) |
| Monitoring | CPU | Equipment room, rack, hostname, device type, etc. |

### 1.4 Timeseries / Field

A timeseries or field can also be called a physical quantity, time series, timeline, signal, metric, point, or measured value. It is the measurement information recorded by a detection device in a real-world scenario. Usually, one physical quantity represents one collection point that can periodically collect a physical quantity from its environment or device. When the data points generated by a timeseries or field are arranged in ascending timestamp order, they form a time series.

Common examples are as follows:

| Scenario | Timeseries / Field Example |
| --- | --- |
| Energy and power | Current, voltage, wind speed, rotational speed |
| Connected vehicle | Fuel level, vehicle speed, longitude, latitude |
| Factory | Temperature, humidity |

### 1.5 Data Point

A data point consists of a timestamp and a value. The timestamp indicates when the data was generated, and the value indicates the collection result of the timeseries or field at that time. The value can be of various types, such as BOOLEAN, FLOAT, and INT32.

A row in a tabular time series, or a point in a trend chart, can be understood as a data point.

![](/img/time-series-data-en-03.png)

### 1.6 Collection Frequency

Collection frequency refers to the number of times a physical quantity generates data within a certain period. For example, if a temperature sensor collects temperature data once per second, its collection frequency is 1 Hz, that is, once per second.

The higher the collection frequency, the more data points are generated per unit of time, and the higher the requirements for write, storage, and query capabilities.

### 1.7 Data Retention Time (TTL)

TTL specifies the retention time of data. Data beyond the TTL will be automatically deleted.

Using TTL properly can control disk space usage, avoid exceptions such as disks becoming full, and help maintain query performance and reduce memory usage.

### 1.8 Schema

Schema is the data model information of a database and is used to describe the structure and definition of data. For time series data, schema usually includes devices, timeseries or fields, data types, and other information.

### 1.9 Encoding and Compression

Encoding is a compression technique used to represent data in binary form and improve storage efficiency. Compression further compresses the encoded binary data to improve storage efficiency.

> For details about encoding and compression supported by IoTDB, see [Compression and Encoding](../Technical-Insider/Encoding-and-Compression.md).

## 2. Common IoTDB Concepts

This section introduces common concepts in IoTDB data models, distributed architecture, and deployment. These concepts explain how IoTDB organizes, manages, and deploys time series data.

### 2.1 Data Model Concepts

#### 2.1.1 Data Model (sql_dialect)

IoTDB supports two data models: tree model and table model. The core objects managed by both models are devices and timeseries, but their organization methods and syntax are different.

- Tree model: Manages data through hierarchical paths, where one path corresponds to one timeseries of one device.

- Table model: Manages data through relational tables. It is recommended that one table correspond to one type of device.

Both model spaces can exist in the same cluster instance. Different models use different syntax and database naming methods, and are not visible to each other by default.

#### 2.1.2 Database

In the table model, a database is the upper-level organizational structure and can manage multiple types of devices and their tables. Before creating tables, writing data, or querying data, you usually need to create a database first.

#### 2.1.3 Table

In the table model, it is recommended that one table correspond to one type of device and be used to organize the time series data of that type of device. Devices of the same type usually have the same or similar sets of fields.

#### 2.1.4 Time Column, Tag Column, Attribute Column, and Field Column

Columns in the table model can be divided by purpose into time columns, tag columns, attribute columns, and field columns.

| Concept | Description |
| --- | --- |
| Time column (TIME) | Each table must contain one time column whose data type is TIMESTAMP |
| Tag column (TAG) | Used to identify devices. It can serve as the composite primary key of devices and usually does not change over time |
| Attribute column (ATTRIBUTE) | Used to describe static attributes of devices. It does not change over time and can be updated or added |
| Field column (FIELD) | Used to store field values collected by devices. Values change over time |

In terms of data filtering efficiency, the usual order can be understood as: time columns and tag columns first, then attribute columns, and finally field columns.

### 2.2 Distributed Concepts

IoTDB supports cluster deployment. Common concepts in a cluster include nodes, Regions, and multiple replicas. A common cluster deployment mode is 3C3D, that is, 3 ConfigNodes and 3 DataNodes.

![](/img/Cluster-Concept03N.png)

#### 2.2.1 Node

An IoTDB cluster includes three types of nodes: ConfigNode, DataNode, and AINode.

- ConfigNode: Manages node information, configuration information, user permissions, schema, partition information, and other cluster information. It is responsible for scheduling distributed operations and load balancing. All ConfigNodes are full backups of each other.

- DataNode: Serves client requests and is responsible for data storage and computation.

- AINode: Provides machine learning capabilities. It supports registering trained machine learning models and invoking models for inference through SQL.

#### 2.2.2 Data Partition (Region)

In IoTDB, both schema and data are divided into smaller partitions, namely Regions, and are managed by DataNodes in the cluster.

- SchemaRegion: A schema partition used to manage the schema of some devices and timeseries or fields.

- DataRegion: A data partition used to manage the data of some devices within a period of time.

Regions with the same RegionID on different DataNodes are replicas of each other.

#### 2.2.3 Multiple Replicas

The number of replicas for data and schema is configurable. Multiple replicas can provide high-availability services.

| Category | Configuration Item | Recommended Standalone Configuration | Recommended Cluster Configuration |
| --- | --- | --- | --- |
| Schema | schema_replication_factor | 1 | 3 |
| Data | data_replication_factor | 1 | 2 |

### 2.3 Deployment Concepts

IoTDB has two running modes: standalone mode and cluster mode.

#### 2.3.1 Standalone Mode

An IoTDB standalone instance includes 1 ConfigNode and 1 DataNode, that is, 1C1D.

- Features: Easy for developers to install and deploy, with low deployment and maintenance costs and convenient operations.

- Applicable scenarios: Scenarios with limited resources or low high-availability requirements, such as edge servers.

- Deployment method: [Standalone deployment](../Deployment-and-Maintenance/Stand-Alone-Deployment_apache.md).

#### 2.3.2 Cluster Mode

An IoTDB cluster instance consists of 3 ConfigNodes and no fewer than 3 DataNodes, usually 3 DataNodes, that is, 3C3D. When some nodes fail, the remaining nodes can still provide services externally, ensuring high availability of database services. Database performance can also be improved by adding nodes.

- Features: High availability and high scalability. System performance can be improved by adding DataNodes.

- Applicable scenarios: Enterprise application scenarios that require high availability and reliability.

- Deployment method: [Cluster deployment](../Deployment-and-Maintenance/Cluster-Deployment_apache.md).

#### 2.3.3 Feature Summary

| Dimension | Standalone Mode | Cluster Mode |
| --- | --- | --- |
| Applicable scenarios | Edge deployment; low high-availability requirements | High-availability services; disaster recovery scenarios, etc. |
| Required number of machines | 1 | >= 3 |
| Safety and reliability | Cannot tolerate a single point of failure | High; can tolerate a single point of failure |
| Scalability | Can scale DataNodes to improve performance | Can scale DataNodes to improve performance |
| Performance | Can scale with the number of DataNodes | Can scale with the number of DataNodes |

Standalone mode and cluster mode have similar deployment steps: ConfigNodes and DataNodes are added one by one. The differences are only in the number of replicas and the minimum number of nodes that can provide services.
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ This section introduces how to transform time series data application scenarios

## 1. Time Series Data Mode

Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Time Series Data Mode](../Background-knowledge/Navigating_Time_Series_Data_apache.md)
Before designing an IoTDB data mode, it's essential to understand time series data and its underlying structure. For more details, refer to: [Basic Concepts](../Background-knowledge/Common-Concepts_apache.md)

## 2. Tree-Table Twin Mode in IoTDB

Expand Down
2 changes: 1 addition & 1 deletion src/UserGuide/Master/Table/QuickStart/QuickStart_apache.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ This guide will assist you in quickly installing and deploying IoTDB. You can qu

1. Database Modeling Design: Database modeling is a crucial step in creating a database system, involving the design of data structures and relationships to ensure that the organization of data meets the needs of specific applications. The following documents will help you quickly understand IoTDB's modeling design:

- Introduction to Time Series Concepts: [Navigating Time Series Data](../Background-knowledge/Navigating_Time_Series_Data_apache.md)
- Introduction to Time Series Concepts: [Basic Concepts](../Background-knowledge/Common-Concepts_apache.md)

- Introduction to Modeling Design:[Data Model and Terminology](../Background-knowledge/Data-Model-and-Terminology_apache.md)

Expand Down
Loading
Loading