Data Source
Reminder:
Please read Access and Authentication REST API and understand how authentication works.
On Curl command line, don't forget to quote the URL if it contains the special char
&
.
Load Hive Table
Call this API to load Hive table metadata to Kylin. By default, when a new Hive table is added, the table metadata will not be loaded to Kylin.
-
POST http://host:port/kylin/api/tables
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
HTTP Body: JSON Object
-
project
-required
string
, project name -
need_sampling
-required
boolean
, whether to enable table sampling -
sampling_rows
-optional
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000] .Note: if you enable need_sampling, this parameter will be required.
-
databases
-optional
[string]
, load all the tables under this database -
tables
-optional
[string]
, load tables with the formatDB.TABLE
Note:
-
If the loaded table already exists in the system, it will be reloaded.
-
The above two parameters
databases
andtables
cannot be empty at the same time, which means you must use one of them to load tables.
-
-
-
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","tables":["SSB.LINEORDER"],"need_sampling":false}' -
Response Details
loaded
, successfully loaded tablesfailed
, failed to load tables
-
Response Example
{
"code": "000",
"data": {
"loaded":["SSB.LINEORDER"],
"failed":[]
},
"msg": ""
}
Prepare Reload Hive Table
Call this API to compare the Hive table metadata in Kylin and that in the data source. For Hive table already loaded to Kylin and already used in model and index building, if some columns are deleted, Kylin will return a failure when reading these columns. Use this API to find the metadata differences and evaluate whether to update the metadata in Kylin by reloading the Hive table.
-
GET http://host:port/kylin/api/tables/pre_reload
-
Introduced in: 5.0
-
URL Parameters
-
project
-required
string
,project name -
table
-required
string
, reload table with the formatDB.TABLE
-
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/pre_reload?project=ssb&table=SSB.LINEORDER' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' -
Response Details
-
has_datasource_changed
, source table structure has changedhas_effected_jobs
, has unfinished jobs related to the tablehas_duplicated_columns
, has duplicated columnsadd_column_count
, number of new columnsremove_column_count
, number of reduce columnsdata_type_change_column_count
, number of column type changesbroken_model_count
, number of broken modelsremove_measures_count
, number of impact measuresremove_dimensions_count
, number of dimensions affectedremove_layouts_count
, number of deleted indexesadd_layouts_count
, increased number of indexesrefresh_layouts_count
, number of indexes refreshedsnapshot_deleted
, snapshot is deletedduplicated_columns
, duplicate column whose format is database.table.columneffected_jobs
,effected Job ID
-
Response Example
{
"code": "000",
"data": {
"has_datasource_changed": false,
"has_effected_jobs": true,
"has_duplicated_columns": true,
"add_column_count": 0,
"remove_column_count": 0,
"data_type_change_column_count": 0,
"broken_model_count": 0,
"remove_measures_count": 0,
"remove_dimensions_count": 0,
"remove_layouts_count": 0,
"add_layouts_count": 0,
"refresh_layouts_count": 0,
"snapshot_deleted": true,
"dumplicated_columns": ["SSB.LINEORDER.PROFIT", "SSB.LINEORDER.LO_DISCOUNT"],
"effected_jobs": ["266c9086-7ffe-44a1-9d5e-f9f9941b891d", "f42e5dd3-78e6-43f8-9bcb-edcb2c09312d"]
},
"msg": ""
}
Reload Hive Table
-
POST http://host:port/kylin/api/tables/reload
-
Request Permission: MANAGEMENT permission and above
-
Introduced in: 5.0
-
HTTP Body: JSON Object
-
project
-required
string
, project name -
table
-required
string
, specify the table, format:DB.TABLE -
need_sampling
-required
boolean
, whether to enable table sampling -
sampling_rows
-optional
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000]Note: if you enable need_sampling, this parameter will be required
-
need_building
-optional
boolean
, whether to build a new index,true
means to build,false
means not to build, default value isfalse
-
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables/reload' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","table":"SSB.LINEORDER","need_sampling":false,"need_building":false}' -
Response Field
sampling_id
, ids of table sampling jobsjob_ids
, ids of the building jobs
-
Response Example
{
"code": "000",
"data": {
"sampling_id":"",
"job_ids":["1234","1234"]
},
"msg": ""
}
Prepare Unload Table
Call this API to evaluate the risks of unloading Hive table metadata. There are cases where you need to offline some Hive tables from Kylin. Use this API to evaluate the impact of unloading Hive table metadata on related Kylin models and jobs.
-
GET http://host:port/kylin/api/tables/{database}/{table}/prepare_unload
-
URL Parameters
-
database
-required
string
,database name of the table to be deleted -
table
-required
string
,table name to be deleted -
project
-required
string
,project name
-
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/SSB/LINEORDER/prepare_unload?project=ssb' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' -
Response Field
has_job
, whether there are running jobs of sampling or building snapshot in the current tablehas_model
, is the current table used by the modelhas_snapshot
, does the current table have a snapshotstorage_size
, storage size of the current table snapshot (Byte)models
, model list
-
Response Example
{
"code": "000",
"data": {
"has_job": false,
"has_model": true,
"has_snapshot": true,
"storage_size": 16616,
"models": [
"model"
]
},
"msg": ""
}
Unload Table
Call this API to unload Hive table metadata from Kylin. After the API call, Hive table metadata will be unloaded from Kylin, and Kylin can no longer read the table data, or update the index data related to the table. It's recommended calling the "Prepare Unload Table" API before calling this API.
-
DELETE http://host:port/kylin/api/tables/{database}/{table}
-
URL Parameters
-
database
-required
string
, database name of the table to be deleted -
table
-required
string
, table name to be deleted -
project
-required
string
, project name -
cascade
-optional
boolean
, delete all. default value isfalse
- true: Delete this source table with the snapshot, attached Kafka/Hive table, the referenced models, and stop/delete related jobs.
- false: Only delete this source table with the snapshot, and stop related jobs. While the referenced models will be kept (BROKEN, can be fixed by reloading the table).
-
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X DELETE \
'http://host:port/kylin/api/tables/SSB/LINEORDER?project=ssb' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' -
Response Field
date
, deleted table name
-
Response Example
{
"code": "000",
"data": "SSB.LINEORDER",
"msg": ""
}
Table Sampling
Call this API to enable data sampling to reflect the characteristics of Hive table data.
-
POST http://host:port/kylin/api/tables/sampling_jobs
-
Request Permission: MANAGEMENT permission and above
-
Introduced in: 5.0
-
HTTP Body: JSON Object
project
-required
string
, project namequalified_table_name
-required
string
,specify the table,format:DB.TABLErows
-required
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000]priority
-optional
integer
, set job priority with range0-4
which indicates the priority from high to low. Default value is3
yarn_queue
-optional
string
, specify the YARN queue used by the job, it can be set after these two parameters were set: kylin.engine-yarn.queue.in.task.enabled (whether to allow set specified YARN queue for build task, default value is false), kylin.engine-yarn.queue.in.task.available (available YARN queues, separate them with English commas)tag
-optional
object
, job tag, if the field is set, when calling the Get Job List API, the field will be the same back when returning the job. It can be used for system integration, mark the job and deal with it accordingly. By default, the maximum size of value is 1024 KB , which can be set by the configure kylin.job.tag-max-size=1024.
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables/sampling_jobs' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","qualified_table_name":"SSB.LINEORDER","rows":20000,"priority":0}' -
Response Example
{
"code":"000",
"data":"",
"msg":""
}
Get Partition Format of A Column
When a column is used as a partition column in a model in Kylin, get the partition format of the column.
-
GET http://host:port/kylin/api/tables/column_format
-
Request Permission: Operation permission and above
-
Introduced in: 5.0
-
Request Parameters
project
-required
string
, project nametable
-required
string
, table name, format as DB.TABLEcolumn_name
-required
string
, column name
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/column_format?project=test&table=DEFAULT.KYLIN_SALES&column_name=PART_DT' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' -
Response Field
column_name
, column namecolumn_format
, column format
-
Response Example
{
"code": "000",
"data": {
"partition_column": "PART_DT",
"format": "yyyy-MM-dd"
},
"msg": ""
}
Get Table Information
Call this API to get the metadata of a specified Hive table.
-
GET http://host:port/kylin/api/tables
-
Request Permission: READ permission and above.
-
Introduced in: 5.0
-
Request Parameters
project
-required
string
, project namedatabase
-optional
string
, database name, case sensitivetable
-optional
string
, table name, case sensitiveis_fuzzy
-optional
boolean
, whether to enable fuzzy matching for table names,true
means to enable,false
means to close, default value isfalse
ext
-optional
boolean
, specify whether the table's extension information is returned,true
means to enable,false
means to close, default value istrue
page_offset
-optional
int
, offset of returned result,0
by defaultpage_size
-optional
int
, quantity of returned result per page,10
by default
-
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
-
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables?project=test&database=SSB&table=KYLIN_SALES' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' -
Response Example
{
"code": "000",
"data": {
"value": [
{
"uuid": "6e638305-1a44-42dc-a161-5e06338dcb14",
"last_modified": 1600335525521,
"create_time": 1600335525522,
"version": "4.0.0.0",
"mvcc": 0,
"name": "KYLIN_SALES",
"columns": [
{
"id": "1",
"name": "TRANS_ID",
"datatype": "bigint",
"cardinality": null,
"min_value": null,
"max_value": null,
"null_count": null
}
],
"source_type": 9,
"kafka_bootstrap_servers": null,
"subscribe": null,
"starting_offsets": null,
"table_type": "MANAGED",
"top": false,
"increment_loading": false,
"last_snapshot_path": null,
"database": "DEFAULT",
"exd": {
"owner": "root",
"create_time": "1524213799000",
"total_file_size": "0",
"hive_inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"hive_outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"location": "hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/kylin_sales",
"partition_column": "",
"total_file_number": "0",
"last_access_time": "0"
},
"root_fact": false,
"lookup": false,
"primary_key": [],
"foreign_key": [],
"partitioned_column": null,
"partitioned_column_format": null,
"segment_range": null,
"storage_size": -1,
"total_records": 0,
"sampling_rows": [],
"last_build_job_id": null
}
],
"offset": 0,
"limit": 10,
"total_size": 3
},
"msg": ""
}
Note: the
total_size
value is adjusted to the total number of all loaded tables in the project and no longer correlates to the actual table permissions the requesting user has in the project.