Data Source API
Reminder:
Please read Access and Authentication REST API and understand how authentication works.
On Curl command line, don't forget to quote the URL if it contains the special char
&
.
Load Hive Table
Call this API to load Hive table metadata to Kylin. By default, when a new Hive table is added, the table metadata will not be loaded to Kylin.
POST http://host:port/kylin/api/tables
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
HTTP Body: JSON Object
project
-required
string
, project nameneed_sampling
-required
boolean
, whether to enable table samplingsampling_rows
-optional
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000] .Note: if you enable need_sampling, this parameter will be required.
databases
-optional
[string]
, load all the tables under this databasetables
-optional
[string]
, load tables with the formatDB.TABLE
Note:
If the loaded table already exists in the system, it will be reloaded.
The above two parameters
databases
andtables
cannot be empty at the same time, which means you must use one of them to load tables.
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","tables":["SSB.LINEORDER"],"need_sampling":false}'Response Details
loaded
, successfully loaded tablesfailed
, failed to load tables
Response Example
{
"code": "000",
"data": {
"loaded":["SSB.LINEORDER"],
"failed":[]
},
"msg": ""
}
Prepare Reload Hive Table
Call this API to compare the Hive table metadata in Kylin and that in the data source. For Hive table already loaded to Kylin and already used in model and index building, if some columns are deleted, Kylin will return a failure when reading these columns. Use this API to find the metadata differences and evaluate whether to update the metadata in Kylin by reloading the Hive table.
GET http://host:port/kylin/api/tables/pre_reload
Introduced in: 5.0
URL Parameters
project
-required
string
,project nametable
-required
string
, reload table with the formatDB.TABLE
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/pre_reload?project=ssb&table=SSB.LINEORDER' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8'Response Details
has_datasource_changed
, source table structure has changedhas_effected_jobs
, has unfinished jobs related to the tablehas_duplicated_columns
, has duplicated columnsadd_column_count
, number of new columnsremove_column_count
, number of reduce columnsdata_type_change_column_count
, number of column type changesbroken_model_count
, number of broken modelsremove_measures_count
, number of impact measuresremove_dimensions_count
, number of dimensions affectedremove_layouts_count
, number of deleted indexesadd_layouts_count
, increased number of indexesrefresh_layouts_count
, number of indexes refreshedsnapshot_deleted
, snapshot is deletedduplicated_columns
, duplicate column whose format is database.table.columneffected_jobs
,effected Job ID
Response Example
{
"code": "000",
"data": {
"has_datasource_changed": false,
"has_effected_jobs": true,
"has_duplicated_columns": true,
"add_column_count": 0,
"remove_column_count": 0,
"data_type_change_column_count": 0,
"broken_model_count": 0,
"remove_measures_count": 0,
"remove_dimensions_count": 0,
"remove_layouts_count": 0,
"add_layouts_count": 0,
"refresh_layouts_count": 0,
"snapshot_deleted": true,
"dumplicated_columns": ["SSB.LINEORDER.PROFIT", "SSB.LINEORDER.LO_DISCOUNT"],
"effected_jobs": ["266c9086-7ffe-44a1-9d5e-f9f9941b891d", "f42e5dd3-78e6-43f8-9bcb-edcb2c09312d"]
},
"msg": ""
}
Reload Hive Table
POST http://host:port/kylin/api/tables/reload
Request Permission: MANAGEMENT permission and above
Introduced in: 5.0
HTTP Body: JSON Object
project
-required
string
, project nametable
-required
string
, specify the table, format:DB.TABLEneed_sampling
-required
boolean
, whether to enable table samplingsampling_rows
-optional
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000]Note: if you enable need_sampling, this parameter will be required
need_building
-optional
boolean
, whether to build a new index,true
means to build,false
means not to build, default value isfalse
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables/reload' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","table":"SSB.LINEORDER","need_sampling":false,"need_building":false}'Response Field
sampling_id
, ids of table sampling jobsjob_ids
, ids of the building jobs
Response Example
{
"code": "000",
"data": {
"sampling_id":"",
"job_ids":["1234","1234"]
},
"msg": ""
}
Prepare Unload Table
Call this API to evaluate the risks of unloading Hive table metadata. There are cases where you need to offline some Hive tables from Kylin. Use this API to evaluate the impact of unloading Hive table metadata on related Kylin models and jobs.
GET http://host:port/kylin/api/tables/{database}/{table}/prepare_unload
URL Parameters
database
-required
string
,database name of the table to be deletedtable
-required
string
,table name to be deletedproject
-required
string
,project name
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/SSB/LINEORDER/prepare_unload?project=ssb' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8'Response Field
has_job
, whether there are running jobs of sampling or building snapshot in the current tablehas_model
, is the current table used by the modelhas_snapshot
, does the current table have a snapshotstorage_size
, storage size of the current table snapshot (Byte)models
, model list
Response Example
{
"code": "000",
"data": {
"has_job": false,
"has_model": true,
"has_snapshot": true,
"storage_size": 16616,
"models": [
"model"
]
},
"msg": ""
}
Unload Table
Call this API to unload Hive table metadata from Kylin. After the API call, Hive table metadata will be unloaded from Kylin, and Kylin can no longer read the table data, or update the index data related to the table. It's recommended calling the "Prepare Unload Table" API before calling this API.
DELETE http://host:port/kylin/api/tables/{database}/{table}
URL Parameters
database
-required
string
, database name of the table to be deletedtable
-required
string
, table name to be deletedproject
-required
string
, project namecascade
-optional
boolean
, delete all. default value isfalse
- true: Delete this source table with the snapshot, attached Kafka/Hive table, the referenced models, and stop/delete related jobs.
- false: Only delete this source table with the snapshot, and stop related jobs. While the referenced models will be kept (BROKEN, can be fixed by reloading the table).
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X DELETE \
'http://host:port/kylin/api/tables/SSB/LINEORDER?project=ssb' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8'Response Field
date
, deleted table name
Response Example
{
"code": "000",
"data": "SSB.LINEORDER",
"msg": ""
}
Table Sampling
Call this API to enable data sampling to reflect the characteristics of Hive table data.
POST http://host:port/kylin/api/tables/sampling_jobs
Request Permission: MANAGEMENT permission and above
Introduced in: 5.0
HTTP Body: JSON Object
project
-required
string
, project namequalified_table_name
-required
string
,specify the table,format:DB.TABLErows
-required
integer
, indicates the max number of sampling rows and the range is [10,000 - 20,000,000]priority
-optional
integer
, set job priority with range0-4
which indicates the priority from high to low. Default value is3
yarn_queue
-optional
string
, specify the YARN queue used by the job, it can be set after these two parameters were set: kylin.engine-yarn.queue.in.task.enabled (whether to allow set specified YARN queue for build task, default value is false), kylin.engine-yarn.queue.in.task.available (available YARN queues, separate them with English commas)tag
-optional
object
, job tag, if the field is set, when calling the Get Job List API, the field will be the same back when returning the job. It can be used for system integration, mark the job and deal with it accordingly. By default, the maximum size of value is 1024 KB , which can be set by the configure kylin.job.tag-max-size=1024.
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X POST \
'http://host:port/kylin/api/tables/sampling_jobs' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8' \
-d '{"project":"ssb","qualified_table_name":"SSB.LINEORDER","rows":20000,"priority":0}'Response Example
{
"code":"000",
"data":"",
"msg":""
}
Get Partition Format of A Column
When a column is used as a partition column in a model in Kylin, get the partition format of the column.
GET http://host:port/kylin/api/tables/column_format
Request Permission: Operation permission and above
Introduced in: 5.0
Request Parameters
project
-required
string
, project nametable
-required
string
, table name, format as DB.TABLEcolumn_name
-required
string
, column name
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables/column_format?project=test&table=DEFAULT.KYLIN_SALES&column_name=PART_DT' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8'Response Field
column_name
, column namecolumn_format
, column format
Response Example
{
"code": "000",
"data": {
"partition_column": "PART_DT",
"format": "yyyy-MM-dd"
},
"msg": ""
}
Get Table Information
Call this API to get the metadata of a specified Hive table.
GET http://host:port/kylin/api/tables
Request Permission: READ permission and above.
Introduced in: 5.0
Request Parameters
project
-required
string
, project namedatabase
-optional
string
, database name, case sensitivetable
-optional
string
, table name, case sensitiveis_fuzzy
-optional
boolean
, whether to enable fuzzy matching for table names,true
means to enable,false
means to close, default value isfalse
ext
-optional
boolean
, specify whether the table's extension information is returned,true
means to enable,false
means to close, default value istrue
page_offset
-optional
int
, offset of returned result,0
by defaultpage_size
-optional
int
, quantity of returned result per page,10
by default
HTTP Header
Accept: application/vnd.apache.kylin-v4-public+json
Accept-Language: en
Content-Type: application/json;charset=utf-8
Curl Request Example
curl -X GET \
'http://host:port/kylin/api/tables?project=test&database=SSB&table=KYLIN_SALES' \
-H 'Accept: application/vnd.apache.kylin-v4-public+json' \
-H 'Accept-Language: en' \
-H 'Authorization: Basic QURNSU46S1lMSU4=' \
-H 'Content-Type: application/json;charset=utf-8'Response Example
{
"code": "000",
"data": {
"value": [
{
"uuid": "6e638305-1a44-42dc-a161-5e06338dcb14",
"last_modified": 1600335525521,
"create_time": 1600335525522,
"version": "4.0.0.0",
"mvcc": 0,
"name": "KYLIN_SALES",
"columns": [
{
"id": "1",
"name": "TRANS_ID",
"datatype": "bigint",
"cardinality": null,
"min_value": null,
"max_value": null,
"null_count": null
}
],
"source_type": 9,
"kafka_bootstrap_servers": null,
"subscribe": null,
"starting_offsets": null,
"table_type": "MANAGED",
"top": false,
"increment_loading": false,
"last_snapshot_path": null,
"database": "DEFAULT",
"exd": {
"owner": "root",
"create_time": "1524213799000",
"total_file_size": "0",
"hive_inputFormat": "org.apache.hadoop.mapred.TextInputFormat",
"hive_outputFormat": "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat",
"location": "hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/kylin_sales",
"partition_column": "",
"total_file_number": "0",
"last_access_time": "0"
},
"root_fact": false,
"lookup": false,
"primary_key": [],
"foreign_key": [],
"partitioned_column": null,
"partitioned_column_format": null,
"segment_range": null,
"storage_size": -1,
"total_records": 0,
"sampling_rows": [],
"last_build_job_id": null
}
],
"offset": 0,
"limit": 10,
"total_size": 3
},
"msg": ""
}Note: the
total_size
value is adjusted to the total number of all loaded tables in the project and no longer correlates to the actual table permissions the requesting user has in the project.