InfluxDB Maintenance
This chapter introduces the basic maintenance of InfluxDB.
Connectivityâ
When InfluxDB is not accessible, you can locate the problem from the following aspects:
- 
Check if InfluxDB is running normally by executing service influxdb status. If it is not running, please check log files of/var/log/influxdb/influxd.logor/var/log/messagesto find out the reason, at the same time, runservice influxdb restartto restart InfluxDB service and make sure the service can be launched normally by observing the logs. (You should be able to login InfluxDB viainflux -host ? -port ?command)
- 
If you find the port has been taken in the starting process, run netstat -anp | grep influxdb_portto get the process id, and executeps -ef | grep pidto get the specific process. You can choose to kill the process if you do not need it or to change InfluxDB's server port to another.
- 
If you are having your Kylin and InfluxDB installed in different nodes, please execute telnet influxdb_ip influxdb_porton Kylin node to check if two nodes can communicate normally, if not, please make sure the Firewall service is not turned on on InfluxDB node viaservice iptables statuscommand or contact the system admin to check the network condition.
Log Managementâ
- 
Log Configuration - By default, InfluxDB writes standard error to log. InfluxDB redirects stderr to /var/log/influxdb/influxd.logfile when it is started. If you would like to change the log path, please modify the property in the configuration file/etc/default/influxdbtoSTDERR=/path/to/influxdb.log, and restart the service viaservice influxdb restartcommand.
- InfluxDB enables HTTP access log by default.  Generally, HTTP access log is quite large, you can modify the property [http] log-enabled=falseto disable the log output.
 
- By default, InfluxDB writes standard error to log. InfluxDB redirects stderr to 
- 
Log Clean InfluxDB itself does not clean its log regularly, it uses logrotate to manage log, which is installed on Linux system by default. The configuration file of logrotate is located at /etc/logrotate.d/influxdb, the log rotates by day, and the retention is 7 days.
Backup and Restoreâ
InfluxDB provides the availability to do backup and restore.
- 
Backup influxd backup -portable -database KYLIN_METRIC -host 127.0.0.1:8089 /path/to/backup
- 
Restore Please make sure that the database exists, otherwise the restore will be failed. influxd restore -portable -database KYLIN_METRIC -host 127.0.0.1:8089 /path/to/backup
note: Please replace KYLIN_METRIC with the actual database name, replace 127.0.0.1:8089 with the actual IP and port, replace
/path/to/backupwith the path you would like to set.
Monitoring and Diagnosisâ
- 
Memory Monitoring - 
Check runtime Run following command to check GC, memory usage, etc. influx -database KYLIN_METRIC -execute "show stats for 'runtime'"Please focus on these important arguments: - HeapAlloc -> Heap allocation size
- Sys -> The total number of bytes of memory obtained from the system
- NumGC -> GC times
- PauseTotalNs -> The total GC pause time
 
- 
Check the memory usage of InfluxDB index show stats for 'indexes'
- 
Monitor InfluxDB memory usage Run following command: pidstat -rh -p PID 5If the memory usage is too high or GC is too frequent, please increase memory. tips: It is recommended to install InfluxDB on a separate machine with high memory allocation, because data read and write speed are dependent on the indexes, and the indexes are stored in memory. 
 
- 
- 
Disk Monitoring Run following command to check disk situation: pidstat -d -p PID 5When the disk read/write load is found to be too high, you can consider mapping the WAL directory and the data directory to different disks to reduce the interaction between read and write operations. - Run vi /etc/default/influxdbto edit the configuration file.
- Modify the properties [data] dir = "/var/lib/influxdb/data"andwal-dir = "/var/lib/influxdb/wal"to point WAL directory and data directory to different disk.
 
- Run 
- 
Read/Write Response Time - 
Write: SELECT non_negative_derivative(percentile("writeReqDurationNs", 99)) / non_negative_derivative(max("writeReq")) / (1000 * 1000) AS "Write Request"
 FROM "_internal".."httpd"
 WHERE time > now() - 10d
 GROUP BY time(1h) fill(0)
- 
Read: 
 SELECT non_negative_derivative(percentile("queryReqDurationNs", 99)) / non_negative_derivative(max("queryReq")) / (1000 * 1000) AS "Query Request"
 FROM "_internal".."httpd"
 WHERE time > now() - 10d
 GROUP BY time(1h)
-