As we reviewed the code we found that Kylin left lots of garbage files in:

  • Local file system of the CLI
  • HDFS
  • Local file system of the hadoop nodes.

A ticket was opened to track this issue:
https://issues.apache.org/jira/browse/KYLIN-926

For future developments, please:

  • Whenever you want to create temp files at Local, choose File.createTempFile or use the folder: BatchConstants.CFG_KYLIN_LOCAL_TEMP_DIR(/tmp/kylin), do not randomly use another folder in /tmp, it will end up a mess, and look unprofessional.
  • Whenever you create temp files at Local, remember to delete it after using it. It’s best to use FileUtils.forceDelete, as it also works for deleting folders. Try avoid deleteOnExit, in case Kylin exits abnormally.
  • Whenever you want to create files in HDFS, try to create it under kylin.hdfs.working.dir or BatchConstants.CFG_KYLIN_HDFS_TEMP_DIR, and remember to delete it after it is no longer useful. Try avoid throwing everything into hdfs:///tmp and leave it as garbage.