Troubleshooting: Out-of-storage issue with NiFi

The Nifi Content repository archives all the flow file content in the system. The purpose of archiving is so that the user can view or replay the content from the provenance user interface only for troubleshooting purposes.

Problem

The Nifi Content repository archives all the flow file content in the system. The purpose is to allow the user to view and/ or replay the content from the provenance.

These archives are kept in the same directory as the Content repository. Sometimes, when handling a large amount of data, the Content repository could fill up a disk, and the FlowFile repository, if it is also on that disk, can cause out-of-storage issues. When this archived data keeps piling up, it can negatively affect performance.

To optimize this, the nifi.properties file contains properties that deal with archiving content in the NiFi Content repository.

Solution

nifi.content.repository.archive.max.retention.period
It specifies how long Nifi will keep archived content before clearing it from the content archive directory. The recommended setting for production should be short enough to ensure storage does not run out.
nifi.content.repository.archive.max.usage.percentage
It specifies Nifi when it should start clearing archived content to keep the overall disk usage at or below the configured rate.
For example: If max.usage.percentage = 40%. Once the archive data crosses the 40% mark, it will start removing the oldest archive content.
nifi.content.repository.archive.enabled
This enables the provenance user interface to view or replay content that is no longer in a dataflow queue.

There are many other properties that can be used to optimize file system Content repository properties. For detailed description refer, NiFi System Administrator’s Guide.