Extract, Transform, Load (ETL)
The Extract, Transform, Load (ETL) is a process in the database usage that combines three database functions that transfer data from one database to another. The first stage, Extract, involves reading and extracting data from various source systems. The second stage, Transform, converts the data from its original format into the format that meets the requirements of the target database. The last stage, Load, saves the new data into the target database, thus finishing the process of transferring the data.
In BigFix Inventory, the Extract stage involves extracting data from the BigFix server. The data includes information about the infrastructure, installed agents, and detected software. ETL also checks whether a new software catalog is available, gathers information about the software scan and files that are present on the endpoints, and collects data from VM managers.
- A large number of small files is retrieved from the BigFix server (Extract).
- Many small and medium files that contain information about installed software packages and process usage data are parsed (Transform).
- The database is populated with the parsed data (Load).
Performance of the ETL process depends on the number of scan files, usage analyses, and package analyses that are processed during a single import. The main bottleneck is storage performance because many small files must be read, processed, and written to the BigFix Inventory database in a short time. By properly scheduling scans and distributing them over the computers in your infrastructure, you can reduce the length of the ETL process and improve its performance.
An important factor that influences the duration of the ETL process is the amount of updates on the file system since the last scan. Such operations as security updates or significant system upgrades can cause ETL to run longer, because it has to process information about all modified files. For example, regular updates released by Microsoft on Tuesdays would significantly lengthen the Wednesday import in environments with many Windows platforms.