Monitoring for capacity
In general, the best metric to monitor for capacity is the Traveler server Availability Index (AI). The AI is a measure of the overall server capacity.
The AI considers the current CPU usage, memory usage, and number of threads being used. An AI of 90 or more implies the server is running under capacity and can handle additional capacity. An AI of 50-90 implies the server is about at capacity and additional load may result in performance issues. An AI of less than 50 implies the server is over capacity, slow sync speeds may be experienced by the end users. For an optimally running server an AI in the 90s is best.
To monitor for capacity yourself, the main statistics to review are the CPU and memory usage of the Traveler server and the CPU, memory and disk I/O usage of the server hosting the database. For a standalone system, the database is hosted by the Traveler server. For an HA environment the database is hosted by enterprise database management system. When monitoring disk I/O, monitor the physical volume that is hosting the actual database and the physical volume hosting the transaction log. On a VM sometimes it's hard to know which volume to monitor; in this case, employ the assistance of your VM Team.
In general, steady CPU usage of up to 25% is optimal. CPU spikes up to 75% or more may be normal, but consistent CPU usage of 75% or more indicates the server is overloaded and additional CPUs should be added or usage reduced. This applies to both the Traveler server and the enterprise database server.
By default, the Traveler server will use 25% of installed physical memory of a system which allows 75% for the CPU and other applications. This is optimal and in general should not be changed, doing so may starve other applications, especially the Domino HTTP server which is responsible for encrypting and decrypting all HTTPS traffic to and from the Traveler server. If memory usage is running at or above the available physical memory, add memory to the system and Traveler will automatically use 25% of the additional memory.
The Enterprise database server should not run out of physical memory. Much of the data stored by the enterprise database is in sorted order or needs to be sorted before returning the results to the Traveler application. Sort requires memory, if the database server runs out of physical memory and has to use virtual memory for things like sort buffers and/or runtime cache, then the database performance will be impacted as the DBMS begins swapping virtual memory in and out of disk. Virtual memory also impacts disk IO which is a critical performance metric for any DBMS. If running at or near 100% of physical memory, then add memory to the DBMS system.
Finally, Disk I/O is a critical metric from the physical volume where the database is being
stored. Due to the nature of the mobile client base, the
Traveler servers generate a high volume of small read and write
database transactions. These transactions are generally measured
in tens or hundreds of milliseconds and due to their short
duration, concurrency is usually low. In practice we have found
disk I/O of the database server to be the number one impact to
performance of the database server, whether it be a standalone
environment or an enterprise DBMS. It is important to plan
accordingly and monitor your disk I/O for the database disk
volumes. The main causes of poor disk I/O are using spindle disk
technology and/or sharing physical disk volumes with other
servers and/or VMs. When using VMs, it can be hard to know if
the underlying disk volume is shared with other VMs or not. For
a standalone environment, you can monitor the Domino platform
statistics to see if the underlying disk architecture is
sufficient for the current load. Check the statistic
Platform.LogicalDisk.n.AvgQueLen.Avg
(where n
corresponds to the volume where the
Domino data directory is located). The value should be 0.2 or
lower.
It is important when planning for HCL Traveler that your VM team be aware that the database server requires a dedicated solid state disk (SSD) volume to prevent any disk I/O contention that may impact performance. In addition, if you are planning for a large environment and/or planning to enable HADR for your DBMS, it is important to use a dedicated physical disk volume for the transaction log that is separate from the physical volume used to host the database itself. Following these simple best practices will save a lot of headache later.
Note that in some cases tuning for performance can increase the overall capacity of a Traveler environment and allow for larger deployments on the same hardware. For more information, see Tuning performance of the server.