Type
|
Instance
|
Matrix
|
Description
|
Units
|
Expected
Values
|
Role
|
Disk Usage
|
boot
|
used
|
Used space on partition /boot
|
Bytes
|
<99%
|
All
|
reserved
|
Space on /boot partition reserved for root user.
|
Bytes
|
|
free
|
Free space on partition /boot
|
Bytes
|
|
opt
|
used
|
Used space on partition /opt
|
Bytes
|
<99%
|
All
|
reserved
|
Space on /opt partition reserved for root user.
|
Bytes
|
|
free
|
Free Space on /opt partition.
|
Bytes
|
|
root
|
used
|
Used space on partition /
|
Bytes
|
<99%
|
|
reserved
|
Space on /opt partition reserved for root user.
|
Bytes
|
|
free
|
Free Space on /opt partition.
|
Bytes
|
|
Disk
|
sdb
|
disk_merged read
|
The number of read operations, that could be merged into other, already queued operations, i.e. one physical disk access served two or more logical operations.
|
Merged Operations/sec
|
|
Director-Web Message Queue, Search Store, Analytics Store, JSON Store, RDBMS Store, Index Store
|
disk_merged write
|
The number of write operations, that could be merged into other, already queued operations, i.e. one physical disk access served two or more logical operations.
|
Merged Operations/sec
|
|
disk_octets read
|
Bytes read from disk per second
|
Bytes/sec
|
|
disk_octets write
|
Bytes written to disk per second
|
Bytes/sec
|
|
disk_ops read
|
Read operation from disk per seconds
|
Operations/sec
|
|
disk_ops write
|
Write operation to disk per seconds.
|
Operations/sec
|
|
disk_time read
|
Average time an I/O- read operation took to complete, equivalent to svctime of vmstat
|
Sec
|
|
disk_time write
|
Average time an I/O-write operation took to complete, equivalent to svctime of vmstat
|
Sec
|
|
Interface
|
eth0
|
if_errors rx
|
Rate of Error in receiving data by network interface.
|
Errors/sec
|
|
All
|
if_errors tx
|
Rate of Error in transmitting data by network interface.
|
Errors/sec
|
|
if_octets rx
|
Rate of Bytes received by network interface.
|
Bytes/sec
|
|
if_octets tx
|
Rate of Bytes transferred by network interface.
|
Bytes/sec
|
|
if_packets rx
|
Rate of packets receivedby network interface
|
Packets/sec
|
|
if_packets tx
|
Rate of packets transferred by network interface
|
Packets/sec
|
|
Load
|
|
longterm
|
Average system load over 15 min period of time.
|
Average number of runnable tasks in the run-queue (15 min)
|
|
All
|
midterm
|
Average system load over 5 min period of time.
|
Average number of runnable tasks in the run-queue (5 min)
|
|
shortterm
|
Average system load over 1 min period of time. Refer top/w/uptime man page for more details.
|
Average number of runnable tasks in the run-queue (1 min)
|
|
Swap
|
swap
|
cached
|
Memory that once was swapped out is swapped back in but still also is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O) ( http://www.redhat.com/advice/tips/meminfo.html/)
|
Bytes
|
|
All
|
free
|
Total amount of swap space available.
|
Bytes
|
|
|
used
|
Total amount of swap space used
|
Bytes
|
|
|
swap_io
|
in
|
Amount of memory swapped in from disk
|
Kilobytes the system has swapped in from disk per second
|
|
All
|
out
|
Amount of memory swapped out from disk
|
Kilobytes the system has swapped out to disk per second
|
|
VMWare
|
CPU
|
elapsed_ms
|
Retrieves the number of milliseconds that have passed in the virtual machine since it last started running on the server. The count of elapsed time restarts each time the virtual machine is powered on, resumed, or migrated using VMotion.
|
Milliseconds
|
|
All
|
limit_mhz
|
Retrieves the upper limit of processor use in MHz available to the virtual machine.
|
|
|
reservation_mhz
|
Retrieves the minimum processing power in MHz reserved for the virtual machine.
|
|
|
shares
|
Retrieves the number of CPU shares allocated to the virtual machine.
|
|
|
stolen_ms
|
Retrieves the number of milliseconds that the virtual machine was in a ready state (able to transition to a run state), but was not scheduled to run
|
Milliseconds
|
|
used_ms
|
Retrieves the number of milliseconds during which the virtual machine has used the CPU. This value includes the time used by the guest operating system and the time used by virtualization code for tasks for this virtual machine. Percentage of cpu utilization is used_ms*number_of_core/elapsed_ms
|
Milliseconds
|
|
Memory
|
active_mb
|
Retrieves the amount of memory the virtual machine is actively using—its estimated working set size
|
MegaBytes
|
|
All
|
balooned_mb
|
Retrieves the amount of memory that has been reclaimed from this virtual machine by the vSphere memory balloon driver (also referred to as the vmmemctl driver)
|
MegaBytes
|
|
limit_mb
|
Retrieves the upper limit of memory that is available to the virtual machine.
|
MegaBytes
|
|
mapped_mb
|
Retrieves the amount of memory that is allocated to the virtual machine. Memory that is ballooned, swapped, or has never been accessed is excluded
|
MegaBytes
|
|
reservation_mb
|
Retrieves the minimum amount of memory that is reserved for the virtual machine
|
MegaBytes
|
|
shares
|
Retrieves the amount of physical memory associated with this virtual machine that is copy-on-write (COW) shared on the host.
|
|
|
swapped_mb
|
Retrieves the amount of memory that has been reclaimed from this virtual machine by transparently swapping guest memory to disk
|
MegaBytes
|
|
used_mb
|
Retrieves the estimated amount of physical host memory currently consumed for this virtual machine's physical memory
|
MegaBytes
|
|
Apache
|
apache_connections
|
|
Total number of busy workers (BusyWorkers)
|
|
|
App Server
|
apache_idle_workers
|
|
Total number of idle workers (IdleWorkers)
|
|
|
apache_scoreboard
|
closing
|
Total number of child processes Closing connections
|
|
|
App Server
|
dnslookup
|
Total number of child precesses performing DNS lookups
|
|
|
finishing
|
Total number of child processes Gracefully finishing
|
|
|
idle_cleanup
|
Total number of Idle cleanup of worker
|
|
|
keepalive
|
Total number of child processes maintaining KeepAlive (read) connections
|
|
|
logging
|
Total number of child precesses simultaneously writing to the logs
|
|
|
open
|
Total number of Open slot with no current process
|
|
|
reading
|
Total number of child processes Reading Request
|
|
|
sending
|
Total number of child processes Sending Reply to request
|
|
|
starting
|
Total number of child processes Starting up
|
|
|
waiting
|
Total number of child processes Waiting for Connection
|
|
|
State Manager
|
StateManager HTTP Response Code. 200=OK, 500=ERROR
|
activemq-code
|
WxS connectivity status with Message Queue service (ActiveMQ)
|
|
200, 500
|
App Server
|
cache-code
|
WxS connectivity status with Cache service
|
|
200, 500
|
digest-code
|
WxS connectivity status with Digest service
|
|
200, 500
|
graph-code
|
WxS connectivity status with Graph service
|
|
200, 500
|
index-code
|
WxS connectivity status with Index/Search service
|
|
200, 500
|
json-code
|
WxS connectivity status with JSON service
|
|
200, 500
|
notifier-code
|
WxS connectivity status with Notifier service
|
|
200, 500
|
quad-code
|
Overall connectivity status of WxS with critical services (RDBMS, JSON, Message Queue, Search, Index)
|
|
200, 500
|
quad_analytics-code
|
WxS connectivity status with Analytics service
|
|
200, 500
|
rabbitmq-code
|
WxS connectivity status with Message Queue service (RabbitMQ)
|
|
200, 500
|
rdbms-code
|
WxS connectivity status with RDBMS service
|
|
200, 500
|
recommendation-code
|
WxS connectivity status with Recommendation service
|
|
200, 500
|
search-code
|
WxS connectivity status with Search/Index service
|
|
200, 500
|
Processes
|
fork
|
fork_rate
|
Number of new process forked per second.
|
|
|
All
|
ps_state
|
blocked
|
Count of processes in Blocked state. If consistently high, alert condition need attention.
|
|
|
All
|
paging
|
Count of processes in Paging state. If consistently high or growing, alert condition need attention.
|
|
|
running
|
Count of processes in running state. Typically less or equal to num of cores.
|
|
|
sleeping
|
Count of processes in sleeping state. Typically most processes are in this state.
|
|
|
stopped
|
Count of processes in Stopped state
|
|
|
zombies
|
Count of processes in Zombies state. If consistently high or growing, alert condition need attention.
|
|
|
TCP Connection
|
Port 80 - App Server,
Port 61616 - Message Queue,
Port 8983 - Search Store,
Port 7973 - Index Store,
Port 27001 - Analytics Store,
Port 27000 - JSON Store,
Port 11211 - Cache
|
close_wait
|
(both server and client) represents waiting for a connection termination request from the local user
|
number of connections
|
|
App Server, Message Queue, Search Store, Index Store, Analytics Store, JSON Store, Cache
|
closed
|
(both server and client) represents no connection state at all
|
number of connections
|
|
closing
|
(both server and client) represents waiting for a connection termination request acknowledgment from the remote TCP
|
number of connections
|
|
established
|
(both server and client) represents an open connection, data received can be delivered to the user. The normal state for the data transfer phase of the connection
|
number of connections
|
|
fin_wait1
|
(both server and client) represents waiting for a connection termination request from the remote TCP, or an acknowledgment of the connection termination request previously sent
|
number of connections
|
|
fin_wait2
|
(both server and client) represents waiting for a connection termination request from the remote TCP
|
number of connections
|
|
last_ack
|
(both server and client) represents waiting for an acknowledgment of the connection termination request previously sent to the remote TCP (which includes an acknowledgment of its connection termination request)
|
number of connections
|
|
listen
|
(server) represents waiting for a connection request from any remote TCP and port
|
number of connections
|
|
syn_recv
|
(server) represents waiting for a confirming connection request acknowledgment after having both received and sent a connection request
|
number of connections
|
|
syn_sent
|
(client) represents waiting for a matching connection request after having sent a connection request
|
number of connections
|
|
time_wait
|
(either server or client) represents waiting for enough time to pass to be sure the remote TCP received the acknowledgment of its connection termination request. [According to RFC 793 a connection can stay in TIME-WAIT for a maximum of four minutes known as a MSL (maximum segment lifetime).]
|
number of connections
|
|
Oracle
|
|
blockingLock
|
Locks that are blocking other sessions. Should be as low as possible and should be for shorter durations.
|
|
|
RDBMS Store
|
cacheHitRatio
|
Cache hit ratios should be as high as possible (highest is 100%)
|
%
|
|
dbBlockBufferCacheHitRatio
|
DB block buffer cache hit ratios should be as high as possible (highest is 100%)
|
%
|
|
dictionaryCacheHitRatio
|
Dictionary cache hit ratios should be as high as possible (highest is 100%).
|
%
|
|
diskSortRatio
|
Disk sorting should be minimal
|
|
|
invalidObjects
|
Invalid objects should be as minimal as possible
|
|
|
latchHitRatio
|
Latch hit ratios should be as high as possible (highest is 100%)
|
%
|
|
libraryCacheHitRatio
|
Library Cache hit ratios should be as high as possible (highest is 100%)
|
%
|
|
lock
|
Minimum number of locks for shorter durations
|
|
|
lockedUserCount
|
The QUADDB and XMPP accounts should be unlocked and so are the DBA/other accounts such as SYS, SYSTEM, SYSMAN etc.
|
|
|
offlineDataFiles
|
All the Datafiles should be ONLINE
|
|
|
pgaInMemorySortRatio
|
PGA memory sorts should be as high as possible
|
|
|
rollBlockContentionRatio
|
Should be minimal
|
|
|
rollHeaderContentionRatio
|
Should be minimal
|
|
|
rollHitRatio
|
Should be as high as possible
|
|
|
rollbackSegmentWait
|
Should be minimal
|
|
|
sessionPGAMemory
|
PGA memory consumed by a session
|
|
|
sessionUGAMemory
|
UGA memory consumed by a session
|
|
|
sgaDataBufferHistRatio
|
Hit ratios should be as high as possible (highest is 100%)
|
%
|
|
sgaSharedPoolFree
|
Too much of free shared pool means over allocation/wastage of memory resource. No shared pool being free can be an indication of memory starving.
|
|
|
sgaSharedPoolReloadRatio
|
System Global Area shared pool reload ratio
|
|
|
softParseRatio
|
Soft parse ratio of the SQLs
|
|
|
staleStatistics
|
Statistics should be up-to-date
|
|
|
ioPerTableSpace: ecp_data, sysaux, system, undotbs1, users
|
PHY_BLK_R
|
Physical Blocks Read
|
|
|
RDBMS Store, Graph Store
|
Phy_BLK_W
|
Physical Blocks WRITE
|
|
|
oraUsageTablespace: ecp_data, sysaux, system, undotbs1, users
|
free_mb
|
Free Space in MB
|
MegaBytes
|
|
RDBMS Store, Graph Store
|
percent_free
|
% Free Space
|
%
|
|
percent_used
|
% Used
|
%
|
|
size_mb
|
Size in MB
|
MegaBytes
|
|
Solr
|
Search
|
avgRequestsPerSecond
|
Number of requests server per second
|
Seconds
|
|
Search Store
|
avgTimePerRequest
|
average time taken to server each request
|
Milliseconds
|
|
errors
|
Rate of error, requests that returned error.
|
Number
|
|
requests
|
Rate of request servered by SOLR.
|
Number
|
|
timeouts
|
Rate of request timed out, request that failed due to time out error.
|
Number
|
|
Search: documentcache, fieldvaluecache, filtercache, queryresultcache
Index: autocompletefieldvalue, followerfieldvaluecach, postfieldvaluecache, socialfieldvaluecache, videofieldvaluecache
|
cumulative_evictions
|
The number of entries that have been removed from the cache, from the start of the solr server
|
Number
|
|
Search Store, Index Store
|
cumulative_hits
|
This number denotes the total number of lookups that were sent to the cache that resulted in positive match in the cache, from the start of the solr server
|
Number
|
|
cumulative_inserts
|
The total number of values inserted in the cache, from the start of the solr server
|
Number
|
|
cumulative_lookups
|
This number shows the total number of lookups/reads on the cache from the start of the solr server
|
Number
|
|
evictions
|
The number of entries that have been removed from the cache
|
Number
|
|
hitratio
|
The percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache
|
Number
|
|
hits
|
The number of documents returned upon search
|
Number
|
|
inserts
|
The number of entries that have been added to the cache
|
Number
|
|
lookups
|
The number of lookups/reads on the cache, since the last cache invalidation (or last commit operation)
|
Number
|
|
size
|
Maximum number of entries in the cache
|
Number
|
|
warmupTime
|
Time to warm up the cache in milliseconds.
|
Milliseconds
|
|
Search: searcher
Index: autocomplete, follower, post, social, video
|
maxDoc
|
maxDoc is the maximum internal document id currently in use. The difference between maxDocs and numDocs numbers gives an idea of how many "deleted" (or replaced) documents are currently still in the index. They gradually get cleaned up as segments get merged or when the index gets optimized.
|
Number
|
|
Search Store, Index Store
|
numDocs
|
numDocs is the number of unique "live" Documents in the solr index. It's how many docs you would get back from a query for *:*.
|
Number
|
|
Java Memory
|
HeapMemoryUsage: Current memory usage of the heap that is used for object allocation. The heap consists of one or more memory pools. The used and committed size of the returned memory usage is the sum of those values of all heap memory pools whereas the init and max size of the returned memory usage represents the setting of the heap memory which may not be the sum of those of all heap memory pools. The amount of used memory in the returned memory usage is the amount of memory occupied by both live objects and garbage objects that have not been collected, if any.
NonHeapMemoryUsage: Current memory usage of non-heap memory that is used by the Java virtual machine. The non-heap memory consists of one or more memory pools. The used and committed size of the returned memory usage is the sum of those values of all non-heap memory pools whereas the init and max size of the returned memory usage represents the setting of the non-heap memory which may not be the sum of those of all non-heap memory pools.
|
HeapMemoryUsage_committed
|
Represents the amount of memory (in bytes) that is guaranteed to be available for use by the Java virtual machine. The amount of committed memory may change over time (increase or decrease). The Java virtual machine may release memory to the system and committed could be less than init.committed will always be greater than or equal to used.
|
Bytes
|
|
Search Store, Index Store, Message Queue, App Server, Worker
|
HeapMemoryUsage_init
|
Represents the initial amount of memory (in bytes) that the Java virtual machine requests from the operating system for memory management during startup. The Java virtual machine may request additional memory from the operating system and may also release memory to the system over time. The value of init may be undefined.
|
Bytes
|
|
HeapMemoryUsage_max
|
Represents the maximum amount of memory (in bytes) that can be used for memory management. Its value may be undefined. The maximum amount of memory may change over time if defined. The amount of used and committed memory will always be less than or equal to max if max is defined. A memory allocation may fail if it attempts to increase the used memory such that used > committed even if used <= max would still be true (for example, when the system is low on virtual memory).
|
Bytes
|
|
HeapMemoryUsage_used
|
Represents the amount of memory currently used (in bytes).
|
Bytes
|
|
NonHeapMemoryUsage_committed
|
Represents the amount of memory (in bytes) that is guaranteed to be available for use by the Java virtual machine. The amount of committed memory may change over time (increase or decrease). The Java virtual machine may release memory to the system and committed could be less than init.committed will always be greater than or equal to used.
|
Bytes
|
|
NonHeapMemoryUsage_init
|
Represents the initial amount of memory (in bytes) that the Java virtual machine requests from the operating system for memory management during startup. The Java virtual machine may request additional memory from the operating system and may also release memory to the system over time. The value of init may be undefine.
|
Bytes
|
|
NonHeapMemoryUsage_max
|
Represents the maximum amount of memory (in bytes) that can be used for memory management. Its value may be undefined. The maximum amount of memory may change over time if defined. The amount of used and committed memory will always be less than or equal to max if max is defined. A memory allocation may fail if it attempts to increase the used memory such that used > committed even if used <= max would still be true (for example, when the system is low on virtual memory).
|
Bytes
|
|
NonHeapMemoryUsage_used
|
Represents the amount of memory currently used (in bytes).
|
Bytes
|
|
Java fd
|
|
OpenFileDescriptorCount
|
Number of all file handles taken by the Java virtual machine currently. This includes all created sockets and virtual machine resources, too. Example notification value: (MaxFileDescriptorCount - OpenFileDescriptorCount) < 100. Monitor to determine if the number of open files that can be opened by the vm is sufficient.
|
|
|
Search Store, Index Store
|
Non Java Application processes
|
ps_count
|
processes
|
Total number of processes (including child) forked for particular program.
|
|
|
Analytics Store, JSON Store, Cache, RabbitMQ
|
threads
|
Total number of threads created for particular program.
|
|
|
ps_code
|
|
Total (in KB) of Shared library code size (VmLib) & Size of text segment (VmExe)
|
KiloBytes
|
|
Analytics Store, JSON Store, Cache
|
ps_data
|
|
Size (in KB) of data segment (VmData)
|
KiloBytes
|
|
Analytics Store, JSON Store, Cache
|
ps_rss
|
|
Number of pages the process has in real memory. This is just the pages which count towards text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.
|
|
|
Analytics Store, JSON Store, Cache
|
ps_stacksize
|
|
Stack size. Difference between the address of the start of the stack (startstck) & current value of ESP stack pointer, as found in the kernel stack page for the process (kstkesp).
|
|
|
Analytics Store, JSON Store, Cache
|
ps_vm
|
|
Virtual memory size in bytes.
|
Bytes
|
|
Analytics Store, JSON Store, Cache
|
ps_cputime
|
syst
|
Amount of time that this process has been scheduled in kernel mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)).
|
|
|
Analytics Store, JSON Store, Cache
|
user
|
Amount of time that this process has been scheduled in user mode, measured in clock ticks (divide by sysconf(_SC_CLK_TCK)). This includes guest time, guest_time (time spent running a virtual CPU), so that applications that are not aware of the guest time field do not lose that time from their calculations.
|
|
|
ps_disk_octets
|
read
|
I/O counter: chars read
The number of bytes which this task has caused to be read from storage. This is simply the sum of bytes which this process passed to read() and pread().
It includes things like tty IO and it is unaffected by whether or not actual physical disk IO was required (the read might have been satisfied from pagecache).
|
|
|
Analytics Store, JSON Store, Cache
|
write
|
I/O counter: chars written
The number of bytes which this task has caused, or shall cause to be written to disk. Similar caveats apply here as with rchar.
|
|
|
ps_disk_ops
|
read
|
I/O counter: read syscalls
Attempt to count the number of read I/O operations, i.e. syscalls like read() and pread().
|
|
|
Analytics Store, JSON Store, Cache
|
write
|
I/O counter: write syscalls
Attempt to count the number of write I/O operations, i.e. syscalls like write() and pwrite().
|
|
|
ps_pagefaults
|
majfit
|
The number of major faults the process has made which have required loading a memory page from disk.
|
|
|
Analytics Store, JSON Store, Cache
|
minfit
|
The number of minor faults the process has made which have not required loading a memory page from disk.
|
|
|
MongoDB
|
|
cache_misses
|
'serverStatus.indexCounters.accesses' divided by 'serverStatus.indexCounters.misses'
serverStatus.indexCounters.accesses:
accesses reports the number of times that operations have accessed indexes. This value is the combination of the hits and misses. Higher values indicate that your database has indexes and that queries are taking advantage of these indexes. If this number does not grow over time, this might indicate that your indexes do not effectively support your use.
serverStatus.indexCounters.misses:
misses represents the number of times that an operation attempted to access an index that was not in memory. These "misses," do not indicate a failed query or operation, but rather an inefficient use of the index. Lower values in this field indicate better index use and likely overall performance as well
|
|
|
Analytics Store, JSON Store
|
|
|
connections
|
serverStatus.connections.current:
The value of current corresponds to the number of connections to the database server from clients. This number includes the current shell session. Consider the value of available to add more context to this datum.
This figure will include the current shell connection as well as any inter-node connections to support a replica set or sharded cluster.
|
|
|
|
|
page_fault
|
serverStatus.extra_info.page_faults:Reports the total number of page faults that require disk operations. Page faults refer to operations that require the database server to access data which isn't available in active memory. The page_faults counter may increase dramatically during moments of poor performance and may correlate with limited memory environments and larger data sets. Limited and sporadic page faults do not necessarily indicate an issue.
|
|
|
|
|
lock_ratio%
|
Displays the relationship between lockTime and totalTime. Low values indicate that operations have held the globalLock frequently for shorter periods of time. High values indicate that operations have held globalLock infrequently for longer periods of time
serverStatus.globalLock.totalTime:
The value of totalTime represents the time, in microseconds, since the database last started and creation of the globalLock. This is roughly equivalent to total server uptime.
serverStatus.globalLock.lockTime:
The value of lockTime represents the time, in microseconds, since the database last started, that the globalLock has been held. Consider this value in combination with the value of totalTime. MongoDB aggregates these values in the ratio value. If the ratio value is small but totalTime is high the globalLock has typically been held frequently for shorter periods of time, which may be indicative of a more normal use pattern. If the lockTime is higher and the totalTime is smaller (relatively,) then fewer operations are responsible for a greater portion of server's use (relatively.)
|
|
|
|
|
flushes
|
flushes
|
serverStatus.backgroundFlushing.flushes:
flushes is a counter that collects the number of times the database has flushed all writes to disk. This value will grow as database runs for longer periods of time.
|
|
|
|
flushes_avg_ms
|
serverStatus.backgroundFlushing.average_ms:The average_ms value describes the relationship between the number of flushes and the total amount of time that the database has spent writing data to disk. The larger flushes is, the more likely this value is likely to represent a "normal," time; however, abnormal data can skew this value. Use the last_ms to ensure that a high average is not skewed by transient historical issue or a random write distribution.
|
|
|
|
memory
|
mapped
|
serverStatus.mem.mapped:
The value of mapped provides the amount of mapped memory, in megabytes (MB), by the database. Because MongoDB uses memory-mapped files, this value is likely to be to be roughly equivalent to the total size of your database or databases.
|
MegaBytes
|
|
|
resident
|
serverStatus.mem.resident:
The value of resident is roughly equivalent to the amount of RAM, in megabytes (MB), currently used by the database process. In normal use this value tends to grow. In dedicated database servers this number tends to approach the total amount of system memory.
|
MegaBytes
|
|
|
virtual
|
serverStatus.mem.virtual:
virtual displays the quantity, in megabytes (MB), of virtual memory used by the mongod process. In typical deployments this value is slight ly larger than mapped. If this value is significantly (i.e. gigabytes) larger than mapped, this could indicate a memory leak. With journaling enabled, the value of virtual is twice the value of mapped.
|
MegaBytes
|
|
|
|
network
|
bytesin
|
serverStatus.network.bytesIn:
The value of the bytesIn field reflects the amount of network traffic, in bytes, received by this database. Use this value to ensure that network traffic sent to the mongod process is consistent with expectations and overall inter-application traffic.
|
Bytes
|
|
|
bytesout
|
serverStatus.network.bytesOut:
The value of the bytesOut field reflects the amount of network traffic, in bytes, sent from this database. Use this value to ensure that network traffic sent by the mongod process is consistent with expectations and overall inter-application traffic.
|
Bytes
|
|
|
|
oplogs
|
difftimesec
|
Time difference between the most recent and the oldest oplog.
|
|
|
|
storagesizemb
|
The total amount of storage (in MB) allocated to this collection for document storage. The storageSize does not decrease as you remove or shrink documents.
|
MegaBytes
|
|
usedsizemb
|
The size (in MB) of the data stored in this collection. This value does not include the size of any indexes associated with the collection.
|
MegaBytes
|
|
|
replication
|
health
|
The health value is only present for the other members of the replica set. This field conveys if the member is up (i.e. 1) or down (i.e. 0.)
|
|
Up=1, Down=0
|
|
optimelagsec
|
Replication lag between secondary node and primary node
|
|
|
state
|
The value of the state reflects state of this replica set member.
|
|
An integer between 0 and 10 represents the state of the member. These integers map to states, as follows:
0 STARTUP Startup, phase 1 (parsing config.)
1 PRIMARY Primary.
2 SECONDARY Secondary.
3 RECOVERING Member is recovering (initial sync, post-rollback, stale members.)
4 FATAL Member has encountered unrecoverable error.
5 STARTUP2 Start up, phase 2 (forking threads.)
6 UNKNOWN Unknown (the set has never connected to the member.)
7 ARBITER Member is an arbiter.
8 DOWN Member is not accessible to the set.
9 ROLLBACK Member is rolling back data.
10 SHUNNED Member has been removed from replica set.
|
|
total_operations
Note: The opcounters data structure provides an overview of database operations by type and makes it possible to analyze the load on the database in more granular manner. These numbers will grow over time and in response to database use. Analyze these values over time to track database utilization.
|
command
|
Provides a counter of the total number of commands issued to the database since the mongod instance last started
|
|
|
|
delete
|
Provides a counter of the total number of delete operations since the mongod instance last started
|
|
|
getmore
|
Provides a counter of the total number of "getmore" operations since the mongod instance last started. This counter can be high even if the query count is low. Secondary nodes send getMore operations as part of the replication process
|
|
|
insert
|
Provides a counter of the total number of insert operations since the mongod instance last started
|
|
|
query
|
Provides a counter of the total number of queries since the mongod instance last started
|
|
|
update
|
Provides a counter of the total number of update operations since the mongod instance last started
|
|
|
MongoDB databases
|
quad, recommendation
|
collections
|
Contains a count of the number of collections in that database
|
|
|
|
indexes
|
Contains a count of the total number of indexes across all collections in the database
|
|
|
num_extents
|
Contains a count of the number of extents in the database across all collections
|
|
|
object_count
|
Contains a count of the number of objects (i.e. documents) in the database across all collections
|
|
|
data file_size
|
The total size of the data held in this database including the padding factor. The dataSize will not decrease when documents shrink, but will decrease when you remove documents
|
Bytes
|
|
index file_size
|
The total size of all indexes created on this database
|
Bytes
|
|
storage file_size
|
The total amount of space allocated to collections in this database for document storage. The storageSize does not decrease as you remove or shrink documents
|
Bytes
|
|
Tomcat
|
|
activeSessions
|
Number of active sessions at this moment
|
|
|
App Server
|
expiredSessions
|
Number of sessions that expired (doesn't include explicit invalidations)
|
|
|
processExpiresFrequency
|
The frequency of the manager checks (expiration and passivation)
|
|
|
processingTime
|
Time spent doing housekeeping and expiration
|
Cumulative milliseconds of wall clock elapsed time
|
|
rejectedSessions
|
Number of sessions rejected due to maxActive being reached
|
|
|
sessionAverageAliveTimes
|
Average time an expired session had been alive
|
Seconds
|
|
sessionCounter
|
Total number of sessions created by this manager
|
|
|
sessionCreateRate
|
Session creation rate in sessions per minute
|
Minute
|
|
sessionExpireRate
|
Session expiration rate in sessions per minute
|
Minute
|
|
RabbitMQ
|
Queue: Activity, Analytics, EMailDigest, Migrate, Polling, Scheduler
|
consumers
|
Number of consumers for the queue
|
|
|
Message Queue
|
memory
|
Bytes of memory consumed by the Erlang process associated with the queue, including stack, heap and internal structures.
|
Bytes
|
|
messages
|
Sum of ready and unacknowledged messages (queue depth).
|
|
|
messages_ready
|
Number of messages ready to be delivered to clients.
|
|
|
messages_acknowledged
|
Number of messages delivered to clients but not yet acknowledged.
|
|
|
node
|
Node associated with the queue
|
|
|
Server
|
fd_total
|
File descriptor count and limit, as reported by the operating system. The count includes network sockets and file handles.
|
|
|
Message Queue
|
fd_used
|
File descriptor count used by RabbitMQ.
|
|
|
mem_limit
|
The memory threshold RabbitMQ will use on the system.
|
Bytes
|
|
mem_used
|
Memory used by RabbitMQ
|
Bytes
|
|
proc_total
|
Maximum number of erlang processes for RabbitMQ
|
|
|
proc_used
|
Number of erlang processes used by RabbitMQ
|
|
|
sockets_total
|
The network sockets count and limit managed by RabbitMQ.
|
|
|
sockets_used
|
The network sockets count used by RabbitMQ.
|
|
|
uptime
|
Uptime of the service
|
Milliseconds
|
|
ActiveMQ
|
Broker
|
TotalEnqueueCount
|
Number of messages sent to queues
|
|
|
Message Queue
|
TotalDequeueCount
|
Number of messages removed from queues & consumed by the clients
|
|
|
TotalConsumerCount
|
Number of clients listening to the queue
|
|
|
TotalMessageCount
|
Number of Messages held by the broker. [TotalMessagesCount+TotalDequeueCount = TotalEnqueueCount ]
|
|
|
MemoryLimit
|
The memory usage limit of the broker
|
Bytes
|
|
MemoryPercentUsage
|
Percentage usage of the memory
|
%
|
|
StoreLimit
|
The upper limit of the store usage of the broker -- we haven't configured any upper limit for WxS queues
|
|
|
StorePercentUsage
|
The actual storage usage of the broker
|
|
|
ActiveMQ
|
Queue: inbound, outbound, portal, search, vdl
|
QueueSize
|
Total number of messages in the queue/store that have not been ack'd by a consumer
|
|
|
Message Queue
|
EnqueueCount
|
Total number of messages sent to consumer sessions (Dequeue + Inflight)
|
|
|
DequeueCount
|
Number of messages sent to a consumer session and have not received an ack
|
|
|
ConsumerCount
|
Total number of messages sent to the queue since the last restart
|
|
|
DispatchCount
|
Total number of messages removed from the queue (ack'd by consumer) since last restart
|
|
|
ExpiredCount
|
Number of client/consumers listening on this Queue
|
|
|
InFlightCount
|
Number of messages which didn't get sent to the clients/Consumers and reach the expiry timeout and cleared by broker -- We have the expired timeout of 8 hours
|
|
|
CursorMemoryUsage
|
Indicates the memory(heap) used by non-persistent messages -- this doesn't to WxSocial as we use persistent messaging
|
|
|
CursorPercentUsage
|
Indicates the memory(heap) used by non-persistent messages in percentage
|
%
|
|
MemoryLimit
|
The upper limit of memory usage of a particular Queue—WxS we haven't configured any upper limits for the Queues in WxS
|
Bytes
|
|
MemoryPercentUsage
|
The percentage of memory usage of a particular Queue
|
%
|
|