CF Summit Notes: Monitoring CF: What are my options and why should I?

October 19, 2014

Monitoring CF: What are my options and why should I? -- Charlie Arehart

How do you know which JVM args and server settings are right for YOU?

(slides are available at carehart.org)

the 3 main monitors

CF Enterprise Server Monitor
-built into CF enterprise (not in CF standard)
-CF Admin, Server Minotoring, Server Monitor, Launch Server Monitor
-in CF 8 and above, about the same in all versions 8 thru 11

FusionReactor
not just for monitoring ColdFusion
can install on a machine to monitor CF, Railo, Tomcat, Solr, ANYthing Java
commercial. available 14 day trial
pricing starts at $24/month per server

SeeFusion
commercial. available for first 2 hours of the CF instance. after that you have to pay
$199 for 2 instances

understand / diagnose CF server problems
-- don't have to just "stare at a monitor" all day
can use it only when you have a problem, and only spend a few minutes in there to track down the problem

Be alerted to problems in advance of the crash

3 main ways to use the monitoring tools
1. user interface / graphs / charts / reports
2. alerts (arriving by email)
3. logs

CF Server monitor doesn't do ANY logging.

Fusion Reactor does lots of logging

3 Most Common Server Problems
High CPU Usage (JRun.exe is really high in Windows, etc)
most people reboot the server when that happens. but there are other solutions that aren't that extreme.
(blog entry on Charlie's site about common causes of this)
often people presume this is b/c of a bad request taking a lot of CPU. that's not always the case. sometimes there are 0 requests and still a high CPU usage

High Memory Usage
could be a Garbage Collection thing.
even after GC has happened, try hitting the "Garbage Collection" button in Fusion Reactor and it will drop again
(minor vs major garbage collection)

CF not responding to requests
maybe too many running requests is causing the issue
in CF ADmin, Request Tuning, Max Simultaneous Requests
-- like defining how many tellers are in the bank of CF. as long as there are available tellers, they will work. eventually it may take a long time and that's what's in the "running requests" in Fusion Reactor (or CF Server Monitor).
could happen with or without CPU/Memory issues

other CF diagnostic tools --
CFStat
command line tool. built in to CF. in /bin directory
reports high level metrics
how many requests are running, how many did run, etc.
only about a dozen stats, but better than nothing.
have to enable it in CF admin
it's on the "debugging" page
can also show you how many requests are QUEUED (if all tellers are busy, how many people are waiting in line? that's the "queue").

Windows PerfMon counters --
checkbox in CF Admin Debugging
same info from CF Stat is sent to Perf Mon

in CF10, added Metrics Logging (in DEbugging / Admin page)
every 5 seconds it will write out how many requests are running, how many sessions, etc.

Other non CF tools to consider --

JVM monitoring tools and logs
(not necessary if you're using one of the main 3 CF monitoring tools discussed above)

OS monitoring tools / logs
Task Mgr, Process List in Linux, etc.
EVent logs in Windows
also some 3rd party tools (Windows: SysInternals -- FREE tools from MS)

Web server monitoring tools / logs
sometimes CF is the "victim" and the problem is actually in the web server. CF is waiting for a request that never shows up.

Database server monitoring

Main Admin Settings to tweak based on monitoring tools
Maximum Simultaneous Requests
-- don't just raise it "because". maybe you can fix the problem so you're no longer hitting that limit.
Heap Size
Template and query cache sizes
-- Fusion Reactor has info on this.

Other Main Uses of Monitoring Tools
to watch queries
all 3 tools let you watch what queries are running and SEE the actual SQL that was executed

Stack tracing
in all 3 tools can have JVM tell me what it's doing right now, on THIS line of code in this file, etc.

Session tracking --
Fusion Reactor and CF Server Monitor
both tools can track sessions
people often don't know how many sessions are actively on their server.

All 3 tools also show "uptime"

lots more features too.