I have been exploring the Graphite graphing tool for showing metrics from multiple servers, and it seems that the 'recommended' way is to send all metrics data to StatsD first. StatsD aggregates the data and sends it to graphite (or rather, Carbon).
In my case, I want to do simple aggregations like sum and average on metrics across servers and plot that in graphite. Graphite comes with a Carbon aggregator which can do this.
StatsD does not even provide aggregation of the kind I am talking about.
My question is - should I use statsd at all for my use case? Anything I am missing here?
StatsD operates over UDP, which removes the risk of carbon-aggregator.py being slow to respond and introducing latency in your application. In other words, loose coupling.
StatsD supports sampling of inbound metrics, which is useful when you don't want your aggregator to take 100% of all data points to compute descriptive statistics. For high-volume code sections, it is common to use 0.5%-1% sample rates so as to not overload StatsD.
StatsD has broad client-side support.