We have been looking at different monitoring tools for our testing. One that has shown promise is called NetData (on the web as my-netdata.io and Twitter as @linuxnetdata). We just wanted to show off how cool the container looks:
You can see very detailed metrics in a number of areas all navigable via the right side management plane.
Monitoring a Single Server with NetData
What drew us to this solution is that it is open, on Github, and it has a unique feature: it can monitor containers. For example, here is a container on a Lenovo X3650 server that is crunching numbers across two CPUs.
Using this you can see the overall system and interrupt picture, and also drill down into each container to see resource utilization. For example, if a container is compute, network or disk I/O heavy, My-NetData will easily let you drill into a system and see what is going on.
Another great feature is that My-NetData actually monitors the resources it is using while providing metrics. Here is what that looks like:
We like the fact that the default theme is dark. It makes static dashboard reporting easy and looks great if you want to have a monitor in the office set up as a dashboard for an engineering team to see.
Monitoring Multiple Servers with NetData
Instead of centralizing, the NetData service creates a local registry that allows you to switch between nodes you have monitored via a menu in the top left.
That means that each server can collect detailed metrics without the overhead of constantly submitting those metrics to a centralized server. The idea is that it enables higher polling frequency using this method.
We really like the My-NetData project. You should check it out. While there other, more robust monitoring tools for huge clusters, if you have machines numbering in the dozens, you should give this project a try. It is sleek and fast.