Skip to content

K3s Monitor

Source Code

The K3s Monitor tool is a comprehensive Python utility designed to collect, analyze and report on resource utilization and performance metrics from a K3s cluster. This tool is particularly useful for diagnosing performance issues, capacity planning and understanding resource consumption patterns in production environments.

Tool Features

  • Cluster Resource Monitoring: Collects various resource metrics from nodes and pods
  • Component-Specific Monitoring: Tracks resource usage for all K3s Cluster components
  • Log Collection: Gathers logs from system services and Kubernetes components
  • Automated Analysis: Identifies high resource consumption and potential issues
  • Comparative Reporting: Compares current metrics with previous monitoring runs
  • Comprehensive Summary: Generates detailed reports with recommendations, ready for AI-assisted analysis with tools like Claude

Prerequisites

The following dependencies are required to run the K3s Monitor tool, automatically deployed with Provisioning playbook:

  • Python 3.8+
  • python3-kubernetes library
  • python3-yaml library
  • kubectl configured to access the K3s cluster
  • journalctl for log collection
  • jq for JSON processing

Generated Reports

The following reports are generated:

  • cilium-metrics.log: Detailed Cilium networking status, endpoints and services information
  • cluster-info.log: Basic information about the cluster
  • comparison.log: Comparison with previous monitoring runs
  • component-metrics.csv: Time-series data for component resource usage
  • summary.log: Overall resource usage summary and recommendations
  • etcd-metrics.log: Status of HA clusters, etcd cluster health and metrics
  • k3s-monitor.log: Operational log of the monitoring tool itself, including all actions taken during execution
  • log-summary.txt: Summary of important log events (errors, warnings)
  • pod-metrics.csv: Detailed pod-level resource metrics
  • sysctl.txt: System kernel parameter settings
  • summary.log: Overall resource usage summary and recommendations

See below the directories and files structure, containing the generated reports.

Note

Submit the generated tarball to Claude, for AI-assisted analysis. Upload the tarball to a chat with Claude and ask for an analysis of your K3s cluster metrics and performance.

      • cilium-metrics.log
      • cluster-info.log
      • comparison.log
      • component-metrics.csv
      • etcd-metrics.log
      • k3s-monitor.log
      • log-summary.txt
      • pod-metrics.csv
          • argo-cd_YYYYMMDD-HHMMSS.log
          • cert-manager_YYYYMMDD-HHMMSS.log
          • cilium_YYYYMMDD-HHMMSS.log
          • coredns_YYYYMMDD-HHMMSS.log
          • external-dns_YYYYMMDD-HHMMSS.log
          • kured_YYYYMMDD-HHMMSS.log
          • longhorn_YYYYMMDD-HHMMSS.log
          • metrics-server_YYYYMMDD-HHMMSS.log
          • victorialogs_YYYYMMDD-HHMMSS.log
          • victoriametrics_YYYYMMDD-HHMMSS.log
        • containerd.log
        • k3s.log
        • kubelet.log
      • summary.log
      • sysctl.txt
    • k3s-monitor-YYYYMMDD-HHMMSS.tar.gz

Tool Usage

Login into one of the server nodes and run the tool:

ssh apollo
sudo k3s-monitor --help
usage: k3s-monitor [-h] [-d DURATION] [-i INTERVAL] [-l LOG_DIR] [-m LOG_MAX_SIZE] [-n NAMESPACE] [-v]

K3s Cluster Monitor

options:
  -h, --help            show this help message and exit
  -d DURATION, --duration DURATION
                        Total monitoring duration in seconds (default: 3600)
  -i INTERVAL, --interval INTERVAL
                        Time between metric collections in seconds (default: 300)
  -l LOG_DIR, --log-dir LOG_DIR
                        Directory to store logs and reports (default: /var/log/k3s)
  -m LOG_MAX_SIZE, --log-max-size LOG_MAX_SIZE
                        Maximum log file size in MB (default: 50)
  -n NAMESPACE, --namespace NAMESPACE
                        Default namespace (default: kube-system)
  -v, --verbose         Enable verbose logging (default: False)

See below various K3s Monitor tool usage examples.

Examples

Monitor components for 24 hours with 15-minute intervals:

sudo k3s-monitor --duration 86400 --interval 900

Store logs into a custom directory with verbose output:

sudo k3s-monitor --log-dir /home/user/k3s-monitoring --verbose

Monitor components deployed into a different namespace:

sudo k3s-monitor --namespace monitoring

Run a quick 10-minute check with 1-minute intervals:

sudo k3s-monitor --duration 600 --interval 60

Best Practices

  • Regular Monitoring: Run the tool periodically (e.g., weekly) to establish baseline metrics
  • After Changes: Run after cluster upgrades or significant workload changes
  • Retention: Keep monitoring results for trend analysis
  • Size Appropriately: Adjust duration and interval based on cluster size:
    • Small clusters: 1-hour duration, 5-minute intervals
    • Large clusters: 6-hour duration, 15-minute intervals