K3s Monitor

Source Code

The K3s Monitor tool is a comprehensive Python utility designed to collect, analyze and report on resource utilization and performance metrics from a K3s cluster. This tool is particularly useful for diagnosing performance issues, capacity planning and understanding resource consumption patterns in production environments.

Tool Features

  • Cluster Resource Monitoring: Collects various resource metrics from nodes and pods
  • Component-Specific Monitoring: Tracks resource usage for all K3s Cluster components
  • Log Collection: Gathers logs from system services and Kubernetes components
  • Automated Analysis: Identifies high resource consumption and potential issues
  • Comparative Reporting: Compares current metrics with previous monitoring runs
  • Comprehensive Summary: Generates detailed reports with recommendations, ready for AI-assisted analysis with tools like Claude

Prerequisites

The following dependencies are required to run the K3s Monitor tool, automatically deployed with Provisioning playbook:

  • Python 3.8+
  • python3-kubernetes library
  • python3-yaml library
  • kubectl configured to access the K3s cluster
  • journalctl for log collection
  • jq for JSON processing

Generated Reports

The following reports are generated:

  • cilium-metrics.log: Detailed Cilium networking status, endpoints and services information
  • cluster-info.log: Basic information about the cluster
  • comparison.log: Comparison with previous monitoring runs
  • component-metrics.csv: Time-series data for component resource usage
  • summary.log: Overall resource usage summary and recommendations
  • etcd-metrics.log: Status of HA clusters, etcd cluster health and metrics
  • k3s-monitor.log: Operational log of the monitoring tool itself, including all actions taken during execution
  • log-summary.txt: Summary of important log events (errors, warnings)
  • pod-metrics.csv: Detailed pod-level resource metrics
  • sysctl.txt: System kernel parameter settings
  • summary.log: Overall resource usage summary and recommendations

See below the directories and files structure, containing the generated reports.

Note

Submit the generated tarball to Claude, for AI-assisted analysis. Upload the tarball to a chat with Claude and ask for an analysis of your K3s cluster metrics and performance.

      • cilium-metrics.log
      • cluster-info.log
      • comparison.log
      • component-metrics.csv
      • etcd-metrics.log
      • k3s-monitor.log
      • log-summary.txt
      • pod-metrics.csv
          • argo-cd_YYYYMMDD-HHMMSS.log
          • cert-manager_YYYYMMDD-HHMMSS.log
          • cilium_YYYYMMDD-HHMMSS.log
          • coredns_YYYYMMDD-HHMMSS.log
          • external-dns_YYYYMMDD-HHMMSS.log
          • kured_YYYYMMDD-HHMMSS.log
          • longhorn_YYYYMMDD-HHMMSS.log
          • metrics-server_YYYYMMDD-HHMMSS.log
          • victorialogs_YYYYMMDD-HHMMSS.log
          • victoriametrics_YYYYMMDD-HHMMSS.log
        • containerd.log
        • k3s.log
        • kubelet.log
      • summary.log
      • sysctl.txt
    • k3s-monitor-YYYYMMDD-HHMMSS.tar.gz
  • Tool Usage

    Login into one of the server nodes and run the tool:

    ssh apollo
    sudo k3s-monitor --help
    usage: k3s-monitor [-h] [-d DURATION] [-i INTERVAL] [-l LOG_DIR] [-m LOG_MAX_SIZE] [-n NAMESPACE] [-v]
    
    K3s Cluster Monitor
    
    options:
      -h, --help            show this help message and exit
      -d DURATION, --duration DURATION
                            Total monitoring duration in seconds (default: 3600)
      -i INTERVAL, --interval INTERVAL
                            Time between metric collections in seconds (default: 300)
      -l LOG_DIR, --log-dir LOG_DIR
                            Directory to store logs and reports (default: /var/log/k3s)
      -m LOG_MAX_SIZE, --log-max-size LOG_MAX_SIZE
                            Maximum log file size in MB (default: 50)
      -n NAMESPACE, --namespace NAMESPACE
                            Default namespace (default: kube-system)
      -v, --verbose         Enable verbose logging (default: False)

    See below various K3s Monitor tool usage examples.

    Examples

    Monitor components for 24 hours with 15-minute intervals:

    sudo k3s-monitor --duration 86400 --interval 900

    Store logs into a custom directory with verbose output:

    sudo k3s-monitor --log-dir /home/user/k3s-monitoring --verbose

    Monitor components deployed into a different namespace:

    sudo k3s-monitor --namespace monitoring

    Run a quick 10-minute check with 1-minute intervals:

    sudo k3s-monitor --duration 600 --interval 60

    Best Practices

    • Regular Monitoring: Run the tool periodically (e.g., weekly) to establish baseline metrics
    • After Changes: Run after cluster upgrades or significant workload changes
    • Retention: Keep monitoring results for trend analysis
    • Size Appropriately: Adjust duration and interval based on cluster size:
      • Small clusters: 1-hour duration, 5-minute intervals
      • Large clusters: 6-hour duration, 15-minute intervals