Find Your Scale

Use this guide to right-size your DevOpsGenie deployment — from a lean development cluster to a multi-tenant production environment handling billions of events per day.

Cluster Tiers

Starter (Development / Evaluation)

Ideal for: small teams evaluating DevOpsGenie, development environments, internal tooling.

Component	Spec	Monthly AWS Cost
EKS Control Plane	Managed	~$73
System Node Group	2× m6i.large (2 vCPU / 8 GiB each)	~$140
Workload Node Group	2× m6i.xlarge (4 vCPU / 16 GiB each)	~$280
NAT Gateway	1×	~$35
Total		~$530/month

# Starter sizing in Terraform
module "eks" {
  ...
  eks_managed_node_groups = {
    system = {
      instance_types = ["m6i.large"]
      min_size       = 2
      max_size       = 3
      desired_size   = 2
    }
    workloads = {
      instance_types = ["m6i.xlarge"]
      min_size       = 2
      max_size       = 10
      desired_size   = 2
    }
  }
}

Growth (Small Production)

Ideal for: production workloads with up to 20 microservices, 3–5 engineering teams.

Component	Spec	Monthly AWS Cost
EKS Control Plane	Managed	~$73
System Node Group	3× m6i.xlarge (HA across 3 AZs)	~$420
Workload Nodes (Karpenter)	~8× m6i.2xlarge on-demand	~$2,400
Spot Nodes (Karpenter)	~4× m6i.2xlarge Spot	~$300
ALB	2×	~$40
NAT Gateway	3× (HA)	~$105
Total		~$3,338/month

Scale (Mid-Size Production)

Ideal for: 50+ microservices, 10–20 teams, regulated industry workloads.

Component	Spec	Monthly AWS Cost
EKS Control Plane	Managed	~$73
System Node Group	3× m6i.2xlarge	~$840
Workload Nodes (Karpenter)	~20× m6i.4xlarge mix	~$8,000
Spot Nodes (Karpenter)	~10× m6i.4xlarge Spot	~$1,200
ALB + WAF	4×	~$200
NAT Gateway	3×	~$105
Total		~$10,418/month

Enterprise (Large-Scale Production)

Ideal for: 100+ microservices, 50+ teams, multi-region, high compliance requirements.

Contact sales@devopsgenie.io for enterprise sizing assistance.

Resource Requests Reference

Platform Components (System Node Pool)

Component	CPU Request	Memory Request	Replicas
ArgoCD Server	100m	256Mi	2
ArgoCD Repo Server	250m	512Mi	2
Prometheus	500m	2Gi	1 (HA: 2)
Alertmanager	100m	128Mi	2
Grafana	200m	256Mi	1
Loki	500m	1Gi	1 (HA: 3)
OPA Gatekeeper	200m	512Mi	3
cert-manager	100m	64Mi	1
External Secrets	200m	128Mi	1
Karpenter	200m	256Mi	2
CoreDNS	100m	70Mi	2–5
Total (minimum)	~2.5 vCPU	~5 GiB

Workload Sizing Guidelines

Workload Type	CPU Request	Memory Request	Notes
REST API (Go/Rust)	100–250m	64–256Mi	Efficient runtimes
REST API (Node.js)	250–500m	256–512Mi	Event-loop model
REST API (Java/Spring)	500m–1	512Mi–1Gi	JVM overhead
Background worker	250–500m	256–512Mi	Depends on batch size
ML inference (CPU)	2–4	4–8Gi	Consider GPU nodes
Data pipeline	1–4	2–8Gi	Highly variable

Autoscaling Recommendations

HPA (Horizontal Pod Autoscaler)

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: payments-api
  namespace: team-payments
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payments-api
  minReplicas: 3
  maxReplicas: 50
  metrics:
    - type: Resource
      resource:
        name: cpu
        target:
          type: Utilization
          averageUtilization: 60
    - type: Resource
      resource:
        name: memory
        target:
          type: Utilization
          averageUtilization: 70
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 300
      policies:
        - type: Percent
          value: 25
          periodSeconds: 60
    scaleUp:
      stabilizationWindowSeconds: 0
      policies:
        - type: Pods
          value: 4
          periodSeconds: 60

VPA (Vertical Pod Autoscaler)

Use VPA in Recommendation mode initially to find the right requests:

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: payments-api
  namespace: team-payments
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: payments-api
  updatePolicy:
    updateMode: "Off"   # Recommendation only — don't restart pods

# View VPA recommendations after 24–48h of production traffic
kubectl get vpa payments-api -n team-payments -o yaml | \
  yq '.status.recommendation'

Cost Optimization Tips

Use Spot/Preemptible instances for stateless workloads — 70–80% savings over on-demand
Enable Karpenter consolidation to right-size nodes automatically
Set CPU limits to prevent noisy-neighbour issues, but don't set memory limits (use requests only to avoid OOMKill under throttling)
Use Graviton/ARM nodes (AWS m7g) for compatible workloads — 20% better price-performance
Schedule batch workloads off-peak using CronJob startingDeadlineSeconds

# Get a cost estimate for your current configuration
devopsgenie sizing estimate \
  --provider aws \
  --region us-east-1 \
  --tier growth

Cluster Tiers​

Starter (Development / Evaluation)​

Growth (Small Production)​

Scale (Mid-Size Production)​

Enterprise (Large-Scale Production)​

Resource Requests Reference​

Platform Components (System Node Pool)​

Workload Sizing Guidelines​

Autoscaling Recommendations​

HPA (Horizontal Pod Autoscaler)​

VPA (Vertical Pod Autoscaler)​

Cost Optimization Tips​