Monitoring and Cost Optimization

Cost Optimization Framework

Level 1: Basic Monitoring (Day 1-7)

Basic Setup:

  1. AWS Budgets

    • Create Budgets to ensure "safe AWS usage" and timely response measures
      • Budget 1: $50/month (Alert at 80%)
      • Budget 2: $25/month (Alert at 50%)
      • Budget 3: $10/day (Alert at 100%)
  2. Billing Alerts

    • Create CloudWatch Alarms:
      • $25 threshold Email warning
      • $50 threshold Email + SMS
      • $75 threshold Email + SMS + Slack
  3. Cost Explorer

    • Enable daily cost reports
    • Set up service-level breakdown
    • Monitor top 5 cost drivers

Level 2: Advanced Analytics (Week 2-4)

Deep Dive Analysis:

  1. Custom CloudWatch Metrics

    import boto3
    
    def publish_cost_metrics():
        ce = boto3.client(''ce'')
        cw = boto3.client(''cloudwatch'')
    
        # Get daily cost
        response = ce.get_cost_and_usage(
            TimePeriod={
                ''Start'': ''2025-01-01'',
                ''End'': ''2025-01-02''
            },
            Granularity=''DAILY'',
            Metrics=[''BlendedCost'']
        )
    
        cost = float(response[''ResultsByTime''][0][''Total''][''BlendedCost''][''Amount''])
    
        # Publish to CloudWatch
        cw.put_metric_data(
            Namespace=''AWS/Billing/Custom'',
            MetricData=[
                {
                    ''MetricName'': ''DailyCost'',
                    ''Value'': cost,
                    ''Unit'': ''None''
                }
            ]
        )
    
  2. Resource Tagging Strategy

    Mandatory Tags:
    - Project: project-name
    - Environment: dev/staging/prod
    - Owner: email@domain.com
    - CostCenter: department
    - AutoShutdown: true/false
    - CreatedDate: YYYY-MM-DD
    

Emergency Cost Control

Critical Actions (>$150 spent)

  1. Immediate Resource Audit

    # Find most expensive resources
    aws ce get-cost-and-usage \
      --time-period Start=2025-01-01,End=2025-01-31 \
      --granularity MONTHLY \
      --metrics BlendedCost \
      --group-by Type=DIMENSION,Key=SERVICE
    
  2. Emergency Shutdown Protocol

    # Stop all non-critical instances
    aws ec2 stop-instances --instance-ids $(
      aws ec2 describe-instances \
        --filters "Name=tag:Critical,Values=false" \
        --query ''Reservations[].Instances[?State.Name==``running``].InstanceId'' \
        --output text
    )