AI Agents Integration

Learn how AI agents can interact with Conductor Framework through the REST API.

Introduction

Conductor Framework's REST API makes it ideal for AI agent automation. Agents can deploy services, monitor health, manage parameters, and troubleshoot issues programmatically.

Benefits of API-Driven Approach

  • Automation: Deploy and manage Kubernetes applications without manual intervention
  • Orchestration: Coordinate complex multi-service deployments
  • Monitoring: Continuously monitor service health and events
  • Troubleshooting: Automatically diagnose and resolve deployment issues
  • Integration: Easily integrate with CI/CD pipelines and automation tools

Prompt Examples

Example 1: Creating a New Conductor

Prompt: "Create a new conductor for managing a microservices application with Redis and PostgreSQL"

Expected Output Structure:

package main

import (
    "context"
    "embed"
    "log"
    "github.com/garunski/conductor-framework/pkg/framework"
)

//go:embed manifests
var manifestFS embed.FS

func main() {
    ctx := context.Background()
    cfg := framework.DefaultConfig()
    cfg.AppName = "microservices-conductor"
    cfg.ManifestFS = manifestFS
    cfg.ManifestRoot = "manifests"
    
    if err := framework.Run(ctx, cfg); err != nil {
        log.Fatalf("Error: %v", err)
    }
}

Key Components to Include:

  • main.go with framework initialization
  • manifests/ directory with service manifests
  • DeploymentParameters CRD for configuration
  • Template-based manifests with parameter substitution

Example 2: Deploying Services

Prompt: "Deploy the frontend and redis-master services using the conductor API"

API Call:

curl -X POST http://localhost:8081/api/up \
  -H "Content-Type: application/json" \
  -d '{
    "services": ["frontend", "redis-master"]
  }'

Expected Response:

{
  "message": "Successfully deployed all manifests"
}

Error Handling:

{
  "error": "reconciliation_error",
  "message": "Deployment failed",
  "details": {
    "resource": "default/Deployment/frontend",
    "error": "ImagePullBackOff"
  }
}

Example 3: Monitoring and Health Checks

Prompt: "Check the health status of all services and report any errors"

Step 1: Get Service Health

curl http://localhost:8081/api/services/health

Response:

{
  "services": [
    {
      "name": "frontend",
      "namespace": "default",
      "status": "healthy",
      "lastChecked": "2024-01-01T00:00:00Z"
    },
    {
      "name": "redis-master",
      "namespace": "default",
      "status": "unhealthy",
      "lastChecked": "2024-01-01T00:00:00Z"
    }
  ]
}

Step 2: Query Error Events

curl "http://localhost:8081/api/events?type=error&limit=10"

Error Detection Pattern:

  1. Check service health status
  2. If unhealthy, query error events for that service
  3. Analyze error messages and details
  4. Take corrective action based on error type

Example 4: Parameter Management

Prompt: "Update the deployment parameters to scale redis to 5 replicas"

Step 1: Get Current Parameters

curl http://localhost:8081/api/parameters

Step 2: Update Parameters

curl -X POST http://localhost:8081/api/parameters \
  -H "Content-Type: application/json" \
  -d '{
    "global": {
      "namespace": "default",
      "replicas": 1
    },
    "services": {
      "redis": {
        "replicas": 5
      }
    }
  }'

Step 3: Redeploy to Apply Changes

curl -X POST http://localhost:8081/api/update \
  -H "Content-Type: application/json" \
  -d '{"services": ["redis"]}'

Parameter Update Workflow:

  1. Retrieve current parameters
  2. Modify parameter values
  3. Update parameters via API
  4. Trigger reconciliation to apply changes
  5. Verify deployment status

Example 5: Troubleshooting

Prompt: "Diagnose why the frontend service deployment failed"

Step 1: Get Error Events

curl "http://localhost:8081/api/events/errors?limit=20"

Step 2: Filter by Resource

curl "http://localhost:8081/api/events/default/Deployment/frontend?limit=10"

Common Error Patterns:

  • ImagePullBackOff: Image not found or registry access issue
  • CrashLoopBackOff: Container crashing on startup
  • Pending: Resource constraints or scheduling issues
  • RBAC Error: Insufficient permissions

Debugging Workflow:

  1. Query error events for the failing resource
  2. Examine error messages and details
  3. Check related resources (pods, services)
  4. Review manifest configuration
  5. Apply fixes and redeploy

API Integration Patterns

REST API Endpoints Summary

Method Endpoint Purpose
GET /healthz Health check
GET /readyz Readiness check
GET /api/services List all services
GET /api/services/health Get service health status
POST /api/up Deploy services
POST /api/down Remove services
POST /api/update Update services
GET /api/events List events with filtering
GET /api/events/errors Get recent errors
GET /api/parameters Get deployment parameters
POST /api/parameters Update deployment parameters

Authentication Considerations

Note: Currently, the API does not require authentication. In production environments, you should:
  • Add authentication middleware (JWT, API keys, etc.)
  • Implement rate limiting
  • Restrict CORS to specific origins
  • Use TLS/HTTPS for all API calls

Error Handling Patterns

All API endpoints return structured error responses:

{
  "error": "error_code",
  "message": "Human-readable error message",
  "details": {
    "key": "value"
  }
}

Common Error Codes:

  • not_found - Resource not found
  • validation_error - Validation failed
  • invalid_yaml - Invalid YAML format
  • kubernetes_error - Kubernetes API error
  • reconciliation_error - Reconciliation failed

Best Practices for Agents

  • Idempotency: All deployment operations are idempotent - safe to retry
  • Polling: Poll health endpoints periodically rather than assuming immediate success
  • Error Recovery: Implement retry logic with exponential backoff for transient errors
  • Event Monitoring: Subscribe to event streams for real-time status updates
  • Resource Validation: Validate parameters before updating to prevent invalid configurations

Workflow Automation

Multi-Step Agent Workflows

Example: Complete deployment workflow

# 1. Check conductor health
curl http://localhost:8081/healthz

# 2. Get current parameters
curl http://localhost:8081/api/parameters

# 3. Update parameters if needed
curl -X POST http://localhost:8081/api/parameters -d '{...}'

# 4. Deploy services
curl -X POST http://localhost:8081/api/up -d '{"services": [...]}'

# 5. Wait and check health
sleep 10
curl http://localhost:8081/api/services/health

# 6. Monitor events
curl http://localhost:8081/api/events?limit=10

State Management

Agents should track:

  • Deployment state (deployed, updating, failed)
  • Last reconciliation time
  • Error history and patterns
  • Parameter change history

Retry Logic Patterns

def deploy_with_retry(service, max_retries=3):
    for attempt in range(max_retries):
        response = deploy_service(service)
        if response.status == "success":
            return response
        if response.error_code == "kubernetes_error":
            wait_time = 2 ** attempt  # Exponential backoff
            time.sleep(wait_time)
        else:
            break  # Don't retry non-transient errors
    raise DeploymentError("Failed after retries")

Idempotency Handling

All deployment operations are idempotent. Agents can safely:

  • Retry failed operations without side effects
  • Re-apply configurations that are already applied
  • Call update endpoints even if nothing changed

Next Steps