IBM - Cloud DevOps/SRE Engineer

Designed and maintained resilient infrastructure enabling high uptime for uptime for IBM Cloud's Key Management Service

Project Overview:

Leading critical infrastructure initiatives for IBM Cloud's Key Management Service, focusing on high availability, security, and automated operations at enterprise scale.

Key Responsibilities:

  • Architected and maintained CI/CD pipelines for automated testing and deployment
  • Designed observability solutions for containerized services
  • Led secret rotation and security automation initiatives

Technical Environment:

  • Cloud Services: IBM Cloud's Key Management Service, Secrets Manager, LogDNA, IKS(Managed Kubernetes), Postgres DB
  • Tools: Jenkins, Docker, Ansible, Kubernetes, Prometheus, Linux, Terraform
  • Languages: Golang, Python, Bash

Major Achievements:

  • Enterprise-Scale Secret Management Automation
    • Challenge: Manual secret rotation across multiple services created significant security risks and operational burden. Quarterly compliance requirements were time-intensive, pulling team resources from critical development work and increasing risk of human error.
    • Solution & Impact: Architected an automated secret rotation pipeline with built-in compliance checks and manual override capabilities. The system automatically handled quarterly rotations while enabling on-demand updates when needed.
    • This resulted in:
      • 85% reduction in credential management time
      • 100% compliance maintenance
      • Enhanced security posture through elimination of manual handling
      • Freed engineering resources for strategic initiatives
  • Kubernetes Network Architecture Optimization
    • Challenge: Complex multi-region architecture requiring service-to-service communication through public APIs, creating unnecessary security risks and operational complexity. Multiple authentication layers, region-specific credentials, and both on-prem/cloud deployment options complicated the infrastructure.
    • Solution & Impact:Implemented pod-to-pod communication strategy through Kubernetes networking configurations, revolutionizing our service architecture.
    • This resulted in:
      • Eliminating need for external API calls between services
      • Removed authentication/authorization complexity
      • Reduced security attack surface
      • Simplified multi-region deployment architecture
      • Decreased operational overhead for credential management
  • Advanced CI/CD Pipeline Evolution
    • Challenge: Need for reliable, secure, and efficient multi-region deployment process supporting extensive testing requirements and compliance checks. Migration from Jenkins to Tekton required maintaining service reliability while enhancing automation capabilities.
    • Solution & Impact: Led the design and implementation of comprehensive CI/CD pipelines, evolving from Jenkins to Tekton while expanding automation coverage:
      • Automated feature testing integration
      • Enhanced compliance verification systems
      • Implemented robust load testing frameworks
      • Created multi-region deployment orchestration
      • Built automated monitoring and logging analysis
      • Reduced deployment issues through improved testing automation

Key Learnings:

  • Importance of automation in maintaining security at scale
  • Balance between innovation speed and operational stability
  • Critical role of observability in modern cloud services

Skills Advanced:

  • Container Orchestration
  • Infrastructure as Code
  • CI/CD Pipeline Design
  • Cloud Security

My Recent Work