terminal

shivang@cloud:~$ whoami

DevOps Engineer • SRE mindset • Platform Reliability

I build systems that don’t break.

Building scalable, reliable systems using modern cloud infrastructure, ensuring seamless operations and robust incident readiness.

Current focus:Terraform

System Status: All services operational

Last incident: resolved • postmortem completed • learnings shipped

Platforms Handling $4B+ daily trades

Platforms Handling $4B+ daily trades

100+ microservices deployed

100+ microservices deployed

MTTD(Mean Time to Detect) ↓ 40%

MTTD(Mean Time to Detect) ↓ 40%

RTO(Recovery Time Objective) < 30m

RTO(Recovery Time Objective) < 30m

About

A bit of story + a bit of philosophy.

I’m Shivang — a DevOps/Site Reliabilty engineer who thrives on building cloud-native platforms that are not just reliable but also efficient. From designing systems on AWS and Azure to implementing Kubernetes and Terraform solutions, I specialize in creating environments that run seamlessly under pressure.

My approach blends technical expertise with design thinking: I focus on creating systems that not only perform well but also give teams the tools they need to operate with confidence. Whether it’s through clear, actionable alerts or responsive dashboards, my goal is to make systems both powerful and simple to use.

Cloud InfrastructureAutomation & CI/CDIncident ManagementHigh Availability & Scalability

What’s new I’m working on:

  • Multi-region infrastructure
  • Advanced SLOs and error budget strategies
  • Serverless Architectures

How I think (and build)

Flip the switches. That’s basically my mindset in production.

Automation > Manual Ops

toggle to expand

I turn repeatable work into pipelines, scripts, and IaC—so humans focus on decisions, not clicks.

Observability First

toggle to expand

Metrics + logs + alerting before scale. If we can’t see it, we can’t own it.

SLO-driven Reliability

toggle to expand

Secure by Default

toggle to expand

Projects (the real work)

Short, outcome-focused, and production-relevant.

  • Modernized a $4B+ daily trade platform, enhancing runtime stability and performance.
  • Monolith → 100+ microservices on AWS EKS for scalability and faster releases.
  • Standardized deployments using ArgoCD + Jenkins across environments.

Toolbox

Tools I’ve used in production (not just “familiar with”).

Cloud

  • AWS (EKS, VPC, IAM, ALB/NLB, EC2, S3, CloudWatch)
  • Azure (DevOps, Backup, DR)

Platform

  • Kubernetes (EKS)
  • Docker
  • Helm
  • Ingress Controllers
  • Autoscaling

Delivery

  • Jenkins
  • ArgoCD
  • GitHub Actions
  • GitOps workflows

IaC

  • Terraform
  • Modular stacks
  • State management

Observability

  • Prometheus
  • Grafana
  • Loki
  • Alertmanager
  • SLI/SLOs

Security

  • RBAC/IAM
  • Audit trails
  • SIEM monitoring
  • WORM retention

Let’s build something reliable

If you’re hiring for DevOps/SRE, I’m happy to share deeper architecture + incident learnings.