Skip to main content

Command Palette

Search for a command to run...

From Code to Architecture: Lessons from Six Years of Shipping

The hardest part of backend engineering isn't writing code. It's deciding what code not to write — and what discipline to add around the code you do.

Updated
4 min read
From Code to Architecture: Lessons from Six Years of Shipping
K
I am a Lead Full Stack Engineer with 6.5+ years of experience building scalable cloud-native platforms, distributed systems, and production-grade applications across telecom, fintech, govtech, and edtech domains. My core strength is backend engineering with Java, Spring Boot, microservices, and AWS, but I work across the entire delivery pipeline — from schema design and APIs to frontend interfaces and deployment systems. I describe my engineering style with one line: “I ship end-to-end. Schema to surface. No handoffs.” I believe strong engineering comes from ownership, not isolated specialization. The same engineer who designs the service should understand the UI consuming it, the deployment pipeline running it, and the metrics validating it in production. That mindset has shaped how I build systems, mentor teams, and deliver software. Over the years, I have worked on carrier-scale enterprise platforms, CRM modernization systems, loan-processing applications, real-time tutoring infrastructure, and department-scale governance portals. Across every domain, the engineering discipline remains the same: understand the problem deeply, design clear system boundaries, instrument what matters, and deliver measurable outcomes. My backend stack primarily revolves around Java, Spring Boot, Spring Cloud, distributed microservices, REST APIs, authentication systems, caching, resiliency patterns, and performance optimization. I have also built extensively using Node.js and NestJS for modern service architectures. On the frontend side, I work with React, Angular, TypeScript, and React Native to deliver responsive and scalable user experiences. I have hands-on experience with cloud-native infrastructure and DevOps workflows using AWS services like EC2, Lambda, S3, ECR, RDS, CloudWatch, CodeBuild, and CodePipeline, along with Docker, Jenkins, SonarQube, Grafana, ELK Stack, and CI/CD automation. I care deeply about observability, operational visibility, and systems that remain maintainable under scale. One thing that defines my approach is that every system should move a metric. I focus on engineering outcomes — improving performance, reducing operational friction, increasing delivery speed, simplifying developer workflows, or creating better user experiences. If a feature does not create measurable lift, it is incomplete. I am also deeply interested in modern AI-assisted engineering workflows. I actively use tools like GitHub Copilot, Claude, Gemini, Cursor, and agentic development systems to accelerate development, improve productivity, and rethink how software teams build products at scale. Beyond coding, I enjoy mentoring engineers, improving engineering standards, reviewing architectures, and building systems that other developers can scale confidently. I value clarity over complexity, practical execution over theoretical perfection, and shipping over endless planning. Today, my focus areas include distributed systems, platform engineering, cloud-native architecture, AI-powered developer tooling, scalable backend infrastructure, and modern full-stack application design. Backend-deep. Full-stack by delivery. Schema to surface. Service to screen. No handoff costs.

The hardest thing about backend engineering isn't writing code. It's deciding what code not to write — and what discipline to add around the code you do.

I've spent six and a half years building production systems across telecom, fintech, govtech and edtech. The languages changed. The frameworks changed. The cloud providers changed. Five patterns kept showing up.

1. The boundary between services is a contract, not a suggestion

Every microservice migration I've seen go badly started with the same mistake: treating service boundaries as code-organization, not contract-organization.

The test is simple. If a downstream team can ship a breaking change without your service knowing about it, you don't have a contract. You have an internal call masquerading as a network call — the worst of both worlds. You pay the latency cost of HTTP and the coupling cost of a shared module.

What it looks like when it's right:

  • Schemas live in a versioned, language-agnostic format (OpenAPI, protobuf)
  • Breaking changes get a /v2, never a same-version mutation
  • Consumers can pin a version; producers can't yank one out from under them
  • A test suite proves you serialize what you say you serialize

This is boring. It's also the reason your platform doesn't catch fire on a Friday.

2. Idempotency is a feature, not a hope

Networks fail. Retries happen. The question isn't whether your service will see the same request twice. The question is whether the second time hurts.

The cheapest implementation is a request-id header that the caller generates and the server caches the result against. If the same id arrives twice, the server returns the cached response and skips the side effect. Costs you a Redis key and a hash check. Saves you a duplicate charge or a duplicate row.

I've watched teams without this build elaborate compensation flows to undo what the second request did. They never work as well as not doing the work twice in the first place.

3. Retries are a contract too

Retries between services without a budget are how cascading failures start. Service A retries B three times, B retries C three times — that's nine attempts on C from one logical call. C is already struggling; now it's seeing 9× load.

Two rules I won't break:

  • Bound the retry budget end-to-end, not per-hop. Pass a deadline header. Each hop checks it before retrying.
  • Circuit-break aggressively. After N failures in a window, stop calling. Surface a 503 fast. The system recovers faster from a clean failure than from slow degradation.

Spring Cloud's circuit breaker (Resilience4j under the hood) does this in three lines of config. Most Node teams hand-roll something with a global counter and call it a day. Both work; the question is whether you're explicit about the budget or just hoping.

4. Observability before incidents, not during

The single biggest predictor of how a system survives an incident is whether the operator can see what's happening right now. Not yesterday's logs. Now.

The minimum viable trio:

  • Structured logs with a request-id propagated through every service touched
  • One latency dashboard with p50 / p95 / p99 per route, refreshed in seconds
  • Error rate per dependency, not just per service — so you know which downstream is the problem before users do

Add this before you have the incident. The pattern I've seen kill teams: rolling out fancy distributed tracing two weeks after a P0, while the lessons are still in postmortem format.

5. If it doesn't move a metric, don't ship it

Every system I've owned shipped with a number attached — throughput lifts, manual-effort reductions, faster cycle times, integrations going live across enterprise systems.

Not because metrics make engineers feel important. Because the discipline of choosing a metric before you start coding forces you to know what "done" means. Every one of those numbers existed in a spec before the first line of code did. The code was the cheap part. The agreement on the metric was the expensive part.

The corollary: if you can't articulate the metric, you don't understand the work yet. Go back and ask.


Six years compresses to this: code is the easy part. The hard part is the discipline you build around the code — boundaries, idempotency, retries, observability, metrics — that lets the code keep working when the world gets noisy.

Everything else is a tool to serve that.