We’re a small consultancy built for the moment when your infrastructure becomes the thing slowing the company down. We don’t do discovery decks. We don’t do junior bait-and-switch. We don’t auto-renew you. Every engagement is led by a senior engineer who’s shipped production AWS and AI infrastructure for years — the same person you’ll talk to on the first call.
Most consultancies hide their position behind soft language. We’d rather state ours flatly. If any of this disqualifies us, we’d rather you find out now.
Not vanity metrics. These are the actual numbers that describe how we work — the kind your procurement team would ask about anyway. We’d rather just publish them.
Three named senior engineers run every engagement. There’s no bench. There’s no "team" hiding behind a logo. These are the humans you’ll be on calls with from week one to handover.
Started Cloudico in 2022 after a decade running production AWS infrastructure at fintechs and AI-native startups. Migrated three companies off of vendor lock-in; the first one is what convinced him to build a consultancy that does only this.
Lead engineer on every AI Infrastructure engagement. The person who decides whether vLLM vs Triton vs TensorRT-LLM is the right call for your workload — usually within the first 20 minutes of looking at your inference logs.
Joined Cloudico in 2023 after running SRE for a Series C SaaS through the on-call hell that comes with 99.5% → 99.99% uptime targets. The reason we publish median MTTR numbers on the Reliability page — she insisted we be specific or not claim anything.
Owns every Reliability Engineering sprint. Designs the SLOs with your product team, tunes the alerts, runs the fire drills, and facilitates the first three postmortems with your engineering org before handover.
The reason Cloudico has an AI Infrastructure practice at all. Spent four years at an AI-native unicorn building the inference platform that cut their Bedrock dependency in nine months. Joined in 2024 to do the same kind of work for other teams.
Specializes in vLLM, Triton, and the operational details of running LLM inference at production scale. The person who’ll explain to your CFO why your Bedrock bill compounds non-linearly — with the actual graph.
Plus three additional senior engineers on rotation · no contractors, no juniors, no offshoring
Not "our values" — we don’t do values posters. These are the operating principles that decide what we will and won’t do, with the real consequences attached.
No staffing changes mid-engagement. No "our principal architect designed the scope, here’s the senior who’ll execute it." The person on the first call is the person who writes the Terraform.
We cap concurrent engagements at 8 firm-wide. We turn work away rather than break this.
Your problem isn’t our problem to win. If a competitor would do better work for your situation — pricing, geography, specialty, urgency — we name two of them on the call and end the meeting on time.
19 engagements sent out as referrals to date. Most of those teams now refer work back when they’re overbooked.
Every engagement ends with a real handover — recorded walkthroughs, runbooks, and 30 days of post-handover Slack support. After that, your team owns it. We’re not building a managed services annuity here.
Most engagements end cleanly. The teams that come back (66% of clients) do so because they have a new problem — not because they can’t run what we built.
Fixed-scope, fixed-price. Written in the proposal. If we underestimated the work, that’s our cost to absorb — not yours. You pay what was quoted, on the date that was quoted.
We’ve eaten ~$80k in overruns across 47 engagements. That’s the price of taking estimation seriously.
The patterns we use in client work end up in public — Terraform modules, blog posts, talks, GitHub repos. If your team finds our content first and never hires us, that’s a successful outcome too.
14 open-source releases. 23 long-form essays. We get hired in part because our public work shows up before our marketing does.
We don’t hide where we’re based or pretend to be a different city than we are. Lahore is home. The work is global.
Cloudico is built in Lahore, Pakistan — not a remote-friendly tagline, not a "global remote team" euphemism. The senior engineers running your engagement work from the same time zone, with the same operational disciplines we’d apply if we were sitting in Brooklyn or Berlin.
What that means in practice: your calls happen on a normal Western business calendar. Asia-Pacific clients get even better coverage. The four-hour-response promise is calibrated to PKT business hours (UTC+5), and our typical overlap with US Pacific is 2–3 hours every workday.
Open-source releases, long-form essays, conference talks, upstream contributions. The patterns we use in client work end up here. If you find this list useful and never hire us, that’s still a win.
14 open-source releases · 23 long-form essays · 7 conference talks · see all
Most "about" pages give you a mission statement written by committee. This is a letter from Hassan, written once, lightly edited, never sent through marketing review.
The consultancy industry has a credibility problem. You know it. I know it. The teams I’ve worked alongside — on both sides — know it. Discovery calls that are really qualification screens. Junior engineers shipping work that was sold by principals. Sequenced emails from people you’ve never met, asking if the proposal "still makes sense for your priorities." Auto-renewing contracts that nobody remembers signing.
I started Cloudico because I wanted to build a different kind of firm — one where the senior engineer who shows up on the first call is the same one who writes the code three months later. Where pricing is on the website. Where if we miss the timeline, we eat the cost. Where if we’re not the right fit, we’ll tell you who is — and mean it.
This sounds idealistic until you realize the math works. We cap engagements at eight. That’s the whole firm. It means we have to be honest about what we can deliver, because we can’t outrun a bad reputation by churning through new logos. The math forces the discipline.
The clients who’ve hired us — 47 engagements across four years — tell us the thing they remember most is the first call. Not because we did anything clever on it. Because we showed up prepared, gave honest answers, and didn’t try to sell. That’s a low bar in this industry, and clearing it is most of what we do.
If your infrastructure is the thing slowing your company down right now — the AWS bill, the on-call rotation, the inference cost, the migration that’s been "next quarter" for two years — we’d like to look at it with you. Worst case, we’re not the right fit and we send you two referrals. Best case, we ship the work and disappear when you don’t need us anymore.
Either way, no CRM sequences. You have my word.