Friday, December 8, 2017

The Making of an Iceberg or: Saying No as an Infrastructure Engineer

An iceberg

I recently convinced a teammate of mine to give an internal talk on the topic of saying no as an SRE. Both the talk and my teammate were just as fantastic as I expected. As an SRE (and manager) I think knowing how to say no productively is a hugely important yet under-discussed part of our job. So in that light, I thought I’d share some thoughts on how and when infrastructure engineers should say no.

First things first - while I hope everyone can find some value in this post, it’s targeted toward engineers/managers in traditional backend technical roles (i.e. SRE, Systems engineers, Ops engineers, Infra engineers, etc.) whom, for practical purposes, I’ll bucket under the umbrella of infrastructure engineers.

Why we're here


To understand why we might say no, we must first understand our role.

The stereotype of the grumpy, change averse sysadmin is (unfortunately) still alive and well. In the intra-company political landscape, the people running the infrastructure are often seen as the Party of No. Why is this? From a high level we’re working toward the same goals as everyone else. In fact, infrastructure only exists to enable the business. Whether providing a platform for a website, app, analytics pipeline, MMORPG, streaming video service, or making sure people’s calendars stay in sync, infrastructure’s entire raison d’ĂȘtre is supporting the business. So why the negative rep? I believe it stems from the differing risk models between the people building infrastructure and the people building on top of it and that this divergence arises due to a fundamental difference in how infrastructure engineers add value to the business and how others do.

As infrastructure engineers, our job is to support business operations while minimizing operational risk. It’s a tough thing to balance and, since folks operating in the layers above us tend to focus mostly on the former, it’s easy start leaning toward the latter. Don’t fall into this trap. At the end of the day it’s the business operations, not the infrastructure, that’s pays your bills.

So why say no?


You probably already know the answer. As the people with the most familiarity with the capabilities of our infrastructure and the environment it operates in, we are often in the best position to articulate the operational risk of a given task (NOTE: this is not the same as being in the best position to decide whether the company should do a thing). This means it’s part of our job to push back when warranted to keep the business operating within its envelope of tolerable risk. Sometimes that push back begins with a no.

Saying no can be difficult


Surprisingly, saying no is often harder than you think. One of the biggest risks to saying no, ironically exacerbated by the improved relationship between infrastructure and product forged by the DevOps movement, is good inter-team relationships. Turns out it’s hard to say no to people you like. If you’re in this situation you need to be diligent as letting things slide can cause lasting harm to you, your team, and the business as whole.

Another major risk are the consistently bad time estimates that are systemic to the software industry. Engineers frequently overestimate how quickly they can accomplish tasks, increasing the likelihood of saying yes to something they shouldn’t.

Getting to no


We know we should say no when warranted, but how do we know when a no is warranted? The first thing to remember is that every no is a judgement call. Some calls are easier to make than others, but there’s no flowchart that’ll guide you to the right answer for every situation. Recognize that every no must come with a reason and having confidence in your reasoning will require due diligence.

While I won’t attempt to cover the myriad reasons why you might say no to a given request (there’s a reason this is more of an art than a science), I’ve outlined a few points that are useful to keep in mind tasked with making a decision.

When weighing risk remember bias is heavier than data

Recognize that, due to the nature of your job, your unconscious mind may be more risk averse than your realize. If you think something is too risky, prove it to yourself. Understand the business impact and do the math. Is the failure mode acceptable or will it put you out of business?

When priorities aren’t clear, escalate

Sometimes you can’t do a thing because you simply have too much on your plate. If this is the case consider articulating what work you’d have to give up (and the risk associated with it) to get the thing done and escalate to the appropriate level of the business to get a decision.

Take time to understand the big picture

You may be being asked to do something that conflicts with your understanding of what the business cares about. Escalate these issues with the aim to resolve the strategic question. It may be that your understanding of the business is wrong.

Find a path to yes

Think about what it would take for you to say yes, no matter how absurd. Being able to put a cost on what is being asked gives the businessa chance to evaluate the request in the language it understands best.

Be your own biggest skeptic

Don’t take your inner monologue at face value. For any non-trivial decision you should validate your argument with someone else (preferably someone outside of the direct decision making process) or your risk playing yourself.

Be thoughtful and be respectful

When you say no, you’re putting your reputation on the line. Carelessness can lead to future decisions being routed around you and impact your career trajectory. Assume everyone involved is trying to do what they think is the right thing for the business. Treat these decisions the same way you want your peers to treat your proposals.

Scope your no


Now that we’ve done our homework and decided a no is warranted, how do we communicate it? Consider your role, as I described it, again: support business operations while minimizing operational risk. You may have noticed that nowhere in that description does it mention blocking business operations.

When you say no, be specific about what you’re saying no to. You’re not saying that you can’t do real time sentiment analysis across all internet comments. You’re saying that with your current expertise/architecture/budget/headcount/roadmap/timeline/etc. isn’t feasible. Remember,
 unless you’re also the CEO it’s not your job to make business decisions. You’re there to articulate the operational risk and costs involved.

Finally we get to the title of the post. 90% of an iceberg’s mass is underwater. This is approximately the balance you should aim for when communicating the reason behind your no. All the diligence and work you did to was to get confidence in your decision. Your actual communication should be crisp, to the point, and provide the minimal viable amount of supporting information to make your reasoning clear. Stuffing your entire decision tree into your response will just muddy your point. Besides, if you do get asked for more justification, you’ll have it on hand.

And that is how to make an iceberg.