Thoughts from an Operations Wrangler: how we use alerts to monitor Wavefront

Thoughts from an Operations Wrangler; I lead the production engineering team, running one of the largest SaaS observability platforms on the planet. Wavefront started in 2013 (I joined in 2016) and was acquired by VMware in 2017. These thoughts are all mine.


I find myself more and more talking with Wavefront users – both internally and externally – on How Wavefront uses Wavefront to Wavefront.

Or, how Wavefront’s Production Engineers use Wavefront to run Wavefront.

And, in what I hope is a series of posts, I hope to go a bit deeper into how we do what we do.

At our core, as production engineers, Reliability is our product. Alerts are fundamental to that and the starting point.

why alerts?#BeachOps because sometimes it's better at the beach

We don’t look at dashboards until an alert tells us to. We don’t look at charts or create ad-hoc queries until an alert tells us to.

We generally aspire to sit beachside until an alert says otherwise.

Within Wavefront Operations we have a few truisms. Alerts are:

  • always evolving
  • actionable or informative (more on this later)
  • any alert that pages is an alert that keeps us from the beach (and tacos)

An alert is the system telling us to go look at a thing. It’s the system telling us something important is outside of some definition of normal.

The majority of our alerts measure the rate of change of a metric or a change in the slope of a line or some other complicated math & science.

Data Ingester SQS Message Processing Variance Detected

variance(rate(ts(dataingester.sqs.processed, context=* and (tag="*-primary" or tag="*-secondary"))), hosttags, context) > 10

the evolution of an alert

Alerts start as

  • a chart, exploring the data or patterns
  • an alert where we test our hypothesis using Wavefront’s back testing
  • an experimental alert, tagged with an alert tag path “experimental”, with an alert destination to anywhere but PagerDuty. It’s here where we refine the alert – it should not contribute to alert fatigue.

Eventually, we have a “production push” and the alert is live.

But we aren’t done until we have the alert automagically fixed through an integration with something like Jenkins or Stackstorm.

why get up when Stackstorm can handle this?

when an alert triggers there must be an action

We try to frame things as the “2am problem” – what am I willing to wake up for at 2am? There are many things for which I will and many more that I won’t.

When an alert fires it must have some action. And generally, we obsess about refining alerts such that when it fires, there is little to no debugging. Because in Wavefront, alerts & queries can be so precise, we evolve the alert such that it represents a singular action.

Singular actions → computer code → an alert that triggers a webhook → leave me along, I’m in bed sleeping.

alerts are actionable… sometimes

Not all alerts page out. When we do get a page we want to assimilate as much information as possible. As quickly as possible.

What else is going on in the system?

contextual alertsWe use alerts to do that too. Alerts should also be informative. We label them as INFO or SMOKE but they help bring context. And since alerts are overlayed in Wavefront charts, we get even richer context.

#BeachOps

Ultimately we want to be at the beach. And an alert that fires is an alert that keeps us away from the beach.

Everyone talks about “single pane of glass” but we use Wavefront as our First Pane of Glass, consolidating disparate metrics sources into single charts and into single alerts. You might call this full stack alerting — we call it #BeachOps.

We leverage Wavefront’s analytics engine and query language to build alerts that are actionable by a computer. Or by a human where we use Wavefront to provide as much context as possible. We also constantly evolve alerts. Taken together, these have helped prevent alert fatigue and keep the team size small while the infrastructure has grown by 400%.

And since Wavefront alerts can trigger actions in automation tooling like Jenkins or Stackstorm, we can spend our time at the beach. With tacos.

day337v2-relaxing-on-beach

15

IMG_3886Dear Amelia,

We’ve been married for 15 years today.

I thought about finding a mariachi band to celebrate. I thought about looking at homes and having the realtor tell me I have commitment issues. I thought about going to an orchid show with you.

I even thought about arguing over which dishes to buy.

But none of that felt right. Instead, I wrote this.


15 years ago we started on a path that has led us through three cities, four homes, five cars, two children, one dog (!) and more memories than I can fit in a Facebook post.

IMG_3888


15 years ago – in front of 400 people – I shared this story:

troopers-hands-in-handsIn 1848, gold was found at Sutter’s Mill. By 1849, people were flocking to California in search of treasure. “Eureka!” they shouted when the found their treasure.

In 1996 I came to California in search of my treasure. Three years later I met you. 

Eureka!


Man, it’s a hot one
Like seven inches from the midday sun
Well, I hear you whisper and the words melt everyone
But you stay so cool

SmoothMy muñequita,
My Spanish Harlem Mona Lisa
You’re my reason for reason
The step in my groove, yeah.

And if you said, “This life ain’t good enough.”
I would give my world to lift you up
I could change my life to better suit your mood
Because you’re so smooth

~ “Smooth“, Santana feat. Rob Thomas


IMG_388915 years, 5 months and 9 days ago we decided we wanted a future together. A future that might be filled with ups, might be filled with downs, might be more of one than the other. But we wanted to dream our dream and do it together.

And 15 years later we continue to stare up into the stars, hold hands, and dream of our future together.

on authenticity


Non-authentic is a virus in anything you do in life.  Non-authentic is not benign. It metastasizes like a tumor. ~ @cookflix


Upside down. Authentic.When I first started people managing, I was given some simple advise: always find one personal thing from each person’s life and remember to ask them about it every week.

I’ve always remembered that.  I don’t recall how much I put it into practice but as I’m learning how to lean into my strengths, I keep coming back to that.

I’m a little embarrassed but it was probably a forced exercise back then. I was new to world of people managing and near clueless on coaching.

Contrast that to today where I still reach back out to those I used to work with – my Tribe. Except that it isn’t forced.

Authenticity builds trust.  Authenticity builds connections.


au·then·tic [aw-then-tik]
adjective: not false or copied;


True leaders come from a place of authenticity. They may look like a heretic or a crazy dancing guy but they are always coming from an Authentic Place.

As I delve into leadership – and really coaching those I work with – I try to always come from that place. Its easy to say this is part of my brand. I’m not sure it always was but my time at Mozilla helped me focus on being as open and transparent as possible.


trust [trəst]
noun: assured reliance on character, ability, strength; one in which confidence is placed


And then you have this trust thing.  And here’s the deal. If you come from an Authentic Place, you generate trust.

Kate Stull (@katestull) says it best in her blog post “Death to top-down leadership models“,

“Teams are no longer content to accept the overarching pronouncements from a shadowy boss figure that they never see, let alone speak to. Instead, people want to be led by someone they know. Someone they trust.”

There’s that word. Trust. I can’t get to that place of trust without being authentic. I can’t build a connection with you if I’m not authentic.

Otherwise you’re just a virus.

Gallery

the grapefruit (pound cake ed.)

Citrus × paradisi


The Grapefruit is one of my favorite fruits, second only to The Strawberry and tied with The Gelato (which, for the sake of this argument, I’ll consider a fruit). The Grapefruit also has the distinction of holding the title of My Favorite French Word: Pampelmuse.
Le Pampelmuse
Perhaps it’s those memories of The Grapefruit for breakfast ~ a gorgeous half Pampelmuse sprinkled with sugar (and “sprinkled” is likely an under exaggeration), it’s flesh liberated with that funky, curved, doubled-edged serrated Grapefruit knife.

Or perhaps it’s the bitter, tangy, sweet taste of The Grapefruit Juice.

Or perhaps it’s my version of an Arnold Palmer with The Grapefruit instead of lemonade (which I guess is called a Leland Palmer?).

Whatever the reason, my heart fluttered and skipped a beat when I saw Deb Perelman’s Grapefruit Olive Oil Pound Cake recipe in her book.

“Must make”, I thought to my self.

grapefruit-EVO-birdsview

grapefruit olive oil pound cake with grapefruit glaze

The recipe I used came from Deb’s “The Smitten Kitchen Cookbook” (p 241) [Note:I swapped out the buttermilk for my wife’s homemade yogurt].

Recipe:

You should get the book, of course, or use this version of the recipe from her website.

grapefruit-collage

Expect the unexpected

Embrace the unexpected.  It might be the best experience ever.

Back in December I was at a leadership retreat. I had the chance to stand in front of a group of people and practice story telling. Somewhere in the middle of telling that story, the ending changed.

Expect the unexpected.
This was the story of my 2011 trip to Budapest when I lost my passport.

maxthesaxI took my first solo trip to Europe and made my way to Budapest with a friend over the weekend to see Parov Stelar at Boloton Sound. Excellent concert marred only by the fact that the night before I had lost my passport. On a Friday night and the US Embassy didn’t open until Monday morning. My flight was Sunday morning.

The story was supposed to be about how utterly horrible it is to lose your passport. About what a major inconvenience it was. About how I hated Budapest.

But the story changed.

I had all of Sunday to myself and if you’ve never been to Budapest in July I’ll tell you it’s one, if not the, hottest & most humid place on earth. I spent all day Sunday, miserable, hopping from one air conditioned coffee shop to another. Something to drink & free wifi.

As I shared my story I remembered that every coffee shop I visited played music.

Music tells a story
budapestbridgeAs I wrote back in March, at some point I learned that life without music is a waste. It often bookmarks points in time. It was the sound track to family vacations.

As I wandered around Budapest on that budapest-trainshot Sunday, I built a playlist of music – a soundtrack of my Sunday in Budapest – I heard while in random coffee shops. Some I had heard, some was new to me.

My story changed
As I tried to wrap up my story and how terrible it was to lose a passport, as I told my story of walking around picking up new music, suddenly the end of my story changed.

The play list of music I collected that Sunday is still on my phone. I listen to that playlist and it takes me back – instantly – to that Sunday.

My story telling ended with me realizing that that one day in Budapest on my own sits firmly in the list of top life events. Despite the heat. Despite the inconvenience.

Expect the unexpected.

happy birthday son

wnz-03Son,

A decade ago you entered my life. A decade ago you fundamentally changed me. I thank you for that.

A decade ago you turned one horrific date – September 11 – into a celebration. I thank you for that.

You make me realize every day that I am a father first and everything else is secondary. I thank you for that.

I try my best to be the best for you. Sometimes I’m sure I falter but you appear more forgiving than I am. I thank you for that.

You’ve allowed me to share all my passions with you – biking, computers, baseball, movies. I thank you for that.

I see the way your mind works, the way you create and invent and I stare in awe. Utter awe, even if it means I have to clean up a seemingly endless pile of Legos. And even then, I thank you.

I was so excited to see you that, on the morning you were born, I raced home from the hospital to shave! I wanted to look my very best for my boy.

In fact, I was so excited to see you that shortly after you were born at 5:11p I wrote this short story to add to your website:

wnz-06

Mommy started having contractions Wednesday night around 5pm. An hour later Daddy took us to the hospital and Mommy got comfortable for a long night ahead of her. Nurse Wendy said it was a full moon! Grandpa and Grandma Toosky stopped for a short visit.

On Thursday, after a lot of hard work (and several visits from my new family), I was born! Daddy helped cut my umbilical cord while Nurse Sue checked me over. Daddy says I was very well behaved and very alert – I didn’t cry at all until Nurse Sue gave me a quick bath. But I really liked when she washed my hair! Daddy said I had to get all fancy for Mommy.

At night, I slept, Mommy tried to sleep and Daddy kept watch over his new family.

In the last decade I’ve seen you crawl. I’ve seen you fall. I’ve seen you try. I’ve seen you cry, smile, laugh and bring joy to those around you. I’ve see you take your first steps (on the kitchen island!), your first train, your first plane, your first everything.

I’ve woken in the middle of the night to comfort you and not thought twice about it. And you’ve rewarded me by reading to me, by holding me, by telling me “Dad, you made me happy today.”

Son: You make me happy every single day. And for that I say thank you and wish you a happy 10th.

~ Dad

on starting @ lookout

Lookout

Those who don’t follow me on Facebook or LinkedIn probably don’t know this.

I have a new job.

That’s right. I’m joining the team at Lookout on Monday, September 9.

I can’t be more excited!

Getting ready

For the past two weeks I’ve been reading books and putting thought into my “First 100 Days” (and learned to blame FDR for this 100 day notion).

I’ve met with with my mentors (you all have mentors, right?), with my friends, and importantly, with my new co-workers. I’ve had breakfast with some, lunch with others and a baseball game with yet others. (My wife’s helped too!)

(I’ve also spent a little time learning chef & OpenStack.)

What I have right now is a very rough agenda. Not really a 100-day plan but an outline knowing that the first month or so will be immersed learning. Learning a new culture and a new dialect with its own set of acronyms and cadence.

But.

I. can. not. wait. to. start.

Why Lookout?

In “You’re in Charge–Now What?: The 8 Point Plan“, Neff & Citron suggest that as people begin to size me up and figure out who I am, I should find a way to share the reasons I took this job. So,

I started talking to Lookout nearly 5 weeks ago over a cup of coffee and throughout the interview process, I was struck by this common sense of passion. Struck by a sense of passion and excitement that started from that very first cup of coffee.

Struck that every interview felt like a conversation and not merely an interview.

Lookout, Fort KnoxStruck with the complexity of the technology and the essential question – how do you secure mobile devices? –  while knowing that the next billion users won’t be desktop users but mobile, knowing that users have content on mobile devices that define their life, content that would be better stored in Fort Knox than in someone else’s hands.

And somewhere in the middle I distinctly recall knowing that these are people I want to work with.

What am I going to be doing?

At my core, I’m an Operations guy. Always have been. I love building systems that scale. Systems that break but don’t wake me up because they broke.

I also love building and helping lead excellent teams. I love seeing people do amazing things.

I pour my heart and mind into figuring out how to scale infrastructure (and teams!) to help grow to hundreds of millions billions of users.

So I join Lookout as Director of Operations because these are the challenges that make me tick.