This is Fine! with Colette Alexander and Clint Byrum

What’s happening in the world of SRE and Resilience Engineering? Join us as we catch up with fellow podcast hosts Colette Alexander and Clint Byrum of the “This Is Fine!” podcast live at SREcon in Seattle.

This is Fine! with Colette Alexander and Clint Byrum

[00:00:00.00] SPEAKER 1: Tele-- Telebot.

[00:00:00.99] [THEME MUSIC]

[00:00:03.43] You missed a page from Telebot.

[00:00:05.88] SPEAKER 2: Welcome to season 6 of the Prodcast, Google's podcast about site-reliability engineering and production software. Earlier this year, The Prodcast team recorded short sessions with par ticipants at SREcon Americas 2026. Here’s what we learned talking with people at the

conference …

[00:00:31.02] SPEAKER 1: You missed a page from Telebot.

[00:00:33.10] STEVE MCGHEE: All right, so we're live in Seattle. This is the Prodcast. I'm a podcast host. You're a podcast host. You're a

podcast host. Don't we have any guests? What are we doing here?

[00:00:43.14] CLINT BYRUM: Everyone's a podcast host.

[00:00:44.58] STEVE MCGHEE: Oh, this is just what we do. Is this what podcasters do? This is so obnoxious. What have we done?

[00:00:50.64] CLINT BYRUM: Well, there weren't enough strangers to talk to. So we have to talk to each other.

[00:00:53.50] STEVE MCGHEE: We're warming up the podcast. That's what it is. We have to make sure the mics work and things like that. Well,

welcome. I'm Steve. Who are you two fine folk? Can you introduce yourself?

[00:01:01.54] CLINT BYRUM: I'm Clint Byram, and we had you on our podcast.

[00:01:04.65] STEVE MCGHEE: That's true. That was like over a year ago?

[00:01:07.11] CLINT BYRUM: A while ago, yeah.

[00:01:07.55] COLETTE ALEXANDER: Way too long ago.

[00:01:08.69] STEVE MCGHEE: Way too long ago?

[00:01:09.87] CLINT BYRUM: Yeah, which is called This is Fine!

[00:01:11.19] STEVE MCGHEE: That's right.

[00:01:11.93] CLINT BYRUM: A podcast about resilience engineering. And I'm also an SRE, and I am enjoying the SRE therapy session that we have

every year around this time.

[00:01:22.71] STEVE MCGHEE: It is a bit of a therapy session. I appreciate that for sure. And you?

[00:01:26.77] COLETTE ALEXANDER: I'm Colette Alexander. I also co-host This is Fine!, a podcast about resilience engineering. And yeah, I'm the

president of the Resilience in Software Foundation, too. We're having a big party tonight-- very excited.

[00:01:40.91] CLINT BYRUM: We are.

[00:01:41.23] COLETTE ALEXANDER: Yeah.

[00:01:41.57] STEVE MCGHEE: Can you tell our fair listeners, what is this foundation, and what does it even?

[00:01:47.41] COLETTE ALEXANDER: What does it even?

[00:01:48.51] [LAUGHTER]

[00:01:49.21] Well, so we're a bunch of nerds that hang out and talk about resilience engineering in the software domain, mostly, although there are some folks

who are not of the software domain who grace us with their-- really, an awesome presence.

[00:02:03.96] CLINT BYRUM: It's pretty cool.

[00:02:04.66] COLETTE ALEXANDER: Yeah, I love that part about. So we have a Slack, and you can go to resilienceinsoftware.org to become a member.

And becoming a member gets you access to that Slack and free access to all of our events. We also do events for the public, too.

[00:02:18.28] So if you have a little money to spare, just to cover our Zoom costs, you can come to-- we have a FRAM training coming up next month, for example. But we do all kinds of things. We also do-- when Clint and I have episodes that are like-- we do like paper-talk episodes, we often do like AMAs for the community

and stuff there.

[00:02:35.92] STEVE MCGHEE: Yeah, one thing that I've found about the community is I get a lot of references to papers, and I read them, and then

I go, what? So the paper talks really helped me a lot.

[00:02:44.80] COLETTE ALEXANDER: Oh, I'm so glad.

[00:02:45.62] STEVE MCGHEE: I did go to grad school. And I read papers on purpose for a while, but that was years-- at least four years ago,

maybe more.

[00:02:53.14] COLETTE ALEXANDER: I had so much cruft to brush off my brain--

[00:02:56.06] STEVE MCGHEE: It's tough.

[00:02:56.26] COLETTE ALEXANDER: --when I went back to London. It's a lot of work to do that.

[00:02:59.08] STEVE MCGHEE: So having buddies to talk about it with is very helpful.

[00:03:02.72] COLETTE ALEXANDER: Absolutely.

[00:03:03.82] CLINT BYRUM: Does the Resilience in Software Foundation have merch, Colette?

[00:03:08.07] COLETTE ALEXANDER: Why, yes, we do. I'm so glad you asked. If you go to resilienceinsoftware.org, you can also see a shop. And you

can get things like this hoodie, the Anti Complexity Complexity Club.

[00:03:20.09] STEVE MCGHEE: So we don't like complexity. Is that what you're saying?

[00:03:22.45] COLETTE ALEXANDER: But you can only fight complexity with complexity due to the law of requisite variety. Sorry, that's like super

cray.

[00:03:29.63] STEVE MCGHEE: That sounds like a paper-content thing, the opinion--

[00:03:31.35] CLINT BYRUM: Is there a paper for that one, actually?

[00:03:34.07] COLETTE ALEXANDER: There must-- Dr. Woods must have written it, David Woods. Right?

[00:03:38.27] CLINT BYRUM: He's the master of complexity, for sure.

[00:03:40.29] STEVE MCGHEE: I'd just like to point out that a monorail just went by the window just now.

[00:03:43.81] CLINT BYRUM: Monorail?

[00:03:43.99] STEVE MCGHEE: I don't know if you guys saw it. We could sing the song if you'd like.

[00:03:46.65] CLINT BYRUM: It glides as softly as a cloud.

[00:03:48.63] STEVE MCGHEE: That's right. That's right.

[00:03:49.97] COLETTE ALEXANDER: [LAUGHS]

[00:03:51.55] STEVE MCGHEE: So can you give me a couple top moments so far of the conf? Like, anything hit real good? What do you think?

[00:03:58.87] COLETTE ALEXANDER: So I will be-- can I-- you can cut this in the edit. It's fine if you're--

[00:04:04.03] STEVE MCGHEE: It can be up. It can be down.

[00:04:05.24] COLETTE ALEXANDER: Yeah, I was really-- you saw my reaction. I was really upset at the return of MTTR. What the heck, man?

[00:04:12.90] STEVE MCGHEE: It happens.

[00:04:13.00] COLETTE ALEXANDER: I thought we got rid of it, and it came back.

[00:04:15.42] STEVE MCGHEE: Words are sticky. I don't know.

[00:04:16.34] COLETTE ALEXANDER: They really are. I would like to move to-- I mean, I think here's the thing, right? It's like I want us to have real conversations about it. I don't want us to just have lectures about it from the podium. Like, let's talk about what's working, what's not working, but I was

a little-- that upset me a little bit.

[00:04:32.12] CLINT BYRUM: Yeah, I thought we had contained that to the expo hall.

[00:04:34.52] COLETTE ALEXANDER: [LAUGHS]

[00:04:36.48] CLINT BYRUM: Like, I expected over there. I understand those things take time, but to have a keynote, not just bring it out, whatever. Some people are still learning how not to say it, but, yeah, it's sort of been a deconstruction process for me. Dr. Forsgren’s work was like seminal in

bringing me into a modern--

[00:04:53.24] STEVE MCGHEE: Absolutely.

[00:04:53.64] CLINT BYRUM: --history perspective. I think a lot of people would say Accelerate was important and still is important. But that

part, just like all good science, some of it falls to the wayside as new research is done, like Stepan Davidovic killed it, and yet the zombie rises again.

[00:05:13.10] STEVE MCGHEE: Yeah. I mean, I think we have to-- we appreciate, I appreciate-- I believe you also appreciate-- the fact that people still do say it, and they still do make decisions based on it, even though it doesn't make any sense. So presenting it, in some cases, to that audience might be

helpful in some way, as long as it's also like-- if we can then talk about it afterwards.

[00:05:34.58] COLETTE ALEXANDER: Also, let's have a conversation about it.

[00:05:36.24] STEVE MCGHEE: Yeah, exactly. So I get why it's still happening, but I would really love it if it was like, we're talking about this. By the way, a better way to think about this is blah, blah, blah, blah, blah. So briefly, if you can do it for our audience again, why not MTTR? Can you? We

could also go get all these people out in the world who will come out and yell, but I trust you guys to not be yelling about this and be--

[00:05:59.72] CLINT BYRUM: Yeah, and I'm--

[00:06:00.40] [INTERPOSING VOICES]

[00:06:00.56] CLINT BYRUM: --definitely of the, like, I need something to move my house onto off the sandy beach that I didn't realize it was sandy at first. I did see there was a talk on CAST, which is an interesting adjustment, more attuned to replacing root-cause analysis. I didn't know much about

CAST before. I've now got some books to read. So that's an exciting-- like, OK, here's a new model.

[00:06:22.83] But for MTTR, the thing that I've been like sending to executives and sending to engineers is like, no, it's dead. Like, there's a bunch of hyperlinks and documents I write that are like-- and the times in incidents are not related to each other enough to statistically matter. And I link to Stepan Davidovic's very free paper where it explains the math. And he did these Monte Carlo simulations that-- I think if anybody goes and reads those and has the time, which I know many executives do not-- if you have the time to think about it, it presents itself very clearly. You can't move it enough to make it matter

statistically.

[00:07:00.61] COLETTE ALEXANDER: I also want to say-- and this is how I explained it to somebody when I was walking in the hallway with them. And I just said, look, there is an intuitive thing here too that you can grab on to, which is you've had an incident that's lasted three hours, three days, three weeks that might not have impacted your business very much. And you've had an incident that's lasted 10 minutes, I bet, that has-- (WHISPERING) --really sucked

for your business.

[00:07:24.62] (SPEAKING) And so the other piece that MTTR leaves out is what is the business impact? What are the other elements, besides time, of an incident that actually do materially matter to you? And is chasing the time bunny down the rabbit hole really how you want to spend your time when you're constructing all

of the tooling, all of the automation, all of the whatevers you're going to do to try to lessen that time? Is that really how you want to do it?

[00:07:54.14] STEVE MCGHEE: Yeah. I mean, this is another nail in the coffin maybe. We'll hope. I don't know. Do you have a final point?

[00:08:00.14] CLINT BYRUM: Very briefly. I also had discussions where similarly to your talk about DR and the lies we tell. Similarly, we have to

go and say these things publicly, and we need to invite people to come and talk to us.

[00:08:12.61] STEVE MCGHEE: Even if they're uncomfortable.

[00:08:13.37] CLINT BYRUM: So we're inviting you to come and talk to us if you see us at the next SRE con or at thisisfinepod.com. You could

submit a question about this. Why you still maybe--

[00:08:22.03] STEVE MCGHEE: Good pitch.

[00:08:22.35] CLINT BYRUM: --believe in MTTR.

[00:08:23.75] STEVE MCGHEE: Cool. Well, thank you very much, my fellow podcasting friends. That was great.

[00:08:27.57] COLETTE ALEXANDER: Thanks, Prodcast. We love you.

[00:08:29.05] STEVE MCGHEE: I can't believe you let me wear this hat this whole time. This is very silly.

[00:08:31.69] CLINT BYRUM: [LAUGHS] OK, thanks, Steve.

[00:08:34.35] SPEAKER 2: You've been listening to the Prodcast, Google's podcast on site-reliability engineering. Visit us on the web at sre.google, where you can find books, papers, workshops, videos, and more about SRE. This season is brought to you by our hosts Jordan Greenberg, Steve McGhee, Florian Rathgeber, and Matt Siegler, with contributions from many SREs behind the scenes. The podcast is produced by Paul Guglielmino and Salim Virgi. The

Prodcast theme is Telebot by Javi Beltran and Jordan Greenberg.

[00:09:08.55] SPEAKER 1: You missed a page from Telebot.