Warning: file_get_contents(/home/www/settings/mirror_forum_db_enable_sql): failed to open stream: No such file or directory in /var/www/html/content/Forum/functions.php on line 8
- Development servers are also a pain in the ass. We have dev, sandbox, and production... and I'm not sure I trust any of them to be aligned.
- Maybe a bit of separation anxiety is good. Absence and growing fonder stuff....
As Twain said: "There isnât time, so brief is life, for bickerings, apologies, heartburnings, callings to account. There is only time for loving, and but an instant, so to speak, for that."
Someone suggested a postmortem report on the last set of incidents. That's a great idea. I'll be doing a blog post soon. Maybe I can weave all of the details into some grand philosophical treatise. Love me a grand philosophical treatise.
But, in the meantime, I've given this a lot of thought and these things seem true:
1. Doing things in an orderly, methodical fashion is a lot easier when you're dealing with proven, documented systems & tools rather than inventing everything as you go.
2. Some aspects of my personal psychology really get in our way.
Here's an example: the normal workflow when developing or updating software is to configure and test everything in a 'sandbox' or development environment, and to only deploy to production (public-facing) servers when you're sure everything is working correctly.
That requires separate servers just for testing and development. In the early days of RP luxuries like that were out of the question. I felt lucky to have even one dedicated database server, and buying or leasing another one would take money I'd rather spend it on expanding our stream relay network. So much sexier, y'know.
So I developed what they call a "cowboy coding" approach: trying out new features and fixing bugs on the same servers â and database data â that you depend on to deliver our streams, apps, & website. Frantically typing raw SQL commands and writing nasty BASH scripts as everything melts down because of a semicolon. A FREAKIN' SEMICOLON! The stress of moments like that â so many moments like that â has probably taken a year or ten off my lifespan. Especially when it comes to semicolons. The least useful bit of punctuation in the mongrel tongue being force-fed to the entire world, but the whole web would grind to a halt without 'em.
The hopefully-final straw is that, for a variety of reasons, this time around I was more aware than ever of how difficult my working style makes things for others. Cowboy coders are generally a solo act, and RP is much more of a rock band at this point. When a band is in the groove, getting people off their asses and onto the floor, the last thing they need is for the keyboard player to announce between songs that he'd just converted to the Church of A432, and that they needed to stop, ignore their audience, and retune all of their instruments to match.
The audience, all jacked and sweaty just moments ago, is left standing around, bored and disappointed â and the keyboard player's assurances that the disruption is all in service of a glorious greater good don't really help.
So, apologies to all of you â and my bandmates â for the disruptions. We'll be setting up that development server. And I'll be using it.
3. Something else that I â and several others involved â bring to the table is a kind of irrational optimism that leads us to assume that whatever we're doing will happen smoothly & quickly, without any problems. That leads one to ignore the need to deeply research potential problems in advance, and to treat a superficial understanding of something as good enough.
CarPlay integration in the new iOS app is a good case in point. There were three of us making the key decisions about the app, and none of us had ever actually used CarPlay. We assumed that we knew enough to accurately assess things and make the right choice. Obviously, we were wrong. If we'd taken the time to do our homework we would have known that before we'd disrupted our relationship with thousands of loyal listeners. Like you.
I'm sorry.
Thank you for this detailed description of Cowboy Coding. I remember when you used to have a live-cam in your office so we could see you in action. You are the best at what you do. Sorry about the growing pains but I'm sure it will all be worth it in the long run.
Thanks for your quick response. I am listen almost exclusively to cached music on my Samsung Galaxy S20+5G running Android 13. RP version 4.11.2.gp. There are no unusual transitions between songs, and normal commercial breaks are fine. I know this happens almost every time William starts to make a comment after a song on the Main Mix. I will pay attention to see if it occurs on the Mello mix also. There are no long gaps in music William starts talking about the song and a new song just starts ten or fifteen seconds later.
Well, listening to the cached version of the main mix is back to being a delight. William is no longer cut off in the middle of a story. Thanks so much you are all appreciated!
Thank you so much for what you do! I thank my lucky stars I found your little cowboy project all those years ago, my life has been so much better in so many ways that are completely incalculable.
Someone suggested a postmortem report on the last set of incidents. That's a great idea. I'll be doing a blog post soon. Maybe I can weave all of the details into some grand philosophical treatise. Love me a grand philosophical treatise.
We love you! You're a real star. As in a Marvel Universe!
You've made sooo many lives more beautiful by what you've been doing!
Thank you, thank you, thank you!
Because.. life is often short and fleeting, as we know.
Someone suggested a postmortem report on the last set of incidents. That's a great idea. I'll be doing a blog post soon. Maybe I can weave all of the details into some grand philosophical treatise. Love me a grand philosophical treatise.
But, in the meantime, I've given this a lot of thought and these things seem true:
1. Doing things in an orderly, methodical fashion is a lot easier when you're dealing with proven, documented systems & tools rather than inventing everything as you go.
2. Some aspects of my personal psychology really get in our way.
Here's an example: the normal workflow when developing or updating software is to configure and test everything in a 'sandbox' or development environment, and to only deploy to production (public-facing) servers when you're sure everything is working correctly.
That requires separate servers just for testing and development. In the early days of RP luxuries like that were out of the question. I felt lucky to have even one dedicated database server, and buying or leasing another one would take money I'd rather spend it on expanding our stream relay network. So much sexier, y'know.
So I developed what they call a "cowboy coding" approach: trying out new features and fixing bugs on the same servers — and database data — that you depend on to deliver our streams, apps, & website. Frantically typing raw SQL commands and writing nasty BASH scripts as everything melts down because of a semicolon. A FREAKIN' SEMICOLON! The stress of moments like that — so many moments like that — has probably taken a year or ten off my lifespan. Especially when it comes to semicolons. The least useful bit of punctuation in the mongrel tongue being force-fed to the entire world, but the whole web would grind to a halt without 'em.
The hopefully-final straw is that, for a variety of reasons, this time around I was more aware than ever of how difficult my working style makes things for others. Cowboy coders are generally a solo act, and RP is much more of a rock band at this point. When a band is in the groove, getting people off their asses and onto the floor, the last thing they need is for the keyboard player to announce between songs that he'd just converted to the Church of A432, and that they needed to stop, ignore their audience, and retune all of their instruments to match.
The audience, all jacked and sweaty just moments ago, is left standing around, bored and disappointed — and the keyboard player's assurances that the disruption is all in service of a glorious greater good don't really help.
So, apologies to all of you — and my bandmates — for the disruptions. We'll be setting up that development server. And I'll be using it.
3. Something else that I — and several others involved — bring to the table is a kind of irrational optimism that leads us to assume that whatever we're doing will happen smoothly & quickly, without any problems. That leads one to ignore the need to deeply research potential problems in advance, and to treat a superficial understanding of something as good enough.
CarPlay integration in the new iOS app is a good case in point. There were three of us making the key decisions about the app, and none of us had ever actually used CarPlay. We assumed that we knew enough to accurately assess things and make the right choice. Obviously, we were wrong. If we'd taken the time to do our homework we would have known that before we'd disrupted our relationship with thousands of loyal listeners. Like you.
I really appreciate you taking the time to write this â truly. Itâs obvious you care about Radio Paradise, and we donât take that lightly.
First, I want to be clear: there was no plan to âgo offline.â This wasnât a scheduled maintenance window where we knowingly shut down the stream. It was backend work intended to strengthen the infrastructure, and unfortunately it triggered unexpected failures. Thatâs on us â but it wasnât a deliberate decision to allow dead air and there was no way to plan for something we had no idea would trigger such issues.
Youâre absolutely right that dead air is painful. William knows that better than anyone. We all do. The stream itself is sacred to us. The moment something goes down, it becomes all-hands-on-deck until itâs restored.
One thing that might not be visible from the outside is how lean we actually are. While there are people who contribute in various roles, our core technical infrastructure is maintained by essentially two engineers and one of those engineers is also the boss and main DJ. They are supporting a global, 24/7, streaming service that operates across web, mobile apps, CarPlay, Android Auto, smart speakers, and multiple audio formats â all without corporate backing, venture capital, or a broadcast network behind us.
We really are a mom-and-pop operation serving a massive worldwide audience. Thatâs part of the beauty â and part of the fragility.
The house analogy wasnât meant to imply neglect or known instability. It was meant to acknowledge that after 26 years of continuous evolution, systems accumulate complexity. Modernizing that complexity is necessary to keep RP viable long-term. Sometimes that means carefully touching foundational pieces. And sometimes, despite testing, something behaves differently in the wild than it does in staging.
Could we build more redundancy? Of course. We are actively working toward greater resilience. But that takes time and resources, and we grow those carefully and deliberately.
There were no ignored red flags. There was no cavalier decision-making. There was careful long hours of work that had unintended consequences â and a team that responded as quickly as possible.
As for revenue impact â yes, outages matter. Weâre fully aware of that. No one here treats this casually. This station is our livelihood and our lifeâs work.
Weâre constantly balancing:
⢠Stability
⢠Innovation
⢠Limited staffing
⢠Financial sustainability
⢠And a global audience that expects perfection
Itâs not corporate radio. Itâs not iHeart. Itâs not Spotify. Itâs a small, fiercely committed team trying to keep human-curated radio alive in a very automated world.
Your feedback is heard. And your expectations come from a place of wanting RP to thrive â which we share.
Weâre not perfect. But we are deeply committed.
And weâre still here.
Thanks for the reply. And thanks so much for all the hard work. I really appreciate, I shudder to think what my life would be like without the stream that you hole sacred. I also hold it sacred.
I'm just asking that the next time something happens, come into the thread and post something, anything. Let us know that you know. Radio silence here about what's going on is not good. It doesn't have to be a long screed, just a quick note, we know that there's an issue, we're working on it and we'll be back up asap. That would have been so much better than just silence from the team.
I really appreciate you taking the time to write this â truly. Itâs obvious you care about Radio Paradise, and we donât take that lightly.
First, I want to be clear: there was no plan to âgo offline.â This wasnât a scheduled maintenance window where we knowingly shut down the stream. It was backend work intended to strengthen the infrastructure, and unfortunately it triggered unexpected failures. Thatâs on us â but it wasnât a deliberate decision to allow dead air and there was no way to plan for something we had no idea would trigger such issues.
Youâre absolutely right that dead air is painful. William knows that better than anyone. We all do. The stream itself is sacred to us. The moment something goes down, it becomes all-hands-on-deck until itâs restored.
One thing that might not be visible from the outside is how lean we actually are. While there are people who contribute in various roles, our core technical infrastructure is maintained by essentially two engineers and one of those engineers is also the boss and main DJ. They are supporting a global, 24/7, streaming service that operates across web, mobile apps, CarPlay, Android Auto, smart speakers, and multiple audio formats â all without corporate backing, venture capital, or a broadcast network behind us.
We really are a mom-and-pop operation serving a massive worldwide audience. Thatâs part of the beauty â and part of the fragility.
The house analogy wasnât meant to imply neglect or known instability. It was meant to acknowledge that after 26 years of continuous evolution, systems accumulate complexity. Modernizing that complexity is necessary to keep RP viable long-term. Sometimes that means carefully touching foundational pieces. And sometimes, despite testing, something behaves differently in the wild than it does in staging.
Could we build more redundancy? Of course. We are actively working toward greater resilience. But that takes time and resources, and we grow those carefully and deliberately.
There were no ignored red flags. There was no cavalier decision-making. There was careful long hours of work that had unintended consequences â and a team that responded as quickly as possible.
As for revenue impact â yes, outages matter. Weâre fully aware of that. No one here treats this casually. This station is our livelihood and our lifeâs work.
Weâre constantly balancing:
⢠Stability
⢠Innovation
⢠Limited staffing
⢠Financial sustainability
⢠And a global audience that expects perfection
Itâs not corporate radio. Itâs not iHeart. Itâs not Spotify. Itâs a small, fiercely committed team trying to keep human-curated radio alive in a very automated world.
Your feedback is heard. And your expectations come from a place of wanting RP to thrive â which we share.
Weâre not perfect. But we are deeply committed.
Knock it off Randy. Alanna very thoughtfully responded to his thoughts and they had a wonderful exchange of information. Then you inject yourself where you are not needed. You are the dead horse. Just go away.
Anal, much? Be happy the music is back and just let it go. No need to beat a dead horse.
Knock it off Randy. Alanna very thoughtfully responded to his thoughts and they had a wonderful exchange of information. Then you inject yourself where you are not needed. You are the dead horse. Just go away.
Thank you for understanding the spirit in which I wrote my comments, and for responding to the concerns about the outage. I also wish to express regret for my initial characterization of the maintenance as cavalier or careless, and I owe you and William an apology.
It sounds like RP didn't really expect things to go sideways. Shit happens. We all get it. Especially those of us who have full time careers in IT, operations, software development, etc.
In that context, I think it would be terrific and truly enlightening if William and Jarred compiled an "incident post-mortem" report.
There are many examples of post-mortems published by larger institutions like Reddit, Facebook, Microsoft, or Cloudflare.
In brief, they are a chronological narrative that begin with a summary the planned changes, and go on to include: highlights of work-in-progress and completed tasks; the moment when the system broke down in an unplanned or unexpected way; what work was done to restore service; how long it took to restore service; and performance & reliability data as the system started to come back online. Incident post-mortems often conclude with a "root cause analysis" that narrows down and explains the exact reason for the unexpected outage.
As always, "Thanks for Listening."
Anal, much? Be happy the music is back and just let it go. No need to beat a dead horse.
I really appreciate you taking the time to write this â truly. Itâs obvious you care about Radio Paradise, and we donât take that lightly.
First, I want to be clear: there was no plan to âgo offline.â This wasnât a scheduled maintenance window where we knowingly shut down the stream. It was backend work intended to strengthen the infrastructure, and unfortunately it triggered unexpected failures. Thatâs on us â but it wasnât a deliberate decision to allow dead air and there was no way to plan for something we had no idea would trigger such issues.
Youâre absolutely right that dead air is painful. William knows that better than anyone. We all do. The stream itself is sacred to us. The moment something goes down, it becomes all-hands-on-deck until itâs restored.
One thing that might not be visible from the outside is how lean we actually are. While there are people who contribute in various roles, our core technical infrastructure is maintained by essentially two engineers and one of those engineers is also the boss and main DJ. They are supporting a global, 24/7, streaming service that operates across web, mobile apps, CarPlay, Android Auto, smart speakers, and multiple audio formats â all without corporate backing, venture capital, or a broadcast network behind us.
We really are a mom-and-pop operation serving a massive worldwide audience. Thatâs part of the beauty â and part of the fragility.
The house analogy wasnât meant to imply neglect or known instability. It was meant to acknowledge that after 26 years of continuous evolution, systems accumulate complexity. Modernizing that complexity is necessary to keep RP viable long-term. Sometimes that means carefully touching foundational pieces. And sometimes, despite testing, something behaves differently in the wild than it does in staging.
Could we build more redundancy? Of course. We are actively working toward greater resilience. But that takes time and resources, and we grow those carefully and deliberately.
There were no ignored red flags. There was no cavalier decision-making. There was careful long hours of work that had unintended consequences â and a team that responded as quickly as possible.
As for revenue impact â yes, outages matter. Weâre fully aware of that. No one here treats this casually. This station is our livelihood and our lifeâs work.
Weâre constantly balancing:
⢠Stability
⢠Innovation
⢠Limited staffing
⢠Financial sustainability
⢠And a global audience that expects perfection
Itâs not corporate radio. Itâs not iHeart. Itâs not Spotify. Itâs a small, fiercely committed team trying to keep human-curated radio alive in a very automated world.
Your feedback is heard. And your expectations come from a place of wanting RP to thrive â which we share.
Weâre not perfect. But we are deeply committed.
And weâre still here.
Thank you for understanding the spirit in which I wrote my comments, and for responding to the concerns about the outage. I also wish to express regret for my initial characterization of the maintenance as cavalier or careless, and I owe you and William an apology.
It sounds like RP didn't really expect things to go sideways. Shit happens. We all get it. Especially those of us who have full time careers in IT, operations, software development, etc.
In that context, I think it would be terrific and truly enlightening if William and Jarred compiled an "incident post-mortem" report.
There are many examples of post-mortems published by larger institutions like Reddit, Facebook, Microsoft, or Cloudflare.
In brief, they are a chronological narrative that begin with a summary the planned changes, and go on to include: highlights of work-in-progress and completed tasks; the moment when the system broke down in an unplanned or unexpected way; what work was done to restore service; how long it took to restore service; and performance & reliability data as the system started to come back online. Incident post-mortems often conclude with a "root cause analysis" that narrows down and explains the exact reason for the unexpected outage.