LEAN Process, “Classic“ Agile and TOGAF.
Hello everyone, I hope all is well with you. If you know me, I try to break away from the humbug of social media and give as much of a down to earth answer as I can to break from the normal influence.
There is a specific way of writing “influentially“. When I monitor posts that are created on linked.in, what I notice the most is the influence of emotion on folks. Just recently, I read a post that used the phrase “fever dream“. There is a lot of emotion in that statement exploiting both our visual cortex and adrenal system. Folks reading that particular post can feel the feels dripping off of the virtual pen.
How does this relate to LEAN, or Agile or TOGAF and our post today? What I see as the core issues in any mindset, methodology or framework is the emotions that come from working through problems cooperatively within that framework and how they for and against us.
I’m taking a TOGAF certification to reaquire some knowledge that has become fuzzy over the years of learning frameworks. I’ve worked on enough teams to have a very hybrid, fused and blended approach to project management.
As I have been going through this coursework I keep thinking to myself: this is not how the real world operates. And, that’s ok! Cooperation is hard.
Lessons Learned
What I have learned (sometimes begrudgingly) is that the fluidity of a situation is really the North Star to solving any problem. I learned this working in incident response the most, where it eventually dawned on me that the immediate solve for a critical issue was the not the maximal solution, but the optimal tradeoff.
The incident had to do with a test email being sent out for an application I was working on. We were testing multi-language emails localized to different languages based off of user preference on the app. So, in order to test we needed data in the database to test with. We were getting very close to launch, so the lines were starting to get blurred between pre-prod and prod tasks. Because of this, instead of the emails being sent to one test email box, they went live.
The set up is that there were a lot of departments working together, and there was a downstream mistake from one of the teams we were working with. Honest mistake as well as we were expediting a release.
Thankfully the process we had in place and the timing of the launch meant this was the real “Welcome“ message. “Hello, welcome to our updated service, we are so happy you are here, the reason why you are here is X, Y, Z“. It was essentially the planned email meant for delivery down the road.
Now - this email went public - and the domain on the website that was linked in the email was not. The DNS didn’t even resolve. That meant if one was to click the link, it went nowhere. Not a good look.
The effect of this was the email lit up the service line with a modest amount of calls as the notification went out to about twenty five thousand recipients - me being one. My email was prioritized in the queue and I saw it in my gmail - literally said: “Oh, that’s not right“ and had folks stop the mail spooler from running. Rule One: Stop the bleeding first.
The notification also went to a few high profile people in the agency. And yes, folks were wondering what the heck, with an f-bomb inserted in place of heck.
Now, came the fun part of working with folks and responding to the incident. In comes the CIO, and a lot of big people in the org. And, being the principal meant that the buck stopped with me.
I listened, wrote some notes as folks needed to have their justifiable say; did not say a word - took my medicine and said: “I have a fix for this“.
We talked about some root causes of the problem, I laid out why the process was not working, but crucially worked with the CIO very quickly on a response and said: “I can tell you right now exactly who received this email; let’s reach out to those affected folks only and let them know this is a legitimate email and that we were excited to get them on board: with a due date to launch“. Rule two: contain the damage.
And then I said: “Let’s use the older landing page we already have, and set up the domain now.“ Rule three: expedite a fix.
My rationale was that we already had a web page from the older system we were replacing: so we could update the language on that page. It was a quick fix to point the domain temporarily to the old page. New language was added, with an honestly positive spin on the changes. Really quick, really dirty, but well put together. The CIO and my director were total gems and we had this up in a little under an hour. We had the folks who ran the domain service give a temporary redirect to the legacy landing page.
All in all, took about three hours to solve. Queued another “oops“ email message with something like, “We were so excited to let you know, got out in front of our skis (This is Colorado y’know)… more awesomeness to come by X date.“
And that was it. Sure we had feedback. Some folks were mildly ticked-off, others appreciated the heads up, the ubiquitous trolling from that one person, but all in all from the ~25k we sent emails to, about 10% opened them (that’s ~2500 people) and then out of that 1% clicked on the link (~250 people, I had to run the stats). From the standpoint of the org, any failure was a big deal but in the grand scheme of the millions of emails sent out that day, this was a tempest in the smallest teacup one could imagine. Didn’t even blip on social media.
Within 72 hours the public had forgotten, and those who eventually opened the email a few hours later or a few days later had better information to look at as we further refined the information on the CMS pages. Also, in English and Spanish, so win-win.
Was it a failure? Yes. However, we found the root cause of where the lines of communication were broken in this test we were running, and the quick and dirty solution was a success. So, more like: “successful failure“. That’s life.
I’m not trying to talk this up, but from a testing standpoint notifications were sent out. Ironically, that passed the test harness criteria in Dev Ops.
Here’s the hot take: we followed a good procedure that had a Swiss cheese moment. We had a lot of eyes watching these tests, and the org had thought through a lot of what-if scenarios. That was our anchor: a good process.
Of Anchors and Sails
Sometimes a blog is just hard to get to the point to, and I do like writing so TL;DR is just not in my nature.
To make sure that an anchor becomes a sail, I ask this question:
Is it ok to make a mistake, and how big can that mistake be?
In my opinion, this is where an honest conversation begins with any organization. And, also in my opinion that answer is as vast as any ocean. I don’t feel there is a trick to how to solve for that; however one thing I have learned is to time box how long folks debate sailing on the ocean.
At some point, there has to be action that involves a process. Does that process weigh everything down to the point of failure, or it is aiding in a successful voyage?
It all depends.
A lot of formal processes and ceremonies in my opinion become a “mooring” anchor. IE: they keep a ship in port. There is a big “but“ coming with this: it is because they are not tailored for the deliverable.
It would be foolish of me to argue out Waterfall, LEAN, Agile or TOGAF - all are good tools when used pragmatically. In a highly regulated environment, TOGAF especially is a needed process to handle the complexities of larger enterprises. In incident response, waterfall is going to happen as it may be the only option in a scripted playbook.
Agile makes sense on paper, although I prefer LEAN process mapping and reducing waste as I find Agile culturally difficult to have all members use efficiently.
I feel that where many organizations struggle is in their misunderstanding of the value of how to scale a process.
For example, I cannot tell you how many times I have walked in and evaluated databases that have never gotten out of the starting gate of good form. Especially in simple constraints on data.
It’s fine, but what this tells me is that the underlying cause of how a thing is being made is not being reinforced with enough process at the appropriate scale for the moment. It is doubly so, for folks who leverage toolkits, especially in popular coding frameworks.
Those toolkits, in my opinion, have a tendency to take out the critical thinking of how to plan out a thing. Since folks can create so quickly, the process of making a thing becomes all delivery and in many cases lacks substance. Then, a decade later folks wonder why things haven’t scaled to their expectations.
I advocate for being able to match the substance of the work to the needs of delivery and keep scaling that. That includes early on defining a vision statement that has standards built in. Like, using ISO 3166 for countries and provinces - which folks do, and sometimes don’t realize it. Why stop there?
In my own work, I balance out how much process I need to get to market quickly and reduce rework. I am working on software right now where I wouldn’t put in as many controls as a massive enterprise. However, there is a crucial difference I try and make: I have good habits that make it so if/when I do need to pivot, a scalable process is in place. Is everything tested right now? No. Can everything be tested in the future: Yes!
I pretty much set up the foundation of Security, ADA, FIPS, Country and State information, GIS and Localization as first steps in everything I do. It means later, I don’t have to retrofit to meet a standard. I test as much code as needed that is complex enough to warrant that it doesn’t break. I test integrations with third party providers.
Those are my goals, and I influence the conversation by having a reasoned approach to scale: reduce rework as more rework equals more budget dollars later that don’t equate to profit.
My concerns in scaling most operations is not scaling the technology. More often than not, I see that scaling the process around profitable delivery is what is difficult. Anytime folks say: “We’re an Agile shop“, I am skeptical as the situation may or may not call for an Agile methodology.
In my opinion, the best influence any organization can have on its teams is establishing good habits and keeping folks accountable to those good habits in a way that creates leadership.
Establishing good habits are iterative wins. They stop micromanagement and establish trust by success.
Folks want to learn Agile or another process and in my opinion that is the start of wisdom, not the end. However, asking a team to make an all-in all or nothing change in process may not provide the anchor that is needed.
Folks may just not get it: and that’s ok. The influence one has in this situation is to look at your team and level folks up appropriately. Give them agency and buy in. Keep them at the table and hungry to succeed. Listen. Persevere.
What is truly needed is engagement, recognition of the need and leadership to scope the changes in ways that create success. Folks advocating: we can do this together, we have the skills right here to make it happen.
Scaling the process to me what a good “sea“ anchor does. It helps bring stability in heavy weather. Forming good habits over time creates an anchor that is deployed in heavy weather, reduces drift and then is pulled back in to let the sails unfurl as the storm passes.
Because, the ship should always be sailing to a destination. And eventually, any team is going to weather a storm. My advice would be to any startup: recognize that a storm is coming. Are you ready?
This is how an anchor becomes a sail.
Don’t code tired!