The hero is dead. Long live the hero!

How toxic behaviours in your organisation hinder growth and lower the quality of your work. And what to do about it.

First of all, let’s establish something — being a Hero is a good thing. They live among us, they drive firetrucks, saving kittens, and they save lives. They work hard to study to become doctors and scientists to invent the next treatment to cure any ailment that comes down our way. Or they become nurses takin§g care of patients or teachers forming young minds. They are protesters, raising a voice or journalists amplifying it. But being a hero can be a bad thing, too, especially in software engineering and digital delivery.

Secondly, let’s agree on something else. Every change is an IT Change. Period. Yes, this is a blanket statement, but we have to admit that most customer journeys are moving to digital ecosystems.

Right now, most enterprises rely on IT change to manage their finances, their Human Resources, their marketing, their tools, their metrics, and so on. And this IT change is needed more than ever. The past two years have proven to the most opinionated people around this that moving certain things quickly to a complete end-to-end digital journey is possible. Prescriptions, healthcare, education, dating, shopping, collaboration… most of them existed in some form of function in digital or required an IT component, but it’s probably the first time in a long time that peoples ideas about what is possible that are challenged and we started talking about a complete end-to-end digital journey. And now, we have seen that there is digital transformation in the most “traditional” industries like banking, pharmaceuticals and education.

According to McKinsey, in most regions of the world, the digitalization of businesses was accelerated on average by three years. The average share of digital customer interactions increased to 58% by July 2020 from 36% at the start of the covid-19 crisis.

Source: McKinsey & Co https://mck.co/3GffsAR

Faster, better and more financially stable products and solutions need to be rolled out more than ever. This can mean many things for people working in digital delivery, but most importantly, we will be employed in the near future. We will probably consume even more coffee to stay up because the truth is that we simply cannot keep up with the rate of IT change we are trying to bring to the market. We will do it for sure, and we will be the heroes for it! Ok, people will not clap every day at 8 pm for 5 minutes. Still, C-levels will recognise our teams for bringing another digital journey to the market, and the transformation we introduced along with innovation and compliments will shower us by the business.

But we will know. It was painful. We will hide our pain or even dress it up. But we will know that we are not extremely proud because we probably messed up somewhere. But we will be the HERO!

But what is a hero, you might ask. Well, a hero is a good thing, right? If you google and find the definition of Hero, you will find that a Hero is indeed a very positive term. They have been through wars, and they are celebrated for it; they are the centre of attention, and ultimately they are the good guys that we have to sympathise with. They are the most critical team members because they know…

Thankfully most of us are not in a war zone and we are not Ironman… but we are in war zones called IT Change.

The real definition of the Hero, at least how I see it, is that in teams dealing with IT Change and delivery:

A hero is a person that will always be there for you. And they will be with you solving a problem over and over again.

And you usually will not have to ask how they solved it, because ultimately they will save you the next time something goes wrong. And yes, everyone will celebrate them for saving you. And also, they might get special treatment. Not because they are better than everyone else in your team, but usually, what happens if they leave an organization is a pain for all. The gates of hell will open, and panic will ensure until the next hero emerges to save the day.

Heroes come in all shapes and sizes, but they all have similar traits and behaviours:

People interrupt them “only for 5–10 minutes”, but usually it’s for much longer. It could be 20 minutes… or it could turn to three days. 👎
People will involve them in meetings because of the context and historical information that only their local hero will have. And the “hero” will despise them for it. (unless they are a psychopath) 🔪
Deployments cannot happen without them, and they have to be a backup in case something goes wrong. And it usually does, and there is a silent agreement that it will. And when it does, they are the first -and often the last- point of contact to resolve it and put out the fire 🚒
And when things get done and dusted, we celebrate them for always being there to solve a problem. And we could ask, “how did you solve this?” and they will be more than happy to explain and help… but somebody just called them out on another crisis. 🤷‍♂

Do you recognize this person? I know I do because I was that person in my career. And I was vocal about that and being that hero. So vocal that I used to put Facebook posts about it 😥

It was me, expressing to the world that I was a Hero….

Probably I was doing a hardcoded production code change on the fly while I was preparing for the go-live of a site.

Many things have changed, but I cannot help but ask myself, “have they?”

I mean, in September of 2021, I had to wake up at 3 am to oversee a deployment. Ok, there was no hardcoding, but still a system had to be taken down, step by step instructions had to be written for the operations team on a confluence page, and a regression testing to happen by a team of 4 with a window of opportunity of 2 hours between deploying and saying… “ok we are fine now… let it be live” or “rollback, we fucked up”.

So asking again, “have you been this person? Or do you have a team member that had to be this person recently?” Answer truthfully.

The reality is that I don’t regret being this person. The person who has written complicated code without realising that it was hard to maintain or I have developed something on the fly because the first time it was tested, it was in production and/or it was a hotfix (that became permanent!) because I didn’t have the test data until then… or even work in a production environment that was the staging environment until the day it went live, and the next time I had to change something, it was a pain in the neck.

It wasn’t out of malice, and it wasn’t on purpose. It was just the symptom of a systemic problem. Heroes of this kind are the victims of a changing system while pressure to deliver is increasing tenfold by the year, and their leads don’t have the steering required to do this strategically. This has an impact on individuals, teams and organizations. And finally, to business. The impact can be summarized:

The most obvious one is fatigue and lack of motivation for the individual and the team and in my experience, no software engineer wants to be the hero. They just want to be left alone and code in a dark corner (and occasionally take a break to go to the Fussball table)
Inability to plan effectively, due to the constant fire-fighting, which leads to missed opportunities from business as the trust is degrading at the same rate as our quality.
Things get longer to be done and momentum/context is lost. A release needs more than you thought and initially, you planned a sprint but it takes a little bit longer with each release and the more it takes, the more the size of the team required to do the release increases.
Domain knowledge is lost but also the opportunity to innovate goes out the window too, because if you don’t have the time due to the “next big fire”, will you have the time for a technical spike?
Attrition is increasing because nobody wants to work like that. And the grass starts to look greener for people and they start to jump ship. Plot twist: it rarely is.

And to innovate you need to be able to close the feedback loop, but we see that most traditional enterprises are having a linear approach in IT delivery. Also known as “over the wall”. And I am not going to explain concepts about agile vs waterfall. But in simple terms...

The process starts from Business, where they say… “I need something” and because people are visual in most cases it will go to research and design.
By the time the design will be finished a new delivery team will be created and they will get a dump of requirements, a design and hopefully some architectural guidance.
They will work hard to develop something, they will request infrastructure to be able to test it and it might take a couple of weeks and they will reach a point where they will involve the QA team. Hopefully, the QA team would have been involved by this time but I wouldn’t be surprised if it didn’t and if the QA team is actually the business seeing things for the first time.
Finally, all of the previous people will have to go to the dreaded CAB (Change Approval Board… 😨) which is also known as the Seventh Circle of Hell to ask for approval to put their changes in production
They will create countless pages of documentation to ensure that all the processes were followed. Compliance and security will audit for a couple of weeks to months. They will all produce evidence and reports. Things will be found and things will be fixed hopefully. And CAB will give their approval… And then they will deploy.
And there is this silent agreement that if it goes bad, we won’t have to worry because we made sure that we have one person around that will save the day. At best, this over the wall approach will work perfectly and we will ask operations to deploy manually and be happy we delivered.

How many times have we seen this scenario working?

And if it appears that is working, what have we sacrificed for it? Was it long hours? Was it manual work and manual testing? Was it budget? Was it mental health? Did we say to our Change Approval Board that “oh it will be grand… it’s only a low-risk change you know…”. Have we intentionally lied to them?… Did we lie to ourselves? Are we the Hero? Oh, we are.

There is nothing good here. You sacrificed something and the problem always is between connecting the developers and the operations. It’s not a technical problem. It’s a cultural one.

And you are definitely not the Hero. A DevOps culture can be. And to change towards that culture, we have to solve the most common challenges that organizations have when trying to implement a real DevOps implementation.

Organisational Structure

Organisational structure issues are probably the “easiest” ones to solve, but a joint agreement is to be reached. DevOps is an evolution, not just a toolkit, and in organisations, we observe that:

Too much middle management is in place, and the engineers are not close to the business, so ideas are lost in translation.
Too many silos and no incentive to break them
Lack of knowledge and the much required educational piece for C-suite about how digital delivery is happening

This transformation is not just a matter of getting the Development and Operations streams working together but also transforming the organisational culture, processes, and technologies. So any change must involve all levels of an organisation with full support from the organisation’s leadership and management

Existing processes

After buy-in and the necessary structural changes, processes about how we approach change in an organization have to be addressed and audited. What is observed:

Lack of data, collection or historical information
Complexity of existing processes that generate “waste:
Daily operations that do not allow improvement
regulated industries, where the need for regulation is used as an excuse not to challenge the status-quo
conflicting best practices (ITIL vs DevOps vs SRE)

Culture

Culture in organisations is what DevOps is trying to change by being the enabler. Changing cultures is the most difficult due to:

Learned behaviours
Resistance to change and a risk-averse approach
project vs product mindset

And to change culture you first need to have sponsorhip and a success that you need to ride and be the vehicle of change.

Now, what ?

In a perfect world, everything would be automated, easy, secure, independent and data-driven. Testing of any kind would be an easy habit; merging code would happen confidently. A security review wouldn’t be an afterthought, but it would shift left and not happen in isolation. Releases would be frequent and fast, and getting something live would be a code commit and be as simple as turning on a feature toggle. Finally, our work would be traceable, and incidents would be easy to pinpoint because we would have the tools to see precisely where the failure is. And all this would be supported with dashboards, tools and APIs that the team would use and necessarily service requests to a remote operations team.

And if we take a step back, to achieve and prepare ourselves for a DevOps culture change, we have to:

Make procurement easy and not a bureaucratic hell as it usually is for large organizations. Empower engineers and operations to work together on this and make Cloud-enablement a priority.
Automation needs to be in our hearts. Having a CI/CD pipeline and a build server is common sense, but we have to introduce automation in every possible step of the process. From how we create new environments, how our Jira integrate with our source control to how we close the loop and communicate back to the team.
We have to take a hard look at our architecture. I am not referring only to introducing microservices, but we have to review our software engineering principles. Microservices are not a panacea, but good practices around configuration and scalability are.
We have to be data-informed and data-driven. We need to introduce data capturing across our SDLC in a meaningful way to know the impact of our choices and know the facts, and finally move away from our risk-averse approach of release plans that last days in not weeks.
Cultivate a quality-first mindset. Infuse in your team pride and ownership and eliminate the over-the-wall approach. Take care of the team, they will take care of the software, and the business will be taken care of as a result.

This is the change that a DevOps mindset is asking for. Not to make developers act as operations, but connect better the Operations with the Development teams and give both the tools to work efficiently wether they are developing something for the first time or responding to an incident.

What should you do next?

Every organisation has a different level of maturity. Assuming you are leading an organisation with some tools already, you have to have a strategy if you want to improve. So first things first:

Get your Dev & Ops to agree.
The messaging can be confusing sometimes with Dev teams and operations teams. We ask the first to “break things and move fast” to keep delivering with flexibility while we ask the latter to ensure that nothing breaks. So flexibility vs stability is a contention point for both.
So we need to ensure that those two bodies are in alignment.
Include Security & Compliance
In most organisations, we tend to avoid those entities because they represent even more work for us. Still, we need to include them in this transformation as soon as possible because we have to shift left the activities we keep for last and include them early in our delivery process.
Assess your current efficiency & maturity
We need to have metrics of our current delivery approach because, after the first transformation, we will need to compare ourselves against those metrics again. This will help us with the narrative and get additional sponsorship while motivating us.
Get business Buy-in
We need to educate our C-Level and come with a strategy that is easy to digest and with a clear narrative and implementation plan that shows the business value we will get
Implement a small change, and keep it simple
Focus on a small change with a tangible result and immediate impact on the team. Do you have CI/CD already? Improve the code coverage. Do you have that too? Focus on automated testing & regression testing. Do you have that too? Focus on Infrastructure-as-a-code. Do you have that too? Well… I think you are in the wrong article!
Tell you story
Don’t forget to advocate for what you are doing and always playback with facts and numbers about the improvements you are bringing.
Branch out & experiment
Change your structure in a way that allows experiments. Investigate for concepts like chaos engineering.
Rinse and repeat

Heroes are everywhere and it’s not because they want to be, but usually because they are at the right place at the right time. But in software engineering, what teams and business like the most is predictability. So how might we turn our organisations into environments that thrive in digital delivery with DevOps as the enabler? And how might we change our organisations to places that they don’t need heroes?

‍

Vasilis Avlonitis

Senior Delivery Manager at EPAM Systems

I am with EPAM and I am looking forward to building teams and contributing to the strategy of organizations by driving innovation and excellence through delivery with well-engineered creative solutions.

I consider myself a problem-solver and have a can-do attitude and my motto is "everything can be done, if you try"

‍