Software Operability article in DevOpsFriday

DevOps star Benjamin Wooton (@benjaminwooton) has published the latest installment of his DevOpsFriday newsletter  – Insight from DevOps Thought Leaders – at, including articles by David Mytton of @serverdensity, Matt Watson of @Stackify, Sandy Walsh (@TheSandyWalsh) and the RethinkDB team (@rethinkdb).

I contributed the following article on software operability and why it is so important for today’s software systems; it takes the form of an interview, with Benjamin Wooton asking the questions.

Continue reading

Let’s Talk About Operational Features, not Non-Functional Requirements

Using the term ‘non-functional requirements’ to describe aspects of software systems which are invisible to the end-user but essential for effective service operation is counter-productive; we should instead use ‘operational requirements’ or ‘operational features’, and schedule these for delivery alongside end-user features.

Update: the Experience DevOps workshop series now has sessions on Software Operability – find out more at


Continue reading

Who Owns My Operability?

Operability is not something which can be ‘bolted on’ or retrofitted to software after it goes live; we need to design and build our software with operability as a first-class concern. You don’t build a bridge, then try to add load-bearing capabilities at the end of the project — but most software projects try to do exactly that, typically with costly results.

Ultimately, the product owner should be responsible for ensuring that operational requirements are prioritized alongside end-user features. If you are responsible for the software product or service, there is only one answer to the question

Who Owns My Operability?

Who Owns My Operability?

Update: the site now shows selected recommended reading on each page load.

(With a nod to

“Focus on Application Support and Maintainability” to Produce Operable Software

97 Things Every Software Architect Should KnowThe book 97 Things Every Software Architect Should Know is a useful collection of personal recommendations from experienced software practitioners around the world, and it contains some excellent advice for any thinking person engaged in software systems engineering.

However, it’s clear that even as recently as 2009 (when the book was published), there was very little focus amongst those who identify as software architects on making sure the software we design and built is operable by the operations teams on the “front line”. In fact, the only contributors who directly touch on software operabililty (aside from Michael Nygard, naturally) are  Rebecca Parsons,  Mncedisi Kasper, and Dave Anderson; four contributions out of 97.

Continue reading

Which sections of Release It! help to make software operable?

Release It! by Michael Nygard

The book Release It! by Michael Nygard (@mtnygard) is essential reading for anyone concerned with the operability of software. “What about the tl;dr version?”, you ask. There is no tl;dr version of Release It! – it’s all hugely valuable, so if you’re serious about software operability, read the whole book.

Once you’ve done that, here are some page numbers for quick reference which relate to software operability:

  • p.212 – Multiple NICs and multiple IP addresses
  • p.240 – Keeping configuration out of the application and in version control
  • p.252 – The importance of human eyeballs on monitoring systems
  • p.261 – Recovery-Oriented Computing (and by implication MTTR)
  • p.263 – Sensing changes in the application
  • p.267 – The importance of data trends
  • p.274-281 – Logging, including logging levels, log message formats, and log message semantics
  • p.318-322 – The architecture of the organisation, and how this affects operability

Arguably the most important theme of Release It! in terms of software operability is that we should treat  logging and metrics as first-class cross-functional aspects of our applications. We can write all the fancy circuit-breaker or exponential backoff code we like, but if the system operators do not know what is happening, the system as a whole is not operable.