Sunday, November 13, 2011

"Uptime" - The New Metric for IT Competitiveness

Having spent the last couple years dealing with multiple aspects of Cloud Computing - definitions, vendor implementations, business adoption vs. pushback, available public services, etc. - I'm now starting to conclude that it all really comes down to one metric. "Uptime".

Before anyone thinks I'm talking about the 99.xxxx% metric that has been getting thrown around for decades, think again. That "uptime", the meausre of how often the system is available (or unavailable) is no longer relevant. That metric is now measured like this:
  • Start with some assumed downtime that your business can afford vs. an assumed loss of revenue (let's say 99.99%).
  • Bring in some architects to figure out how much hardware and software is required to meet that goal.
  • Build the system. Run the system. Periodically update/maintain the system.
  • Annually, measure if you achieved that level of "uptime".
Now, throw all that away the first time something unexpected happens to your system and it's publicly off-line for 4-8hrs and it becomes a topic of conversation on Twitter. Or one of your disgruntled employees mentions on a blog how they couldn't get their job done because your system was done.

"uptime" (lower-case "u") is now expected to be 100%. Engineering might scope it for something else, but non-engineers done know/care about your scoping. They just know they can't work. They don't care if there was a vendor bug, or traffic demand surged, or the ISP had a routing issue. Blah, blah, technobabble...

So what is this new "Uptime" (upper-case "U") I mentioned earlier? This is the "How long does it take to have the new functionality 'Up'?" measurement. Translated a different way:

  • How long does it take to go from new business idea to new business execution?
  • How long before we can respond to our competition?
  • How much do I have to alter this ROI calculation before this investment makes financial sense?
  • How quickly could we respond to a radical shift in the market (positive, negative, etc.)?
Ultimately "Uptime" is focused on a level of market competitiveness for any IT service, whether it's delivered via internal source or external sources. And it doesn't automatically mean that IT organizations have to be a credit-card accepting service for basic functionality (VMs, block storage, etc.). Rather it looks at how to measure themselves against well-defined benchmarks for similar services. Some of those exist today via the public IaaS or SaaS markets, others are much more complicated to measure.

"Uptime" ties back to the idea of a CIO being the Cloud Concierge, where they deliver a buffet of options back to the business, sourced from different IT providers (internal, external, etc.) at various price points and "Uptimes". 

But "Uptime" also means that IT organizations will have to start thinking about a different funding model. Those funding changes may include:
  • Funding for a centralized system to allow IT to provide access to that buffet of services through different models (IT created and managed systems; user-driven self-service created systems; password accessed external SaaS systems, etc.). 
  • Funding for on-demand capacity, via shared pools of server / network / storage / virtualization / application resources. The capacity may reside internally or be lumped from multiple funding sources for external capacity acquisition (on-demand or pre-paid).
  • Funding to monitor external (market, industry) best-practices for "Uptime" of various types of service and associated costs / skills required. 
  • Funding to be actively looking at how to decommission or retire legacy systems which are consuming too many resources for the useful value still being provided back to the company (how to migrate, how to retrain on new systems, etc.)
  • Funding for cross-functional technology skills (DevOps, converged infrastructure, open-source) as well as new operational skills (internal + external systems management)
I believe we have the technology in place - between virtualization, automation, service-catalogs, multi-cloud management - to make this type of market-competitive metric a reality today. But it will require some changes on the side of IT organizations looking to play a different role within their business. Are they ready to measure themselves against an "Uptime" metric in 2012? 

I'd be very interested in feedback from anyone that does this today, or organizations that believe this is an important metric for them to remain IT-competitive going forward.