Links List 7.3.08

Posted July 3rd, 2008 by Joe Pendry

Microsoft finally released their key feature of Windows Server 2008, Hyper-V. The download allows people to run multiple OSes under one physical Windows Server. Although many were excited about this release, some still question if Microsoft can stay afloat against virtualization frontrunner VMWare. Only time will tell.

Data centers are going green – because they need too. According to an article published by the Wall Street Journal, the EPA states, “data center energy use could double by 2011, amounting to $7.4 billion in U.S. electricity costs and requiring the equivalent of 10 new power plants.” Many companies are addressing the need for data centers to be energy efficient and acknowledging that green is the new black.

Facebook’s Jonathan Heiliger, vice president of technical operations, discusses the growth of the popular social networking site and how their servers stay on top despite the 250,000 additional users per day.

Are enterprises ready for the cloud? Here are ten reasons why enterprises are not ready to trust the cloud.

Popularity: 7% [?]

Filed Under: IT Operations, Virtualization

Comment now »

Disaster Recovery and Virtualization at Gartner IOM

Posted July 1st, 2008 by Joe Pendry

Today’s post wraps up our thoughts on the Gartner Infrastructure and Operations Management conference. One of the more interesting presentations we attended focused on the advantages of using virtualization for disaster recovery testing. John Morency, Research Director led a session on this topic.

As we heard from a number of presenters throughout the week, John noted that IT is getting more complex. Organizations must deal with more business processes, more applications, and more data each year. This is particularly problematic for organizations that are trying to test for disaster recovery because the testing is very much manual and labor intensive. With a scope of any size, the process becomes very difficult to scale.

As a result, most organizations tend to focus their testing efforts towards supporting the higher priority, “tier 1 and 2” areas. Because they can’t do everything at once, they focus on testing to get the most important stuff up and running quickly.

Interestingly, when John asked the audience about the success of the most recent disaster recovery test, not one attendee reported that all service level goals for the test were met. The audience was roughly split as to whether this was because they ran into “minor” or “major” problems.

So what are the major problems in the way of disaster recovery?

  • Challenge No. 1 – Dependency management. Tests don’t always have data or applications included. And there may be a lack of change management synch between the data center and recovery center.
  • Challenge No. 2 – Clear and complete Testing Runbooks. It is difficult to answer all of the questions with regard to the testing procedure. What should I test? What should I do? What order or startup procedures? For each application, which transactions need to be tested and why? Today this is largely a paper or MS project process rather than automated.
  • Challenge No. 3 – Race against the clock. Difficult to scope the length of the test and coordinate appropriately with services. A mismatch sometimes exists between the classic disaster recovery service provider model and what organizations need to do. Especially when testing frequencies need to speed up.

So John touched on some best practices for organizations to follow when trying to ramp up disaster recovery plans:

  • Aim for twice a year testing of the end-to-end disaster recovery plan is best practice. (Well over half of organizations fail to meet this, however)
  • Get all stakeholders in the room to (at a minimum) discuss scenarios and needs.
  • Don’t presume that you need to start with planning for cataclysmic events. When one vendor looked at the incidents they experienced over a ten year period, they found two of top three most common incidents had nothing to do with major disasters (they were brown-outs and equipment malfunction).
  • Realize that every exercise doesn’t need to be a full production cut-over.
  • Gain senior sponsor to get the right people into the room – and push for continuous sponsorship.

Lastly, John focused on ways that virtualization can help with disaster recovery testing:

  • Consolidate pre- and post-recoveries into a single set of tests.
  • More easily call out missing dependencies in testing.
  • Gain ability/resource that can be used more frequently for testing – and therefore more efficient and thorough.
  • Lower risk of unintentional data updates as a result of tests.

The interesting point of the presentation is that it confirms the idea that starting somewhere is better than waiting for perfection. Deciding on small ways to develop testing results – and perhaps using some of the benefits of virtualization – can put organizations well ahead of the average company in terms of testing. This should help when the disaster inevitably strikes.

Popularity: 9% [?]

Filed Under: Business Continuity, Change Management, Downtime, Testing, Virtualization

Comment now »

Links List 6.27.08

Posted June 27th, 2008 by Joe Pendry

Cloud computing is a seriously hot topic. Alistair Croll at Bitcurrent (and GigaOm) breaks it down into two simple thoughts though. The first one’s pretty basic: Don’t use someone who can’t keep their cloud running. The second one is less obvious: The value of a cloud service isn’t just what it does; it’s also how many people use it.

Red Hat has adopted KVM. The company will offer a new lightweight distribution which integrates KVM, selling it as the Red Hat virtualization platform (Embedded Linux Hypervisor). Additionally, the company will offer a new enterprise-wide management solution called oVirt, which is based on the standardized libvirt APIs and is designed to scale up to thousands of virtual machines. The question is, how does this impact Xen?

Speaking of competitive impact, HP and VMWare are teaming up. They will jointly develop hardware to manage VMWare Hypervisor technology.

RedMonk praises downtime, saying that without it, Twitter would not be the hot commodity it is today.

A positive effect from downtime is not always possible though, as evidenced by Data Center Knowledge’s post about an extended data center outage at hosting firm Atlantic.net.

Popularity: 16% [?]

Filed Under: Downtime, IT Operations, Virtualization

Comment now »

Gartner IOM Conference: CMDB Success

Posted June 26th, 2008 by Joe Pendry

Another topic that was popular during the recent Gartner Infrastructure Operations Management show this week was the change management database or CMDB. Patricia Adams, Gartner Research Director and Ronni Colville, Gartner VP and Distinguished Analyst hosted a session titled “Ensuring Your CMDB Success”.

Because there are many views and statistics being thrown around about the CMDB these days, it was interesting to see Patricia and Ronni use a couple of survey questions to get a sense of where things stand among conference attendees.

First off, it appears that most companies at the conference are going down the CMDB path – eventually, at least. The session attendees were asked where they are in the CMDB process, here is a break out of the answers:

  • 33% are currently in progress with a CMDB
  • 19% will start a CMDB efforts in 6 months
  • 12% will start a CMDB in 6-12 months
  • 20% will start a CMDB by the end of 2009
  • 9% are not planning a CMDB

There seems to be quite a bit of traction (both now and in the future), that also maps to the results of the IT Skeptic’s survey. A number of companies are already working on a CMDB and good number plan to continue work.

Another survey question showed additional interesting results. About half of the attendees were implementing ITIL v2 and about ten percent were implementing v3. This would seem to indicate that the deployment of a CMDB is actually outpacing ITIL adoption. This is an ironic occurrence since the CMDB term actually stems from ITIL in the first place.

Regardless, Ronni also had some points of consideration for companies looking to implement a CMDB:

  1. Ensure that CMDB data is accurate. Whatever data you include in the CMDB to be reliable to be effective.
  2. Test to make sure CMDB is trusted. Is IT running the data regularly with proper effect?
  3. Agree on why you are implementing CMDB. It is important to quantify the value of the effort given the length of time that implementation may take.
  4. Choose your CMDB vendor wisely. There are no standards for CMDB today, so different vendors may be using different methodologies.

One goal of the CMDB is to improve visibility into the impact of a change. To the extent that this will reduce downtime, we think this is a good thing. However, as Patricia and Ronni pointed out, CMDB projects are very complicated and require considerable effort to provide value. It seems the work will continue for some time.

Popularity: 18% [?]

Filed Under: Change Impact Analysis, Change Management, Downtime, IT Operations, ITIL

2 Comments »

Doug McClure: Thoughts on BSM, ITSM, Change and Release Management

Posted June 25th, 2008 by Dennis Powell

A couple weeks ago, I had a chance to speak with Doug McClure about his perceptions in regards to Business Service Management (BSM), IT Service Management (ITSM), and its relationship to Change and Release Management. Doug is a Senior Managing Consultant for Business and IT Service Management within the IBM Tivoli Lab Services (ISST) organization, and believe me when I say that he lives and breathes this stuff. See Doug’s blog for more insight.

I’ll start by summarizing Doug’s comments: Everything in the IT environment rolls up to BSM. By “everything” Doug is referring to the people, process, components, systems, services, and technology that make up an IT organization. While BSM and ITSM are becoming more and more mainstream in today’s computing industry, Doug believes that most organizations still need to engage in a fundamental organizational breakdown to instill a true business service management perspective within the IT organization.  This would replace IT’s own lingering view that IT is responsible for “managing the Windows servers” e.g. the organization’s technology.

To begin our conversation, I asked Doug where he would advise organizations to start in bringing IT under the BSM umbrella. Doug responded with a sentiment that seems to be growing among those in the industry that advise and consult on such matters: the organizations that are most successful at adopting BSM and ITSM are those that drive adoption from the top, e.g. the executive office. These types of organizations build a true partnership between the business analysts that truly understand business, speak business language, and operate within business context, and the IT personnel who understand the technology, its power, and how best to use it.

One other aspect to this – when organizations “personalize” BSM and ITSM (in other words put it in their own terms of what BSM means to their business, rather than simply repeating what vendors or analysts say it means), the organizations regularly meet the goals and satisfy the objectives about BSM and ITSM that they put forth. It’s because they understand in their own terms what they are striving to accomplish.

I was curious to know whether Doug viewed the adoption of Web 2.0 technologies, and social/collaborative networking as a bane or boon to the adoption of BSM. There was no uncertainty in his answer. The adoption of social networking technology is absolutely a requirement for next generation BSM, to attract and keep people engaged.

This is not just because collaborative communication is the growing trend of communication. It is much more a requirement because organizations that universally adopt wikis, blogs, etc. tend to express things in a much more transparent and honest manner. For example, when business people “talk about the pain of an IT outage” using public collaborative means, that information reaches the “deck plate” where those on the front lines can be exposed to this unique business perspective.  This can be more impactful than hiding this information in a slide deck, an executive report, or behind the intranet firewall

Doug and I went on to discuss several other topics of interest, including his perspective on the importance of a Configuration Management DataBase (CMDB) to BSM, the importance of Change and Release Management to BSM, and the importance of Data Center Automation (DCA) to BSM. However, instead of giving all the highlights away here, I’ll invite you to listen to the conversation online.

My thanks to Doug for providing his time and invaluable insight. I look forward to speaking with him in the future about a variety of subjects related to BSM and ITSM.

Popularity: 24% [?]

Filed Under: Change Management, IT Operations, Interviews, Interviews-Bloggers

3 Comments »

Gartner IT Operations & Management Summit: Changes Mean We’re a Long Way from Nirvana

Posted June 24th, 2008 by Joe Pendry

Gartner IOM This week, we are attending the Gartner IT Infrastructure, Operations & Management Summit 2008. The sessions at this summit have proven to be full of interesting information for the IT’s About Uptime Team, so we’ll be sending in posts from Orlando all week.

On Monday, I attended a keynote address from Donna Scott, Gartner VP and Distinguished Analyst. She touched on the quest by IT Operations teams to achieve business alignment, and how this is often disrupted by “turbulence”…issues like compliance, new architectures and new technologies. Too often, IT Operations must take care of managing their environment at the expense of more strategic endeavors because – as she stated – “change is never ending” and it is hard to get to business alignment nirvana.

We couldn’t agree more. As we have posted in the past, change management maturity offers a host of business benefits. The most mature organizations can gain real business benefits such as fewer problems, greater confidence in changes and fewer emergency changes. This in turn, can allow them to focus on running IT as a business service.

Donna also mentioned a few key trends that are particularly prevalent these days:

  • Cost pressures. This is becoming more and more of an issue, requiring smart organizations to infuse a culture of continuous optimization that allows them to keep an eye on cost containment.
  • Alternate service delivery models. Strategies to support IT are varied and numerous. You can insource, outsource, leverage SaaS, utilize selective sourcing, or outsource only business processes to name a few. There is good and bad here – while these strategies offer flexibility, they can create business risk and alignment issues. According to Donna, there is nothing wrong with more diversity, but it needs to be managed with a common process. And, as we have blogged about (although we were focusing on virtualization), complexity can cause problems.
  • Business demands. Business wants IT to operate like a utility. Plug it in and let is run without interruption. Unfortunately IT can’t operate this way. No standard processes exist for IT Operations the way they do for utilities. There is no standard method, for example, to architect for or achieve availability.

One other interesting point. A survey of participants revealed that the top three pressures for attendees were:

  • 24×7 availability
  • Cost containment
  • Business continuity

Lower down the list (almost last) was “Preparing for virtualization.” It seems that most organizations no longer need to prepare because the majority are already using virtualization in their organization. As we have discussed, this will put even more pressure on IT Operations to handle and manage change.

It seems IT Operations will be very busy in the near term. Business alignment nirvana might be the endpoint, but we have lots of road to travel before we get there.

Popularity: 19% [?]

Filed Under: Change Management, IT Operations

Comment now »

Links List 6.20.08

Posted June 20th, 2008 by Joe Pendry

Google and Firefox both saw downtime this week. Google App Engine went down for a brief period on Tuesday, causing developers much angst as they could not access their management consoles. Interestingly enough, there was no mention of the downtime on their blog. Firefox also saw a brief period of downtime on Tuesday, due to the volume of people trying to access the download platform. As with Twitter, it appears that scalability and testing are key factors to prevent downtime.

Input/output (I/O) virtualization gets a nod from the Burton Group’s Data Center Strategies, noting, “in general these solutions do reduce complexity when trying to manage the unruly forest of physical connections that go into a rack of virtual server or blade hosts.”

Managed Objects discusses CMDB and the vision for integration. 70% of the data needed for a CMDB already exists in the enterprise. What’s needed is the mechanism by which to tie this data together, and present it in a manner that can be easily updated by the enterprise and consumed by the business.

Judith Hurwitz asks if Microsoft can manage to pull together five opportunities for service. Virtualization, managing a combined physical and virtual world, creating the next generation dynamic platform, SaaS (and more), and SOA are all mentioned as a unique opportunity to take Microsoft’s traditional customer base of programmers and move them to a new level of knowledge so they can participate in their vision of Dynamic IT.

Popularity: 31% [?]

Filed Under: Change Management, Downtime, IT Operations, Virtualization

Comment now »

The Hidden Costs of Heterogeneous Operating System Environments

Posted June 19th, 2008 by Jonah Paransky

Often we praise the advantages of a heterogeneous operating system environment to support multi-tier business applications and IT services. We can pick the best operating system for each component of the software infrastructure stack. We have better security, because we are using different operating systems for different components of our application. We manage costs better, because we won’t be locked in by a single vendor.

All these advantages are true. There are also hidden costs associated with this approach.

In our recent study, IT Operations Research Report: Testing Maturity Part II Applications and Operating Systems, we discovered several costs associated with heterogeneous environments. They included:

Increased Total Cost of Downtime

Heterogeneous environments show a greater total cost of unplanned downtime that organizations that used one operating system for all tiers of the stack.

Increased IT Labor Hours Due to Unplanned Downtime

Companies with multiple operating system environments devote over 60% more in IT staff time to address unplanned downtime emergencies.

Increased Cost of Changes Due to Production Problems

Companies with multiple operating systems show a greater cost of changes due to production problems compared to single operating system environments.

Greater Total Number of Changes

Companies make 26% more changes to heterogeneous operating system environments than to single operating system environments.

How Does This Impact the Data Center?

We therefore shouldn’t be surprised that so many organizations go down a single operating system path. A significant number of companies choose to standardize across individual stack tiers. A majority of companies also choose one operating system, such as Windows, for all tiers of their software infrastructure stacks as we discussed in depth in a previous blog post on Windows dominance in the data center.

OS

Download the Report

A full copy of the IT Operations Research Report: Testing Maturity Part II – Applications and Operating Systems can be downloaded here.

Popularity: 27% [?]

Filed Under: Change Impact Analysis, Downtime, IT Operations, IT Operations Research

Comment now »

Virtualization Security: What Is It and Where Is It Going?

Posted June 18th, 2008 by Joe Pendry

We mentioned that many virtualization and security experts joined together for a call to discuss virtualization security and what it means in the industry.

Kris Buytaert of virtualization.com has posted some teaser tidbits and quotes from the call as well. Chris Hoff is also interested in getting things going on his blog Rational Survivability.

Check back for additional discussion on the definition and impact of virtualization security, open source and virtualization, challenges, standardization questions and more virtsec news and analysis.

The entire conversation has been posted on StackSafe’s website for your convenience. Download the Virtualization Security Webinar from StackSafe’s website here.

Popularity: 28% [?]

Filed Under: Interviews, Security, Virtualization

Comment now »

Links List 6.13.08

Posted June 13th, 2008 by Joe Pendry

In light of Amazon’s latest downtime issues, Gigaom explains why Amazon went down and why it matters. In a thorough explanation, Gigaom bets the problem to be with the CDN or AFE. The moral of their story is to look into the global server load balancing and making sure to have geographically distributed data centers.

InfoWorld reports that Symantec is releasing a virtualization package that includes the vendor’s own management software with Citrix Xen hypervisor. The tool was announced this week in conjunction of the Symantec Vision 2008 conference.

Will the 3G iPhone release cause websites to clog up and go down? The Rogers site had some issues recently, and we’ll check in next week to see how it all turned out.

Cloud computing continues to be a hot topic, and Alan Shimel’s recent post on SYS-CON brings up questions of infrastructure and SaaS. He points out that, “it is easy to dismiss Don Dodge’s piece today asking ‘Do You Really Want Your Data in the Cloud?‘ as a Microsoft guy defending their turf. ” He uses some recent uptime problems at Amazon, Twitter, Disqus and Typepad to show that keeping your information in the cloud and relying on the net to deliver your applications gives you less control, less security, less scalability and less reliability.”

Transitioning to enterprise storage? Make sure that you are testing the configuration. According to TechRepublic, “when you provision a system, you should go through a testing process that checks I/O performance on the shared storage and accounts for the loss of a path or link down in the connection. You should also know what tools you can use to add, remove, and modify storage while a system is online. You do not want to go through the discovery process on a live system.” A virtual environment will be a lifesaver in this case.

Popularity: 38% [?]

Filed Under: Downtime, IT Operations, Testing, Virtualization

Comment now »