Why Information Lifecycle Management (ILM) Failed But Needs an Updated Look


If you know me, you know I advocate something called Information Lifecycle Governance (ILG) as the proper model for managing information over its’ lifespan.  I was reminded recently (at IOD) during a conversation with Sheila Childs, who is a top Gartner analyst in this subject area, of a running dialogue we have on the differences between governance at the storage layer and using records management and retention models as an alternative approach.  This got me thinking about the origins of the ILG model and I decided to take a trip in the “way-back” machine for this posting.

 

Accordingly to Wikipedia as of this writing, Information Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices.  Searchstorage.com (an online storage magazine) offers the following explanation:  Information life cycle management (ILM) is a comprehensive approach to managing the flow of an information system’s data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. Unlike earlier approaches to data storage management, ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures, as for example, hierarchical storage management (HSM) does. Also in contrast to older systems, ILM enables more complex criteria for storage management than data age and frequency if access. ILM products automate the processes involved, typically organizing data into separate tiers according to specified policies, and automating data migration from one tier to another based on those criteria. As a rule, newer data, and data that must be accessed more frequently, is stored on faster, but more expensive storage media, while less critical data is stored on cheaper, but slower media. However, the ILM approach recognizes that the importance of any data does not rely solely on its age or how often it’s accessed. Users can specify different policies for data that declines in value at different rates or that retains its value throughout its life span. A path management application, either as a component of ILM software or working in conjunction with it, makes it possible to retrieve any data stored by keeping track of where everything is in the storage cycle.

If you were able to get all the way through that (I had to read it 3 times) you probably concluded that (1) it was way too complicated (2) was very storage centric and likely too costly (3) was incomplete.  These are all reasons why this concept never took hold and is widely considered a failed concept.

But hold on … let’s not throw the baby out with the bath water quite yet.  The underlying idea is sound but needs modification.  In my opinion, here is what was wrong with the notion of ILM when it came to prominence in 2002 or so:

It’s incomplete:  Frequency of access does not determine the usefulness of information.  Any set of policies need to include the value of the information to the business itself and the legal and regulatory obligations.  Only calculating how recently files were accessed and used is an incomplete approach.  Wouldn’t it make sense to understand all of the relevant facets of information value (and obligations) along with frequency of access? 

It’s inefficient and leads to error:  Managing policies at the device level is a bad idea.  As an example, many storage devices require setting the retention policy at the device itself.  This seems crazy to me as a general principle.  Laws and obligations change, policies changes, humans make errors … all of which leads to a very manual time-consuming and error prone policy administration process.  Wouldn’t a centrally managed policy layer make more sense?

It’s not well understood and can be too costly:  This model has led to the overbuying of storage.  Many organizations have purchased protected storage when it was not necessary.  These devices are referred to as NENR (Non Erasable, Non Rewritable) or WORM (Write Once, Read Many).  These devices come in multiple flavors:  WORM Optical, WORM Tape and Magnetic Disk WORM (Subystem) and can include multiple disks with tiered tape support.  Sample vendors include: EMC Centera, Hitachi HCAP, IBM DR550, NetApp Snaplock and IBM Information Archive.  This class of storage costs more then other forms of storage primarily because of the perception of safety.  Certain storage vendors (who will remain nameless) have latched onto this market confusion and even today try to “oversell” storage devices as a substitute for good governance.  This is often to uninformed or ill-advised buyers.  The fact is, only the SEC 17a-4 regulation requires WORM storage.  Using WORM for applications other then SEC 17a-4 usually means you are paying too much for storage and creating retention conflicts (more on this in a future posting).  The point is … only buy protected storage when appropriate to your requirements or obligations.

If we could just fix those issues, is the ILM concept worth re-visiting?  It’s really not that hard of a concept.  When information is born, over 90% is born digital.  Over 95% expires and needs to be disposed of.  Here is a simple concept to consider:

A simple model for governing information over its' lifespan

I will go deeper in this very concept (and model) in my next posting.  In the mean time, leave me your thoughts on the topic. 

I am also curious to know if you have been approached by an overly zealous vendor trying to sell you WORM based storage as a replacement for good governance or records management.  I will publish the results.

14 thoughts on “Why Information Lifecycle Management (ILM) Failed But Needs an Updated Look

  1. Good points, and I hope you expand on this topic going forward. I would also like to hear your thoughts on how ILM fits fits Information Agenda – strategic planning for information in the enterprise.

  2. I agree that the definition of ILM can be a bit unwieldy. In my role I’m regularly asked to define ILM, and I prefer to use a combination of the SNIA definition as well as the goals I’ve set for my organization with regard to ILM:

    * Definition: ILM is comprised of the policies, processes, practices and tools used to align the business value of information with the most appropriate and cost-effective IT infrastructure from the time information is conceived through its final disposition
    * Goal: Reduce the complexity, footprint and cost of the hardware, software and maintenance within our storage environment while maintaining data security and ensuring data availability.
    * Goal: Determine the business value of all the data in our organization and use this business value as a guide to determine where data should be stored (fast disk, slower disk, tape archive), how quickly we need be able to access data during regular business or recover it in a disaster, and how to dispose of data when it no longer has business value

    Our ILM program is still young, but I believe we are much farther ahead than many other organizations of our size (2PB online, 2.6PB offline). A key requirement for the success of an ILM program is to recognize that it is a process that should be SLA driven. Once an organization comes to terms with this (and realizes it is more than just tiered storage as many of the vendors would lead you to believe) they have a much better chance of deploying an effective ILM solution. Would you agree?

    • Eric – Thanks for the thoughtful posting. I general I agree with your point … in that any organization should understand it’s information assets and the associated business value and obligations … and should match it’s SLA’s to all of those. My challenge with SNIA’s definition is that it seems to be missing the point about legal and regulatory obligations. My very simple example being that you can’t dispose of any form of information if you don’t know what your legal hold or eDiscovery obligations are. Not being able to dispose of information undermines any ILM or governance initiative.

  3. Craig,

    This is a thoughtful piece you’ve written although it seems to me it should really be two separate blogs: One about ILM or ILG and the other about storage vendors overselling WORM to “protect” data.

    Regarding ILM, I agree with you that ILM definitions tend to be too storage only centric and that a centralized, more holistic information governance strategy is needed that transcends cross-functional or departmental policies and solutions. I also agree with you that additional criteria beyond frequency of access and age is needed to properly govern information. However, storage features such as auto-tiering are becoming essential for under staffed, overburdened organizations that are already swamped with data with more to come providing some welcome and affordable relief until ILG solutions become more mature.

    Regarding WORM, I wrote blog post last year that I believe dovetails with your thoughts here. http://wikibon.org/blog/hey-sec-time-to-de-hype-worm-and-other-technology-requirements/ Ironically, I have subsequently interviewed many happy WORM clients whose needs go beyond 17a-4 who are quite happy with the data protection and compliance benefits they receive. I think it behooves storage vendors to move beyond purpose built storage offerings and take into account your salient point about how automated storage policy can only take ILG so far and how ILM today fails to take into account critical human and organoztional factors. Meanwhile, there is still a market for WORM in many markets including financial, medical and government.

    Regards,

    Gary

    • Craig – good points regarding legal and regulatory. As we develop classes of service for our ILM program we must create special classes into which this data would be moved. Having someone from legal on your ILM governance team is critical to ensure this compliance is in place. I consider legal and regulatory compliance a business issue, and business requirements should drive system requirements in any good solution including and ILM program. Good discussions here!

  4. Craig, excellent posting and discussion. I appreciate the clarification between ILM and ILG and now more fully appreciate the value of ILG and the differentiation.

    Something that ILM does not address is the unnecessary cost, risk and burden of keeping transient information around beyond its retention period. I often use the AIIM term ROT, Redundant Outdated and Trivial, for this information. In the recent CGOC, http://cgoc.com , study the cost of ediscovery at $3M per case and that 70% of the content that was reviewed by expensive attorneys was past its retention really calls this out.

    The point to me is, ILM would be keeping this info on certain storage types, whereas ILG addresses the disposition of this content. I can see there is a value for each ILM and ILG, but as Eric pointed out above the ILG is more driven by business than IT.

    I look forward to more on this from you.

    Scott

    • In our environment we aren’t drawing a distinction between ILM and ILG. Rather, we are incorporating disposal into our model with the understanding that keeping everything forever (as we are doing now) is beginning to get pretty expensive. But outside of our backup solution which makes disposal easy via an expiration process, we haven’t identified a tool or suite of tools to help manage this disposal. We’ve found it pretty easy to classify and tier data. Learning how to throw stuff away, despite how badly we want to do it, looks like the really hard part.

      • Eric … you have hit the nail on the head. It is the hardest part. Keeping everything forever and simply moving it around the storage infrastructure isn’t very helpful. Especially when you consider that everything you have is potentially discoverable. Defensibly disposing of what is no longer needed is the key reason for the word “governance”. Backup has it’s own pitfalls if used as a substitute for an archive but that’s a whole separate issue. I won’t write my next ILG posting here but you said pretty well yourself.

  5. Craig, I read this a few days ago and certainly agree, especially that ‘storage ILM’ was always too complex, never got down to a well-defined prescriptive ‘what-to-do-when’ proscriotive policy level, and could be to costly. Am forwarding the Gartner report I pulled out at dinner at IOD for Craig Butler related to this (you may have seen it) and the URL to the whitepaper I wrote on ‘ECM & ILM’ in 2007, with RM being the higher-order ECM-based capability that truly illustrated the limitations of storage ILM.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s