Top 10 ECM Pet Peeve Predictions for 2011

It’s that time of the year when all of the prognosticators, futurists and analysts break out the crystal balls and announce their predictions for the coming year.  Not wanting to miss the fun, I am taking a whack at it myself but with a slightly more irreverent approach … with a Top 10 of my own.  I hope this goes over as well as the last time I pontificated about the future with Crystal Ball Gazing … Enterprise Content Management 2020.

I don’t feel the need to cover all of the cool or obvious technology areas that my analyst friends would.  A number of social media, mobile computing and cloud computing topics would be on any normal ECM predictions list for 2011.  I do believe that social media, combined with mobile computing, delivered from the cloud will forever change the way we interact with content but this list is more of my own technology pet peeve list.  I’ve decided to avoid this set of topics as there is plenty being written about all three topics already.  I’ve also avoided all of the emerging fringe ECM technology topics such as video search, content recommendation engines, sentiment analysis and many more.  There is plenty of time to write about those topics in the future.  Getting this list to just 10 items wasn’t easy … I really wanted to write something more specific on how lousy most ECM meta data is but decided to keep the list to these 10 items.  As such, ECM meta data quality is on the cutting room floor.  So without further a do … Craig’s Top 10 Pet Peeve Predictions for 2011:

 
Number 10:  Enterprise Search Results Will Still Suck
Despite a continuing increase in software sales and an overall growing market, many enterprises haven’t figured out that search is the ultimate garbage in, garbage out, model.  Most end-users are frustrated at their continued inability to find what they need when they need it.  Just ask any room full of people.  Too many organizations simply decide to index everything thinking that’s all you need to do … bad idea.  There is no magic pill here, search results will ultimately improve when organizations (1) eliminate the unnecessary junk that keeps cluttering up search results and (2) consistently classify information, based on good meta data, to improve findability.  Ultimately, enterprise search deployments with custom relevance models can deliver high quality optimal results, but that’s a pipedream for most organizations today.  The basics need to be done first and there is a lot of ignorance on this topic.  Unfortunately, very little changes in 2011, but we can hope.
 
Number 9:  Meaning Based Technologies Are Not That Meaningful
Meaningful to whom?  It’s the user, business or situation context that determines what is meaningful.  Any vendor, with a machine based technology claiming that it can figure out meaning without understanding the context of the situation is stretching the truth.  Don’t be fooled by this brand of snake oil.  Without the ability to customize to specific business and industry situations these “meaning” based approaches don’t work … or are of limited value.  Vendors currently making these claims will “tone down” their rhetoric in 2011 as the market becomes more educated and sophisticated on this  topic.  People will realize that the emperor has no clothes in 2011.
 
Number 8:  Intergalactic Content Federation Is Exposed As A Myth
The ability to federate every ECM repository for every use case is wishful thinking.  Federation works very well when trying to access, identify, extract and re-use content for applications like search, content analytics, or LOB application access.  It works poorly or inconsistently when trying to directly control content in foreign repositories for records management and especially eDiscovery.  There are too many technology hurdles such as security models, administrator access, lack of API support, incompatible data models that make this very hard.  For use cases like eDiscovery, many repositories don’t even support placing a legal hold.  Trying to do unlimited full records federation or managing enterprise legal holds in place isn’t realistic yet … and may never be.  It works well in certain situations only.  I suppose, all of this can be solved with enough time and money but you could say that about anything – it’s simply not practical to try to use content federation for every conceivable use case and that won’t change in 2011.  This is another reason why we need the Content Management Interoperability Standard (CMIS).
 
Number 7:  CMIS Adoption Grows, Will Be Demanded From All Content, Discovery and Archive Vendors
Good segue, huh?  If federation is the right approach (it is), but current technology prevents it from becoming a reality, then we need a standard we can all invest in and rely on.  CMIS already has significant market momentum and adoption.  Originally introduced and sponsored by IBM, EMC, Alfresco, OpenText, SAP and Oracle, it is now an OASIS standard where the list of members has expanded to many other vendors.  IBM is already shipping CMIS enabled solutions and repositories, as are many others.  However, some vendors still need encouragement.  None of the archiving or eDiscovery point solution vendors have announced support for CMIS yet.  I expect to see market pressure in 2011 on any content related vendor not supporting CMIS … so get ready Autonomy, Symantec, Guidance Software, and others who are not yet supporting CMIS.  The days of closed proprietary interfaces are over.  
 
Number 6:  ACM Blows Up BPM (in a good way)
Advanced Case Management will forever change the way we build, deploy and interact with process and content centric (or workflow if you are stuck in the ’90s) applications.  Whether you call it Advanced Case Management, Adaptive Case Management or something else, It’s only a matter of time before the old “wait for months for your application model” is dead.  Applications will be deployed in days and customized in hours or even minutes.  IT and business will have a shared success model in the adoption and use of these applications.  This one is a no-brainer.  ACM takes off in a big way in 2011.
 
Number 5:  Viral ECM Technologies without Adequate Governance Models Get Squeezed
In general, convenience seems to trump governance, but not this year.  The viral deployment model is both a blessing and a curse.  IT needs to play a stronger role in governing how these collaborative sites get deployed, used and eventually decommissioned.  There is far too much cost associated with eDiscovery and the inability to produce documents when needed for this not to happen.  There are way too many unknown collaborative sites containing important documents and records.  Many of these have been abandoned causing increased infrastructure costs and risk.  The headaches associated with viral deployments force IT to put its foot down in 2011.  The lack of governance around these viral collaborative sites becomes a major blocker to their deployment starting in 2011.
 
Number 4:  Scalable and Trusted Content Repositories Become Essential
Despite my criticism of AIIM’s labeling of the “Systems of Engagement” concept in my last blog, they’ve nailed the basic idea.  “Systems or Repositories of Record” will be recognized as essential starting in 2011.  We expect 44 times the growth of information in 10 years with 85% being unstructured, yikes!  We’re going to need professional, highly scalable, trusted, defensible repositories of record to support the expected volume and governance requirements, especially as ECM applications embrace content outside the firewall.  Check out my two postings earlier this year on Trusted Content Repositories for more on this topic (Learning How To Trust … and Step 1 – Can You Trust Your Repository?)
 
Number 3:  Classification Technology Is Recognized As Superior To Human Based Approaches
For years, I’ve listened to many, many debates on human classification versus machine based classification.  Information is growing so out of control that it’s simply not possible to even read it all … much less decide how it should be classified and actually do it correctly.  The facts are simple; studies show humans are 92% accurate at best.  The problem is that humans opt out sometimes.  We get busy, get sick, have to go home or simply refuse to do certain things.  When it comes to classification, we participate about 33% of the time on average.  Overall, this makes our effective accuracy more like 30% and not 92%.  Context technology based approaches have consistently hit 70-80% over the years and recently we’ve seen accuracy levels as high as 98.7%.  Technology approaches cost less too.  2011 is the year of auto-classification.
 
Number 2:  Business Intelligence Wakes Up – The Other 85% Does Matter
It’s a well known fact that ~85% of the information being stored today is unstructured.  Most BI or data warehouse deployments focus on structured data (or only 15% of the available information to analyze).  What about the rest of it?  The explosion of content analysis tools over the last few years has made the 85% more understandable and easy to analyze then ever before and that will continue into 2011.  BI, data warehouse and analytics solutions will increasingly include all forms of enterprise content whether inside or outside the firewall.
 
Number 1:  IT Waste Management Becomes a Top Priority
The keep everything forever model has failed.  Too many digital dumpsters litter the enterprise.  It’s estimated over 90% of info being stored today is duplicated at least once and 70% is already past its retention date.  It turns out buying more storage isn’t cheaper, once you add in the management staff, admin costs, training, power and so forth.  One customer told me they’d have to build a new data center every 18 months just to keep storing everything.  In 2011, I expect every organization to more aggressively start assessing and decommissioning unnecessary content as well as the associated systems.  The new model is keep what you need to keep … for only as long as you need to keep it based on value and/or obligation … and defensibly dispose of the rest.
 
I hope you enjoyed reading this as much as I enjoyed writing it.  I hope you agree with me on most of these.  If not, let me know where you think I am wrong or list a few predictions or technology pet peeves of your own.  
 

It’s Back to the Future, Not Crossing the Chasm When it Comes to AIIM’s “Systems of Record” and “Systems of Engagement”

Pardon the interruption from the recent Information Lifecycle Governance theme of my postings but I felt the need to comment on this topic.  I even had to break out my flux capacitor for this posting to remind me as I was certain I had seen this before.

Recently at the ARMA Conference and currently in the AIIM Community at large, there is a flood of panels, webinars, blog postings and tweets on a “new” idea from Geoffrey Moore (noted author and futurist) differentiating “Systems of Record” from “Systems of Engagement.” This idea results from a project at AIIM where Geoffrey Moore was hired as a consultant to give the ECM industry a new identity among other things. One of the drivers of the project has been the emergence and impact of social media on ECM. The new viewpoint being advocated is that there is a new and revolutionary wave of spending emerging on “Systems of Engagement” – a wave focused directly on knowledge worker effectiveness and productivity.

Let me start by saying that I am in full agreement with the premise behind the idea that there are separate “Systems of Record” and “Systems of Engagement.” I am also a big fan of Geoffrey Moore. I’ve read most of his books and have drank the Chasm, Bowling Alley, Tornado and Gorilla flavors of his Kool-Aid. In fact, Crossing the Chasm is mandatory reading on my staff.

Most of the work from the AIIM project involving Moore has been forward thinking, logical and on target. However, this particular outcome does not sit well with me. My issue isn’t whether Moore and AIIM are right or wrong (they are right). My issue is that this concept isn’t a new idea. At best, Geoffrey has come up with a clever new label. The concept of “System of Record” is nothing new and a “System of Engagement” is a catchy way of referring to those social media systems that make it easier to create, use, and interact with content.

Here is where AIIM and Moore are missing the point. Social Media is just the most recent, not the first “System of Engagement.” Like those before it, these previous engagement systems were not capable of also being “Systems of Record” … so we need both … we’ve always needed both. It’s been this way for years. Apparently though, we needed a new label as everyone seems to have jumped on the bandwagon except me.

Let me point out some of the other “Systems of Engagement” over the years. For years, we’ve all been using something called Lotus Notes and/or Microsoft Exchange as a primary system to engage with our inner and outer worlds. This engagement format is called email … you may have heard of it. Kidding aside, we use email socially and always have. We use email to engage with others. We use email as a substitute for content management. Ever send an email confirming a lunch date? Ever communicate project details in the body of an email? Ever keep your documents in your email system as attachments so you know where they are? You get the idea. Email is not exactly a newfangled idea and no one can claim these same email systems also serve any legitimate record keeping purpose. There is enough case law and standards to fill a warehouse on that point (pardon the paper pun). More recently, instant messaging has even supplanted email for some of those same purposes especially as a way to quickly engage and collaborate to resolve issues. No one is confused about the purpose of instant messaging systems. It can even be argued that certain structured business systems like SAP are used in the same model when coupled with ECM to manage key business processes such as accounts payable. The point being, you engage in one place and keep records or content in another place. Use the tool best suited to the purpose.

Using technology like email and instant messaging to engage with, collaborate and communicate on content related topics with people is not a new idea. Social media is just the next thing in the same model. On one hand, giving social media and collaboration systems a proper label is a good thing. On the other hand, give me a break … any Records Manager doing electronic records embraced the concept of “record making applications” and “record keeping systems” a long time ago. It’s a long standing proven model for managing information. Let’s call it what it is.

I applaud AIIM and Moore for putting this idea out there but I also think they have both missed the mark. “Systems of Engagement” is a bigger, different and proven idea than how both currently talking about it. Maybe I am Luddite, but this seems to me like this simply a proven idea that got a fresh coat of paint.

As AIIM and Moore use words like “revolution” and “profound implications” in their promotional materials I think I’ll break out my Back to the Future DVD and stay a little more grounded.  Like a beloved old movie, I am still a fan of both Moore and AIIM.  However, I recommend you see this particular movie for yourself and try to separate the hype from the idea itself.  If you do, let me know whether you agree … is this an original idea or a simply a movie sequel?

Why Information Lifecycle Management (ILM) Failed But Needs an Updated Look

If you know me, you know I advocate something called Information Lifecycle Governance (ILG) as the proper model for managing information over its’ lifespan.  I was reminded recently (at IOD) during a conversation with Sheila Childs, who is a top Gartner analyst in this subject area, of a running dialogue we have on the differences between governance at the storage layer and using records management and retention models as an alternative approach.  This got me thinking about the origins of the ILG model and I decided to take a trip in the “way-back” machine for this posting.

 

Accordingly to Wikipedia as of this writing, Information Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices.  Searchstorage.com (an online storage magazine) offers the following explanation:  Information life cycle management (ILM) is a comprehensive approach to managing the flow of an information system’s data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. Unlike earlier approaches to data storage management, ILM involves all aspects of dealing with data, starting with user practices, rather than just automating storage procedures, as for example, hierarchical storage management (HSM) does. Also in contrast to older systems, ILM enables more complex criteria for storage management than data age and frequency if access. ILM products automate the processes involved, typically organizing data into separate tiers according to specified policies, and automating data migration from one tier to another based on those criteria. As a rule, newer data, and data that must be accessed more frequently, is stored on faster, but more expensive storage media, while less critical data is stored on cheaper, but slower media. However, the ILM approach recognizes that the importance of any data does not rely solely on its age or how often it’s accessed. Users can specify different policies for data that declines in value at different rates or that retains its value throughout its life span. A path management application, either as a component of ILM software or working in conjunction with it, makes it possible to retrieve any data stored by keeping track of where everything is in the storage cycle.

If you were able to get all the way through that (I had to read it 3 times) you probably concluded that (1) it was way too complicated (2) was very storage centric and likely too costly (3) was incomplete.  These are all reasons why this concept never took hold and is widely considered a failed concept.

But hold on … let’s not throw the baby out with the bath water quite yet.  The underlying idea is sound but needs modification.  In my opinion, here is what was wrong with the notion of ILM when it came to prominence in 2002 or so:

It’s incomplete:  Frequency of access does not determine the usefulness of information.  Any set of policies need to include the value of the information to the business itself and the legal and regulatory obligations.  Only calculating how recently files were accessed and used is an incomplete approach.  Wouldn’t it make sense to understand all of the relevant facets of information value (and obligations) along with frequency of access? 

It’s inefficient and leads to error:  Managing policies at the device level is a bad idea.  As an example, many storage devices require setting the retention policy at the device itself.  This seems crazy to me as a general principle.  Laws and obligations change, policies changes, humans make errors … all of which leads to a very manual time-consuming and error prone policy administration process.  Wouldn’t a centrally managed policy layer make more sense?

It’s not well understood and can be too costly:  This model has led to the overbuying of storage.  Many organizations have purchased protected storage when it was not necessary.  These devices are referred to as NENR (Non Erasable, Non Rewritable) or WORM (Write Once, Read Many).  These devices come in multiple flavors:  WORM Optical, WORM Tape and Magnetic Disk WORM (Subystem) and can include multiple disks with tiered tape support.  Sample vendors include: EMC Centera, Hitachi HCAP, IBM DR550, NetApp Snaplock and IBM Information Archive.  This class of storage costs more then other forms of storage primarily because of the perception of safety.  Certain storage vendors (who will remain nameless) have latched onto this market confusion and even today try to “oversell” storage devices as a substitute for good governance.  This is often to uninformed or ill-advised buyers.  The fact is, only the SEC 17a-4 regulation requires WORM storage.  Using WORM for applications other then SEC 17a-4 usually means you are paying too much for storage and creating retention conflicts (more on this in a future posting).  The point is … only buy protected storage when appropriate to your requirements or obligations.

If we could just fix those issues, is the ILM concept worth re-visiting?  It’s really not that hard of a concept.  When information is born, over 90% is born digital.  Over 95% expires and needs to be disposed of.  Here is a simple concept to consider:

A simple model for governing information over its' lifespan

I will go deeper in this very concept (and model) in my next posting.  In the mean time, leave me your thoughts on the topic. 

I am also curious to know if you have been approached by an overly zealous vendor trying to sell you WORM based storage as a replacement for good governance or records management.  I will publish the results.

IBM Acquires PSS Systems – You Might Be Asking Why?

In case you missed it, IBM announced today the acquisition of PSS Systems.

You might be asking why?  Organizations are striving for rigorous discovery, more effective information retention, and legally-defensible data disposal because of rising eDiscovery pressures and exponential information growth.  According to Information Week, a whopping 17% – and rising – of organizations’ IT budgets is now spent on storage.   A new Compliance, Governance and Oversight Council (CGOC) Benchmark Report on information governance revealed fewer than 25% of organizations are able to dispose of data because they lack rigorous legal hold practices or effective record retention programs.  eDiscovery costs average over $3 million per case yet an estimated 70% of information is often needlessly retained; as with escalating IT costs, the root cause of escalating eDiscovery cost is the inability to dispose of information when it is no longer needed.

Organizations struggle with these issues.  What has been missing up until now are: 1) a way to coordinate policy decisions for legal hold and retention management across stakeholders; and 2) a way to systematically execute those policy decisions on high volumes of information that are often residing in disparate systems.  To effectively determine what is eligible for disposal, organizations must determine and associate the legal obligations for information and its specific business value with information assets.  With multiple stakeholders, litigation intensity and information diversity across the enterprise, it is essential to coordinate and formalize policy decisions in real time as they are made by legal, records and business groups and automate the execution of those policies on information across the enterprise.

These problems are of high importance to legal and IT executives; 57% have established executive committees to drive better legal and lifecycle governance outcomes but less than 1/3 of organizations have achieved the desired cost and risk reduction results.

Organizations lack sufficient internal competency or resources to quantify the cost and risk business case and define the program structures necessary to achieve their defensible disposal goals.  While 98% of organizations cite defensible disposal as the results they are seeking, only 17% believe they have the right people resources at the table1.  The analysts predict that the market for these kinds of governance solutions will experience significant growth through 2014, they also point out that internal cooperation and competencies are barriers today.

Now with the acquisition of PSS Systems, only IBM provides a comprehensive and integrated enterprise solution for legal and information lifecycle governance, along with the business expertise that customers need to reduce legal risk and lower discovery and information and content management costs. The PSS Atlas legal information governance solutions complement and extend IBM’s existing Information Lifecycle Governance strategy and integrated suite of solutions.  This joint olution and approach is unlike others that address only a single silo such as legal, which fail to systematically link legal decisions to corresponding information assets and therefore don’t fully mitigate risk or actually increase the cost of compliance.

Until now, organizations’ choices were limited, and reinforced their problems by failing to systematically link legal obligations and business value to information assets. Often initial selection of tactical eDiscovery applications left organizations with high risk and compliance cost and no path forward to defensible disposal because these tactical applications don’t integrate holistically with records and retention management, email archiving, advanced classification and enterprise content management systems and infrastructure.

Those days are over !!  If you can’t tell … I am excited about the future of how we plan to help customers tackle these problems in concert with our new colleagues from PSS Systems.

Adding Storage or Enforcing Retention: The Debate is Over

I did a joint webcast this week with InformationWeek on strategies to deal with information overload (which made me feel guilty about my recent lull in blogging).  On the webcast we conducted a quick poll and I was fascinated by the results.  The poll consisted of two questions:

The first question was …

What is your organization’s current, primary strategy for dealing with its information overload?

The choices and audience responses were:

  1. Adding more storage  35.2%
  2. Developing new enterprise retention policies to address information growth  29.6%
  3. Enforcing enterprise retention policies more vigorously  9.3%
  4. Don’t know  25.9%

The second question was the same except asked in a future tense:

What is your organization’s future, primary strategy for dealing with its information overload?

It had the same choices but far different audience responses:

  1. Adding more storage  19.1%
  2. Developing new enterprise retention policies to address information growth  29.8%
  3. Enforcing enterprise retention policies more vigorously  25.5%
  4. Don’t know  25.5%

Holy smokes Batman! … I think we are coming out of the dark ages.  Keep in mind that InformationWeek serves an IT centric audience and generally not the RIM or Legal stakeholders who are already passionate about retention and disposition of records and information.  From this survey data I concluded the following from this IT centric audience:

  • 29.6% already developing retention policies today in addition to those that already have them – this is progress.
  • Adding storage as a primary strategy will decrease from 35.2% to 19.2%this is amazing … and may be the first time “adding storage” wasn’t the automatic answer.
  • Enforcing retention as a primary strategy will increase from 9.3% to 25.5%IT professionals clearly understand that enforcing retention is “the” answer to controlling information growth, see Spring Cleaning for Information and How Long Do I Keep Information?
  • 55.3% will develop or enforce retention policies as a primary strategy in the future – more than 3 times now prefer this to adding storage.
  • Developing and enforcing retention policies is now the clear choice for a primary strategy to address information overload and growth over simply adding storage.

This isn’t the only data that supports this of course.  According to Osterman Research, 70% of organizations share the same concern.  A number of related resources can be found at http://tinyurl.com/2fayjwf including a webinar from Osterman and others.

Here is the replay link to the information overload webinar Content Assessment: The Critical First Steps to Gaining Control that serves as the backdrop for this posting … I hope you check it out.

In any case, rejoice with me … Ding Dong the Witch is Dead !

Developing and enforcing retention policies is now the clear choice and current primary strategy over simply adding storage by all stakeholders … IT, Legal and RIM.  Are you seeing the same change in thought and action in your organization?  Let me know by sharing your thoughts.

How Long Do I Keep Information?

In case you are wondering how the garage cleaning went last weekend (see my last posting)  … I filled several trash bags and boxes worth of items that were donated and several others for the trash.  In order to ensure I was only disposing of unnecessary stuff and not valued items, I secured the approval of my family stakeholders before disposing of anything.  The results we’re fantastic … I cleared several shelves worth of storage space that allowed me to reorganize for better findability.  I now have plenty of room to store more items and everything is properly organized so I can find things in the future, including the lost flashlight, which is no longer lost.  Best of all, it didn’t cost me anything except a little time.

It’s exactly the same with information.  Like the unnecessary stuff I was keeping in my garage, information has a useful lifespan that ultimately requires disposition.  In simple terms, information is created, used, stored and should ultimately be disposed of.  It should be obvious from my previous posting why information disposal is probably the most important step in this “information lifecycle”.

Many people get confused by this notion though.  The confusion comes in when deciding how long (and why) they need to keep things for.  There are two primary schools of thought on this:

  • Keep information based on how often it is used or accessed – the frequency of access model … or …
  • Keep information based on actual value or obligation – the business value (and obligation) model.

The frequency of access paradigm gave us the term “information lifecycle management” or “ILM” a couple of years ago.  This was a vendor driven idea that moved information between storage tiers based on frequency of access.  It never really caught on as it didn’t address the core issues especially the disposal of information.  It’s an interesting concept if your motive is to sell storage.  Moving information around to optimize storage infrastructure is a good idea but only part of the answer.  Business need, relevance and usage combined with regulatory and legal obligations truly determine how long information must be managed, retained and governed.

In simple terms, we should keep information because it is an asset (business value) and/or because we have an obligation to do so (legal and regulatory).  Debra Logan (Vice President at Gartner) has been publishing excellent research on this topic.  Best practices exist as well.  The new Information Management Reference Model (IMRM), from the same organization that gave us The Electronic Discovery Reference Model (EDRM), aligns the key stakeholders (IT, Business and RIM/Legal) with the key issues (asset, value and duty) and the key benefits (efficiency, profit and reduced risk).  There are a number of other approaches as well, notably The Generally Accepted Recordkeeping Principles (GARP) from ARMA.

Best of all, optimizing systems and storage infrastructure based on business context of usage, not just frequency of access, is much easier to do when things are properly organized (classified) based on actual need/value.

In summary, the business value of information changes over time requiring Information Lifecycle Governance eventually requiring defensible disposition (more on that next time).  I hope you manage and govern your information based on business value and your obligations.  If not, check out some the links above to get started.  I also hope your information spring cleaning is coming along as well as my garage is.  I am so motivated by my results that the attic is next for me.

Spring Cleaning for Information

I find myself wondering (as I plan to clean out the garage today) what time of year we’re supposed to throw out all that unnecessary information we keep around.  Since cleaning out the garage doesn’t qualify as fun in my book, I would sure be easier just to add space to my garage.  That way, I’d never have to throw anything away.  It would cost alot … and make it much harder to find important stuff among all of the clutter but it would be easier.  Maybe I should just call a contractor (5 minutes) rather then actually clean out the garage (at least an hour or more).  Hhhmmm …

It’s funny that when it comes to this aspect of information management we seem to always take the path of least resistance.  I’ve lost count of many times I’ve heard “storage is cheap” or other reasons why organizations don’t properly manage the lifespan of their information.  Most organizations don’t have a responsible program to properly dispose of electronically stored information.  How is this possible when those same organizations usually have good control over and properly dispose of paper based information?

Sure it’s harder to properly organize, retain and dispose of electronically stored information but the keep everything forever model has failed.  Buying more storage is not the answer.  Storage already consumes (on average) 17% of IT budgets and information will continue to explode … eventually gobbling up increasing percentages of IT budgets.  When does it end?  It won’t by itself.  Left unattended, this information explosion will eventually consume all remaining IT budget dollars and cripple or prevent any strategic investments by IT.

If that weren’t sobering enough, valued information is already buried beneath too much unnecessary information.  Much of it is over-retained, irrelevant and duplicated.  This is causing runaway storage and infrastructure costs and exacerbating power, space and budget challenges.  It’s also creating an inability to find and produce critical information, especially under punitive scenarios and deadlines.  How can anyone find and leverage the useful and trustworthy information lost among all the junk?

This sounds exactly like my garage … the power went out the other night and I was desperate to find that really cool flashlight I bought last year in case of a power outage.  Couldn’t find it, which ended up being my motivation to clean out the garage and throw out all of the unneccessary stuff that is piling up.  No garage extension for me!  No offsite storage facility either!  The fact is, I don’t want to spend more money on simply storing random unnecessary stuff.  I have higher value activities to spend my budget on … like golf 🙂

Isn’t it time every organization did their own information spring cleaning?  It would reduce storage/infrastructure costs, improve findability of information, reduce legal risks and increase usefulness and re-use of information.

Maybe you are already planning to clean out your garage of enterprise information.  Leave me your thoughts on the topic or visit us at the upcoming National Conference on Managing Electronic Records in Chicago.  We’ll be doing a special session on Content Assessment and how to use Content Analytics to identify and defensibly decommission and routinely dispose of unnecessary information.

Creating and Managing Trusted Content

I remember when I was in elementary school (don’t laugh), that my best friend Danny tried to change one of the grades on his quarterly report card.   We used to walk home from school together and on this day we stopped at the corner drug store where he bought some office supplies and went about “altering” his report card.  Ahhh … the things you think you can get away with in 5th grade … so foolish.  It was a great plan for Danny right up until the point his Mom spotted the obvious change.  Needless to say, Danny’s report cards could no longer be trusted as an accurate representation of his school performance.  It completely backfired and his report cards got more scrutiny then he could have ever wanted, all the way through high school.  I think he makes fake passports today (kidding).  He actually works for a large financial institution (not kidding).

This one incident made Danny’s parent’s suspicious of the entire school grade reporting process and they never trusted report cards again.  He ruined it for his younger sister too.  It’s the same with documents.  We need a better process (and technology) to ensure our documents and records can be trusted for business decision making.  The implications in business are far more catastrophic.

Consider the large distributor who has multiple versions of contracts and supplier agreements.  The business fails to reference the correct version of a contract addendum that materially changes key terms and conditions between the parties.  This results in a dispute and has trickle down implications of disrupting shipments which customer complaints and cancelled orders … all because someone used the wrong content.  In short, it’s paramount to have trust in out content.

Here are three strategies you can take to bring trust to your content:

Clean-up the backlog … assess and separate trusted content from suspect content.  Decommission and dispose of what is not necessary to keep.  Preserve and exploit your trusted content from your trusted content repositories (discussed in a previous posting).

Instrument ad-hoc and controlled document creation and approval processes … establish event and process based steps (or KPIs) to measure, trigger, review and monitor the accuracy of content that is designated as trusted.

Enhance meta data and leverage master data … to clean up dirty document meta data and reference trusted data sources within the enterprise.  Ensure an accurate 360-degree view all information assets and meta data.

Obviously there are a number of ways to make content quality better and improve document based decision making.  The trick is … how to do it without burdening the business users.  Manual methods are thought to be easy but always fail as human beings are inconsistent, sometimes inaccurate and can refuse to cooperate.  In some rare cases … humans take matters into their own hands.  Don’t take a “Danny” approach to trusting your content. 

Choose one or more of the above paths and increase the accuracy of content based decisions in your organization.  If you don’t, I may have to send Danny’s Mom out to have a talk with you.

Learning How To Trust …

Before my digression last posting into a perspective on ECM systems integrators … I was describing the characteristics of trusted ECM repositories (see Step 1 – Can You Trust Your Repository?).  Picking up from there …

Since choosing the right repository or content storage location is so important, how can we objectively evaluate the repositories we have?  Use this scoring model to assess and designate your content storage options (including ECM repositories) as Trusted Content Repositories (TCRs)

 

Level 0 – is missing key capabilities like security, basic content services and APIs.  This category represents file shares, CDs and other relatively unsecure locations.  These environments are flexible and useful but the missing capabilities cause us to lose confidence (or trust) in the content we keep there.  Imagine building an application that delivers critical documents only to have an end-user delete the underlying files.

Level 1 – Missing key capabilities like repository governance and lineage.  This category represents SharePoint, wikis and blogs and other environments with user controlled governance.  These environments are fantastic for collaboration and are easy to deploy but are missing essential capabilities when the environment itself can’t be properly governed and secured in accordance with IT standards (including the ability to meet SLAs).  Imagine building an application that depends on critical documents only to have an end-user retire the SharePoint site that used to content the needed documents or records.

Level 2 – Missing a few key capabilities to instrument and automate workflows like event management and content federation.  This category represents most ECM repositories from major vendors like IBM, EMC, OpenText and selected others.  The missing capabilities enable us to have confidence the right documents are designed as “trusted” so they can be found, automated and consumed with confidence.

Level 3 – Has all of the key capabilities.  This is the optimal level for trusted content applications.  Only IBM FileNet P8 has all of these characteristics today.

Remember … if you can’t trust your repository you can’t trust what is in it, can you?  Critical content must be stored in Trusted Content Repositories … it’s that simple.  Next time we’ll explore what it takes to create and maintain trusted content.  In the mean time, leave me your feedback on the model.

Step 1 – Can You Trust Your Repository?

The 4 steps to enable ECM to participate in information governance starts with choosing the right repository.  In short, Trusted Information needs to reside in trusted environments.  In the case of content, this means Trusted Content Repositories.  By definition, if you can’t trust the environment, you can’t trust the information itself.  Imagine building an application that delivers critical information (including documents or images) only to have an end-user delete the underlying files.  This can easily happen if your reference documents live on file systems or in improperly governed environments.  Unlike structured applications and databases, the users have a large majority of the control over content storage environments.  Imagine an end-user decommissioning a SharePoint team room only to have applications “break” that need to access the content that was residing in the now missing environment.  This can happen when important content is stored on file systems, wikis and other systems with inadequate governance and security controls.  Critical content must be stored in Trusted Content Repositories.

Some Key Actions to Take:

  • Evaluate your candidate content repositories to determine viability for use as a repository of record or Trusted Content Repository (TCR)
  • Designate Trusted Content Repositories for use in essential applications and only store critical content in TCRs.
  • Update operational practices to increase confidence and assurance of trusted information including a Trusted Content strategy supported by TCRs.

But how de we define what a Trusted Content Repository (TCR) is?  The following are characteristics of TCRs:

 Performance, Scalability and HA/DR

  • Support for billions of objects and thousands of users across the enterprise.
  • Support for SLA levels of disaster recovery and business continuity.

Preservation and Lineage Capabilities

  • Confident and assured immutability of content, structure, lineage and context over time.

Interoperability and Extensibility

  • Support for industry leading RDBMSs, application servers and operating systems.
  • Open and robust APIs including support for CMIS.

Content Capabilities

  • Basic ECM capabilities including versioning, meta data management, classification, content based retrieval, content transformation, etc.

Repository Governance

  • Deployments can be managed and controlled to protect against information supply chain breakdowns.

Information Lifecycle Governance

  • Support for all lifecycle events and processes including eDiscovery and records disposition.

Security, Access and Monitoring

  • Controls to promote access to authorized users and controls to prevent unauthorized access.
  • Auditing and monitoring for all related activities.

Physical Capabilities

  • Ability to support references to physical objects and entities.

Federation and Replication

  • Federation capabilities to provide a common meta data catalog across multiple repositories.

Business Process Management

  • Integrated business process management.

Events Based Architecture

  • Internal and external event support with trigger and subscription model.

All of these capaabilities are required to enable content participation to meet Information Governance requirement.  Next we’ll explore what it takes to create and maintain trusted content. In the mean time, do you agree with these characteristics?