Playing The Healthcare Analytics Shell Game

When I think of how most healthcare organizations are analyzing their clinical data today … I get a mental picture of the old depression era shell game – one that takes place in the shadows and back alleys. For many who were down and out, those games were their only means of survival.

The shell game (also known as Thimblerig) is a game of chance. It requires three walnut shells (or thimbles, plastic cups, whatever) and a small round ball, about the size of a pea, or even an actual pea. It is played on almost any flat surface. This conjures images of depression era men huddled together … each hoping to win some money to buy food … or support their vices. Can you imagine playing a shell game just to win some money so you could afford to eat? A bit dramatic I know – but not too far off the mark.

The person perpetrating the game (called the thimblerigger, operator, or shell man) started the game by putting the pea under one of the shells. The shells were quickly shuffled or slid around to confuse and mislead the players as to which of the shells the pea is actually under … and the betting ensued. We now know, that the games were usually rigged. Many people were conned and never had a chance to win at all. The pea was often palmed or hidden, and not under any of the shells … in other words, there were no winners.

Many healthcare analytics systems and projects are exactly like that today – lots of players and no pea. The main component needed to win (or gain the key insight) is missing.  The “pea” … in this case, is unstructured data. And while it’s not a con game, finding the pea is the key to success … and can literally be the difference between life and death. Making medical decisions about a patient’s health is pretty important stuff. I want my care givers using all of the available and relevant information (medical evidence) as part of my care.

In healthcare today, most analytics initiatives and research efforts are done by using structured data only (which only represents 20% of the available data). I am not kidding.

This is like betting on a shell game without playing with the pea – it’s not possible to win and you are just wasting your money. In healthcare, critical clinical information (or the pea) is trapped in the unstructured data, free text, images, recordings and other forms of content. Nurse’s notes, lab results and discharge summaries are just a few examples of unstructured information that should be analyzed but in most cases … are not.

The reason used to be (for not doing this) … it’s too hard, too complicated, too costly, not good enough or some combination of the above. This was a show stopper for many.

Well guess what … those days are over. The technology needed to do this is available today and the reasons for inaction no longer apply.

In fact – this is now a healthcare imperative! Consider that over 80% of information is unstructured. Why would you even want to do analysis on only 1/5th of your available information?

I’ve written about the value of analyzing unstructured data in the past with Healthcare and ECM – What’s Up Doc? (part 1) and Healthcare and ECM – What’s Up Doc? (part 2).

Let’s look at the results from an actual project involving the analysis of both structured and unstructured data to see what is now possible (when you play “with the pea”).

Seton Family Healthcare is analyzing both structured and unstructured clinical (and operational) data today. Not surprisingly, they are ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are currently featured in a Forbes article describing how they are transforming healthcare delivery with the use of IBM Content and Predictive Analytics for Healthcare. This is a new “smarter analytics” solution that leverages unstructured data with the same natural language processing technology found in IBM Watson.

Seton’s efforts are focused on preventing hospital readmissions of Congestive Heart Failure (CHF) patients through analysis and visualization of newly created evidence based information. Why CHF?  (see the video overview)

Heart disease has long been the leading cause of death in the United States. The most recent data from the CDC shows that heart disease accounted for over 27% of overall mortality in the U.S. The overall costs of treating heart disease are also on the rise – estimated to have been $183 billion in 2009. This is expected to increase to $186 billion in 2023. In 2006 alone, Medicare spent $24 billion on heart disease. Yikes!

Combine those staggering numbers with the fact that CHF patients are the leading cause of readmissions in the United States. One in five patients suffer from preventable readmissions, according to the New England Journal of Medicine. Preventable readmissions also represent a whopping $17.4 billion in expenditures from the current $102.6 billion Medicare budget. Wow! How can they afford to pay for everything else?

They can’t … beginning in 2012, those hospitals with high readmission rates will be penalized. Given the above numbers, it shouldn’t be a shock that the new Medicare penalties will start with CHF readmissions. I imagine every hospital is paying attention to this right now.

Back to Seton … the work at Seton really underscores the value of analyzing your unstructured data. Here is a snapshot of some of the findings:

The Data We Thought Would Be Useful … Wasn’t

In some cases, the unstructured data is more valuable and more trustworthy then the structured data:

  • Left Ventricle Ejection Fraction (LVEF) values are found in both places but originate in text based lab results/reports. This is a test measurement of how much blood your left ventricle is pumping. Values of less than 50% can be an indicator of CHF. These values were found in just 2% of the structured data from patient encounters and 74% of the unstructured data from the same encounters.
  • Smoking Status indicators are also found in both places. I’ve written about this exact issue before in Healthcare and ECM – What’s Up Doc? (part 2). Indicators that a patient was smoking were found in 35% of the structured data from encounters and 81% of the unstructured data from the same encounters. But here’s the kicker … the structured data values were only 65% accurate and the unstructured data values were 95% accurate.

You tell me which is more valuable and trustworthy.

In other cases, the key insights could only be found from the unstructured data … as was no structured data at all or enough to be meaningful. This is equally as powerful.

  • Living Arrangement indicators were found in <1% of the structured data from the patient encounters. It was the unstructured data that revealed these insights (in 81% of the patient encounters). These unstructured values were also 100% accurate.
  • Drug and Alcohol Abuse indicators … same thing … 16% and 81% respectively.
  • Assisted Living indicators … same thing … 0% and 13% respectively. Even though only 13% of the encounters had a value, it was significant enough to rank in the top 18 of all predictors for CHF readmissions.

What this means … is that without including the unstructured data in the analysis, the ability to make accurate predictions about readmissions is highly compromised. In other words, it significantly undermines (or even prevents) the identification of the patients who are most at risk of readmission … and the most in need of care. HINT – Don’t play the game without the pea.

New Unexpected Indicators Emerged … CHF is a Highly Predictive Model

We started with 113 candidate predictors from structured and unstructured data sources. This list was expanded when new insights were surfaced like those mentioned above (and others). With the “right” information being analyzed the accuracy is compelling … the predictive accuracy was 49% at the 20th percentile and 97% at the 80th percentile. This means predictions about CHF readmissions should be pretty darn accurate.

18 Top CHF Readmission Predictors and Some Key Insights

The goal was not to find the top 18 predictors of readmissions … but to find the ones where taking a coordinated care approach makes sense and can change an outcome. Even though these predictors are specific to Seton’s patient population, they can serve as a baseline for others to start from.

  • Many of the highest indicators of CHF are not high predictors of 30-day readmissions. One might think LVEF values and Smoking Status are also high indicators of the probability of readmission … they are not. This could  only be determined through the analysis of both structured and unstructured data.
  • Some of the 18 predictors cannot impact the ability to reduce 30-day admissions. At least six fall into this category and examples include … Heart Disease History, Heart Attack History and Paid by Medicaid Indicator.
  • Many of the 18 predictors can impact the ability to reduce 30-day admissions and represent an opportunity to improve care through coordinated patient care. At least six fall into this category and examples include … Self Alcohol / Drug Use Indicator, Assisted Living Indicator, Lack of Emotion Support Indicator and Low Sodium Level Indicator. Social factors weigh heavily in determining those at risk of readmission and represent the best opportunity for coordinated/transitional care or ongoing case management.
  • The number one indicator came out of left field … Jugular Venous Distention Indicator. This was not one of the original 113 candidate indicators and only surfaced through the analysis of both structured and unstructured data (or finding the pea). For the non-cardiologists out there … this is when the jugular vein protrudes due to the associated pressure. It can be caused by a fluids imbalance or being “dried out”. This is a condition that would be observed by a clinician and would now be a key consideration of when to discharge a patient. It could also factor into any follow-up transitional care/case management programs.

But Wait … There’s More

Seton also examined other scenarios including resource utilization and identifying key waste areas (or unnecessary costs). We also studied Patient X – a random patient with 6 readmission encounters over an eight-month period. I’ll save Patient X for my next posting.

Smarter Analytics and Smarter Healthcare

It’s easy to see why Seton is ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are a shining example of an organization on the forefront of the healthcare transformation. The way they have put their content in motion with analytics to improve patient care, reduce unnecessary costs and avoid the Medicare penalties is something all healthcare organizations should strive for.

Perhaps most impressively, they’ve figured out how to play the healthcare analytics shell game and find the pea every time.  In doing so … everyone wins!

As always, leave me your comments and thoughts.

ECM Systems: Is Yours A Five Tool Player?

I grew up in Baltimore and baseball was my sport. I played Wiffle Ball in my backyard and Little League with my friends. It was all we ever talked and thought about. I played on all-star teams, destroyed my knees catching and worshipped the Orioles. And while I think Billy Beane’s use of analytics in “Moneyball” was absolute genius (read the book) … every good Orioles fan knows that starting pitching and three run homers wins baseball games … at least according to the Earl of Baltimore (sorry for the obscure Earl Weaver reference).

Brooks Robinson (Mr. Hoover) was my favorite player (only the greatest 3rd baseman of all time). I still have an autographed baseball he signed for me, as a kid, on prominent display in my office. I stood in line at the local Crown gas station for several hours with my Dad to get that ball.

But alas, baseball has fallen on hard times in Baltimore and even I had drifted away from the game. Good ole Brooksie was a fond nostalgic memory for me until the other day. This posting is not about baseball … it’s about ECM … really it is.

The recently concluded World Series is one of the most remarkable ever played. The late inning heroics in game six were amazing. Though neither team would give up, one had to prevail. Watching the end of that game got me thinking about ECM … no, really!

Baseball is a game that transfixes you when the ball is put into play … or in motion. And quite frankly, the game is pretty boring in between the action … or when things are at rest. So much so that the game is almost unwatchable unless things are in motion. The game comes alive with the tag-up on a sacrifice fly … or the stolen base … or a runner stretching a single into a double … or best of all, the inside-the-park homer. What do they all have in common? Action! Excitement! Motion!

No one care really cares what happens between the pitches. Everyone wants the action. That’s why you pay the ticket price … to sit on the edge of your seat and wait for ball to be put into play. The same is true for your enterprise content. It’s much more valuable when you put it into play … or in action. Letting your content sit idle is just driving up your costs (and risks too). Your goal should be to put it in motion. I recently wrote about this with Content at Rest or Content in Motion? Which is Better?.

However … putting your content in motion requires having the right tools. In baseball, the most coveted players are five tool players. They hit for average, hit for power, have base running skills (with speed), throwing ability, and fielding abilities.

The best ECM systems are also five tool players. They have five key capabilities. If you want the maximum value from your content, your ECM system must be able to:

1) Capture and manage content

2) Socialize content with communities of interest

3) Govern the lifecycle of content

4) Activate content through case centric processes

5) Analyze and understand content

I was lucky enough to have recently been interviewed by Wes Simonds who wrote a nice piece on these same five areas of value for ECM. These five tools are coveted, just like baseball. Why? Think about it … no one buys an ECM system unless they want to put their content in motion in one way or another.

Here’s the rub … far too often I see ECM practitioners who are only using one, or two, or maybe three, of their ECM capabilities even though they could be doing more. Why is this? It’s like being happy with being a .220 average hitter in baseball (or a one or two tool player). No one is getting a fat contract or going to the Hall of Fame by hitting .220 and just keeping your head above the Mendoza line (another obscure baseball reference). Like in baseball, you need to use all five skills to get to the big contracts … or get the maximum value from your ECM based information.

Brooks Robinson didn’t win a record 16 straight Gold Gloves, the Most Valuable Player Award or play in 18 consecutive All Star games because he had one or two skills. He was named to the All Century team and elected to the Hall of Fame on the first ballot with a landslide 92% of the votes because he put the ball in motion and made the most of the skills and tools he had.

It’s simple … those new to ECM should only consider systems with all five capabilities.

And today’s existing ECM practitioners should be promoting, using and benefiting from all five tools, not just a few. Putting content in motion with all five tools benefits your career and maximizes your ECM program. It enables your organization get the maximum value from the 80% of your data that is unstructured content.

As always, leave your thoughts and comments here.

Healthcare and ECM – What’s Next Doc? (part 2 of 2)

In my last blog posting Healthcare and ECM – What’s Up Doc?, I wrote about using ECM based content analytics technology to help accelerate decision making in an industry in transition.

But why stop there … how powerful would it be to turn those new insights (from unstructured information) into action by combining content analytics with predictive analytics or other business analytics?

This is transformational … by unlocking the 80% of information not currently being leveraged (explained in part 1) we unlock new ways to use information. More compellingly, we unlock never seen before trends and patterns in both clinical and operational data.

Think about it … do we know everything we need to know about healthcare and how to identify and treat diseases? Or can we benefit from new insights? The answer is obvious.

Combining content and predictive analytics enables:

  • Accurate extraction of medical facts and relationships from unstructured data in clinical and operational sources – not easy, cost effective, or even possible in the past.
  • Never seen before trends, patterns and anomalies are revealed – connections or relationships between diseases, patients and outcomes (and even costs) are now able to be surfaced and acted upon. Think of the medical research possibilities!
  • The ability to predict future outcomes based on past and present scenarios – optimizing resource allocation and patient outcomes. One organization reduced cardiac surgery patient morbidity from 2.9% to 1.3% by doing this.
  • New insights can be surfaced to any clinical or operational knowledge based on their respective role – this could be through dashboards, case management/care coordination system, EMR, claims processing or any number of other ways – enabling better decision making and action across the organization.
  • The ability to leverage these new insights with other systems such as data warehouses, master patient data – maximizing and befitting from the use of other systems.

In my last posting, I commented that it was now an imperative to leverage clinical information and operational data in new ways … and that are obvious things to do to improve quality of care, patient satisfaction and business efficiency.

There are at least nine areas where this opportunity exists. The clinical scenarios are:

  • Diagnostic Assistance: Highly correlated symptom to health/disease analysis issues visualized with predictive guidance on diagnosis to improve treatment and outcomes … with predicted or forecasted costs.
  • Clinical Treatment Effectiveness: Examine patient-specific factors against the effectiveness of a healthcare organizations specific treatment options and protocols … including comparisons to industry wide outcomes and best practices.
  • Critical Care Intervention: Early detection of unmanageable or high risk cases in the hospital that leads to interventions to reduce costs and maintain or improve clinical conditions … including case based interventions.
  • Research for Improved Disease Management: Perform analysis and predict outcomes by extracting discreet facts from text, such as: patient smoking status, patient diet and patient exercise regime to find new and better treatment options … use a mechanism for differentiation or to secure research grants.

Operational scenarios include:

  • Claims Management: All claims involve unstructured data and manually intensive analysis. Analyze claims information documented in cases, forms and web content to understand new trends and patterns to identify areas … perfect for process improvement, cost reduction and optimal service delivery.
  • Fraud Detection and Prevention: Uncover eligibility, false assertions and fraud patterns trapped in the unstructured data to reduce risk before payments are made … usually represented by a word or combination of words in text that can’t be detected with just structured data.
  • Voice of the Patient: Include unstructured data and sentiment analysis from surveys and web forms in analysis of patient and member satisfaction … this will be key as the industry moves to a value based model.
  • Prevention of Readmissions: Discover key indicators which are indicative of readmission to alert healthcare organizations to these so that protocols can be altered to avoid readmission … this is key as new Medicare payment penalties go into effect in 2012.
  • Patient Discharge and Follow-up Care: Understand and monitor patient behavior to proactively inform preventative and follow-up care coordinators before situations get worse.

According to the New England Journal of Medicine, one in five patients suffer from preventable readmissions. This represents $17.4 billion of the current $102.6 billion Medicare budget. Beginning in 2012, hospitals will be penalized for high readmission rates with reductions in Medicare discharge payments. Seton Healthcare Family is already ahead of the game.

“IBM Content and Predictive Analytics for Healthcare uses the same type of natural language processing as IBM Watson, enabling us to leverage our unstructured information in new ways not possible before,” said Charles J. Barnett, FACHE, President/Chief Executive Officer, Seton Healthcare Family. “With this solution, we can access an integrated view of relevant clinical and operational information to drive more informed decision making. For example, by predicting readmission candidates, we can reduce costly and preventable readmissions, decrease mortality rates, and ultimately improve the quality of life for our patients.”

This week at IOD … IBM is launching a new solution specifically designed to reveal clinical and operational insights in the high impact overlap between clinical and operational use cases – enabling low cost accountable care.

IBM Content and Predictive Analytics for Healthcare, a synergistic solution to IBM Watson, helps transform healthcare clinical and operational decision making for improved outcomes by uniquely applying multiple analytics services to derive and act on new insights in ways not previously possible … which is exactly what Seton Healthcare Family is doing.  Dr. David Ramirez, Medical Director at Seton shares his perspective here.

IBM Content and Predictive Analytics for Healthcare (ICPA) is Watson Ready and is designed to complement and leverage IBM Watson for Healthcare through the ability to analyze and visualize the past, understand the present, and predict future outcomes.

ICPA, as the first Watson Ready offering, not only provides assurance of Watson solution interoperability but extends the value ultimately delivered to clients. For example, using input from ICPA outcomes, IBM Watson will be able to provide better diagnostic recommendation and treatment protocols as well as learn from the confidence based responses.

The press release is available here for those seeking more information. I will be doing a high level main stage demo of ICPA on Wednesday which will be streamed live. I will post the replay when available.

But it’s not just healthcare … every industry is impacted by the explosion of information and has the same opportunity to leverage the 80+ percent that is unstructured to turn insights into action.

As always, leave me your thoughts and comments here.

Healthcare and ECM – What’s Up Doc? (part 1 of 2)

This is one of those industry centric topics everyone can relate to … we all need healthcare and we’ll all use it at some point in our lives.  I plan to do a couple of postings on Healthcare and ECM … here is the first.

The healthcare industry is undergoing a major transformation.  We have a legacy health system that is fee for service based resulting in a care system that is high cost with inconsistent quality.  Healthcare provider consolidation is accelerating; competitors as well as payers/providers are merging.  Clinical transformation is already occurring … disease management, health and wellness management, and behavioral health are integrating.  The industry is moving to a more patient centric, evidence based and competitive care system where the players are held accountable and will have to compete on the value they deliver and not rely solely on quantity based reimbursements.

This transformation is driving new thinking, new business models and a restructuring of clinical and operational care models.  The expectation of value is changing and healthcare organizations have to adjust their business models to deliver value, not just volume.  This type of transformation requires innovation … the kind of innovation that improves productivity and competitive advantage … and not just advancing medical technology for technology sake. The main consideration must be for total well being and cost, and not one for the sake of the other.

As the backbone for a transformed healthcare system, leveraging clinical information and operational data in new ways are obvious things to do to improve quality of care, patient satisfaction and business efficiency.  This places a premium on making this information accessible and actionable to optimize outcomes! … and where ECM comes in!

There are many ways ECM technologies are being applied to solve problems in healthcare. Obvious ones are document capture conversion of paper based patient records and advanced case management for care coordination.  I am going to focus on content analytics and leveraging unstructured information to reveal insights currently trapped in documents, records and other content.  I believe this has significant transformative potential as an ECM based information technology.

Studies show that healthcare information is growing at 35% per year and that over 80% of information is unstructured data (or content).  The explosion of information makes accessing and leveraging it a harder task, but this is now an imperative.

Unstructured data resides in many sources:  physician notes, registration forms, discharge summaries, text messages, documents and more.  Because this content lacks structure, it is arduous for healthcare enterprises to include it in business analysis and therefore it is routinely left out.

The impact of this is staggering.  If you had a choice – would you choose to leverage all of your available information or just the 20% that is structured data and found in databases?  This is exactly the type of thing that can accelerate transformation.  We need to leverage the remaining 80% of available information.  After all … would you want your Doctor making decisions about your health on 1/5th of the available information?

It’s such a simple premise but the reality is that until recently, the technology wasn’t available to easily and accurately analyze and unlock insights contained in the unstructured information.  This is where natural language processing (NLP) and breakthrough technologies like IBM Watson and IBM Content Analytics come in.  So let’s apply this to the real world.

Smoking has long been known as a habit that contributes to poor health and diseases like Congestive Heart Failure but how accurately do the healthcare systems of today reflect the patient’s current smoking status?  To understand a patient’s smoking status … it cannot only be a yes/no checklist question found in structured data.  How can a check box know you if you quit 3 years ago … or started again last year and just recently quit again … or that you recently took up casual cigar smoking … or that you cut down from 2 packs to 1 pack a day?  A structured data field can’t understand these nuances.  This is natural language based information found only in text.  These text based descriptions are often captured in registration forms, history and physical reports, progress reports and other update reports.  Most systems have not factored in this kind of information due to the cost and time taken to manually extract it. It’s often too costly and too late. Yet it is exactly this kind of information that could be most critical in improving care.

In a recent private IBM customer data study, we found 40% of the total population of smoking patients were identified in the text of unstructured physician notes, and not the structured data.  This is huge!  Can you imagine doing research on smoking without including this kind of information? … or not including 40% of the total smoking population?

BJC Healthcare has figured out the value of leveraging unstructured data.  They found that structured data alone was not enough when doing research often resulting in the reading of documents … many many documents … one by one.  You can imagine how fun and helpful that was.  They are now using IBM Content Analytics to extract key medical facts and relationships from more than 50 million documents in medical records, speeding up research to ultimately provide better care for patients worldwide… See the recent case study.

I feel strongly that ECM technologies, and especially Content Analytics, can make a huge impact in both the clinical and operational healthcare transformation underway.  I’ll be back in two weeks with more on this topic … which is now published as Healthcare and ECM – What’s Next Doc?

As always, leave me your thoughts and comments here.

TV Re-runs, Watson and My Blog

When I was a wee lad … back in the 60s … I used to rush home from elementary school to watch the re-runs on TV.  This was long before middle school and girls.  HOMEWORK, SCHMOMEWORK !!!  … I just had to see those re-runs before anything else.  My favorites were I Love Lucy, Batman, Leave It To Beaver and The Munsters.  I also watched The Patty Duke Show (big time school boy crush) but my male ego prevents me from admitting I liked it.  Did you know the invention of the re-run is credited to Desi Arnaz?  The man was a genius even though Batman was always my favorite.  Still is.  I had my priorities straight even back then.

I am reminded of this because I have that same Batman-like re-run giddiness as I think about the upcoming re-runs of Jeopardy! currently scheduled to air September 12th – 14th.

You’ve probably figured out why I am so excited, but in case you’ve been living in a cave, not reading this blog, or both … IBM Watson competed (and won) on Jeopardy! in February against the two most accomplished Grand Champions in the history of the game show (Ken Jennings and Brad Rutter).  Watson (DeepQA) is the world’s most advanced question answering machine that uncovers answers by understanding the meaning buried in the context of a natural language question.  By combining advanced Natural Language Processing (NLP) and DeepQA automatic question answering technology, IBM was able to demonstrate a major breakthrough in computing.

Unlike traditional structured data, human natural language is full of ambiguity … it is nuanced and filled with contextual references.  Subtle meaning, irony, riddles, acronyms, idioms, abbreviations and other language complexities all present unique computing challenges not found with structured data.  This is precisely why IBM chose Jeopardy! as a way to showcase the Watson breakthrough.

Appropriately, I’ve decided that this posting should be a re-run of my own Watson and content analysis related postings.  So in the sprit of Desi, Lucy, Batman and Patty Duke … here we go:

  1. This is my favorite post of the bunch.  It explains how the same technology used to play Jeopardy! can give you better business insight today.  “What is Content Analytics?, Alex”
  2. I originally wrote this a few weeks before the first match was aired to explain some of the more interesting aspects of Watson.  10 Things You Need to Know About the Technology Behind Watson
  3. I wrote this posting just before the three day match was aired live (in February) and updated it with comments each day.  Humans vs. Watson (Programmed by Humans): Who Has The Advantage?
  4. Watson will be a big part of the future of Enterprise Content Management and I wrote this one in support of a keynote I delivered at the AIIM Conference.   Watson and The Future of ECM  (my slides from the same keynote are posted here).
  5. This was my most recent posting.  It covers another major IBM Research advancement in the same content analysis technology space.  TAKMI and Watson were recognized as part of IBM’s Centennial as two of the top 100 innovations of the last 100 years.  IBM at 100: TAKMI, Bringing Order to Unstructured Data
  6. I wrote a similar IBM Centennial posting about IBM Research and Watson.  IBM at 100: A Computer Called Watson
  7. This was my first Watson related post.  It introduced Watson and was posted before the first match was aired.  Goodbye Search … It’s About Finding Answers … Enter Watson vs. Jeopardy!

Desi Arnaz may have been a genius when it came to TV re-runs but the gang at IBM Research have made a compelling statement about the future of computing.  Jeopardy! shows what is possible and my blog postings show how this can be applied already.  The comments from your peers on these postings are interesting to read as well.

Don’t miss either re-broadcast.  Find out where and when Jeopardy! will be aired in your area.  After the TV re-broadcast, I will be doing some events including customer and public presentations.

On the web …

  • I will presenting IBM Watson and the Future of Enterprise Content Management on September 21, 2011 (replay here).
  • I will be speaking on Content Analytics in a free upcoming AIIM UK webinar on September 30, 2011 (replay here).

Or in person …

You might also want to check out the new Smarter Planet interview with Manoj Saxena (IBM Watson Solutions General Manager)

As always, your comments and thoughts are welcome here.

A 124 Year Odyssey Involving Cases and Records Finally Ends

I first became aware of this matter about 10 years ago when I read a story about a woman named Josephine Wild Gun (yes, that is her name) who then lived in a small run-down house on the Blackfeet reservation in Montana. Like most of her Native American neighbors, she owned several parcels of reservation land that were being held in trust by the U.S. Government (Indian Trust Fund).  The Indian Trust Fund was created in 1887, as part of the Dawes Act, to oversee payments to Native Americans.  This fund managed nearly 10,000 acres on Josephine’s behalf, leasing the property to private interests for grazing and oil drilling fees.  In return, she was supposed to receive royalties from the trust fund.

Despite the lucrative leases, Josephine had allegedly never received more than $1,500 a year from the trust fund.  According to the story, the payments trickled off and one check totaled only 87 cents.  When her husband died, she even had to borrow money to pay for the funeral.  Josephine’s story is compelling … and it stuck with me.   This story, along with some research I was doing on the Cobell v. Salazar lawsuit (involving the same Indian Trust Fund) and the government’s inability to produce records documenting the income accounting of the payments to Josephine and about 300,000 other Native Americans, caused me to wonder how and why something like this could happen.

The 15-year old class action (Cobell v. Salazar) lawsuit was recently settled for $3.4 billion.  I am writing about this today because hundreds of thousands of notices went out this week to American Indians who are affected by the $3.4 billion settlement bringing an end to a 124 year odyssey involving The Department of the Interior, The Bureau of Indian Affairs and many Native Americans and their descendants.  In this suit, Elouise Cobell (a Native American and member of the Blackfeet tribe) sued the federal government over the mismanagement of the trust fund.  In her suit, Cobell claimed that the U.S. Government failed to provide a historical accounting of the money the government held in trust for Native American landowners in exchange for the leasing of tribal lands.  Ultimately, the case hinged on the government’s ability to produce these accounting records showing how the money was managed on behalf of the original landowners.  I find myself wondering if the whole entire thing could have been avoided with better case management and recordkeeping practices.  This 15-year court battle is the culmination of events going all the way back to the 19th Century!  The landowners had a right to expect proper case management, proper records management and proper distribution of funds.  Apparently, none of those things happened.

As a history buff, I find the whole back story fascinating … so here we go …

It all starts with Henry Dawes (1816 – 1903) who was a Yale graduate from Massachusetts.  He was an educator, a newspaper editor, a lawyer and perhaps, somewhat infamously, a Congressman who was both a member of the U.S. House of Representatives (1857 to 1875) and the U.S. Senate (1875 to 1893).

During his time in public service, he had his ups and his downs.  In 1868, he received a large number of shares of stock from a railroad construction company as part of the Union Pacific railway’s influence-buying efforts.  On the positive side, Dawes was both a supporter and involved with the creation of Yellowstone National Park.  He also had a role in promoting anti-slavery and reconstruction measures during and after the Civil War.  In the Senate, he was chairman of The Committee on Indian affairs, where he concentrated on the enactment of laws that he believed were for the benefit of American Indians.

Dawes’s most noteworthy achievement was the passage of The General Allotment Act of 1887 (known as The Dawes Act referenced earlier).  The Dawes Act authorized the government to survey and inventory Indian tribal land and to divide the area into allotments for individual Indians.  Although later amended twice, it was this piece of legislation that set the stage for 124 years of alleged mismanagement and eventually the Cobell v. Salazar lawsuit.

I see this as a cautionary tale … reminding us of the need for enterprise content and case management as well as records management (but more on that later).  I wasn’t around but I would imagine PC’s ran pretty slowly back in 1887 (chuckle) … but I digress, as manual paper based practices did exist.

Back to the story … The Dawes Commission, was established under the Office of Indian Affairs to persuade American Indians to agree to the allotment plan.   Dawes himself, later oversaw the commission for a period of time after his time as a Senator.  It was this same commission that registered and documented the members of the Five Civilized Tribes.  Eventually, The Curtis Act of 1898 abolished tribal jurisdiction over the tribes’ land and the landowners became dependent on the government.  Native Americans lost about 90 million acres of treaty land, or about two-thirds of the 1887 land base over the lifespan of the Dawes Act.  Roughly 90,000 Indians were made landless and the Act forced Native people onto small tracts of land … in many cases, it separated families.  The allotment policy depleted the land base and also ended hunting as a means of subsistence.  In 1928, a Calvin Coolidge Administration study had determined that The Dawes Act had been used to illegally deprive Native Americans of their land rights.  Today, The United States Department of the Interior is responsible for the remnants of The Dawes Act and the Office of Indian Affairs is now known as the Bureau of Indian Affairs.

There is a pretty big taxpayer bill about to finally be paid out ($3.4 billion) to the surviving Native American descendants and for other purposes.  Throughout the lifecycle of this case, there were multiple contempt charges, fines and embarrassing mandates resulting in the government’s reputation taking a significant hit.  Interior Secretary Bruce Babbitt and Treasury Secretary Robert Rubin were found in contempt of court for failing to produce documents and slapped with a $625,000 fine.  And while time went by and Administrations changed, not much else did when Interior Secretary Gale Norton and Assistant Interior Secretary of Indian Affairs Neal McCaleb were also held in contempt.  At one point, the judge also ordered the Interior Department to shut down most of its Internet operations after an investigator discovered that the department’s computer system allowed unauthorized access to Indian trust accounts.  During this time, many federal employees could not receive or respond to emails, and thousands of visitors to national parks were unable to make online reservations for campsites.  The shutdown also prevented the trust fund from making payments to more than 43,000 Indians, many of whom depended on the quarterly checks to make ends meet. In Montana and Wyoming, some beneficiaries were forced to apply for tribal loans to help them through the holidays.

There was plenty of mudslinging as well:

“Federal officials have spent more than 100 years mismanaging, diverting, and losing money that belongs to Indians,” says John Echohawk of the Native American Rights Fund, which directed the lawsuit.  “They have no idea how much has been collected from the companies that use our land and are unable to provide even a basic, regular statement to most Indian account holders.”

Again I ask … where was the accountability for these landowner cases and the associated records?  Could all of this have been prevented with better policies and processes?

The damage was already done but we know that the government invested in an array of systems such as Integrated Records Management System (IRMS), Trust Funds Accounting System (TFAS), Land Records Information System (LRIS) and Trust Asset and Accounting Management System (TAAMS).  These systems were to collect, manage and distribute trust funds in support of the 1994 Indian Trust Fund Management Reform Act.  They were used for historical accounting purposes and contained land ownership records and financial records for the associated cases.  A major premise of the government’s accounting effort was that the transition from paper to electronic records took the accuracy, completeness and reliability of the trust data to a level that far surpassed the “paper ledger era” … seems like it was too little too late.

I guess we’ll never know for sure, but I firmly believe that much, if not most, of this could have been avoided.  It was alleged during the case that as much 90 percent of the Indian Trust Fund’s records were missing, and the few that were available were in comically bad condition. An Interior Department report provided to the court refers to storage facilities plagued by problems ranging from “poisonous spiders in the vicinity of stored records” to “mixed records strewn throughout the room with heavy rodent activity.”

It’s a tragic story and I am glad it’s finally ending.  It’s disheartening that Josephine Wild Gun and many others had to suffer the way they did for the past 124 years.  It’s amazing the number of people that this impacted starting with Henry Dawes and ending with ~300,000 Native Americans (and everyone in between).  It’s encouraging to know that technologies like Enterprise Content Management, Advanced Case Management and Records Management can all be used with great impact in the future to improve processes and outcomes like this.

As always, leave me your thoughts and opinions here.

IBM at 100: TAKMI, Bringing Order to Unstructured Data

As most of you know … I have been periodically posting some of the really fascinating top 100 innovations of the past 100 years as part of IBM’s Centennial celebration.

This one is special to me as it represents what is possible for the future of ECM.  I wasn’t around for tabulating machines and punch cards but have long been fascinated by the technology developments in the management and use of content.  As impressive as Watson is … it is only the most recent step in a long journey IBM has been pursuing to help computers better understood natural language and unstructured information.

As most of you probably don’t know … this journey started over 50 years ago in 1957 when IBM published the first research on this subject entitled A Statistical Approach to Mechanized Encoding and Searching of Literary InformationFinally … something in this industry older then I am!

Unstructured Information Management Architecture (UIMA)

Another key breakthrough by IBM in this area was the invention of UIMA.  Now an Apache Open Source project and OASIS standard, UIMA is an open, industrial-strength platform for unstructured information analysis and search.  It is the only open standard for text based processing and applications.  I plan to write more on UIMA in a future blog but I mention it here because it was an important step forward for the industry, Watson and TAKMI (now known as IBM Content Analytics).

TAKMI

In 1997, IBM researchers at the company’s Tokyo Research Laboratory pioneered a prototype for a powerful new tool capable of analyzing text. The system, known as TAKMI (for Text Analysis and Knowledge Mining), was a watershed development: for the first time, researchers could efficiently capture and utilize the wealth of buried knowledge residing in enormous volumes of text. The lead researcher was Tetsuya Nasukawa.

Over the past 100 years, IBM has had a lot of pretty important inventions but this one takes the cake for me.  Nasukawa-san once said,

“I didn’t invent TAKMI to do something humans could do, better.  I wanted TAKMI to do something that humans could not do.”

In other words, he wanted to invent something humans couldn’t see or do on their own … and isn’t that the whole point and value of technology anyway?

By 1997, text was searchable, if you knew what to look for. But the challenge was to understand what was inside these growing information volumes and know how to take advantage of the massive textual content that you could not read through and digest.

The development of TAKMI quietly set the stage for the coming transformation in business intelligence. Prior to 1997, the field of analytics dealt strictly with numerical and other “structured” data—the type of tagged information that is housed in fixed fields within databases, spreadsheets and other data collections, and that can be analyzed by standard statistical data mining methods.

The technological clout of TAKMI lay in its ability to read “unstructured” data—the data and metadata found in the words, grammar and other textual elements comprising everything from books, journals, text messages and emails, to health records and audio and video files. Analysts today estimate that 80 to 90 percent of any organization’s data is unstructured. And with the rising use of interactive web technologies, such as blogs and social media platforms, churning out ever-expanding volumes of content, that data is growing at a rate of 40 to 60 percent per year.

The key for the success was natural language processing (NLP) technology. Most of the data mining researchers were treating English text data as a bag of words by extracting words from character strings based on white spaces. However, since Japanese text data does not contain white spaces as word separators, IBM researchers in Tokyo applied NLP for extracting words, analyzing their grammatical features, and identifying relationships among words. Such in-depth analysis led to better results in text mining. That’s why the leading-edge text mining technology originated in Japan.

The complete article on TAKMI can be found at http://www.ibm.com/ibm100/us/en/icons/takmi/

Fast forward to today.  IBM has since commercialized TAKMI as IBM Content Analytics (ICA), a platform to derive rapid insight.  It can transform raw information into business insight quickly without building models or deploying complex systems enabling all knowledge workers to derive insight in hours or days … not weeks or months.  It helps address industry specific problems such as healthcare treatment effectiveness, fraud detection, product defect detection, public safety concerns, customer satisfaction and churn, crime and terrorism prevention and more.

I’d like to personally congratulate Nasukawa-san and the entire team behind TAKMI (and ICA) for such an amazing achievement … and for making the list.  Selected team members who contributed to TAKMI are Tetsuya Nasukawa, Kohichi Takeda, Hideo Watanabe, Shiho Ogino, Akiko Murakami, Hiroshi Kanayama, Hironori Takeuchi, Issei Yoshida, Yuta Tsuboi and Daisuke Takuma.

It’s a shining example of the best form of innovation … the kind that enables us to do something not previously possible.  Being recognized along with other amazing achievements like the UPC code, the floppy disk, magnetic stripe technology, laser eye surgery, the scanning tunneling microscope, fractal geometry, human genomics mapping is really amazing.

This type of enabling innovation is the future of Enterprise Content Management.  It will be fun and exciting to see if TAKMI (Content Analytics) has the same kind of impact on computing as the UPC code has had on retail shopping … or as laser eye surgery has had on vision care.

What do you think?  As always, leave for your thoughts and comments.

Other similar postings:

Watson and The Future of ECM

“What is Content Analytics?, Alex”

10 Things You Need to Know About the Technology Behind Watson

Goodbye Search … It’s About Finding Answers … Enter Watson vs. Jeopardy! 

It’s a Bird … It’s a Plane … It’s ACM! (Advanced Case Management)

ECM and BPM evil doers beware!  The days of creeping requirements … endless application rollout delays … one-size fits all user experiences … and blaming IT for all of it are over!

Advanced Case Management is here to save us.  Long before this superhero capability arrived from a smarter planet, we’ve had to use a bevy of workflow and BPM technologies to address the needs of case-centric processes.  In most cases, this has not worked well.  That’s because case-centric processes are different.

Traditional BPM processes tend to be straight-through and transactional with the objective of completing the process in the most efficient way and at the lowest possible cost and risk.

Case centric processes are not straight-through.  They are ad-hoc, collaborative and involve exceptions … sometimes, lots of exceptions.  In certain cases, these processes are so ad-hoc or collaborative that it is not realistic or possible to map them.  That’s because the objective is to make the best decision (within the context of the case) and the path to the right decision may not be known.  Speed and cost are always important but take backseat to achieving the best outcome … which usually involves customers, partners, employees or even citizens / patients.  You get the idea.

Why should you care?  Most “C” level survey these days lists Reinventing Customer Relationships at a top priority.  The same goals are seen again and again:

  • Get closer to customers (top theme)
  • Better understand our what customers need
  • Deliver unprecedented customer service

From a technology perspective … this means we need new tools to build those solutions that enable us to get closer, better understand and deliver optimal service to our customers.  Most customer oriented processes are case centric involving human interactions.  They tend not to be straight-through.

The traditional BPM model which depends on (1) process modeling, (2) process automation and (3) process optimization works fine for the straight-through processes … not so much for case management.

As such, a big gap exists today to build solutions that drive better case outcomes.  To close this gap, new tools that bring people, process and information together in the context of a case are needed when:

  • Processes are collaborative and ad-hoc
  • Activities are event-driven
  • Work is knowledge intensive
  • Content is essential for decision making
  • Outcomes are goal-oriented
  • The judgment of people impact how the goal is achieved
  • Process is often not predetermined

The discipline of case management is deeply rooted in industries like healthcare, public sector and the legal profession.  Case management concepts are being applied across all industries – and though organizations describe case management differently – they consistently describe the lack of tools needed for their knowledge workers to get their jobs done.  Some organizations may describe their challenges as complaint / dispute management, investigations, interventions, claims processing or other forms of business functions that have a common pattern or problem but not a straight-through process.  Cases also typically involve invoices, contracts, employees, vendors, customers, projects, change requests, exceptions, incidents, audits, electronic discovery and more.

Faster then a speeding bullet!

Yesterday’s BPM development tools simply don’t work for case management applications.  By the time you build the application, too much time has past, requirements change and IT usually gets the blame.  Time-to-value suffers.  I have nothing against BPM application development tools.  I just wouldn’t use a screwdriver to hammer a nail … and neither should you.  Case management solutions require a new kind of development environment and tools.  We need tools that are easy to use and allow a business user (not just IT) to very quickly build a solution.  They should be able to address the comprehensive nature of all case assets and provide a 360 degree view of a case.  They should leverage templates for a fast-start and represent industry best practices.  In the end, they need to significantly shorten time-to-value relative to other approaches.

More powerful then a locomotive!

Since the objective is to empower case based decision making, we need user experiences that are more robust and flexible then those of the past.  We need those experiences to be role-based and personalized so the end-user gets exactly the information they need to progress the case.  The user experience needs to be flexible and extensible … not to mention configurable, to meet unique business, case or user requirements.  The user experience should provide deep contextual data for case work and eliminate disjointed jumping between applications.  It must bring people, process and information together to drive case progression and optimal outcomes.  That way, a single case worker has all the information they need to improve case outcomes.

Able to leap tall buildings in a single bound!

Proactively advising case workers of best practices, historical outcomes, fraud indicators and other relevant insight is also needed.  Leveraging analytics to detect and surface trends, patterns and deviations contributes to better and more consistent outcomes.  In other words, we need powerful analytics for better case outcomes.  Comprehensive reporting and analysis gives case managers visibility across all information types to assess and act quickly.  Real-time dashboards help understand issues before they become a problem.  Unique content analytics can discover deeper case insight.  Bottom line … case managers need insight in order to impact results.

Anatomy of a superhero

Before being rocketed to Earth as some new problem solving superhero technology … a combination of capabilities are needed to address the needs of case management solutions.  Under the cape and tights of any case management superhero technology, you will find six core capabilities in a seamlessly integrated environment:

1 – Content.  By placing the case model in the content repository, information and other artifacts associated with cases are not only selected and viewed but also managed in the context of the case over its lifecycle.  These include collaborations, processes steps, and the other associated case elements.

2 – Process.  Cases may follow static processes that are prescribed for certain business situations.  They may also follow more dynamic paths based on changes to information associated with a case.  Straight through, transactional processes can be called as can more collaborative processes.

3 – Analytics.  Analytics help case workers to make the right decisions in case of fraudulent claims for insurance, social benefit coverage, eligibility for welfare programs and more. Analytics help detect patterns within or across cases or simply optimize the overall case handling to optimize case outcomes.

4 – Rules.  Many decisions in a case depend on set values, e.g. interest rates for loans based on credit rating, approval authority for transaction amounts, etc. By separating rules from process the case handling becomes much more agile as rules can change in lockstep with market changes.

5 – Collaboration.  Finding the right subject matter expert is often critical to make an ad-hoc decision required to bring a case to an optimal closure. Collaboration in form of instant messaging, presence awareness, and team rooms enables an organization and its case workers to work together to drive outcomes.

6 – Social Software.  Dynamic To Do Lists that are role based help case workers establish conversations and actions that must take place to close cases and link to information about the people that can help.  Users can brainstorm on appropriate solutions and actions and create wikis linked to particular case types to assist colleagues in their case work.

If you can’t do those six things … seamlessly … you aren’t very super … or advanced … and you certainly can’t meet the demands of case management solutions.

Advanced Case Management is now saving the world one case and solution at a time.

So “up, up and away” to better case management solutions and outcomes.  As always leave me your thoughts and comments here.

Content in Motion: The Voice of Your Customer

Do you listen to your customers?

No, really!  Of course, everyone answers “yes” when asked this question.  So much so … that the question really isn’t worth asking anymore.  The real question to ask is “What are you doing about it?”

Your customers write about your services, prices, product quality and their experiences with you in social media.  They write you letters (yes, letters on paper do exist), they send you emails, they call your call centers and even participate in surveys you conduct … Again I ask, what are you doing about it?

How are you translating all that information across all those input channels into action?  All of that content (you already have) in the form of customer interactions is just waiting to be leveraged (hhmmmm).

In three separate “C” Level studies (CIO, CFO, CEO) … the number one executive imperative was to “Reinvent Customer Relationships”.  Across the three studies, key findings were to:

  • Get closer to customers (top need)
  • Better understand what customers need
  • Deliver unprecedented customer service

Can anyone think of a better way to accomplish this then by examining all of that customer interaction based content to enable you to do something about it?  I bet there are loads of trends, patterns and new insights just waiting to be explored and discovered in those interactions … something demanding your attention and needing action.  This is one of the thoughts I had in mind when I blogged about “Content at Rest or Content in Motion? Which is Better?” a few weeks ago.  Clearly, identifying customer satisfaction trends about products, services and personnel is critical to any business.

The Hertz Corporation is doing this today.  They are using IBM Content Analytics software to examine customer interaction based content to better identify car and equipment rental performance levels for pinpointing and making the necessary adjustments to improve customer satisfaction levels.  Insights derived from enterprise content enable companies like Hertz to drive new marketing campaigns or modify their products and services to meet the demands of their customers.

“Hertz gathers an amazing amount of customer insight daily, including thousands of comments from web surveys, emails and text messages. We wanted to leverage this insight at both the strategic level and the local level to drive operational improvements,” said Joe Eckroth, Chief Information Officer, the Hertz Corporation.

Hertz isn’t just listening … they are taking action … by putting their content in motion.

Again I ask, what are you doing about it?  Why not test drive Hertz’s idea in your business?  You’ve already got the content to do so.

I welcome your input as always.  I recently bylined articles on Hertz and IBM Content Analytics for ibm.com and CIO.com entitled  “Insights into Action – Improving Service by Listening to the Voices of your Customers”.  For a more detailed profile on ICA at Hertz visit: http://www-03.ibm.com/press/us/en/pressrelease/32859.wss

IBM … 100 Years Later

Nearly all the companies our grandparents admired have disappeared.  Of the top 25 industrial corporations in the United States in 1900, only two remained on that list at the start of the 1960s.  And of the top 25 companies on the Fortune 500 in 1961, only six remain there today.  Some of the leaders of those companies that vanished were dealt a hand of bad luck.  Others made poor choices. But the demise of most came about because they were unable simultaneously to manage their business of the day and to build their business of tomorrow.

IBM was founded in 1911 as the Computing Tabulating Recording Corporation through a merger of four companies: the Tabulating Machine Company, the International Time Recording Company, the Computing Scale Corporation, and the Bundy Manufacturing Company.  CTR adopted the name International Business Machines in 1924.  The distinctive culture and product branding has given IBM the nickname Big Blue.

As you read this, IBM begins its 101st year.  As I look back at the last century, there is a path that led us to this remarkable anniversary which has been both rich and diverse.  The innovations IBM has contributed includes products ranging from cheese slicers to calculators to punch cards – all the way up to game-changing systems like Watson.

But what stands out to me is what has remained unchanged.  IBM has always been a company of brilliant problem-solvers.  IBMers use technology to solve business problems.  We invent it, we apply it to complex challenges, and we redefine industries along the way.

This has led to some truly game-changing innovation.  Just look at industries like retail, air travel, and government.  Where would we be without UPC codes, credit cards and ATM machines, SABRE, or Social Security?  Visit the IBM Centennial site to see profiles on 100 years of innovation.

We haven’t always been right though … remember OS/2, the PCjr and Prodigy?

100 years later, we’re still tackling the world’s most pressing problems.  It’s incredibly exciting to think about the ways we can apply today’s innovation – new information based systems leveraging analytics to create new solutions, like Watson – to fulfill the promise of a Smarter Planet through smarter traffic, water, energy, and healthcare.  This promise of the future … is incredibly exciting and I look forward to helping IBM pave the way for continued innovation.

Watch the IBM Centennial film “Wild Ducks” or read the book.  IBM officially released a book last week celebrating the Centennial, “Making the World Work Better: The Ideas that Shaped a Century and a Company”.  The book consists of three original essays by leading journalists. They explore how IBM” has pioneered the science of information, helped reinvent the modern corporation and changed the way the world actually works.

As for me … I’ve been with IBM since the 2006 acquisition of FileNet and am proud to be associated with such an innovative and remarkable company.