Healthcare Data is the New Oil: Delivering Smarter Care with Advanced Analytics

It has been said that “data” is the new “oil” of the 21st century.  That is certainly true in healthcare where a unique opportunity exists to leverage data – as fuel for better health outcomes.  Everything that happens with our health is documented … initially this was on paper … and more recently, in the form of electronic medical records.

Despite billions of incentive dollars being dolled out by the federal government to purchase Electronic Medical Record (EMR) systems and use in meaningful ways, there continues to be significant dissatisfaction with these systems.

In a recent Black Book Rankings survey, 80% surveyed claim their EMR solution does not meet the practice’s individual needs.  This is consistent with my own observations, where many express frustration that “the information goes in … but rarely, if ever, comes out”.

If the information never comes out, or it’s too hard to access, are we really maximizing its value?

It all boils down to our ability to leverage years and years of longitudinal patient population data to surface currently hidden insights … and put those insights to work to improve care.

It’s incredibly powerful to combine years of clinical patient population data (longitudinal patient histories) with other types of data such as social and lifestyle factors to surface new trends, patterns, anomalies and deviations.  These complex medical relationships (or context) trapped in the data are the key to identifying new ways to achieve better health outcomes.  Some organizations are already empowering physicians with these new insights.

Context can be critical in a lot of situations—but in healthcare, especially, it can be the difference between preventing a hospital readmission or not. It’s not enough, for example, to know that a patient has diabetes and smokes a pack of cigarettes each week. These factors are only part of the whole picture. Does she live on her own, with family or in a care facility? Does she have a knee injury that prevents her from an active exercise program? Has she been treated for any other illnesses recently? Did she experience a recent life-changing event, such as moving homes, getting a new job or having a baby? Is she able to cook meals for herself, does she rely on someone else to cook, or does she frequent cafeterias, restaurants or take-out windows?

All of these things and more can—and should—influence a patient’s care plan, because these are the factors that help determine which treatments will be most successful for each individual. And as our population grows and ages, a greater focus on individual wellness and increasing economic pressures are forcing providers, insurers, individuals and government agencies to find new ways to optimize healthcare outcomes while controlling costs.
Today’s data-driven healthcare environment provides the raw materials (or “oil”) to fuel this kind of personalized care, and make it cost-effective as well. But it takes savvy analysis to turn that data into the kind of reports and recommendations providers, patients and communities need to make informed decisions.

The good news: IBM is uniquely positioned to help organizations and individuals achieve these goals. The IBM® Smarter Care initiative draws on a comprehensive portfolio of advanced IBM technologies and services to help generate new patient insights that can improve the quality of care; facilitate collaboration among organizations, patients, government agencies and other groups; and promote wellness through a range of public health and social programs.

IBM Patient Care and Insights is a key component of the Smarter Care initiative. By incorporating advanced analytics with care management capabilities, Patient Care and Insights can produce valuable insights and enable holistic, individualized care.

Advanced analytics: Leading the way to Smarter Care

Several leading healthcare organizations are already on the path to Smarter Care and demonstrating the real-world benefits of advanced analytics from IBM. For example, in St. Louis, Missouri, BJC HealthCare—one of the largest nonprofit healthcare systems in the United States—is using natural language processing (NLP) and content analytics capabilities from IBM to extract information from patient records that are valuable for clinical research. By tapping into unstructured data, such as text-based doctors notes, BJC HealthCare is surfacing important social factors, demographic information and behavioral patterns that would otherwise be hidden from researchers.

BJC HealthCare is also using IBM technologies to reduce hospital readmissions for chronic heart failure (CHF). The organization is analyzing clinical data such as ejection fraction metrics (which represent the volume of blood pumped out of the heart with each beat) to better predict which patients are most likely to be readmitted. These insights enable providers to implement tailored interventions that can avoid some readmissions.

The University of North Carolina (UNC) Health Care is using Patient Care and Insights for three new pilot projects. First, UNC is employing NLP and content analytics on free-text clinical notes to discover predictors of hospital readmission, identifying patients at risk and improving pre-admission prediction models.

UNC is also using IBM technology to empower patients. IBM NLP technology is helping to transform clinical data contained electronic medical records (EMRs) into a format that can be presented to patients through an easy-to-use portal. Streamlined access to information will help patients make more informed decisions and encourage deeper participation in their own care.

Finally, UNC is using NLP to help generate alerts and reminders for physicians. With NLP, the organization is extracting key unstructured data from EMRs, such as abnormal cancer test results, and then storing this data in a structured form within a data warehouse. The structured data can then be used to produce alerts for prompt follow-up care.

This is just the beginning. As organizations continue to launch new projects that capitalize on advanced analytics, case management and other technologies from IBM, we expect to see some very innovative approaches to delivering Smarter Care.

Learn more about IBM Smarter Care by visiting:

For more about IBM Patient Care and Insights, visit:

As always, share your comments or questions below.

Playing The Healthcare Analytics Shell Game

When I think of how most healthcare organizations are analyzing their clinical data today … I get a mental picture of the old depression era shell game – one that takes place in the shadows and back alleys. For many who were down and out, those games were their only means of survival.

The shell game (also known as Thimblerig) is a game of chance. It requires three walnut shells (or thimbles, plastic cups, whatever) and a small round ball, about the size of a pea, or even an actual pea. It is played on almost any flat surface. This conjures images of depression era men huddled together … each hoping to win some money to buy food … or support their vices. Can you imagine playing a shell game just to win some money so you could afford to eat? A bit dramatic I know – but not too far off the mark.

The person perpetrating the game (called the thimblerigger, operator, or shell man) started the game by putting the pea under one of the shells. The shells were quickly shuffled or slid around to confuse and mislead the players as to which of the shells the pea is actually under … and the betting ensued. We now know, that the games were usually rigged. Many people were conned and never had a chance to win at all. The pea was often palmed or hidden, and not under any of the shells … in other words, there were no winners.

Many healthcare analytics systems and projects are exactly like that today – lots of players and no pea. The main component needed to win (or gain the key insight) is missing.  The “pea” … in this case, is unstructured data. And while it’s not a con game, finding the pea is the key to success … and can literally be the difference between life and death. Making medical decisions about a patient’s health is pretty important stuff. I want my care givers using all of the available and relevant information (medical evidence) as part of my care.

In healthcare today, most analytics initiatives and research efforts are done by using structured data only (which only represents 20% of the available data). I am not kidding.

This is like betting on a shell game without playing with the pea – it’s not possible to win and you are just wasting your money. In healthcare, critical clinical information (or the pea) is trapped in the unstructured data, free text, images, recordings and other forms of content. Nurse’s notes, lab results and discharge summaries are just a few examples of unstructured information that should be analyzed but in most cases … are not.

The reason used to be (for not doing this) … it’s too hard, too complicated, too costly, not good enough or some combination of the above. This was a show stopper for many.

Well guess what … those days are over. The technology needed to do this is available today and the reasons for inaction no longer apply.

In fact – this is now a healthcare imperative! Consider that over 80% of information is unstructured. Why would you even want to do analysis on only 1/5th of your available information?

I’ve written about the value of analyzing unstructured data in the past with Healthcare and ECM – What’s Up Doc? (part 1) and Healthcare and ECM – What’s Up Doc? (part 2).

Let’s look at the results from an actual project involving the analysis of both structured and unstructured data to see what is now possible (when you play “with the pea”).

Seton Family Healthcare is analyzing both structured and unstructured clinical (and operational) data today. Not surprisingly, they are ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are currently featured in a Forbes article describing how they are transforming healthcare delivery with the use of IBM Content and Predictive Analytics for Healthcare. This is a new “smarter analytics” solution that leverages unstructured data with the same natural language processing technology found in IBM Watson.

Seton’s efforts are focused on preventing hospital readmissions of Congestive Heart Failure (CHF) patients through analysis and visualization of newly created evidence based information. Why CHF?  (see the video overview)

Heart disease has long been the leading cause of death in the United States. The most recent data from the CDC shows that heart disease accounted for over 27% of overall mortality in the U.S. The overall costs of treating heart disease are also on the rise – estimated to have been $183 billion in 2009. This is expected to increase to $186 billion in 2023. In 2006 alone, Medicare spent $24 billion on heart disease. Yikes!

Combine those staggering numbers with the fact that CHF patients are the leading cause of readmissions in the United States. One in five patients suffer from preventable readmissions, according to the New England Journal of Medicine. Preventable readmissions also represent a whopping $17.4 billion in expenditures from the current $102.6 billion Medicare budget. Wow! How can they afford to pay for everything else?

They can’t … beginning in 2012, those hospitals with high readmission rates will be penalized. Given the above numbers, it shouldn’t be a shock that the new Medicare penalties will start with CHF readmissions. I imagine every hospital is paying attention to this right now.

Back to Seton … the work at Seton really underscores the value of analyzing your unstructured data. Here is a snapshot of some of the findings:

The Data We Thought Would Be Useful … Wasn’t

In some cases, the unstructured data is more valuable and more trustworthy then the structured data:

  • Left Ventricle Ejection Fraction (LVEF) values are found in both places but originate in text based lab results/reports. This is a test measurement of how much blood your left ventricle is pumping. Values of less than 50% can be an indicator of CHF. These values were found in just 2% of the structured data from patient encounters and 74% of the unstructured data from the same encounters.
  • Smoking Status indicators are also found in both places. I’ve written about this exact issue before in Healthcare and ECM – What’s Up Doc? (part 2). Indicators that a patient was smoking were found in 35% of the structured data from encounters and 81% of the unstructured data from the same encounters. But here’s the kicker … the structured data values were only 65% accurate and the unstructured data values were 95% accurate.

You tell me which is more valuable and trustworthy.

In other cases, the key insights could only be found from the unstructured data … as was no structured data at all or enough to be meaningful. This is equally as powerful.

  • Living Arrangement indicators were found in <1% of the structured data from the patient encounters. It was the unstructured data that revealed these insights (in 81% of the patient encounters). These unstructured values were also 100% accurate.
  • Drug and Alcohol Abuse indicators … same thing … 16% and 81% respectively.
  • Assisted Living indicators … same thing … 0% and 13% respectively. Even though only 13% of the encounters had a value, it was significant enough to rank in the top 18 of all predictors for CHF readmissions.

What this means … is that without including the unstructured data in the analysis, the ability to make accurate predictions about readmissions is highly compromised. In other words, it significantly undermines (or even prevents) the identification of the patients who are most at risk of readmission … and the most in need of care. HINT – Don’t play the game without the pea.

New Unexpected Indicators Emerged … CHF is a Highly Predictive Model

We started with 113 candidate predictors from structured and unstructured data sources. This list was expanded when new insights were surfaced like those mentioned above (and others). With the “right” information being analyzed the accuracy is compelling … the predictive accuracy was 49% at the 20th percentile and 97% at the 80th percentile. This means predictions about CHF readmissions should be pretty darn accurate.

18 Top CHF Readmission Predictors and Some Key Insights

The goal was not to find the top 18 predictors of readmissions … but to find the ones where taking a coordinated care approach makes sense and can change an outcome. Even though these predictors are specific to Seton’s patient population, they can serve as a baseline for others to start from.

  • Many of the highest indicators of CHF are not high predictors of 30-day readmissions. One might think LVEF values and Smoking Status are also high indicators of the probability of readmission … they are not. This could  only be determined through the analysis of both structured and unstructured data.
  • Some of the 18 predictors cannot impact the ability to reduce 30-day admissions. At least six fall into this category and examples include … Heart Disease History, Heart Attack History and Paid by Medicaid Indicator.
  • Many of the 18 predictors can impact the ability to reduce 30-day admissions and represent an opportunity to improve care through coordinated patient care. At least six fall into this category and examples include … Self Alcohol / Drug Use Indicator, Assisted Living Indicator, Lack of Emotion Support Indicator and Low Sodium Level Indicator. Social factors weigh heavily in determining those at risk of readmission and represent the best opportunity for coordinated/transitional care or ongoing case management.
  • The number one indicator came out of left field … Jugular Venous Distention Indicator. This was not one of the original 113 candidate indicators and only surfaced through the analysis of both structured and unstructured data (or finding the pea). For the non-cardiologists out there … this is when the jugular vein protrudes due to the associated pressure. It can be caused by a fluids imbalance or being “dried out”. This is a condition that would be observed by a clinician and would now be a key consideration of when to discharge a patient. It could also factor into any follow-up transitional care/case management programs.

But Wait … There’s More

Seton also examined other scenarios including resource utilization and identifying key waste areas (or unnecessary costs). We also studied Patient X – a random patient with 6 readmission encounters over an eight-month period. I’ll save Patient X for my next posting.

Smarter Analytics and Smarter Healthcare

It’s easy to see why Seton is ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are a shining example of an organization on the forefront of the healthcare transformation. The way they have put their content in motion with analytics to improve patient care, reduce unnecessary costs and avoid the Medicare penalties is something all healthcare organizations should strive for.

Perhaps most impressively, they’ve figured out how to play the healthcare analytics shell game and find the pea every time.  In doing so … everyone wins!

As always, leave me your comments and thoughts.