Innovation in the Cognitive Era

Throughout the industrial age there has been breakthrough innovation that has changed everything.

Way back when … most manufactured products were made by hand. A craftsman, or team of craftsmen, would create products by hand. They would use their skills, and hand tools, to create the individual parts before assembling them into the final product. The process was very labor intensive.   The assembly line, institutionalized by Henry Ford, changed everything. It also catapulted Ford to market leadership with The Model T and re-shaped the automobile industry … as well as the way all products are manufactured.

American railroads were originally planned to serve cities and their surrounding areas. It didn’t initially occur to city planners, or the railroad builders, that these networks might eventually need to connect with one another. This led to a mish mash of track and rail sizes (or gauges in railroad speak) … none of which were compatible. While some standardization was inevitable, by the 1870s, there were still over twenty different gauges in use in America. This stalled growth and hurt the industries ability to expand. Railroad standardization did eventually change everything and traveling by train became the de facto way people traveled long distances … at least until other travel innovations disrupted the railroad industry. Many supplemental innovation opportunities were created (such as luxury Pullman rail cars) as the transportation industry innovated its way forward, eventually adding new ways to travel (airplanes and automobiles).

Before the telephone and wireless radio (yes this was also before fax machines), the only way to send messages was by telegraph. The telegraph was a hard wired connection of send/receive points. I guess the use of smoke signals and homing pigeons had other imitations … like being too messy. Ever clean-up after a bunch of pigeons?

The telegraph was the primary form of long distance communication for the better part of a century … and Morse code was the language of the telegraph. Morse code was a system of dots (shown as asterisk below) and dashes, that when combined, formed words, letters and sentences. As you can imagine, it was highly inefficient. One of the last messages sent from The Titanic was:

*** ** * *

*** **** ** **,

*** ** * *

***- — -*– *- –* *.

Translated to English … it says, “FINE SHIP, FINE VOYAGE.”

I wonder when the ship started sinking how many messages didn’t get sent because the Morse coding and reassembly process was so cumbersome, time-consuming and error prone.

When Alexander Graham Bell invented the telephone … you guessed it … it changed everything. Many supplemental innovation opportunities were also created including wireless radio, broadcast television and every facet of communications.

These kinds of “change everything” opportunies don’t happen that often. So recognizing one as an innovation and business opportunity … may be more important then the innovation itself. After all, Ford, Bell and the Railroads only reaped a small fraction of the spoils. Entire massive industries in the transportation and communications were created. The savvy innovators and intrapreneurs understood the follow-on opportunities being created and capitalized on them.

Well .. most did. William Orten was not among them. He was the CEO of Western Union Telegraph Company in 1876. Western Union had a monopoly on the most advanced communications technology available (the telegraph) in 1876. Western Union was offered the patent on a Bell’s invention (the telephone), for $100,000 (or about $2M in 2014 dollars). The CEO (William Orten) considered the whole idea ridiculous and wrote directly to Alexander Graham Bell:

”After careful consideration of your invention, while it is a very interesting novelty, we have come to the conclusion that it has no commercial possibilities … What use could this company make of an electrical toy?”

Two years later, after the telephone began to take off, Orten realized the magnitude of his mistake, and spent years (unsuccessfully) challenging Bell’s patents. Ooops!!!

The computing industry is about to undergo a once on in a generation innovation opportunity. We are entering The Cognitve Era of computing.

The current computing model has reached it’s limits. The next level of value can’t be unlocked by current approaches. Data already flows from every device, replacing guessing and approximations with precise information … yet 80% of this data is unstructured … and therefore, invisible to computers and of limited use to business. There is simply too much information and most of it is noise. We need the ability to know what is relevant and useful.

More importantly, why don’t computers learn, adapt, reason and apply information the way I do?

We got here by first building computers that solved basic problems and/or enabled us to new things … before moving onto solving more advanced problems and/or raising the bar on our own ambition. It all got started in the late 19th century.

1890s – 1950s: The Tabulating Systems Era

  • Massive growth in people and things demanded single purpose systems that could count.
  • For the first time, a program like the US Social Security system was possible.

1950s – Today: The Programmable Systems Era

  • The increasing complexity of business and society demanded multipurpose systems that can apply logic to perform pre-programmed tasks.
  • For the first time, landing on the moon was possible.

Today: The Cognitive Systems Era

  • Continually changing scale and complexity now require real-time judgment from systems that can sense, learn and understand to help humans make decision and take action.
  • With technology augmenting and extending human intelligence, it’s difficult to imagine what is not possible.

Welcome to the The Cognitve Era … where cognitive systems can understand the world through sensing and interaction, reason using hypotheses and arguments and learn from experts and through data:

  • Understand unstructured data, through sensing and interaction.
  • Reason about it by generating hypotheses, considering, arguments, and recommendations.
  • Learn from training by experts, from every interaction, and from continually ingesting data. In fact, they never stop learning.

This changes everything!

And like those other once in a generation innovation opportunities, now is the time to get involved. Start by educating yourself and begin experimenting. Look for cognitive innovation in these five areas:

  1. Deeper Human Engagement: Cognitive businesses create more fully human interactions with people—based on the mode, form and quality each person prefers. They take advantage of what is available today to create a fine-grained picture of individuals—geo-location data, web interactions, transaction history, loyalty program patterns, EMRs, data from wearables—and add to that picture details that have been hard or impossible to detect: tone, sentiment, emotional state, environmental conditions, strength and nature of a person’s relationships. They reason through the sum total of all this structured and unstructured data to find what really matters in engaging a person. By continuously learning, these engagements deliver greater and greater value, and become more natural, anticipatory and emotionally meaningful.
  2. Elevated Expertise: Every industry and profession’s knowledge is expanding at a rate faster than any professional can keep up with—journals, new protocols, new legislation, new practices, and entire new fields.
  3. Cognitive Products and Services: Cognition enables new classes of products and services to sense, reason and learn about their users and the world around them. This allows for continuous improvement and adaptation, and augments their ability to deliver on products and services not previously imagined.
  4. Cognitive Processes and Operations: Cognition also transforms how a company operates and functions. Business processes infused with cognitive capabilities capitalize on the phenomenon of data, from internal and external sources. This gives them heightened awareness of workflows, context and environment—leading to continuous learning, better forecasting and operational effectiveness—along with decision-making at the speed of today’s data.
  5. Intelligent Exploration and Discovery: Ultimately, the most powerful tool that cognitive businesses will possess is far better “headlights” into an increasingly volatile and complex future. Such headlights are becoming more important, as leaders in all industries are compelled to place big bets—on drug development, on complex financial modeling, on materials science innovation, and on launching a startup. By applying cognitive technologies to vast amounts of data, leaders can uncover patterns, opportunities and actionable hypotheses that would be virtually impossible to discover using traditional research or programmable systems alone.

Innovators, entrepreneurs and intrapreneurs should all be licking their chops at the numerous ways to capitalize on this.

This does change everything! … hopefully you agree.  More from IBM on The Cognitive Era … check out the short video.

As always, leave me your thoughts and ideas here.

Why Bigger Should Always Be Faster … and Better

Let me say upfront that I was rooting for Golaith, not David.

I was recently asked to speak about some of IBM’s intrapreneurship initiatives at the upcoming Intrapreneurship Conference in New York during October 21-23. I have been conducting my own research on corporate entrepreneurship and have gotten to know the folks behind this organization .. it should be a good event. I will be speaking on the first day and shared some thoughts on this in a recent interview.

As I reflected on the interview .. it occurred to me how tired I am of all of the rhetoric in business publications these days about it being easy and commonplace for small innovative companies to disrupt large established ones. Some articles and books even pretend there’s a formula for doing this. It’s as if these much larger and proven companies are incompetent, have lost their way and are filled with unmotivated, slow witted human zombie idiot robots. To think that David always slays Goliath is too idealistic. It might help sell books or increase readership … but it’s not a predictor of business success or outcomes. It’s also foolish to underestimate any competitor by reducing them to a cliché … especially the ones who can squash you.  Has anyone noticed that Gillette didn’t lay down and die when Dollar Shave Club and Harry’s started a subscription service to try to disrupt Gillette’s core profit source of razor blades.

In an entrepreneurial and intrapreneurial career has spanned both start-ups and large corporations … I have been in key roles in both types of companies, and can tell you that there are advantages and disadvantages to each type.

Being fast, nimble and adaptable are essential traits when starting a new business or bringing a new innovative offering to market. These are even definitional attributes of start-ups. But no matter how nimble you are, you can’t birth of baby in one month if you put nine pregnant women on the job.

Large companies have significant and undeniable advantages over smaller (and allegedly more nimble) companies … notably resources and customers. The larger, the better. When mobilized properly, these advantages can be leveraged and rapidly applied in ways not possible by smaller, less resourced would-be competitors. Sure … large companies can be complex and have too much politics and red tape … but bring it on.  I’ll take money and customers every time.

That fact is, nothing can be as productive as working on an important initiative with a highly motivated and excited team of the most talented people you can imagine.

BOOM .. and there it is.   An Intrapreneurial Business Team. Done right, it’s like being on an all-star team … even exhilarating. You get to work with the best people or have access to subject matter expertise that start-ups can only dream of.

Where do you find Intrapreneurial Business Teams? In large companies, of course. It’s really the only way that a large matrixed organization can operate in a “start-up” like mode.

A small empowered team(s) approach is essential when siloed reporting, resource allocation and decision making model(s) are the norm. The typical large company model fundamentally disables a single person’s ability to lead all aspects of a innovation commercialization project.

Even though the fundamental goals and skills are the same for both types/sizes of companies … but the execution model and processes needed are completely different.


Here is an overview to the similarities and differences of the two approaches:

Intrapreneurs – Similarities

•   Requires vision and strategy.

•   Needs leadership and strong execution to succeed.

•   Needs internal funding.

•   Similar “learning” process of validate, plan, build, launch and grow.

•   Opportunity driven.

Entrepreneurs – Similarities

•   Requires vision and strategy.

•   Needs leadership and strong execution to succeed.

•   Needs external funding.

•   Similar “learning” process of validate, plan, build, launch and grow.

•   Opportunity driven.


Intrapreneurs – Differences

•   Mostly fearful including fear of failure, peer perception, embarrassment and confrontation.

•   Stakeholders motives are not just financial and include NIH syndrome, lack of alignment, skills, priorities or reward system.

•   Has to navigate existing culture and processes … and may have little or no influence over this.

•   Has advantages/starting points – ability to leverage customers, assets, brand and track record.

•   Funding is NOT guaranteed once secured.

•   Depends on Team Based Leadership


Entrepreneurs – Differences

•   Mostly fearless who are more likely to take risks, start over or have a pivot mentality

•   Stakeholder motives are almost exclusively financial or performance related (keeping investors happy is a top priority).

•   Has to create a new culture and must build teams, culture and more.

•   Starting from a blank page without track record – must secure customers and build trust

•   Funding is guaranteed once obtained.

•   Depends on a Strong Individual Leader


By embracing on these similarities and differences, large organizations can move as fast or faster than start-ups. Importantly, start-ups should study large companies they are taking aim at … before taking them on. Avoid the ones who are operating Intrapreneurially as shown above.

Lastly, if you are a publicly traded company … your organization must be committed to these principles (from the top down). Public companies have a fundamental conflict of interest in that innovation projects are usually longer term investments with unclear ROI in many cases. There is a natural tension between organic innovation investment and fiduciary shareholder budget responsibility … where innovation projects almost always lose out. Quarter to quarter financial decisions (cutbacks) have unintended downstream innovation consequences. Projects without a clear ROI, or without committed revenue, are usually the first place that cuts get made when the belt needs to be tightened. The larger the company, the more acute the problem. Watch out for this dynamic. It’s difficult to overcome without a top down commitment to change and innovation commercialization. Shareholders are always sitting in the first chair. These are the people paying for Goliath’s projects and they expect a return (and soon).

I am definitely looking forward to speaking at the Intrapreneurship Conference. I’ll talk more about Intrapreneurial Business Teams and will feature the Intrapreneurship@IBM program … a program designed to foster corporate entrepreneurship and help bring IBM’s innovation to market.

I founded the Intrapreneurship@IBM program and community as well as the associated 8 Minute Pitch program. I will cover some successes, challenges and failures as well as our future plans for these programs. I will also cover a deeper set of findings from a benchmark survey I recently conducted with over 500 innovation professionals (both non-IBM and IBM respondents).

Lastly, IBM has set out on a “moonshot” attempt at transforming healthcare. Bringing our innovation to market to part of that strategy and Intrapreneurship is a key success factor of this initiative. I plan to cover some of our innovation in healthcare including the innovative and world-renowned IBM Watson family of healthcare solutions.

I hope to see you in New York at the conference … and as always leave your thoughts and comments below.

Playing The Healthcare Analytics Shell Game

When I think of how most healthcare organizations are analyzing their clinical data today … I get a mental picture of the old depression era shell game – one that takes place in the shadows and back alleys. For many who were down and out, those games were their only means of survival.

The shell game (also known as Thimblerig) is a game of chance. It requires three walnut shells (or thimbles, plastic cups, whatever) and a small round ball, about the size of a pea, or even an actual pea. It is played on almost any flat surface. This conjures images of depression era men huddled together … each hoping to win some money to buy food … or support their vices. Can you imagine playing a shell game just to win some money so you could afford to eat? A bit dramatic I know – but not too far off the mark.

The person perpetrating the game (called the thimblerigger, operator, or shell man) started the game by putting the pea under one of the shells. The shells were quickly shuffled or slid around to confuse and mislead the players as to which of the shells the pea is actually under … and the betting ensued. We now know, that the games were usually rigged. Many people were conned and never had a chance to win at all. The pea was often palmed or hidden, and not under any of the shells … in other words, there were no winners.

Many healthcare analytics systems and projects are exactly like that today – lots of players and no pea. The main component needed to win (or gain the key insight) is missing.  The “pea” … in this case, is unstructured data. And while it’s not a con game, finding the pea is the key to success … and can literally be the difference between life and death. Making medical decisions about a patient’s health is pretty important stuff. I want my care givers using all of the available and relevant information (medical evidence) as part of my care.

In healthcare today, most analytics initiatives and research efforts are done by using structured data only (which only represents 20% of the available data). I am not kidding.

This is like betting on a shell game without playing with the pea – it’s not possible to win and you are just wasting your money. In healthcare, critical clinical information (or the pea) is trapped in the unstructured data, free text, images, recordings and other forms of content. Nurse’s notes, lab results and discharge summaries are just a few examples of unstructured information that should be analyzed but in most cases … are not.

The reason used to be (for not doing this) … it’s too hard, too complicated, too costly, not good enough or some combination of the above. This was a show stopper for many.

Well guess what … those days are over. The technology needed to do this is available today and the reasons for inaction no longer apply.

In fact – this is now a healthcare imperative! Consider that over 80% of information is unstructured. Why would you even want to do analysis on only 1/5th of your available information?

I’ve written about the value of analyzing unstructured data in the past with Healthcare and ECM – What’s Up Doc? (part 1) and Healthcare and ECM – What’s Up Doc? (part 2).

Let’s look at the results from an actual project involving the analysis of both structured and unstructured data to see what is now possible (when you play “with the pea”).

Seton Family Healthcare is analyzing both structured and unstructured clinical (and operational) data today. Not surprisingly, they are ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are currently featured in a Forbes article describing how they are transforming healthcare delivery with the use of IBM Content and Predictive Analytics for Healthcare. This is a new “smarter analytics” solution that leverages unstructured data with the same natural language processing technology found in IBM Watson.

Seton’s efforts are focused on preventing hospital readmissions of Congestive Heart Failure (CHF) patients through analysis and visualization of newly created evidence based information. Why CHF?  (see the video overview)

Heart disease has long been the leading cause of death in the United States. The most recent data from the CDC shows that heart disease accounted for over 27% of overall mortality in the U.S. The overall costs of treating heart disease are also on the rise – estimated to have been $183 billion in 2009. This is expected to increase to $186 billion in 2023. In 2006 alone, Medicare spent $24 billion on heart disease. Yikes!

Combine those staggering numbers with the fact that CHF patients are the leading cause of readmissions in the United States. One in five patients suffer from preventable readmissions, according to the New England Journal of Medicine. Preventable readmissions also represent a whopping $17.4 billion in expenditures from the current $102.6 billion Medicare budget. Wow! How can they afford to pay for everything else?

They can’t … beginning in 2012, those hospitals with high readmission rates will be penalized. Given the above numbers, it shouldn’t be a shock that the new Medicare penalties will start with CHF readmissions. I imagine every hospital is paying attention to this right now.

Back to Seton … the work at Seton really underscores the value of analyzing your unstructured data. Here is a snapshot of some of the findings:

The Data We Thought Would Be Useful … Wasn’t

In some cases, the unstructured data is more valuable and more trustworthy then the structured data:

  • Left Ventricle Ejection Fraction (LVEF) values are found in both places but originate in text based lab results/reports. This is a test measurement of how much blood your left ventricle is pumping. Values of less than 50% can be an indicator of CHF. These values were found in just 2% of the structured data from patient encounters and 74% of the unstructured data from the same encounters.
  • Smoking Status indicators are also found in both places. I’ve written about this exact issue before in Healthcare and ECM – What’s Up Doc? (part 2). Indicators that a patient was smoking were found in 35% of the structured data from encounters and 81% of the unstructured data from the same encounters. But here’s the kicker … the structured data values were only 65% accurate and the unstructured data values were 95% accurate.

You tell me which is more valuable and trustworthy.

In other cases, the key insights could only be found from the unstructured data … as was no structured data at all or enough to be meaningful. This is equally as powerful.

  • Living Arrangement indicators were found in <1% of the structured data from the patient encounters. It was the unstructured data that revealed these insights (in 81% of the patient encounters). These unstructured values were also 100% accurate.
  • Drug and Alcohol Abuse indicators … same thing … 16% and 81% respectively.
  • Assisted Living indicators … same thing … 0% and 13% respectively. Even though only 13% of the encounters had a value, it was significant enough to rank in the top 18 of all predictors for CHF readmissions.

What this means … is that without including the unstructured data in the analysis, the ability to make accurate predictions about readmissions is highly compromised. In other words, it significantly undermines (or even prevents) the identification of the patients who are most at risk of readmission … and the most in need of care. HINT – Don’t play the game without the pea.

New Unexpected Indicators Emerged … CHF is a Highly Predictive Model

We started with 113 candidate predictors from structured and unstructured data sources. This list was expanded when new insights were surfaced like those mentioned above (and others). With the “right” information being analyzed the accuracy is compelling … the predictive accuracy was 49% at the 20th percentile and 97% at the 80th percentile. This means predictions about CHF readmissions should be pretty darn accurate.

18 Top CHF Readmission Predictors and Some Key Insights

The goal was not to find the top 18 predictors of readmissions … but to find the ones where taking a coordinated care approach makes sense and can change an outcome. Even though these predictors are specific to Seton’s patient population, they can serve as a baseline for others to start from.

  • Many of the highest indicators of CHF are not high predictors of 30-day readmissions. One might think LVEF values and Smoking Status are also high indicators of the probability of readmission … they are not. This could  only be determined through the analysis of both structured and unstructured data.
  • Some of the 18 predictors cannot impact the ability to reduce 30-day admissions. At least six fall into this category and examples include … Heart Disease History, Heart Attack History and Paid by Medicaid Indicator.
  • Many of the 18 predictors can impact the ability to reduce 30-day admissions and represent an opportunity to improve care through coordinated patient care. At least six fall into this category and examples include … Self Alcohol / Drug Use Indicator, Assisted Living Indicator, Lack of Emotion Support Indicator and Low Sodium Level Indicator. Social factors weigh heavily in determining those at risk of readmission and represent the best opportunity for coordinated/transitional care or ongoing case management.
  • The number one indicator came out of left field … Jugular Venous Distention Indicator. This was not one of the original 113 candidate indicators and only surfaced through the analysis of both structured and unstructured data (or finding the pea). For the non-cardiologists out there … this is when the jugular vein protrudes due to the associated pressure. It can be caused by a fluids imbalance or being “dried out”. This is a condition that would be observed by a clinician and would now be a key consideration of when to discharge a patient. It could also factor into any follow-up transitional care/case management programs.

But Wait … There’s More

Seton also examined other scenarios including resource utilization and identifying key waste areas (or unnecessary costs). We also studied Patient X – a random patient with 6 readmission encounters over an eight-month period. I’ll save Patient X for my next posting.

Smarter Analytics and Smarter Healthcare

It’s easy to see why Seton is ranked as the top health care system in Texas and among the top 100 integrated health care systems in the country. They are a shining example of an organization on the forefront of the healthcare transformation. The way they have put their content in motion with analytics to improve patient care, reduce unnecessary costs and avoid the Medicare penalties is something all healthcare organizations should strive for.

Perhaps most impressively, they’ve figured out how to play the healthcare analytics shell game and find the pea every time.  In doing so … everyone wins!

As always, leave me your comments and thoughts.

ECM Systems: Is Yours A Five Tool Player?

I grew up in Baltimore and baseball was my sport. I played Wiffle Ball in my backyard and Little League with my friends. It was all we ever talked and thought about. I played on all-star teams, destroyed my knees catching and worshipped the Orioles. And while I think Billy Beane’s use of analytics in “Moneyball” was absolute genius (read the book) … every good Orioles fan knows that starting pitching and three run homers wins baseball games … at least according to the Earl of Baltimore (sorry for the obscure Earl Weaver reference).

Brooks Robinson (Mr. Hoover) was my favorite player (only the greatest 3rd baseman of all time). I still have an autographed baseball he signed for me, as a kid, on prominent display in my office. I stood in line at the local Crown gas station for several hours with my Dad to get that ball.

But alas, baseball has fallen on hard times in Baltimore and even I had drifted away from the game. Good ole Brooksie was a fond nostalgic memory for me until the other day. This posting is not about baseball … it’s about ECM … really it is.

The recently concluded World Series is one of the most remarkable ever played. The late inning heroics in game six were amazing. Though neither team would give up, one had to prevail. Watching the end of that game got me thinking about ECM … no, really!

Baseball is a game that transfixes you when the ball is put into play … or in motion. And quite frankly, the game is pretty boring in between the action … or when things are at rest. So much so that the game is almost unwatchable unless things are in motion. The game comes alive with the tag-up on a sacrifice fly … or the stolen base … or a runner stretching a single into a double … or best of all, the inside-the-park homer. What do they all have in common? Action! Excitement! Motion!

No one care really cares what happens between the pitches. Everyone wants the action. That’s why you pay the ticket price … to sit on the edge of your seat and wait for ball to be put into play. The same is true for your enterprise content. It’s much more valuable when you put it into play … or in action. Letting your content sit idle is just driving up your costs (and risks too). Your goal should be to put it in motion. I recently wrote about this with Content at Rest or Content in Motion? Which is Better?.

However … putting your content in motion requires having the right tools. In baseball, the most coveted players are five tool players. They hit for average, hit for power, have base running skills (with speed), throwing ability, and fielding abilities.

The best ECM systems are also five tool players. They have five key capabilities. If you want the maximum value from your content, your ECM system must be able to:

1) Capture and manage content

2) Socialize content with communities of interest

3) Govern the lifecycle of content

4) Activate content through case centric processes

5) Analyze and understand content

I was lucky enough to have recently been interviewed by Wes Simonds who wrote a nice piece on these same five areas of value for ECM. These five tools are coveted, just like baseball. Why? Think about it … no one buys an ECM system unless they want to put their content in motion in one way or another.

Here’s the rub … far too often I see ECM practitioners who are only using one, or two, or maybe three, of their ECM capabilities even though they could be doing more. Why is this? It’s like being happy with being a .220 average hitter in baseball (or a one or two tool player). No one is getting a fat contract or going to the Hall of Fame by hitting .220 and just keeping your head above the Mendoza line (another obscure baseball reference). Like in baseball, you need to use all five skills to get to the big contracts … or get the maximum value from your ECM based information.

Brooks Robinson didn’t win a record 16 straight Gold Gloves, the Most Valuable Player Award or play in 18 consecutive All Star games because he had one or two skills. He was named to the All Century team and elected to the Hall of Fame on the first ballot with a landslide 92% of the votes because he put the ball in motion and made the most of the skills and tools he had.

It’s simple … those new to ECM should only consider systems with all five capabilities.

And today’s existing ECM practitioners should be promoting, using and benefiting from all five tools, not just a few. Putting content in motion with all five tools benefits your career and maximizes your ECM program. It enables your organization get the maximum value from the 80% of your data that is unstructured content.

As always, leave your thoughts and comments here.

TV Re-runs, Watson and My Blog

When I was a wee lad … back in the 60s … I used to rush home from elementary school to watch the re-runs on TV.  This was long before middle school and girls.  HOMEWORK, SCHMOMEWORK !!!  … I just had to see those re-runs before anything else.  My favorites were I Love Lucy, Batman, Leave It To Beaver and The Munsters.  I also watched The Patty Duke Show (big time school boy crush) but my male ego prevents me from admitting I liked it.  Did you know the invention of the re-run is credited to Desi Arnaz?  The man was a genius even though Batman was always my favorite.  Still is.  I had my priorities straight even back then.

I am reminded of this because I have that same Batman-like re-run giddiness as I think about the upcoming re-runs of Jeopardy! currently scheduled to air September 12th – 14th.

You’ve probably figured out why I am so excited, but in case you’ve been living in a cave, not reading this blog, or both … IBM Watson competed (and won) on Jeopardy! in February against the two most accomplished Grand Champions in the history of the game show (Ken Jennings and Brad Rutter).  Watson (DeepQA) is the world’s most advanced question answering machine that uncovers answers by understanding the meaning buried in the context of a natural language question.  By combining advanced Natural Language Processing (NLP) and DeepQA automatic question answering technology, IBM was able to demonstrate a major breakthrough in computing.

Unlike traditional structured data, human natural language is full of ambiguity … it is nuanced and filled with contextual references.  Subtle meaning, irony, riddles, acronyms, idioms, abbreviations and other language complexities all present unique computing challenges not found with structured data.  This is precisely why IBM chose Jeopardy! as a way to showcase the Watson breakthrough.

Appropriately, I’ve decided that this posting should be a re-run of my own Watson and content analysis related postings.  So in the sprit of Desi, Lucy, Batman and Patty Duke … here we go:

  1. This is my favorite post of the bunch.  It explains how the same technology used to play Jeopardy! can give you better business insight today.  “What is Content Analytics?, Alex”
  2. I originally wrote this a few weeks before the first match was aired to explain some of the more interesting aspects of Watson.  10 Things You Need to Know About the Technology Behind Watson
  3. I wrote this posting just before the three day match was aired live (in February) and updated it with comments each day.  Humans vs. Watson (Programmed by Humans): Who Has The Advantage?
  4. Watson will be a big part of the future of Enterprise Content Management and I wrote this one in support of a keynote I delivered at the AIIM Conference.   Watson and The Future of ECM  (my slides from the same keynote are posted here).
  5. This was my most recent posting.  It covers another major IBM Research advancement in the same content analysis technology space.  TAKMI and Watson were recognized as part of IBM’s Centennial as two of the top 100 innovations of the last 100 years.  IBM at 100: TAKMI, Bringing Order to Unstructured Data
  6. I wrote a similar IBM Centennial posting about IBM Research and Watson.  IBM at 100: A Computer Called Watson
  7. This was my first Watson related post.  It introduced Watson and was posted before the first match was aired.  Goodbye Search … It’s About Finding Answers … Enter Watson vs. Jeopardy!

Desi Arnaz may have been a genius when it came to TV re-runs but the gang at IBM Research have made a compelling statement about the future of computing.  Jeopardy! shows what is possible and my blog postings show how this can be applied already.  The comments from your peers on these postings are interesting to read as well.

Don’t miss either re-broadcast.  Find out where and when Jeopardy! will be aired in your area.  After the TV re-broadcast, I will be doing some events including customer and public presentations.

On the web …

  • I will presenting IBM Watson and the Future of Enterprise Content Management on September 21, 2011 (replay here).
  • I will be speaking on Content Analytics in a free upcoming AIIM UK webinar on September 30, 2011 (replay here).

Or in person …

You might also want to check out the new Smarter Planet interview with Manoj Saxena (IBM Watson Solutions General Manager)

As always, your comments and thoughts are welcome here.

A 124 Year Odyssey Involving Cases and Records Finally Ends

I first became aware of this matter about 10 years ago when I read a story about a woman named Josephine Wild Gun (yes, that is her name) who then lived in a small run-down house on the Blackfeet reservation in Montana. Like most of her Native American neighbors, she owned several parcels of reservation land that were being held in trust by the U.S. Government (Indian Trust Fund).  The Indian Trust Fund was created in 1887, as part of the Dawes Act, to oversee payments to Native Americans.  This fund managed nearly 10,000 acres on Josephine’s behalf, leasing the property to private interests for grazing and oil drilling fees.  In return, she was supposed to receive royalties from the trust fund.

Despite the lucrative leases, Josephine had allegedly never received more than $1,500 a year from the trust fund.  According to the story, the payments trickled off and one check totaled only 87 cents.  When her husband died, she even had to borrow money to pay for the funeral.  Josephine’s story is compelling … and it stuck with me.   This story, along with some research I was doing on the Cobell v. Salazar lawsuit (involving the same Indian Trust Fund) and the government’s inability to produce records documenting the income accounting of the payments to Josephine and about 300,000 other Native Americans, caused me to wonder how and why something like this could happen.

The 15-year old class action (Cobell v. Salazar) lawsuit was recently settled for $3.4 billion.  I am writing about this today because hundreds of thousands of notices went out this week to American Indians who are affected by the $3.4 billion settlement bringing an end to a 124 year odyssey involving The Department of the Interior, The Bureau of Indian Affairs and many Native Americans and their descendants.  In this suit, Elouise Cobell (a Native American and member of the Blackfeet tribe) sued the federal government over the mismanagement of the trust fund.  In her suit, Cobell claimed that the U.S. Government failed to provide a historical accounting of the money the government held in trust for Native American landowners in exchange for the leasing of tribal lands.  Ultimately, the case hinged on the government’s ability to produce these accounting records showing how the money was managed on behalf of the original landowners.  I find myself wondering if the whole entire thing could have been avoided with better case management and recordkeeping practices.  This 15-year court battle is the culmination of events going all the way back to the 19th Century!  The landowners had a right to expect proper case management, proper records management and proper distribution of funds.  Apparently, none of those things happened.

As a history buff, I find the whole back story fascinating … so here we go …

It all starts with Henry Dawes (1816 – 1903) who was a Yale graduate from Massachusetts.  He was an educator, a newspaper editor, a lawyer and perhaps, somewhat infamously, a Congressman who was both a member of the U.S. House of Representatives (1857 to 1875) and the U.S. Senate (1875 to 1893).

During his time in public service, he had his ups and his downs.  In 1868, he received a large number of shares of stock from a railroad construction company as part of the Union Pacific railway’s influence-buying efforts.  On the positive side, Dawes was both a supporter and involved with the creation of Yellowstone National Park.  He also had a role in promoting anti-slavery and reconstruction measures during and after the Civil War.  In the Senate, he was chairman of The Committee on Indian affairs, where he concentrated on the enactment of laws that he believed were for the benefit of American Indians.

Dawes’s most noteworthy achievement was the passage of The General Allotment Act of 1887 (known as The Dawes Act referenced earlier).  The Dawes Act authorized the government to survey and inventory Indian tribal land and to divide the area into allotments for individual Indians.  Although later amended twice, it was this piece of legislation that set the stage for 124 years of alleged mismanagement and eventually the Cobell v. Salazar lawsuit.

I see this as a cautionary tale … reminding us of the need for enterprise content and case management as well as records management (but more on that later).  I wasn’t around but I would imagine PC’s ran pretty slowly back in 1887 (chuckle) … but I digress, as manual paper based practices did exist.

Back to the story … The Dawes Commission, was established under the Office of Indian Affairs to persuade American Indians to agree to the allotment plan.   Dawes himself, later oversaw the commission for a period of time after his time as a Senator.  It was this same commission that registered and documented the members of the Five Civilized Tribes.  Eventually, The Curtis Act of 1898 abolished tribal jurisdiction over the tribes’ land and the landowners became dependent on the government.  Native Americans lost about 90 million acres of treaty land, or about two-thirds of the 1887 land base over the lifespan of the Dawes Act.  Roughly 90,000 Indians were made landless and the Act forced Native people onto small tracts of land … in many cases, it separated families.  The allotment policy depleted the land base and also ended hunting as a means of subsistence.  In 1928, a Calvin Coolidge Administration study had determined that The Dawes Act had been used to illegally deprive Native Americans of their land rights.  Today, The United States Department of the Interior is responsible for the remnants of The Dawes Act and the Office of Indian Affairs is now known as the Bureau of Indian Affairs.

There is a pretty big taxpayer bill about to finally be paid out ($3.4 billion) to the surviving Native American descendants and for other purposes.  Throughout the lifecycle of this case, there were multiple contempt charges, fines and embarrassing mandates resulting in the government’s reputation taking a significant hit.  Interior Secretary Bruce Babbitt and Treasury Secretary Robert Rubin were found in contempt of court for failing to produce documents and slapped with a $625,000 fine.  And while time went by and Administrations changed, not much else did when Interior Secretary Gale Norton and Assistant Interior Secretary of Indian Affairs Neal McCaleb were also held in contempt.  At one point, the judge also ordered the Interior Department to shut down most of its Internet operations after an investigator discovered that the department’s computer system allowed unauthorized access to Indian trust accounts.  During this time, many federal employees could not receive or respond to emails, and thousands of visitors to national parks were unable to make online reservations for campsites.  The shutdown also prevented the trust fund from making payments to more than 43,000 Indians, many of whom depended on the quarterly checks to make ends meet. In Montana and Wyoming, some beneficiaries were forced to apply for tribal loans to help them through the holidays.

There was plenty of mudslinging as well:

“Federal officials have spent more than 100 years mismanaging, diverting, and losing money that belongs to Indians,” says John Echohawk of the Native American Rights Fund, which directed the lawsuit.  “They have no idea how much has been collected from the companies that use our land and are unable to provide even a basic, regular statement to most Indian account holders.”

Again I ask … where was the accountability for these landowner cases and the associated records?  Could all of this have been prevented with better policies and processes?

The damage was already done but we know that the government invested in an array of systems such as Integrated Records Management System (IRMS), Trust Funds Accounting System (TFAS), Land Records Information System (LRIS) and Trust Asset and Accounting Management System (TAAMS).  These systems were to collect, manage and distribute trust funds in support of the 1994 Indian Trust Fund Management Reform Act.  They were used for historical accounting purposes and contained land ownership records and financial records for the associated cases.  A major premise of the government’s accounting effort was that the transition from paper to electronic records took the accuracy, completeness and reliability of the trust data to a level that far surpassed the “paper ledger era” … seems like it was too little too late.

I guess we’ll never know for sure, but I firmly believe that much, if not most, of this could have been avoided.  It was alleged during the case that as much 90 percent of the Indian Trust Fund’s records were missing, and the few that were available were in comically bad condition. An Interior Department report provided to the court refers to storage facilities plagued by problems ranging from “poisonous spiders in the vicinity of stored records” to “mixed records strewn throughout the room with heavy rodent activity.”

It’s a tragic story and I am glad it’s finally ending.  It’s disheartening that Josephine Wild Gun and many others had to suffer the way they did for the past 124 years.  It’s amazing the number of people that this impacted starting with Henry Dawes and ending with ~300,000 Native Americans (and everyone in between).  It’s encouraging to know that technologies like Enterprise Content Management, Advanced Case Management and Records Management can all be used with great impact in the future to improve processes and outcomes like this.

As always, leave me your thoughts and opinions here.

IBM at 100: TAKMI, Bringing Order to Unstructured Data

As most of you know … I have been periodically posting some of the really fascinating top 100 innovations of the past 100 years as part of IBM’s Centennial celebration.

This one is special to me as it represents what is possible for the future of ECM.  I wasn’t around for tabulating machines and punch cards but have long been fascinated by the technology developments in the management and use of content.  As impressive as Watson is … it is only the most recent step in a long journey IBM has been pursuing to help computers better understood natural language and unstructured information.

As most of you probably don’t know … this journey started over 50 years ago in 1957 when IBM published the first research on this subject entitled A Statistical Approach to Mechanized Encoding and Searching of Literary InformationFinally … something in this industry older then I am!

Unstructured Information Management Architecture (UIMA)

Another key breakthrough by IBM in this area was the invention of UIMA.  Now an Apache Open Source project and OASIS standard, UIMA is an open, industrial-strength platform for unstructured information analysis and search.  It is the only open standard for text based processing and applications.  I plan to write more on UIMA in a future blog but I mention it here because it was an important step forward for the industry, Watson and TAKMI (now known as IBM Content Analytics).

TAKMI

In 1997, IBM researchers at the company’s Tokyo Research Laboratory pioneered a prototype for a powerful new tool capable of analyzing text. The system, known as TAKMI (for Text Analysis and Knowledge Mining), was a watershed development: for the first time, researchers could efficiently capture and utilize the wealth of buried knowledge residing in enormous volumes of text. The lead researcher was Tetsuya Nasukawa.

Over the past 100 years, IBM has had a lot of pretty important inventions but this one takes the cake for me.  Nasukawa-san once said,

“I didn’t invent TAKMI to do something humans could do, better.  I wanted TAKMI to do something that humans could not do.”

In other words, he wanted to invent something humans couldn’t see or do on their own … and isn’t that the whole point and value of technology anyway?

By 1997, text was searchable, if you knew what to look for. But the challenge was to understand what was inside these growing information volumes and know how to take advantage of the massive textual content that you could not read through and digest.

The development of TAKMI quietly set the stage for the coming transformation in business intelligence. Prior to 1997, the field of analytics dealt strictly with numerical and other “structured” data—the type of tagged information that is housed in fixed fields within databases, spreadsheets and other data collections, and that can be analyzed by standard statistical data mining methods.

The technological clout of TAKMI lay in its ability to read “unstructured” data—the data and metadata found in the words, grammar and other textual elements comprising everything from books, journals, text messages and emails, to health records and audio and video files. Analysts today estimate that 80 to 90 percent of any organization’s data is unstructured. And with the rising use of interactive web technologies, such as blogs and social media platforms, churning out ever-expanding volumes of content, that data is growing at a rate of 40 to 60 percent per year.

The key for the success was natural language processing (NLP) technology. Most of the data mining researchers were treating English text data as a bag of words by extracting words from character strings based on white spaces. However, since Japanese text data does not contain white spaces as word separators, IBM researchers in Tokyo applied NLP for extracting words, analyzing their grammatical features, and identifying relationships among words. Such in-depth analysis led to better results in text mining. That’s why the leading-edge text mining technology originated in Japan.

The complete article on TAKMI can be found at http://www.ibm.com/ibm100/us/en/icons/takmi/

Fast forward to today.  IBM has since commercialized TAKMI as IBM Content Analytics (ICA), a platform to derive rapid insight.  It can transform raw information into business insight quickly without building models or deploying complex systems enabling all knowledge workers to derive insight in hours or days … not weeks or months.  It helps address industry specific problems such as healthcare treatment effectiveness, fraud detection, product defect detection, public safety concerns, customer satisfaction and churn, crime and terrorism prevention and more.

I’d like to personally congratulate Nasukawa-san and the entire team behind TAKMI (and ICA) for such an amazing achievement … and for making the list.  Selected team members who contributed to TAKMI are Tetsuya Nasukawa, Kohichi Takeda, Hideo Watanabe, Shiho Ogino, Akiko Murakami, Hiroshi Kanayama, Hironori Takeuchi, Issei Yoshida, Yuta Tsuboi and Daisuke Takuma.

It’s a shining example of the best form of innovation … the kind that enables us to do something not previously possible.  Being recognized along with other amazing achievements like the UPC code, the floppy disk, magnetic stripe technology, laser eye surgery, the scanning tunneling microscope, fractal geometry, human genomics mapping is really amazing.

This type of enabling innovation is the future of Enterprise Content Management.  It will be fun and exciting to see if TAKMI (Content Analytics) has the same kind of impact on computing as the UPC code has had on retail shopping … or as laser eye surgery has had on vision care.

What do you think?  As always, leave for your thoughts and comments.

Other similar postings:

Watson and The Future of ECM

“What is Content Analytics?, Alex”

10 Things You Need to Know About the Technology Behind Watson

Goodbye Search … It’s About Finding Answers … Enter Watson vs. Jeopardy!