Bayes Watch

October 31, 2014

CHW LogoSay a coin is tossed 10 times, and each time it comes up heads.  What is the probability of heads on the next toss?  It might be tempting to say that the probability is low, since surely 11 heads in a row is extremely unlikely.  But the correct answer is 50%.

Or is it?  It turns out, it depends on what kind of statistics you rely on.

Mark Twain talked about three kinds of falsehood: lies, damned lies, and statistics.  What he didn’t point out is that there are actually different kinds of statistics, and they sometimes give different answers!  The traditional school, known as “frequentist” statistics, is based on the independence of events.  The chance of heads on any toss is completely unrelated to the prior results.  While the probability of 11 heads in a row is indeed extremely unlikely (about 1 in 2000), the chance of any one of those tosses being heads – even the last one after a string of other heads – is still 1 in 2.  Yet it still feels counterintuitive.  After all, I just saw 10 heads some up – how could there possibly be another?

An increasingly popular approach to statistics attempts to answer this.  Bayesian statistics, named for the Reverend Thomas Bayes1, does not assume that the probability of an event is completely independent of prior events.  Instead, the expected probability of an event incorporates other known information, including other results up to that point.  In this case, if I am asked to estimate the chance of an 11th head, I would look at the string of 10 heads in a row and reasonably wonder if perhaps this is not a fair coin.  The answer to that question, in turn, would be based on other information.  How well do I know the person tossing the coin?  Did I have a chance to examine it beforehand?  If there is good reason to believe that the coin may in fact be biased, then I would have to conclude that the probability of a head coming up on the next toss is indeed higher than 50%.  Which is what it feels like intuitively.

While frequentist statistics is what is most commonly taught, most of us in reality behave like Bayesians.  We don’t simply ignore the string of 10 heads as being irrelevant.  While a frequentist would assume the coin is fair, the Bayesian at least asks the question when confronted with evidence that it might not be.

This shouldn’t be an excuse for complete subjectivity.  A true Bayesian approach is just as analytic and quantitative as a frequentist one, and in practice can be more complicated.  But it does more closely mirror the way our minds actually work.  Most physicians are Bayesians when it comes to diagnostic decision-making.  Here’s an example.  I see a child with abdominal pain, and am concerned about appendicitis.  At first, all I know is the age and gender – say, an 11 year old boy.  I know that approximately 10% of all 11 year old boys who come to the ER for belly pain will have appendicitis.  That seems high enough to worry about, but not high enough to go ahead and remove his appendix just yet.  So I go ahead and examine him.  He has a Pediatric Appendicitis Score of 3.  According to the research, half of patients with appendicitis would have a score of at least 3, and only 17% of the patients without appendicitis have a score that high.  Using Bayes’ theorem (look it up if you want, but trust me on the math), I can revise my estimate for the probability of this patient of having appendicitis, knowing not only that he is an 11 year old boy, but an 11 year old boy with a PAS of 4.  His chance of appendicitis is no longer 10%, it is 25%, high enough that I should probably not just ignore it but do further tests.  Based on the results of those tests, I would again update my estimate of probability upward or downward.  On the other hand, with a score of 1, this boy’s chance goes from 10% to less than 2%, and I can reassure the family that we do not need to worry about it unless something changes.

This example illustrates something that may be surprising to non-medical professionals, who may think that tests can tell you whether someone does or does not have a disease.  This is almost never the case.  Every test can have false positive or false negative results.  They are simply one more piece of information that must be interpreted in light of what else we know.  A positive test generally increases the likelihood that someone has the condition we are checking for.  A good test will indicate a high enough probability to take action, while a not-so-good test leaves us sufficiently uncertain that we need more information.  And there are a lot of not-so-good tests out there.

Perhaps the biggest challenge to Bayesian analyses is the need for prior information.  Sometimes this might be based on good research or our own prior experience.  However, we must often make an educated guess.  In those cases, our judgment may be biased by many factors, including what I have previously referred to as availability bias (the tendency to be overly influenced by recent experience or information.)  A child comes to the emergency department with a fever.  His mother recently returned from Africa.  What is the chance the child has Ebola?  Your first reaction might be a small but measurable number, say, a 1% or even a 5% chance.  In reality, we know very little, for example, whether or not the mother has been in a part of Africa affected by Ebola, when she was there, and whether she has had any symptoms and could therefore have transmitted the disease to the child.  Based on what we do know, our highest possible estimate (assuming she had fever and that her travel was in the past 21 days) would be the number of known Ebola patients in Africa (about 10,000) divided by total population of Africa (a little over a billion), or 0.001%.  If we found out, for example, she had been in Guinea, we would change our estimate to a higher chance, while if she had been in South Africa it would be far lower.

And Mark Twain never heard of Ebola.


[1] An 18th century Presbyterian minister in England.  The apochryphal story is that he developed his theorem in an effort to prove the existence of God.


What Happened to Marcus Welby?

October 27, 2014

CHW LogoA few years ago, when a poll was released showing that Congress’ approval rating was at an all-time low of 9%, several commentators pointed out that it was substantially lower than Stalin’s approval rating in Russia.  Which provides a little context around a recent study supported by the Robert Wood Johnson Foundation and reported in the New England Journal of Medicine.  It showed that in response to the question, “All things considered, doctors in [your country] can be trusted,” only 58% of Americans agreed.  This placed us 24th of 29 countries surveyed, just above the former Stalinist countries of Bulgaria, Russia, and Poland.


This represents a significant decline from the 1960s.  It parallels a general distrust of the health care system (23% approval).  Interestingly, Americans are much more satisfied with their own doctor (ranked 3rd internationally) than with doctors in general, which mirrors the pattern of opinions about Congress as a whole (low approval) vs. one’s own representative (much higher).  Why the growing distrust of the medical profession?  Some of it is undoubtedly reflective of a generalized trend away from traditional deference to authority.  (I have no data to support this, but I’d guess there is an erosion of trust in the Encyclopedia Britannica, too.)  There has also been a de-deification of physicians.  Think of the contrast between Dr. Welby and Dr. House.  But another clue may come from looking at the results broken down by income.  Americans with above-average income rated doctors’ trustworthiness higher than those with below-average incomes.  This pattern was not seen in other countries.  Another study in the same issue of NEJM showed that access to healthcare is also sharply lower for low-income Americans, another disparity not seen elsewhere in the world.

Unfair as it may be, it seems likely that as Americans face more barriers to care due to rising cost and our lack of universal coverage, they are taking it out on all those who are seen to be benefitting from the system, including, unfortunately, doctors.  This is exacerbated by stories like the one in the New York Times about the unexpected bill for $117,000 for an assistant surgeon, or one of my buddies getting an $18,000 bill for his wife’s cataract surgery from a doctor he had never heard of.  Unfortunately, we physicians are seen as no better than those Party leaders who benefitted under Communism.  Or – gasp – than Congress.

The Price of a Scar

October 17, 2014

CHW LogoIf we needed any more evidence that yes, cost matters to patients and families, a new study from Annals of Surgery should be a wake-up call.  Researchers from Primary Children’s Hospital in Utah approached the families of 100 children about to undergo surgery for acute appendicitis.  They were offered 2 options: the open procedure, or laparoscopic surgery.  Based on published evidence, they were told that the complication rates were similar, but that open appendectomy would result in a larger scar.  One half of the families were only given this clinical outcome information.  But the other half were also given information on the charges of the two different procedures: $2172 less for the open procedure.

We pediatric providers have long assumed that, when it comes to their children particularly for something serious or with potential lasting consequences, parents would always pick what was best.  Cost would be no object.  Well guess again.  In this study, among families not shown the charge information 35% of the time, while those who were aware of the charges chose the less expensive, bigger scar option 63% of the time.  Interestingly, this difference was independent of insurance type, deductible, or income.

I’m not suggesting that parents will take their kids to Walgreens for a heart transplant, or that many parents wouldn’t make extraordinary efforts to get what they think is best for their child.  But this study demonstrates that cost is a major factor.  When told the cost, they are willing to trade off some significant possible negatives – in this case, a larger permanent scar.  As one familiar member said, “Cost saving measures are a priority for me when it does not impact the safety of the patient.”  And this is an acute, potentially life-threatening condition, where the parents may weigh cost less due to the pressure of making a decision without time to really consider the alternatives or “comparison shop.”  Imagine how this might look for something completely elective like ear tubes.

A few important caveats.  First, the surgeons in this study could convincingly claim the complications would be expected to be similar because it would be the same team – surgeons, anesthesiologists, nurses, etc. – doing either procedure.  In the real world, parents need to choose between more dissimilar alternatives, such as a specialized children’s center with a full complement of sub-specialists vs. a lower volume community hospital with non-pediatric providers.  Second, parents were provided with the full cost of having the appendectomy: a bundled price for everything.  In reality, most people have a hard enough time finding out the price of each item or service used.  The move to price transparency can only work if hospitals and providers can show what the total cost to the family will be.  For example, our hospital has a reputation for being expensive, based on the price of some of our services.  But analyzing data reported to the state of Wisconsin, I was able to show that for children in the Milwaukee area, the least expensive average charge for an emergency department visit was at Children’s Hospital of Wisconsin!  I didn’t have the data to figure out why, but a very reasonable hypothesis based on other research on differences between general and pediatric EDs is that we do less testing and treatment than at other hospitals because of our greater expertise in dealing with children.  (I have long held that the key to being an excellent pediatric emergency physician is as much in knowing what not to do as what to do.)  Even if Children’s charges more (and I don’t know if this is actually true) for a CT scan of the head, a parent wanting to know the cost needs to understand that their child is far less likely to get one unnecessarily in our ED.

It’s all back to the value proposition.  People paying for health care – and increasingly that is families themselves – want a good outcome and good experience at a reasonable cost.  If we want to attract children to our hospital – and kids do deserve the best – we need to be able to demonstrate all parts of that value equation.  And what this Annals of Surgery study shows is that we can’t assume we know what parents will value.  Many of us would pay more for the smaller scar.  But what matters isn’t what we would do.  We can provide information, we can provide guidance; only the family can decide.

Common Ground  

October 14, 2014

CHW LogoFarmers vs. ranchers.  Jets vs. Sharks.  Arabs vs. Israelis.  Bourgeoisie vs. proletariat.  Packer fans vs. Viking fans.  Examples of seemingly unbridgeable gulfs abound in literature and life.  It’s sometimes difficult to picture these groups even talking to each other, much less connecting.  In the 1990s, books like Men Are From Mars, Women Are From Venus, and You Just Don’t Understand, popularized the notion that, because men and women see and process the world so differently, it creates inherent barriers to effective communication.  While criticized in some circles for over-generalization and stereotyping, the research behind these books supports the idea that differences in life experience can undermine meaningful dialogue and relationship-building between people.

New evidence shows that this is particularly true about class background.  In a series of studies, Stephanie Cote and Michael Kraus showed that interaction between people of different socioeconomic status were marked by verbal and non-verbal indications of lower degrees of engagement and emotional connection.

Think about the implications.  Many in the healthcare professions are at least in the middle class, while a large number of our patients and families are significantly less advantaged.  Does this interfere with our ability to bond with them, to empathize?  At times we have to admit it does.  Who hasn’t heard (and at times made) disparaging comments about “frequent flyers,” patients who are “non-compliant,” folks abusing the system?  This happens all too often.  Yet by and large, even those of us near the top of the economic ladder show amazing cognitive and emotional connection to those we care for.  How do we do it?

The answer, I think, comes from some of the same studies.  When participants were asked to interact with others of different background, their engagement and connectedness increased when they were first asked to identify points of commonality.  We see this when people of widely varying status come together in fellowship in places of worship (shared faith), or sports leagues or clubs (shared interests), or life-threatening emergencies (shared mortality and fate).  For us, I believe it is the kids, our value of purpose.  We caregivers and providers on the one hand, and families on the other, share an interest first and foremost in the child.  It’s when we forget that commonality that we fail to make a real connection, moving from curious to judgmental.

One of my favorite books, The Lemon Tree, tells the story of a Palestinian and an Israeli who bond over a shared love of a piece of property.  It shouldn’t be hard for each of us to try to find that one piece of common ground when we deal with families or colleagues who may be from such different circumstances that connecting is a challenge.  Even Packer and Viking fans can agree about the Bears.

Rolling Right Along

October 3, 2014

CHW LogoThe results are in, and Wisconsin is the winner! The 2014 National Bike Challenge just ended, and our state edged out last year’s winner, Nebraska, with over 7800 participants (including 40 from Children’s Hospital!) pedaling 3.9 million miles, of which 70% were for recreation and 30% were for transport.  That’s 1.3 million miles of commuting and errands that might otherwise have required a car.  In Wisconsin alone, we kept 3.6 million pounds of carbon dioxide out of the atmosphere.

The environmental impact is one of many reasons I and others choose to try to get around as much as possible on two wheels.  Much of the year it’s just nice to be outside, and it can be a really relaxing way to unwind at the end of the day.  Of course, it’s also a good way to get in some exercise while also doing something useful (spoken like a true multi-tasker).  The Wisconsin contingent burned a collective 213 million calories cycling during the five months of the challenge.  Just think of all the deep fried cheese curds we could eat afterward….

People are catching on.  I’ve noticed the bike racks here at the hospital getting more and more full.  Nationally, miles driven are down, and the number of people bicycling to work increased in 85 of the 100 largest metro areas between 2000 and 2010.  According to the Guardian, not only are individual workers recognizing the benefits and switching, but businesses are finding that promoting cycling actually improves their bottom line.  Businesses with access to protected bike lanes (such as you find everywhere in Denmark and the Netherlands) have higher sales per parking spot (car vs. bike); real estate values are higher; and workers are healthier.

Some recent local developments could make the picture even brighter for cyclists.  The city of Milwaukee has been adding bike lanes, and Wauwatosa, as part of a comprehensive cycling and pedestrian plan, is adding high-visibility green bike lanes to North Ave.  And we are finally catching up, albeit slowly, with the bike-sharing trend in many cities.  Bublr, a Milwaukee bike share start-up, currently has 10 stations around the city, with plans to increase that to 100.  (Several locations on the Milwaukee Regional Medical Center campus and in the village of Wauwatosa are being considered.)

Yes, I know winter will be here before we know it (or want it), but there’s still plenty of fall days left.  (And don’t rule out winter commuting.)  Give it a try.  We don’t want those Cornhuskers to catch us.

What’s the Value of Trainees?  

September 29, 2014

CHW LogoThere are two especially awkward phases of life for most physicians: adolescence and residency.  Both are sort of in-between states, where you are not quite what you left behind but not yet fully what you are moving toward.  Is a resident a learner or a worker?  Depends on who you ask, and the answer has changed over time.  For example, when I was a resident we belonged to a union (!) – the Committee of Interns and Residents (CIR).  Except the CIR wasn’t a true union, because we were considered students rather than employees, and therefore not able to unionize.  At the same time, we were able to continue to defer payments on student loans because we were still “in school.”  Since then, the National Labor Relations Board has ruled that residents are actually employees and therefore entitled to organize (the CIR is now affiliated with the SEIU), while the IRS has ruled similarly, and residents must begin making student loan payments.  Win some, lose some.

The uncertainty carries over to the issue of federal funding for graduate medical education.  Currently Medicare, Medicaid, Veteran’s Affairs, and the states pay approximately $16 billion annually to hospitals to offset the cost of having residents and fellows.  Part of that covers the salaries and benefits of the trainees (direct GME), while the majority offsets the additional costs associated with medical training (indirect GME), such as lower productivity for supervising physicians, additional testing ordered by trainees, etc.  (I should note that this generally does not include pediatric residents and fellows, as children’s hospitals do not treat Medicare patients.  A separate, much smaller [$265 million] stream of Children’s Hospital GME funding is available, but unlike the Medicare money, it must be approved annually during the budget process.)

The rationale for this funding is that the training of physicians benefits society.  Teaching hospitals would have no financial incentive to train physicians who can, after all, go work anywhere when they are done.  Therefore, government should help pay for ensuring a supply of trained medical professionals.

Buried in a recent Institute of Medicine report on the state of graduate medical education, a small but notable group of health economists questioned that rationale.  They argue that residents provide a greater economic benefit to their hospitals than the salaries they receive; therefore, government GME funding is simply a subsidy of those hospitals.  The fact that most hospitals actually have more residents than they get funding for (the number was capped in the 1990s) is evidence that the hospitals must see them as a good investment.

If true, this might argue for using that $16 billion for other purposes, as those economists urge.  However, as I’ve already indicated, it’s not all that clear cut.  It is true that residents provide work that is of benefit to the hospitals that employ them as well as to the attending physician staff.  But much of this work takes the form of documenting and performing other tasks that can be – and in non-teaching hospitals, is – done by nurses or advanced practice providers.  And it isn’t clear that the work done by a resident provides more value than what could be done by these others, as the economists imply.  For one thing, residents rotate to different areas of the hospital each month, and often between hospitals.  There is a constant learning curve that in most cases sharply limits the benefit of the work compared with what you would get with a stable staff.  Moreover, the ratio of useful work increases with years of residency, but once residents enter their last (and most “productive”) year of training and really hit their stride, they leave.  In simple economic terms, most hospitals would actually be better off hiring non-residents for those tasks.

I do believe there is a unique value to a hospital of having physicians-in-training.  It’s not, as these economists argue, cheap labor.  Rather, it takes the form of the academic, intellectually challenging and stimulating environment that residents create.  It’s part of the reason I and many of my colleagues have always wanted to be at a teaching hospital.  That, however, is difficult to quantify.  In the current health care environment, with ever greater economic pressure, hospitals may be less willing to invest in such an intangible benefit without the GME funding.

Also, while it may be partly coincidental, teaching hospitals tend to be the care provider of last resort in a community.  The mission of caring for everyone regardless of ability to pay tends to go hand in hand with the education mission.  Part of the indirect cost of a teaching program is the large percentage of patients for whom the actual costs of care are not covered (Medicare, Medicaid, uninsured).  Yes, it’s a subsidy, but not for the bottom line of the hospital.  It’s a subsidy of the safety net we provide, masked as a subsidy for training future physicians.

There are certainly improvements we can make in the way GME is paid for.  For example, the program could do a better job of prioritizing undersupplied primary care fields (including pediatrics).  But arguing that GME funding is a form of corporate welfare for hospitals, and that the costs of training residents should be left to the marketplace, is not going to get us more of the right kinds of doctors, or better care for patients.

Picture This

September 22, 2014

CHW LogoA neurosurgeon, a minister, and a nurse walk into a bar…. (There’s no punch line, though I invite suggestions for a good joke.)  I’m willing to bet that in picturing the scenario the vast majority of you imagined two men and a woman.  I’d even go so far as to say that the woman was the nurse.  It’s an example of how our thinking is influenced by our most recent experiences.  If you work at Children’s Hospital of Wisconsin, the last nurse you met was very likely female, and the last neurosurgeon was certainly a male.  It is what is referred to in the psychology of decision making as the “availability heuristic”: when we make judgments without complete information, we tend to refer to our most recent experiences, relying on the information we have available by easy recall to fill in for the information that is missing.  (A heuristic is a mental shortcut – there are many types, this being just one.)  Not knowing the sex of the characters, I draw on the most recent prior information I have about the sex of a neurosurgeon or a nurse.

Short cuts like this evolved as a way for our minds to function more efficiently.  When asked “Think of a common man’s name that starts with P,” it is far easier for me to conjure up the last man with that name that I interacted with (Peter) than to call up in my mind the complete list of men’s names beginning with P (Paul, Philip, Patrick, Pedro, Pradeep, etc.) and thinking about how many people have each of them.  In many circumstances, the availability heuristic works well and allows us to act on incomplete information.

You could argue that it’s simply a matter of playing the odds.  In the US, the majority of neurosurgeons and ministers are men, and the majority of nurses are women.  But research shows that we actually are not all that good at thinking statistically, and that playing the odds is often trumped by recent experience.  When recent experience is not representative of reality, this mental shortcut leads to bias.  For example, we recently had a patient in the ED who had just arrived from Liberia with high fever and upper respiratory symptoms.  Which is the most likely diagnosis: a) malaria, b) a cold, c) Ebola?  If Ebola even crossed your mind then you are displaying the availability bias; a cold is several orders of magnitude more likely based on actual prevalence.

Non-representative recent experience can steer us wrong in many ways.  It’s a common problem in medical diagnostic decision making, especially among non-experts.  I remember as a fellow seeing a teen with severe abdominal pain, to the point that he was irrational.  I had recently read about acute intermittent porphyria, which can cause abdominal pain and altered mental status, and promptly ordered a urine porphobilinogen level to test for it.  Never mind that it has an incidence of around 1 in 50,000.  Not only was I wrong, it delayed me from treating his pain and making the actual diagnosis (kidney stone, incidence about 1 in 10, though less common in teens).  I suspect the availability bias explains a good deal of the higher cost of care provided by medical trainees.  The first time a resident sees someone with a rare illness, they start to evaluate more patients for that problem.  It’s also a culprit in driving some utilization by patients.  When the media run sensational reports about uncommon conditions, people overestimate their risk and often seek unnecessary medical care.

The availability heuristic also leads to broader bias in society.  For instance, young blacks are arrested for marijuana possession at much higher rates than young whites, despite having a similar frequency of drug use.  Blacks thus have higher rates of incarceration, and news stories about drug arrests are much more likely to feature African-Americans.  As a result, people (both blacks and whites) overestimate the proportion of criminals that are black.  In one study, 60% of viewers of a crime story without a picture of the suspect falsely recalled seeing one, and 70% believed that the suspect was African-American.  After all, the last news story they saw about crime was likely to have featured a black suspect: availability bias.  Similarly, low income individuals are more likely to be prosecuted for child abuse, leading us to believe – incorrectly – that those who are more well off are unlikely to maltreat their children, and potentially missing an opportunity to intervene when necessary.

There are many examples of how our use of this mental shortcut can lead us not only to misrepresent how common or uncommon something is across a group, but also to misapply the most readily recalled information about groups to individuals.  Even when the most recent image is truly representative (e.g., most nurses at Children’s are female), it may not apply to a given individual.  (Just ask any of the 3 male nurses I worked with in the ED yesterday!)

The availability heuristic is just one of the filters we all see the world through.  Like other filters, it’s not necessarily either good or bad, but it is something to be aware of.  When we make a snap judgment without having all the information, we need to be aware that we are overly influenced by our most recent experience and by the way things are portrayed – correctly or not – in society at large, and be willing to reshape our initial image as we get more information.  And while some people cry “political correctness” when we use gender-neutral language or multiracial images, a non-biased environment is an important way to make our mental images more accurate.  I know more than a few women neurosurgeons, female ministers, and male nurses who would appreciate it.


Get every new post delivered to your Inbox.

Join 353 other followers

%d bloggers like this: