A Quick Confidence Heuristic

Let’s say you have well-informed opinions on a variety of topics. Without information about your long term accuracy in each given area, how confident should you be in those opinions?

Here’s a quick heuristic, for any area where other people have well-informed opinions about the same topics; your confidence should be a function of the distance of your estimate from the average opinion, and the standard deviation of those opinions. I’ll call this the wisdom-of-crowds-confidence level, because it can be justified based on the empirical observation that the average of even uninformed guesses is typically a better predictor than most individual predictions.

Why does this make sense?

The Aumann agreement theorem implies that rational discussants can, given enough patience and introspection, pass messages about their justifications until they eventually converge. Given that informed opinions share most evidence, the differential between the opinions is likely due to specific unshared assumptions or evidence. If that evidence were shared, unless the vast majority of the non-shared assumptions were piled up on the same side, the answer would land somewhere near the middle. (This is why I was going to call the heuristic Aumann-confidence, but I don’t think it quite fits.)

Unless you have a strong reason to assume you are a privileged observer, trading on inside information or much better calibrated than other observers, there is no reason to expect this nonshared evidence will be biased. And while this appears to contradict the conservation of expected evidence theorem, it’s actually kind-of a consequence of it, because we need to update on the knowledge that there is unshared evidence leading the other person to make their own claim.

This is where things get tricky — we need to make assumptions about joint distributions on unshared evidence. Suffice it to say that unless we have reason to believe our unshared evidence or assumptions is much stronger than theirs, we should end up near the middle. And that goes back to a different, earlier assumption – that others are also well informed.

Now that we’ve laid out the framework, though, we can sketch the argument.

We can expect that our opinion should shift towards the average, once we know what the average is, even without exploring the other people’s unshared assumptions and data. The distance it should shift depends on how good our assumptions and data are compared to theirs.
Even if we have strong reasons for thinking that we understand why others hold the assumptions they do, they presumably feel the same way about us.
And why do you think your unshared evidence and assumptions are so great anyways, huh? Are you special or something?

Anyways, those are my thoughts.

Comments?

(Originally posted here on LessWrong)

Chasing Superior Good Syndrome vs. Baumol’s (or Scott’s) Cost Disease

Slatestarcodex had an excellent (as always) piece on “Considerations on Cost Disease.” It goes over a number of reasons, aside from Baumol’s cost disease, for why everything in certain sectors, namely healthcare and have gotten much more expensive. I think it misses an important dynamic, though, that I’d like to lay out.

First, though, he has a list of eight potential answers, each of which he partly dismisses. Cost increases are really happening, and markets mostly work, so it’s not simply a market failure. Government inefficiency and overregulation doesn’t really explain large parts of the problem, nor do fear of lawsuits. Risk tolerance has decreased, but that seems not to have been the sole issue. Cost shirking by some people might increase costs a bit, but that isn’t the whole picture. Finally, not on that list but implicitly explored when Scott refers to “politics,” is Moloch.

I think it’s a bit strange to end a piece with a long list of partial answers, which plausibly explain the vast majority of the issue with “ What’s happening? I don’t know and I find it really scary.” But I think there is another dynamic that’s being ignored — and I would be surprised if an economist ignored it, but I’ll blame Scott’s eclectic ad-hoc education for why he doesn’t discuss the elephant in the room — Superior goods.

Superior Goods

For those who don’t remember their Economics classes, imagine a guy who makes $40,000/year and eats chicken for dinner 3 nights a week. He gets a huge 50% raise, to $60,000/year, and suddenly has extra money to spend — his disposable income probably tripled or quadrupled. Before the hedonic treadmill kicks in, and he decides to waste all the money on higher rent and nicer cars, he changes his diet. But he won’t start eating chicken 10 times a week — he’ll start eating steak. When people get more money, they replace cheap “inferior” goods with expensive “superior” goods. And steak is a superior good.

But how many times a week will people eat steak? Two? Five? Americans as a whole got really rich in the 1940s and 1950s, and needed someplace to start spending their newfound wealth. What do people spend extra money on? Entertainment is now pretty cheap, and there are only so many nights a week you see a movie, and only so many $20/month MMORPGs you’re going to pay for. You aren’t going to pay 5 times as much for a slightly better video game or movie — and although you might pay double for 3D-Imax, there’s not much room for growth in that 5%.

The Atlantic had a piece on this several years ago, with the following chart;

Food, including rising steak consumption, decreased to a negligible part of people’s budgets, as housing started rising.In this chart, the reason healthcare hasn’t really shot up to the extent Scott discussed, as the article notes, is because most of the cost is via pre-tax employer spending. The other big change the article discusses is that after 1950 or so, everyone got cars, and commuted from their more expensive suburban houses — which is effectively an implicit increase in housing cost.

And at some point, bigger houses and nicer cars begin to saturate; a Tesla is nicer than my Hyundai, and I’d love one, but not enough to upgrade for 3x the cost. I know how much better a Tesla is — I’ve seen them.

Limitless Demand, Invisible Supply

There are only a few things that we have a limitless demand for, but very limited ability to judge the impact of our spending. What are they?

I think this is one big missing piece of the puzzle; in both healthcare and education, we want improvements, and they are worth a ton, but we can’t figure out how much the marginal spending improves things. So we pour money into these sectors.

Scott thinks this means that teachers’ and doctors’ wages should rise, but they don’t. I think it’s obvious why; they supply isn’t very limited. And the marginal impact of two teachers versus one, or a team of doctors versus one, isn’t huge. (Class size matters, but we have tons of teachers — with no shortage in sight, there is no price pressure.)

What sucks up the increased money? Dollars, both public and private, chasing hard to find benefits.

I’d spend money to improve my health, both mental and physical, but how? Extra medical diagnostics to catch problems, pricier but marginally more effective drugs, chiropractors, probably useless supplements — all are exploding in popularity. How much do they improve health? I don’t really know — not much, but I’d probably try something if it might be useful.

I’m spending a ton of money on preschool for my kids. Why? Because it helps, according to the studies. How much better is the $15,000/year daycare versus the $8,000 a year program a friend of mine runs in her house? Unclear, but I’m certainly not the only one spending big bucks. Why spend less, if education is the most superior good around?

How much better is Harvard than a subsidized in-state school, or four years of that school versus 2 years of cheap community college before transferring in? The studies seem to suggest that most of the benefit is really because the kids who get into the better schools. And Scott knows that this is happening.

We pour money into schools and medicine in order to improve things, but where does the money go? Into efforts to improve things, of course. But I’ve argued at length before that bureaucracy is bad at incentivizing things, especially when goals are unclear. So the money goes to sinkholes like more bureaucrats and clever manipulation of the metrics that are used to allocate the money.

As long as we’re incentivized to improve things that we’re unsure how to improve, the incentives to pour money into them unwisely will continue, and costs will rise. That’s not the entire answer, but it’s a central dynamic that leads to many of the things Scott is talking about — so hopefully that reduces Scott’s fears a bit.

Deceptive Dataviz and Confusing Data; Uncomparables in Education

I don’t actually want to talk about dataviz here, I want to talk about the data that is visualized. I routinely see graphs that are not (necessarily) bad as misleading graphs, but is bad data to be presented in a graph. There are plenty of examples of unreasonably constrained axes, or simply incorrect bar heights — but that’s not the problem for today.

Today, I want to give an example of data that is displayed as if the information is comparable, when it isn’t – like dollars and scores, or percentage improvement versus totals. What do I mean? I have a great example!

It might be worth noting that scores are the only data being normalized to the improving international standard…pic.twitter.com/88sbi1nG3y

— David Manheim (@davidmanheim) February 8, 2017

This graph is a masterpiece of the errors I am talking about. And it seems the very recently deceased Dr. Coulson is being maligned by a wiki article on Cato attributing this graph to him. (At the very least, the original seems to have kept dollars and percentages separate.) This graph tries hard to make incomparable data comparable, by displaying percentage change of a variety of incomparable datasets — which is better than showing the comparable raw data, right?

Well, no. At least not here. But why are they incomparable?

First, we have NAEP scores, which are inconsistently measured over the time period; the meaning of the metric changed repeatedly over the time period displayed, as academic standards have been altered to reflect the changing abilities and needs of students.

They are also scores, and as I’m sure everyone is aware, the difference between a 1300 and an 1400 on the SAT is much smaller than the difference between a 1500 and a 1600. Percentage improvements on these tests are not a great comparison. They are also a range-bound number; the scores are in the range 0–500, so that doubling the math scores is not only not linear, but in most cases literally impossible; it’s already around 300.

Next, the basis for all of these numbers is non-constant, in an interesting way. The chart presents enrollment as a total, but ignores the changing demographic mix — and no, this isn’t about the soft bigotry of low expectations, it’s about the expanding school population. Expanding? Yes — because the number is constant, but the total is shrinking. (Chart by Bill McBride)

The 1970s were the height of the baby boom — and the percentage of people who were going to school was still on an upwards trend;

The totals were flat, but the demographic split wasn’t, and the percentage of low achievers, who are the least likely to attend, is increasing. And the demographic composition of schools matters. But I won’t get into divergent birth rates and similar demographic issues any further for now.

But what about cost? I mean, clearly that can’t be deceptive — we’re spending more, because we keep hiring more teachers, like the chart seems to show! But we aren’t —teachers only increased by 50% in that time, not nearly 100%. But the chart isn’t wrong — they’re hiring more staff (largely to deal with regulations, as I’m sure Cato would agree.)

And this also explains why total cost went up — we have way more non-teacher staff, many of whom are much more expensive. We’re also neglecting the fact that the country is richer, and as a share of GDP, we’re way behind, because we pay teachers the same amount, but the economy as a whole grew. But that’s a different issue.

So yes, we can show a bunch of numbers correctly on a chart, but it won’t mean what it looks like if we’re sloppy — or purposefully misleading.