The “Truth Value” of Evidence

Exploring the use of statistical evidence in law and other forms of truth-finding

Teng Rong
10 min readJul 5, 2019

Suppose a crowd of 100 people were involved in a riot. All 100 were arrested for their involvement. Of the 100, we know that 99 people willfully damaged property. No other identifying information could be found.

Suppose that you are the judge for the trial of person X. The prosecutor presents the proof that 99 out of 100 people arrested that day damaged property, and infers from this fact that there is a 99% chance defendant X is guilty of the crime. Do you convict based on this evidence?

Your intuition might lean in either direction. If you lean towards a “yes”, then the reasons seem self-explanatory: there is an overwhelming body of evidence, “beyond a reasonable doubt” that X participated in the riot. After all, rarely in life can we be 99% sure of the truth of a claim. How can we justify not convicting given such strong probability? (We’ll call this Intuition 1)

However, what if your intuition leans in the direction of “no”? This intuition against statistical evidence exists despite the near-certainty that the claim is correct. Some may present this intuition in the form of falsifiability: that the statistical evidence cannot be falsified. In this form, it follows that if the statistical evidence is allowed and all 100 defendants stood trial, all 100 defendants would be convicted based on the same evidence which shows only 99/100 are guilty. You may be right to point out that this is a contradiction and thus the evidence should not be allowed. (We’ll call this Intuition 2)

Robert Nozick called this form of intuition against statistical evidence “sensitivity”:

S’s belief that p is sensitive if and only if, (if p were false, S would not believe that p).

In not-so-philosophical terms, this approximates the claim that a sensitive belief is not merely correct, but also responsive to the truth across different scenarios. Here, prisoner X is either guilty or not guilty. If you use the 99% statistical evidence as proof to convict X, this conviction would be equally valid regardless of his actual guilt. This means that X’s conviction is not sensitive to his actual guilt.

So, which intuition is correct?

This actually turns out to be a very difficult and interesting problem. Consider this alternative scenario: there is one eyewitness watching the same riot. Eyewitness Y has agreed to testify at the trials of each of the defendants. Independent testing shows that eyewitness Y has an accuracy of 99% in positively identifying people that she saw one week prior. This is the only available identifying information.

Once again, you are presiding the trial of defendant X. The prosecution questions the eyewitness and she testifies that she saw X destroying a mailbox. Absent any other evidence, do you convict?

Let us compare the responses for both examples from those who hold either intuitions from the first example:

Intuition 1: This intuition convicts for the 1st example based on the likelihood. It also convicts for the 2nd example because the witness’s accuracy raises no reasonable doubt as to the guilt of the accused.

Intuition 2: This intuition acquits for the 1st example based on the lack of falsifiability/sensitivity/individualization of the statistical evidence. This intuition convicts based on the witness testimony for the second example, because the witness would not have testified against defendant X had he not destroyed property.

Through these examples we can see the alignment of these intuitions. For the first intuition the basis for conviction is the bare probability of the act having actually occurred. This intuition discounts the importance non-trivial alternatives: in the first example with the statistical evidence, there is a factually true, non-trivial alternative that X is the 1/100 that did not commit any crime. We can claim that intuition 1 selects for false positives.

For the second intuition, the basis for conviction is the lack of non-trivial unfalsifiable alternatives. This intuition discounts the importance of bare probabilities: if a group of 10,000 rioters are rounded up, and we only know that one person did not commit any crimes, then despite the 99.99% chance that any single person is a criminal, intuition 2 would acquit everyone. We can claim that intuition 2 selects for false negatives.

I use the riot as an extreme example of the Paradox of the Gatecrasher to show how the intuition against statistical evidence can be strong even in extreme cases. L. Jonathan Cohen’s Paradox of the Gatecrasher is a famous paradox in legal philosophy that states (with some modification):

  1. There is a concert of which there are 1000 attendants.
  2. Only 300 tickets have been sold, and ticket stubs are not issued so there can be no separate proof of purchase.
  3. Management picks a person at random and sues for not-paying to see the concert.
  4. The balance of probabilities is essentially a p>0.5 measure. The bare probability the concertgoer having not paid, or “gate-crashed” is 70%.
  5. Therefore, a court should find this person liable to pay.

Like our two scenarios above, the gatecrasher paradox irks our intuitions about what types of evidence should be allowed in law and other forms of truth-finding. The only difference is that this paradox is designed for maximum irking. If these statistics are the only evidence available about whether a concertgoer paid their due, and the court accepted this evidence, then management can sue every person and win every time. Is this acceptable? Most will say no.

If you found the conclusion of the paradox, as presented, to be intuitively wrong, consider the alternative: that an eyewitness with 70% reliability testified against the defendant. Suppose that the court found for the concert management based on this testimony (as courts are almost certain to do), is this acceptable? Most will say yes.

Why?

The root of this intuition-split, in my opinion, lies in the ability for evidence to determine the truth of an event. In other words, it depends on our intuitive concept of the “truth value” of evidence. The “truth value” of evidence, coupled by a situational and temperamental preference for false positives or false negatives, help us get to the bottom of the Gatecrasher Paradox.

Evidence can be roughly split into two categories: statistical evidence, which describes the degree of certainty of a claim, and individual specific evidence, which ascribes some deterministic truth value onto a claim. The evidence of 700 gatecrashers per 1000 attendees is statistical in nature, and only allows us to know that any claim of “X snuck in” is 70% probable. The evidence of the eyewitness, despite also being 70% probable, is intuitively attached to the individual instance.

Individual specific evidence is, in its nature, evidence that relates to the event in question. Statistical evidence, in its bare form (i.e. as a Bayesian prior), can only describe the relation of the event to its surrounding circumstances. The statistical evidence does not hold, and thus does not convey any meaningful information about the event in question. The truth of the event can only be determined by an encounter with the event.

In other words, the statistical evidence has no ability to peer beyond the veil at the actual event to which we are trying to ascribe “true or false”. Mr. X either did sneak into the concert without paying, or he did not. Probability is useful as a heuristic tool to help us estimate truth, but probability neither contains the truth nor provides meaningful information to actually determine the truth.

Let me illustrate this with an example. I buy a lottery ticket with the numbers 2, 14, 23, 26, 31, 37. There are a maximum of 42 numbers to pick from, and I have to pick 6 and predict them all correctly to win. I ask you to find out whether or not I won this lottery.

You can easily find that the odds of winning this lottery is 1 in 5,245,786. There is a lot of information in that statement. It means that for any particular lottery ticket in this scheme, I have a 99.9999809% chance of not winning. However, as we can see, this statistical evidence contains only information about the lottery ticket in relation to all the other lottery tickets in this lotto scheme. It contains no information about the individual ticket I am holding in my hand.

You would be wrong to claim, on the basis of the statistics, that I did not win. Even if later it turned out that I indeed did not win, your prior claim lacks “truth value” in the sense that the evidence you used lacked the ability to examine the actual event. The only way to find out the truth of whether my lottery ticket is a winner is to check and match numbers with the lotto corporation for that particular draw.

Sticking to this example, suppose that I send a 4-year-old kid with pretty bad memory to check the ticket. The child has a 30% chance of relaying the result incorrectly (i.e. a 0.7 certainty, versus the 0.9999998 statistical certainty). He relays the information and tells me that I won. Which evidence should I accept? Let us look at the two possibilities:

  1. The statistical evidence tells me, with 99.99998% certainty that I lost, or
  2. The kid-evidence tells me, with 70% certainty that I won.

The difference between 1 and 2, again, lies in the “truth value” of the claim. If I am interested in the actual truth of whether I can retire early from a big jackpot, I will not be throwing away the ticket despite the #1 odds. Indeed, if we reasoned as if statistical evidence contained “truth value”, we would be throwing away lottery tickets as we buy them based on the sheer overwhelming unlikelihood of winning.

Yet, we insist on checking. We are much more likely to take the word of the kid as truth over the word of the statistic, despite the significantly lower certainty. I believe that this is because our intuitions tend towards a deterministic view of the the world. We do not see the world in terms of probability: things either did happen, or did not happen. This intuition reflects in our expectations for the “truth value” of evidence. Evidence that did not take part in observing an event cannot tell us “what happened”, it can only tell us “what might have happened” — and “what might have happened” is not truth.

There is, of course, the obvious horror of making a Type I error in this example (false positive of believing there is no winning ticket, default condition, when there is a winning ticket, a.k.a. discarding $20 million). Given the sheer undesirability of a false positive, it makes sense that we lean towards Intuition 2 in our treatment of statistical evidence. Analogous to the intuition of “not convicting an innocent man based on statistical evidence” is the intuition of “not throwing out a winning ticket based on probability”.

These two factors interplay to produce our intuitive leanings towards and away from statistical evidence. In a situation where making a false negative is unacceptable, the evidence that guarantees the most positives will likely be accepted. If, in this type of situation, statistical evidence provides a greater certainty, then it becomes more intuitively acceptable.

Consider this: you are walking in a forest when you see a silhouette that resembles a large cat. You have just recently learned that silhouettes of large cats have a 99% probability of being a man-eating tiger in the local forest. This bare statistic contains no information about this instance of silhouette, and thus, per our prior analysis, contains no “truth value” in relation to this instance. Do you:

  1. React as if there is a tiger, or
  2. React as if there is no tiger?

The answer here seems quite obvious. No reasonable person would ever treat the shadow as not-a-tiger on the basis that the statistic contains no “truth value”. Here, we are geared towards false positives as a result of the cost of the false negative (believing in “no-tiger” when there is in reality “tiger”) is our lives. Here, perhaps, we can say that we do not care about the truth of the particular event: we only care about the most extreme undesirable outcome of becoming tiger food. In other words, we are forced to reason probabilistically.

If we re-visit the initial examples of this article, we can see how each intuition relates to a preference for and against a type of error. Those less concerned about the actual event and more concerned about casting a wider net would be more intuitively open to using the 99% as evidence to convict prisoner X. Those more concerned about the truth of the actual guilt of X would resist the use of statistical evidence and yet be intuitively open to using the eyewitness evidence. I suspect if enough people were surveyed, the distribution of each intuition would change if the crime was not “destruction of property” but instead, murder.

At some point, we inevitably encounter the problem of extreme outcomes. Suppose that instead of 100 rioters, it was 100 Thanos(es) and only one of them does not wish to exterminate half the life in the universe. Are we still willing to forego statistical evidence for its lack of “truth value”? If we’re so concerned about finding truth, then the statistical evidence alone cannot suffice. However, I think that everyone reading this article would be more than willing to convict all 100 based on statistical evidence.

Final Note:

All of this reminds me of Sindell v. Abbott Laboratories, 26 Cal. 3d 588 (1980). In that case, the court decided to use market share, form of statistical evidence, to portion liability. Drugmakers were held liable even though the court could not peer behind the veil and determine which drugmaker actually injured the plaintiff.

Sindell is quite easily explained by this discussion (although legal scholars won’t like it). The idea is that its better to make a false positive error and make negligent corporations pay, than to make a false negative error and leave the injured plaintiff without any recourse. Is this perhaps the future of law? Away from the truth/truth-value of the event and rights of the plaintiff and defendant, and towards the social desirability of Type I vs Type II errors?

All my articles are dedicated to the public domain under the terms of the Creative Commons Zero licence. Please translate, copy, excerpt, share, disseminate and otherwise spread it far and wide. You don’t need to ask me, you don’t need to tell me. Just do it!

--

--