Truthocracy – Part II – Discovering Truth and Experts

PROBLEMS
Our economic system hasn’t been self correcting through arbitrage, because markets have stayed irrational longer than one could stay solvent.  Our legal system has not been just because precedents have been bluffed into existence using legal costs instead of legal arguments.  Our regulatory systems have been infiltrated by biased colluding agents representing the interests of a powerful minority (who they’re often supposed to be regulating).  Our political system has been overly influenced by a misinformed and/or manipulated majority.   

Evolutionary flaws in these systems can cross-contaminate each other with negative (un)intended consequences and if those consequences are not outweighed by the positives, then we need to improve our systems.  Democracy is merely the best system in terms of the balance between growth and stability that humanity has been able to come up with so far.  Free markets are necessary but not sufficient for efficient allocation of resources.   Our development doesn’t need to be a slave to a misdirected less sophisticated majority to the degree that it currently is.  We are committing simple behavioral flaws/mistakes on different levels (consumer, executive, and regulator) which are Nash rational for our self-benefit, but have free rider/tragedy of the common/agency issues/conflicts of interest consequences written all over them. Our systems can be improved if we only stop pretending that we’ve already perfected them!

 

SOLUTION
How, then, can we strip away these distortions and get to the core of what a better sytem “should” be?  How can we efficiently filter for and give more weight to unbiased experts and good ideas without appealing to authority, seniority, or majority? Enter the Bayesian Truth Serum (BTS).  I cannot summarize it better than the author of the article:

BTS is a survey scoring method that provides truthtelling incentives for respondents answering multiple-choice questions about intrinsically private matters: opinions, tastes, past behavior. The method requires respondents to supply not only their own answers, but also percentage estimates of others’ answers. The formula then assigns high scores to answers that are surprisingly common, i.e. whose actual frequency exceeds their predicted frequency…The scoring system transforms a survey into a competitive, zero-sum contest, in which truthtelling is a strict Bayesian Nash equilibrium (Prelec 2004). 

We conduct a general knowledge questionnaire in which we ask respondents if they recognize various items: electronics brand names, historical figures, philosophy terms, etc… By including nonexistent foils alongside the real items, we can measure the degree of deception. When significant bonus payments are awarded to the survey takers with the highest BTS scores, people claim to recognize fewer foils than when bonuses are awarded randomly. The study also validates our claim that truthtelling is in the respondents’ interest: people do in fact achieve higher scores, and earn more money, when they deny knowledge of foils. 

In our second study, we investigate whether it is possible for survey takers to exploit the BTS system by engaging in strategic deception that they hope will be more profitable than answering truthfully. In four surveys, with content chosen to be neutral enough that we can plausibly treat actual answers as truthful, we compare information scores from actual responses to those resulting from various deception strategies. For example, we test whether respondents can score higher by giving the answers they believe will be most popular, rather than their true opinions. We also examine whether respondents do better by misrepresenting their demographic characteristics (gender), or by simulating the answers of some other person they know well. We find that genuine answers reliably outperform every deception policy we test, and that no identifiable subgroup of respondents can expect to benefit from deception.”

 

BTS KILLED THE DEMOCRACY STAR
In one of a handful of different experiments, MIT and Princeton students were asked to identify US state capitals where the named city was always the most populous in the state (ie ‘Is Seattle the capital of Washington?’).  Here are the results:

Expertise BTS Correlation

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Expertise correlates with BTS scores but not with Conventional Wisdom.
Conventional wisdom is defined as the number of consensus answers (answers consistent with majority). The two top panels are taken from the MIT study; the bottom panels from the Princeton study. The y-axis is the number of correct answers out of fifty. The x-axis is: (left panels) the number of states where a respondent’s answer matches majority opinion; (right) for each subject r, the sum over 50 states of the BTS score for the answer that they endorsed (averaged across individuals who endorsed that answer)” Source

We can cultivate an ability to maneuver within any given system to discover the most surprising common answers.  If you can predict what others can’t predict, those who “lost”, should  trust your “expertise”, because you are best at discovering truth (the surprisingly common answer).

How can we discover truth by applying the BTS scoring methodology to change our industries, press, entertainment, and politics into institutions that will breed good ideas instead of merely focusing on the short-term profit potential?  A BTS “forum” provides a possible answer.  The people in the forum would have “competitions” to ask each other questions.  If someone asks a predictable question where most others are able to accurately predict what the answer distribution will be, the information value of the question will be 0 (and the person asking won’t have anything to gain himself).  If someone on the other hand can ask a question that is unpredictable, it will hold BTS Information value and most importantly people answering will have the ability to score positive  BTS points and display their expertise, while filtering out nonexperts (negative BTS scores).

For example, ten people ask each other ten questions.  At the end of the game, the scores are tabulated and the people with more BTS points are deemed to have more expertise than the losers.  Top three experts move on to the higher division, bottom three move down to a lower division.  Ten games are played and the league divisions are then changed using a similar rule, with Experts moved up and the “losers” moved down.  Quite similar to promotion/relegation in futbol leagues, tennis, and chess rankings.  (The forum idea is not mine and I am speculating as to what the best way to set it up would be.  There’s a working paper on this that the author has requested to not be cited, but it doesn’t go into much on the topic, yet).

This forum would simulate the evolutionary or emergent (?) learning environment, similar to sports and other competitions.  We ask hypothetical problems of each other and “weed out” the people who agreed on the reply the most where the majority wasn’t able to predict that reply.  It’s a way to test “self awareness” of your own group/sample – your ability to predict the “average” (and possibly even the SHAPE (other statistical parameters) of the distribution).

Please understand that we don’t merely identify experts, which is a grand accomplishment in itself; we also discover “surprisingly common answers” (even to the experts themselves).  For example, people consistently underestimated the degree to which others would find the Humor questionnaire funny (read the first study).  So the BTS methodology allowed us to discover that the Humor questionnaire is indeed surprisingly funny!

I modeled the BTS scoring methodology in excel.  Knock yourself out.  Think of different deception strategies and see if you can break it (I can only think of one that may work, but it’s easily identifieable and thus can probably be corrected for).  I highly suggest reading the paper first, because they unsuccessfully test quite a few deception strategies and truth-telling tends to win the vast majority of the time.

Related posts:

  1. Truthocracy – Part IV – www.hunch.com
  2. Truthocracy – Part I – Reducing Collusion
  3. Truthocracy – Part III – MIT Center for Collective Intelligence
  4. Dangerous Ideas
  5. Game Theory and Military Planning

  • http://www.overcomingbias.com/2009/12/majoritarian-philosophy.html Overcoming Bias : Majoritarian Philosophy

    [...] surprisingly popular in #2, which is a Bayesian Truth Serum [...]

  • http://rationalmechanisms.com DWCrmcm

    A datum is a container. A container is any given constraint on flow. The RMCM also asserts that all containers are both nouns and verbs. Vise complexity is encapsulated as polymorphic multi behavioralism.
    The difficulty in taming statistics arises out of inexpression. The value of any attribute bound to any container depends on the density of the aggregates within.
    Twenty years ago the density of computer displays would have made the graphics above impossible. Hence, “evolutionary or emergent” is a speculative encapsulation that simply falls away in light of the power of density.
    Have you considered density and a given density’s capacity to encapsulate a statistical metaphor?

    I don’t know what a statistical metaphor means, I only know that it must exist.

    Sample size is a speculative encapsulation. Density matters. To increase density you must increase the sample size. Then you must ask questions that exploit attribution – aka deviation. I suspect that statics’s untapped power lies in the most ignored of its capacities, the elucidation of deviations arising from lack of density.
    A standard deviation is only a standard within a given density – misdiagnosis is abundant.

  • Alex Golubev

    DWCrmcm, thank you for your thoughts! hope you keep contributing cause it sounds like you have a lot to say on the topic.

    You may be suggesting that BTS simply measures DENSITY of opinion, because “the most surprising common answer” is another way of saying “unpredicted majority”. That’s a great point.

    The strong version of arguing for AI and BTS is that it will allows us to know everything ASAP. The weak version suggests that a human with AI/BTS is more fit than one without it. I’m definitely undecided on the strong version and lean toward being “trapped” in a Memento type cycle. The weak version however is where i’d argue that our current Artificial Collective Intelligence of democracy, seniority, etc… is archaic and is putting limits on what is knowable, so we’re very far from achieving peak density as a civilization/specie.

    I think our “disagreement” or potential discovery lies in the space between the weak and the strong versions. I don’t think we can make an assumption that density of opinion is 100% objective, so it’s isn’t a sufficient condition to reach full knowledge. I think emergence and evolution will still play a role at these levels. But don’t get me wrong, i think knowledge can grow by a few factors before we reach those levels. So the weak version is quite “strong” in absolute quantification.

  • http://emergentfool.com/2009/12/13/truthocracy-part-iv-www-hunch-com/ Truthocracy – Part IV – www.hunch.com « The Emergent Fool

    [...] Comments Alex Golubev on Truthocracy – Part II – Discovering Truth and Expertskevindick on Non-DualismJohn on Non-DualismRafe Furst on Non-Dualismkevindick on [...]

  • http://emergentfool.com/2009/12/13/truthocracy-part-iv-www-hunch-com/ Truthocracy – Part IV – www.hunch.com « The Emergent Fool

    [...] Comments Alex Golubev on Truthocracy – Part II – Discovering Truth and Expertskevindick on Non-DualismJohn on Non-DualismRafe Furst on Non-Dualismkevindick on [...]

  • http://rationalmechanisms.com DWCrmcm

    Your reply precipitated the idea that I need to deal with juxtapositions.
    PurityDiversity
    AcquisitionExpression
    FormFunction

    Thank you

    “Memento” is a problem when quantifying abstractions.
    Maybe some mechanism for breadcrumbs would be helpful.

    Knowing and experience?

    I prefer the encapsulation: matching experience to capacity.

    There is also the difficulty of recall and context.

    Consider, if you will, polymorphic multi-behavioral containers. One can increase density through behavior by utilizing an orchestrated or polymorphic density.
    There is a density “of kind”.

    Consider, if you will, juxtaposing likedislike, these two are primary polymorphic multi-behavioral containers.
    The density could become quickly overwhelming.

    This is where I suspect a hierarchical statistical method might find useful expression.

    Densities in “three dimensions”. Maybe sets of sets. I don’t know really just a method of building and behaving within what is built.
    Query the query?

  • Rafe Furst

    Still speaking in code, dude. Bring it down to Earth so us earthlings can understand you :-)

  • Alex Golubev

    Density CAN become overwhelming, but it’s the complete lack of density that is nonexistence. We cannot know if we’re human or AI and thus while nonexistence is theoretically possible and imminent, the road to it is infinite. 1/0. However if you’re the only one on the journey, then you are nonexistent, so slow down your teachers and speed up your students while shortening the lead time to as close to the present as possible… Balancing the Future and the Present is the key to infinite existence. Primer “game theory”?

  • http://rationalmechanisms.com DWCrmcm

    :-)
    Not code, NeoRationalist Jargon.

    http://rationalmechanisms.com/lexicon

    AI cannot be programmed. To approach AI we need to provide an inorganic metabolism in which it can flourish.

    There is no digital equivalent to organic intelligence, because “code” lacks Causal Constraints.

    Nonetheless, we can use “gate” “bits” technology to mimic both the electro and chemical values passed around the brain.
    For instance 128 bit pathways may be large enough to encapsulate meaningful analogs to the electrochemical metabolism of the brain. The micro processor can be simplified to add, compare, divert, aggregate, and create pathways when it is bound to each 128 bit aggregate.

  • http://rationalmechanisms.com DWCrmcm

    Beautifully encapsulated. I agree completely.

    and my vote goes to the pendulum as the starting metaphor.

    Really you took the words out of my mouth.

    Game theory bound to aggregates of pendulums.

    Sound weird?

    As flow increases through such an aggregate you will see emergent behavior.

    This I Promise.

    I would love to see such a construct in action.

  • Alex Golubev
  • http://www.facebook.com/profile.php?id=704300847 Al Mendoza

    “we can measure the degree of deception.” Pretty sadistically massive problem if that’s part of the solution you are programming into the system no?

blog comments powered by Disqus