Learning and Improvement in Kleros Mechanism Design

Kleros Director of Research, Dr. William George explains key learnings from the Kleros protocol and upcoming improvements.

Learning and Improvement in Kleros Mechanism Design
Kleros is a decentralized protocol that produces justice in a peer-to-peer manner. This allows one to have confidence in the system without having to trust individual operators to be honest; however, decentralization creates some challenges in how the system can be designed.

Historically, every time breakthrough innovation surfaced, it came with a great deal of limitations and skepticism. This was the case with Wikipedia, where people initially didn’t believe the crowds would have been able to create an encyclopedia that is more detailed, unbiased, and expansive than the Encyclopedia Britannica. These new dominant designs appear, find their place in the market, and ultimately surpass the predecessor by being radically different, but “good enough” for the users.

Henry Ford famously mentioned “If I had asked people what they wanted, they would have said faster horses”. Producing automobiles in a time where the vast majority traveled by carriage, road networks were in a poor state, and engines were nowhere as reliable as the trusted steed was a bold move - to say the least.

Through a combination of great vision, perseverance and continuous learning, Ford's products improved incrementally and are now an integral part of society as we know it.

Kleros is walking the same path. With the help of our lead users and great community, we are learning how crowdsourced justice should operate and how our engine should be refined.

In this article, we will respond to some frequently asked questions about design choices in Kleros, many of which tend to arise from aspects of Kleros that seem unorthodox when compared to centralized systems. Additionally, we will see that these choices are necessary for Kleros to operate properly in the absence of a trusted operator.

On some of these points, we are researching creative solutions that will incorporate what we have learned from Kleros cases so far, and these solutions will be baked into future versions of the Kleros court, bringing it more in line with more familiar behaviour while remaining decentralized.

Commit and Reveal

"Why are votes public? Isn’t it a problem that people can copy how others voted?"

In order for the Kleros contracts to determine which option won the dispute and to reward and penalize jurors for their votes accordingly, these contracts must be able to eventually “see” all of the votes. One approach, which would allow the contract to have access to this information at the end of a dispute, while still keeping votes private during the voting period is commit and reveal.

Here, a juror Alice “commits” to her vote by taking a random salt, computing the hash of her vote taken together with the salt, and submitting the hash to the contract. Later, once everyone has committed their vote, Alice can “reveal” her vote by submitting it and the salt on-chain. The contract can then verify that Alice’s vote actually corresponds to what she committed by checking the vote+salt hash and comparing it to the value that was previously received.

(In order for Alice to convince the contract that she had committed to a different option, she would have to be able to provide a different salt that together with a different vote still gives the same hash that she had previously committed to. This means that she would have found a hash collision. Cryptographic hash functions are designed so that this should be practically impossible.)

Note that there are limitations to commit and reveal. It doesn’t prevent malicious jurors from prematurely publishing their salt in a Telegram group or on a forum somewhere, publicly revealing their votes while others are still voting. While one could have a system where jurors are penalized if someone submits their salt to the contract before the reveal period, you could imagine jurors then proving that they voted in a given way without revealing their salt using a zero-knowledge proof.

Ultimately, there are aspects of a cat-and-mouse game in preventing jurors who are trying to reveal their vote from doing so, though there are some potential general approaches. Nonetheless, from a purely game-theoretic point of view, using commit and reveal is probably generally better than not doing so as most jurors are likely not malicious and will not reveal their votes, so less vote information will circulate compared to pure public votes.

In Kleros, commit and reveal is part of the current court contract. However, it is activated, or not, by a parameter that can vary from one court to another. The reason that commit and reveal is not just activated by default for all courts, despite its game-theoretic advantages, is that it creates user experience issues. Jurors need to perform a second action, coming back to reveal their vote.

If they forget to do this, their vote cannot be counted and the juror is penalized as if she didn’t vote at all. In a centralized system, one could just have a trusted operator keep a private list of all of the votes that the other jurors would not be able to see; however when using commit and reveal in a decentralized system, only the individual juror knows her salt and is capable of revealing her vote.

"Why not have the vote 'reveal itself' at an appointed time? Wouldn’t this solve the problem?"

There is no viable way, using the current tools available in cryptography, for a vote that is not already known by a trusted third party to just “reveal itself” at an appointed time. In future releases, it might be possible to have an app on a user’s phone or in a user’s browser store the salt and automatically reveal it during the reveal period. This would require users to install specific software and would still have issues with votes not being revealed, particularly if users’ devices are not turned on during the relevant periods, but might nonetheless be a viable future approach.

"Then how can Kleros prevent users just copying other people’s votes instead of carefully evaluating the evidence?"

Kleros’ main defences against vote copying come from the incentive system and the possibility for appeal. Even if you see that enough other jurors have already cast their votes for “No” to win a given voting round, there is generally the possibility that there will be some future appeal round that will reverse this result, and whether you are rewarded or penalized for your vote depends on whether it agrees with the result of the juror vote in the final round. Moreover, Kleros’ incentive structure is built on the premise that the stakes of jurors whose vote does not agree with the eventual outcome of the last appeal round are redistributed to other jurors on a per-round basis.

Thus, if you as a juror see most other jurors in your voting round voting for an outcome that you believe is incorrect and will eventually be overturned by an appeal, your incentive to vote for the (honest) answer that you believe will eventually win is that much greater; each additional juror in your round whose vote is incoherent (i.e. that does not agree with the vote in the final appeal round) increases your payoff as a coherent juror.

Proof of Humanity

"Why can an address get more than one vote? Isn’t this anti-democratic?"

Ideally, each individual human being participating as a juror would only have a single vote. However, one person on Ethereum may have many addresses. Namely, Ethereum addresses are not “Sybil resistant”. Limiting each address to one vote is not an effective means of preventing large holders of PNK from getting multiple votes; doing so would incentivize large holders to split up their PNK over many addresses to continue to be selected at higher rates.

Indeed, under a one vote per address policy, malicious participants that have large holdings of PNK would likely split up their holdings while honest holders might not. This would have the consequence of giving the malicious participants artificially greater weight.

While a centralized system might be able to perform identity checks on its jurors to guarantee that each person controlled at most one account, for Kleros, as a decentralized system, this is more challenging. Eventually, Proof of Humanity (still in development) could be used to provide these kinds of controls in a decentralized fashion.

"If an address was drawn in round 1, shouldn’t it be disallowed to vote in subsequent rounds?"

Ultimately, this is another form of the same question above of why an address can have more than one vote. If addresses were ineligible to vote in multiple rounds, then this would again just incentivize holders to split up their PNK over several addresses. An eventual Proof of Humanity system is again a possible solution to this issue.

Automated Parameter Tuning

"There have been times when the token value changed significantly (e.g. during case 360) and the minimum juror stakes were slow to adjust."

Parameter changes of the Court go through a decentralized governance procedure. This can be a fairly long process, as it gives the community time to notice and respond in the event of a malicious actor submitting hostile parameter changes.

First, changes are proposed for discussion on the Kleros forum and voted on. Then, even after those steps, there is a waiting period during which the resulting changes are submitted to Kleros governor in a format that is readable by the Court smart contract and submissions can be challenged if they do not match what the community voted on. (By having this extra step, the proposals during the actual community votes can be human readable, making submitting and voting on changes more accessible.)

All of this takes time, so the parameters might not immediately adjust to very recent changes. For example during the period of time that covered case 360 the value of PNK had recently significantly increased and the corresponding parameter change was making its way through the governance process.

This led to juror stakes that were abnormally high. It should nevertheless be remembered that jurors play a positive-sum game; stakes lost by incoherent jurors are redistributed to coherent jurors. So in situations like this, the jurors are not worse off on average; the expected value of their payoff is still positive.

Nevertheless, this effect may have caused some jurors to take on more risk than they were comfortable with. In a future release, it will likely be the case that some of the court parameters will be able to automatically adjust to each other using oracles or price information from decentralized exchanges, e.g. the required juror stakes will automatically adjust to changes in the value of PNK with respect to ETH, limiting these kinds of effects.

On recent Cases

"In case 360, a whale presented many arguments to vote for 'No' and voted for 'No' in round 1. But then that whale was drawn again in a subsequent round and voted for 'Yes'. Because of this, the 'Yes' side won. What do you make of this sudden change of mind? Isn’t this behaviour manipulative?"

It is difficult to speculate on the thought process of any individual juror. However, there are various reasons that a juror might change her mind over the course of a case, notably because the arguments that were presented in the appeal period convinced her that “Yes” was the honest response and the response that was most likely to eventually win.

Indeed, if you as a juror realize that your vote in a previous round was a mistake and that some other option is more likely to be seen by the average juror as honest and win the dispute, it is generally in your interest to vote for what you now see as the honest outcome.

This would, of course, mean that you would lose a part of your stake for your previous “No” vote, however, to the degree that you cannot change the outcome of the case and “No” would eventually lose anyway, this lost stake is a sunk cost.

Even if you think you have the decisive vote in the appeal round and by again voting “No” you can save your stake from the first round, you should remember that there is generally a possibility for future appeals. If the community believes that “No” is a dishonest resolution, there is a good chance that there will be another appeal, and if a future appeal round ultimately rules for “Yes” you would then lose stakes from incoherent votes in two rounds.

On the other hand, consider a related situation that is more clearly manipulative. Suppose that, in a given voting round, Eve is a (whale) juror who concludes that “Yes” is the honest outcome and will ultimately win and as a result, she plans to vote “Yes” herself. However, she presents arguments for “No”.

Eve might hope that by doing so, a few jurors will vote “No”, either because they are swayed by her arguments or simply because they think that “No” has a better chance of winning due to there being a whale juror who supports it. Then, if and when “Yes” eventually wins, the jurors who voted for “No” will lose part of their stake to the jurors in their round who voted “Yes”, including Eve. Namely, Eve will get a higher return because she managed to “trick” these jurors into voting for “No”.

Generally, jurors should keep in mind the fact that in the game they are playing with the other jurors, if one juror receives a lower payout, the others receive a higher payout. Thus, while it makes sense to look at the arguments and evidence provided by others, including other jurors, as you decide which outcome seems to have the most solid arguments, you should take what other jurors say with a grain of salt as those other jurors’ interests are not necessarily aligned with yours.

"I was one of three jurors in a case where 'Yes' was the obviously correct answer, so I voted for it. When I saw that the other two jurors had voted for 'No', I was glad because I reasoned that the case would be appealed and ultimately, in the last round, the jury would vote for 'Yes', so I would get the lost stakes of those two jurors who voted for 'No'. However, the appeal fees were crowdfunded for 'Yes', but not for 'No'. So 'Yes' is going to win without another juror vote! While that means that the honest choice will have won, I will be ruled incoherent because the last juror round will have been 2–1 for 'No'. What should I do?"

Jurors are currently incentivized based on their coherence with the final juror vote rather than the actual outcome of the case, taking into account which options are crowdfunded. This is because of the way that roles are divided between the arbitrator and arbitrable contracts in our arbitration standard. This standard is designed so that the arbitrator contract should not have to trust any information received from an arbitrable contract. Indeed, we want the arbitrator (Kleros) to be used by a wide range of people/third parties who develop different arbitrable contracts in a permissionless fashion. Hence, some of these arbitrable contracts could be buggy or malicious.

Thus the arbitrator doesn’t actually know what the arbitrable contract considers to be the final outcome; in a case like this, it does not know that “Yes” was crowdfunded and “No” was not. The last thing it sees is the 2–1 vote for “No”.

Note that, in some situations, incentivizing jurors in terms of the final juror vote, rather than the outcome as seen by the arbitrable contract can have positive game-theoretic effects. For example, imagine a scenario where a whale is attacking a Kleros case by making frivolous appeals on behalf of a dishonest outcome. If jurors were rewarded for voting for the dishonest option in a situation where it wins by being funded last, even though it loses every juror vote, then jurors might anticipate the whale’s chances of being able to out-fund honest actors and vote for the dishonest option in an effort to be “coherent”. This effect makes such an attack easier for the whale as enough jurors could vote in this way - the attacker wins the juror vote(s) and doesn’t even need to appeal.

Then, such an attacker might only need to make a semi-credible threat of being able to carry out a frivolous appeal attack to sway an outcome. With the current design, jurors in each appeal round can vote for the honest outcome and as long as they believe that other jurors will do the same, they don’t have to worry about which party to the dispute is more capable of crowdfunding future appeals.

However, this design choice can also sometimes have undesirable consequences in cases where the honest outcome loses in a voting round, but because it is seen as the winner by the arbitrable contract, no party is incentivized to appeal and honest jurors are seen by the arbitrator contract as being incoherent. In a future version of the court, we will likely take the crowdfunding mechanism, which currently only exists as part of arbitrable contracts, and put it as a default in the arbitrator contract.

This will allow juror redistribution to take into account which outcomes were funded. For example, one could then have special logic so that if the winner of the final juror vote loses a case because it is not crowdfunded, then there is no redistribution of PNK for that case. Such a compromise approach would allow for a more nuanced balance of the tradeoffs in incentivizing around the types of situations considered above.

In the meantime, there are a few things to keep in mind if you find yourself as a juror in this type of situation.

First, if you are at times sufficiently convinced that “Yes” would win an appeal and the value of PNK stakes are sufficiently high compared to the appeal fees, it might make sense to pay the appeal fees yourself for “No” as a last measure if no one else will. The difference in your payoff between being incoherent on a 2–1 vote versus coherent on a 1–2 vote is the difference between getting the entire round’s worth of juror fees plus the lost stakes of the other two jurors versus losing your own stake.

Denote by F the total juror fees in your round, denote by K the value in PNK at stake per juror in your round, and denote by A the total appeal fees required to fund “No” and trigger an appeal. Then the swing in PNK redistribution between being incoherent on a 2–1 vote versus coherent on a 1–2 vote is 3K. If you fund “No” and “Yes” wins, compared to not appealing, you lose A in appeal fees but you gain 3K+F from the redistribution in your round. Depending on the crowdfunding stake multipliers, (keeping in mind that “No” is the winner of the previous voting round) A is typically between 3F and 4F. So if 3K+F-A>0, namely if three times the amount of PNK at stake per juror is worth more than A-F, which is typically between 2F and 3F, you might make enough from being coherent in your round to make up for paying the appeal fees of “No”.

Moreover, note that the incentive system of Kleros is based on the idea that jurors that act in good faith will win rewards on average over the long term. However, there is some intrinsic variability in these rewards when considered over a small number of cases. Indeed, in some cases, a good faith juror will vote for the losing side just because she misjudges the situation.

Hence, a juror acts in a way that is similar to an insurance company. An insurer might get unlucky on any given client and have to pay out more than it raised in premiums. However, in the long run, as the insurer designs its policies so that the expected value of its return is positive on each client, the insurer has a very high probability of being profitable on average.

Frankly, a juror being on the losing side of a case because of a misjudgment is generally going to be a significantly more frequent occurrence than the relatively rare type of situation described above, where no one pays the previous round winner’s appeal fees (and the chance to win the PNK stakes from other jurors is not sufficient to make it worth doing so yourself).

A good faith juror might get unlucky and land in such a situation, and it is particularly frustrating to be penalized when you are right even if this happens only rarely. However, this phenomenon will not prevent good faith participation from generally being the winning strategy.

"I am a juror for a case where I strongly believe that 'Yes' is the right choice. However, a substantial chunk of the community seems to be interpreting the primary document for the case in a different way than I would, and these people are convinced that 'No' is the right choice. People are having heated arguments on Telegram. What should I make of this?"

It is the nature of any dispute resolution system that there will be cases where reasonable people will disagree. This is particularly the case if the primary document on which jurors are basing their decision is unclear or is open to interpretation in multiple different ways. Parties who write primary documents when setting up Kleros to eventually resolve their future disputes should be careful to write criteria that are clear to minimize the occurrence of such situations.

Nonetheless, it is inevitable that there will sometimes be cases on which the community is divided. If you find yourself as a juror in such a case, you should resist the urge to think of the people on the “other side” from your viewpoint as being necessarily “dishonest”.

Indeed, as a juror, you want to predict which position is a stronger Schelling point and will win out in the end; keeping an open mind and trying to understand the arguments being made by other segments of the community should help you in this goal.

A reference can be made to the common law systems: a judge can have a strong a priori opinion about a case. However, it is his duty to understand how other judges have ruled on similar cases historically and what the legal community’s consensus seems to be with the case at hand.

Solving disputes is inherently challenging, as it requires the jurors to appreciate the subjective evidence and empathize with different points of view.

Ultimately, we would encourage jurors in situations like this to:

  • Remain calm. 
  • Try their best to vote honestly according to the evidence and (how the majority of the community seems to be interpreting) the rules, as that option is most likely to win in the end.
  • Remember that at the end of the day, jurors in any given case are playing a positive-sum game, so even if a given case has a high degree of uncertainty, in the long run, the average payoff is on your side.

Key Learnings for the Future

So far, over 350 cases have passed through Kleros courts and ultimately arrived at a satisfactory outcome. While there have certainly been some disputes that were particularly contentious, this is the case in all dispute resolution systems.

As we gather more data, Kleros will continue to adapt and evolve. We appreciate the great involvement of our dedicated lead users and look forward to implementing our learnings in the future version of the Court.

Where Can I Find Out More?

Join the community chat on Telegram.

Visit our website.

Follow us on Twitter.

Join our Slack for developer conversations.

Contribute on Github.

Download our Book