Ripped from the other site for posterity
It’s supposed to work like this: There’s 5 rounds of debate. First round they are supposed to give a hot take. Rounds 2 and 3 they’re supposed to react to each other (shared chat history). Rounds 4 and 5 they’re supposed to vote.
This was meant to be a demo of the parallel models capability, but people seem interested in the debate idea itself… I think the actual debate performance could be improved significantly!
Source code is in this PR here if anyone wants to hack on it: https://github.com/lemonade-sdk/lemonade/pull/648
I tried pausing and reading and their “arguments” were so yes-man like that they seemed to not really want to debate or lean one way or the other and basically were saying it depends on context or that it could be seen as either. Which is fine, but meaningless in the context of wanting to come up with an answer. Any question can be replied to with “it depends”, without really answering the question in a satisfying way.
I think it would make more sense to either use an odd number of LLMs, or let them abstain if they are undecided - to try to force them to come up with a clear cut answer.
Then there is also the issue of swarm intelligence, which does not get used here at all, because it only works if the voters DO NOT discuss their thinking before the vote, thus influencing each other. One LLM could be confidently wrong, but because they all are such yes-man - the strongest, most confident sounding voice linguistically, might overweight the correct “thinking”.
So yeah, this seems like a bad approach to a really interesting problem.
Here are some interesting reads on this topic:
I agree it’s a bad approach, but also fairly entertaining. I’d love to see it done in a more rigorous fashion though.
I think it has the opposite problem, they start with the same question at the same time and most vote immediately.
What do you mean by immediately? the votes happen in the last third of the video.


