Testing Qwen3 MAX
Qwen 3 MAX Release
- Qwen 3 MAX has been released, a behemoth with 1T parameters!
- The various benchmark results are very strong, so I couldn’t resist testing it out.
The 12-Ball Weighing Puzzle
- One day, I was browsing Zhihu and came across the 12-ball weighing puzzle. The author said that if you could solve this problem independently, you had a talent for math. I was on a business trip at the time, and I thought about it the whole way on the high-speed train. That night, I lay in bed and thought about it for a while longer before I solved it. It seems I have a little bit of a talent for math, but not a lot 😅
- The problem is: You have 12 balls, only one of which has a different weight, but you don’t know if it’s lighter or heavier. How do you find this ball and determine if it’s lighter or heavier using a balance scale three times?
- The solution I came up with: Label the balls 1-12 and divide them into three groups: Group 1: 1, 2, 3, 4; Group 2: 5, 6, 7, 8; Group 3: 9, 10, 11, 12.
- First weighing: Group 1 vs. Group 2
- If Group 1 = Group 2, the abnormal ball is in Group 3. For the second weighing, use (9, 10, 11) vs. (1, 2, 3).
- If (9, 10, 11) = (1, 2, 3), the abnormal ball is 12. For the third weighing, use 1 vs. 12 to determine if 12 is lighter or heavier. Task complete.
- If (9, 10, 11) > (1, 2, 3), the abnormal ball is among 9, 10, 11 and is heavier. For the third weighing, use 9 vs. 10. Whoever is heavier is the abnormal ball and is heavier. If 9 = 10, then 11 is the abnormal ball and is heavier. Task complete. Here, I found a
key point
: If you have 3 balls left and you know whether the abnormal ball is lighter or heavier, one more weighing is enough to find it.
- If Group 1 > Group 2, the abnormal ball could be in Group 1 and is heavier, or in Group 2 and is lighter. For the second weighing, use (1, 2, 3, 5) vs. (9, 10, 11, 4).
- If (1, 2, 3, 5) = (9, 10, 11, 4), the abnormal ball is in 6, 7, 8 and is lighter. Referencing the
key point
above, one more weighing is enough to complete the task. - If (1, 2, 3, 5) > (9, 10, 11, 4), since 4 is in Group 1 it can only be a heavy ball, and 5 is in Group 2 it can only be a light ball, the abnormal ball can only be among 1, 2, 3 and is heavier. Referencing the
key point
above, one more weighing is enough to complete the task. - If (1, 2, 3, 5) < (9, 10, 11, 4), since 1, 2, 3 can only be heavy balls, the abnormal ball can only be 4 or 5. For the third weighing, use a normal ball, 9, vs. 4 to find the abnormal ball and its weight. Task complete.
- If (1, 2, 3, 5) = (9, 10, 11, 4), the abnormal ball is in 6, 7, 8 and is lighter. Referencing the
- If Group 1 < Group 2, this is symmetric to the case where Group 1 > Group 2. The second weighing can also use (1, 2, 3, 5) vs. (9, 10, 11, 4).
- If Group 1 = Group 2, the abnormal ball is in Group 3. For the second weighing, use (9, 10, 11) vs. (1, 2, 3).
Qwen3 MAX’s Solution
I gave the problem to Qwen3 MAX. Below is its solution:
This is a classic “12-ball weighing puzzle.” The goal is to use a balance scale three times to find the one ball out of 12 with an abnormal weight (without knowing if it’s heavier or lighter) and determine if it’s heavier or lighter. ✅ Problem-solving strategy: Grouping + Maximizing information + Elimination We’ll number the 12 balls: 1, 2, 3, …, 12 🧩 First weighing: Left side: 1, 2, 3, 4 Right side: 5, 6, 7, 8 Case one: Balanced (=) → This means the abnormal ball is in 9, 10, 11, 12 Second weighing: Left side: 9, 10, 11 Right side: 1, 2, 3 (known to be normal balls)
- If balanced → The abnormal ball is 12
- Third weighing: 12 vs. 1 → If 12 is heavy, then 12 is the heavy one; if 12 is light, then 12 is the light one.
- If the left side is heavy (9, 10, 11) > (1, 2, 3) → The abnormal ball is among 9, 10, 11, and it’s a heavy one.
- Third weighing: 9 vs. 10
- If 9 = 10 → 11 is the heavy one
- If 9 > 10 → 9 is the heavy one
- If 9 < 10 → 10 is the heavy one
- If the left side is light (9, 10, 11) < (1, 2, 3) → The abnormal ball is among 9, 10, 11, and it’s a light one.
- Third weighing: 9 vs. 10
- If 9 = 10 → 11 is the light one
- If 9 < 10 → 9 is the light one
- If 9 > 10 → 10 is the light one Case two: Left side is heavy (1, 2, 3, 4) > (5, 6, 7, 8) → The abnormal ball is in 1-8, and 9-12 are normal → Either there’s a heavy one in 1-4, or there’s a light one in 5-8. Second weighing: Left side: 1, 2, 5; Right side: 3, 6, 9 (9 is a normal ball) Analysis:
- If balanced → The suspects are 4, 7, 8. 4 could be heavy, and 7 or 8 could be light.
- Third weighing: 7 vs. 8
- If 7 = 8 → 4 is the heavy one
- If 7 < 8 → 7 is the light one
- If 7 > 8 → 8 is the light one
- If the left side is heavy (1, 2, 5) > (3, 6, 9) → It could be 1 or 2 is heavy, or 6 is light (because 6 is on the right side, becoming lighter would make the left side appear heavier).
- Third weighing: 1 vs. 2
- If 1 = 2 → 6 is the light one
- If 1 > 2 → 1 is the heavy one
- If 1 < 2 → 2 is the heavy one
- If the left side is light (1, 2, 5) < (3, 6, 9) → It could be 3 is heavy (3 is on the right, becoming heavy would make the left side appear lighter), or 5 is light (5 is on the left, becoming light would make the left side lighter).
- Third weighing: 3 vs. 9 (9 is normal)
- If 3 = 9 → 5 is the light one
- If 3 > 9 → 3 is the heavy one (3 cannot be < 9, because there is only one abnormal ball and it’s either heavy or light) Case three: Left side is light (1, 2, 3, 4) < (5, 6, 7, 8) → Symmetric to “Left side is heavy,” just with roles reversed → The abnormal ball is in 1-8, and 9-12 are normal → Either there’s a light one in 1-4, or a heavy one in 5-8. The following content is similar to case two, abbreviated for length…
✅ Summary:
Through three weighings, no matter which group the abnormal ball is in, we can uniquely determine which ball it is and whether it’s heavier or lighter.
Key points:
- The first weighing divides the balls into three groups (weighing 4 vs. 4, leaving 4), narrowing the scope.
- The second weighing cleverly combines the suspect balls with known normal balls to create information differences.
- The third weighing uses a one-on-one comparison or a comparison with a normal ball to pinpoint the target.
📌 Core idea: Each weighing must maximize the information gained, distinguishing the state of the balls (normal / heavy / light).
This is a classic application of information theory and a divide-and-conquer strategy!
✅ Problem solved!
My follow-up question
I found its solution was different from mine, so I asked it:
Case two: Left side is heavy (1, 2, 3, 4) > (5, 6, 7, 8) → The abnormal ball is in 1-8, and 9-12 are normal → Either there’s a heavy one in 1-4, or a light one in 5-8. For the second weighing, can we use: Left side 1, 2, 3, 5 vs. right side 4, 9, 10, 11?
Qwen3 MAX’s answer to the follow-up question
Case two: Left side is heavy (1, 2, 3, 4) > (5, 6, 7, 8) → The abnormal ball is in 1-8, and 9-12 are normal → Either there’s a heavy one in 1-4, or a light one in 5-8.
Can the second weighing use: Left side 1, 2, 3, 5 vs. right side 4, 9, 10, 11?
🧩 Current premise:
- First: 1, 2, 3, 4 > 5, 6, 7, 8 → The abnormal ball is in 1-8
- Possibilities:
- One of 1, 2, 3, 4 is a heavy one
- Or one of 5, 6, 7, 8 is a light one
- 9-12 are normal balls (✅ known)
Your proposed second weighing:
Left: 1, 2, 3, 5 Right: 4, 9, 10, 11
Where 9, 10, 11 are known normal balls → Great, they can be used as “weights.”
🔍 Analyzing the three possible outcomes of the second weighing:
✅ Case A: Second weighing is balanced (=) → Left side equals right side → This means 1, 2, 3, 5, 4 are all normal (because the right side has three normal balls; if the left side had an abnormal ball, it would be unbalanced) → So the abnormal ball is in 6, 7, 8, and because they were on the “light side” in the first weighing, one of them is a light one.
Third weighing: 6 vs. 7
- If 6 = 7 → 8 is the light one
- If 6 < 7 → 6 is the light one
- If 6 > 7 → 7 is the light one
✅ Feasible!
✅ Case B: Second weighing is left-heavy (1, 2, 3, 5) > (4, 9, 10, 11) → The left side is heavy, and the right side has three normal balls (9, 10, 11), so the right side’s weight is “standard” → The reason the left side is heavy could be:
- One of 1, 2, 3 is a heavy one (because they are on the left side)
- Or 4 is a light one (because 4 is on the right side, becoming lighter would make the right side lighter → make the left side appear heavier)
- 5 is on the left side. If it were a light one, it would make the left side lighter → It wouldn’t cause “left-heavy,” so 5 cannot be the abnormal ball (because in the first weighing, 5 was in the “light group,” it can only be a light one, but here the left side is heavy, and 5 being light would weaken the left side, a contradiction) → So the possibilities are narrowed to:
- One of 1, 2, 3 is a heavy one
- Or 4 is a light one
Third weighing: 1 vs. 2
- If 1 = 2 → The abnormal ball is 3 (heavy) or 4 (light) → ❗Cannot distinguish! → There’s a problem!
At this point, we don’t know if it’s 3 that’s heavy or 4 that’s light, and we only have one weighing opportunity! → If we weigh 1 vs. 2:
- 1 = 2 → Problem: Is 3 heavy? Or is 4 light? → Cannot be determined
- 1 > 2 → 1 is the heavy one
- 1 < 2 → 2 is the heavy one → So when 1 = 2, we are left with ambiguity!
❌ Failure! Cannot be uniquely determined.
🚫 Root of the problem:
In the second weighing, you put “suspect heavy candidates” (1, 2, 3) and “suspect light candidates” (4) on opposite sides of the scale, which means that when they “cancel out their abnormal effects” (e.g., 3 heavy vs. 4 light), they cannot be distinguished in the third weighing.
My second follow-up question
- Seeing its solution, I double-checked my own solution to see if there were any issues.
- After confirming my solution was correct, I checked its second solution.
- Then I asked a second follow-up question:
Your analysis is incorrect. In ✅ Case B: When the second weighing is left-heavy ((1, 2, 3, 5) > (4, 9, 10, 11)), 4 cannot be light.
Qwen3 MAX’s answer to the second follow-up question
You are absolutely right! 👏
My previous analysis here was indeed incorrect—in “Case B: The second weighing is left-heavy,” 4 cannot be light!
Let’s re-analyze this key point rigorously.
🧩 Reviewing the premise:
First weighing: 1, 2, 3, 4 > 5, 6, 7, 8 → Left side is heavy → The abnormal ball is in 1-8 → Either one of 1, 2, 3, 4 is a heavy one → Or one of 5, 6, 7, 8 is a light one → 9-12 are normal balls ✅
🆕 Your proposed second weighing:
Left pan: 1, 2, 3, 5 Right pan: 4, 9, 10, 11
Where 9, 10, 11 are known normal balls → Right pan’s baseline weight = weight of 4 normal balls (because of 4 + 3 normal ones).
🎯 Focus analysis: Case B — Second weighing result: Left side is heavy (1, 2, 3, 5) > (4, 9, 10, 11)
Let’s check which balls could cause this result, one by one:
Assume one of 1, 2, 3 is a heavy one: → They are in the left pan → Cause the left side to be heavier → ✅ Matches “left-heavy” Assume 5 is a light one: → 5 is in the left pan → The left pan becomes lighter → Would cause the left side to be lighter or balanced → ❌ Contradicts “left-heavy” → 5 cannot be the abnormal ball Assume 4 is a light one: → 4 is in the right pan → The right pan becomes lighter → This makes the left side relatively heavier → ✅ Seems to match?
⚠️ But there’s a key logical trap here! → If 4 is a light one, then in the first weighing, it was on the left pan (1, 2, 3, 4), and it becoming lighter would cause the left side to be lighter, but the first weighing result was left-heavy → ❗Contradiction!
✅ Core reasoning:
If a ball caused the “left-heavy” result in the first weighing, then if it’s the abnormal ball, it can only be:
- On the left side (1-4) and a heavy one
- Or on the right side (5-8) and a light one
→ Therefore:
- 4 was on the left side of the first weighing → If 4 is the abnormal ball, it can only be a heavy one (not a light one!)
- Otherwise, if 4 were a light one → The first weighing’s left side would become lighter → Contradicts “left-heavy” → ❌ Impossible!
🔍 So, in the case of the second weighing being “left-heavy,” the only possible cause is:
- One of 1, 2, 3 is a heavy one (in the left pan, making the left side heavier) ✅
- 4 being light? → ❌ Impossible! Because it contradicts the first weighing!
- 5 being light? → In the left pan, it would make the left side lighter → Contradicts “left-heavy” → ❌ Impossible
- Other balls? → 6, 7, 8 weren’t on the scale, so they don’t affect this result; 9-11 are normal.
→ So the only possibility is: one of 1, 2, 3 is a heavy one!
✅ Third weighing plan:
Weigh directly: 1 vs. 2
- If 1 = 2 → 3 is the heavy one
- If 1 > 2 → 1 is the heavy one
- If 1 < 2 → 2 is the heavy one
✅ Perfectly solved! No need to consider 4 again!
✅ Corrected full analysis (for your proposed solution):
Second weighing: 1, 2, 3, 5 vs. 4, 9, 10, 11
Case A: Balanced → The abnormal ball is in 6, 7, 8, and it’s a light one → Third weighing: 6 vs. 7 → Find out which is light, or if 8 is light Case B: Left-heavy → The abnormal ball is in 1, 2, 3, and it’s a heavy one → Third weighing: 1 vs. 2 → Find out which is heavy, or if 3 is heavy Case C: Left-light → Left side being light means: → Possible causes:
- 5 is a light one (in the left pan, becomes lighter → left side is lighter) ✅
- 4 is a heavy one (in the right pan, becomes heavier → right side is heavier → makes the left side appear lighter) ✅
→ Note: If 1, 2, 3 were heavy, they would make the left side heavier → Contradicts “left-light” → ❌ Excluded
→ So it can only be: 5 is a light one OR 4 is a heavy one
→ Third weighing: 4 vs. 9
- If 4 > 9 → 4 is the heavy one
- If 4 = 9 → 5 is the light one
✅ Perfectly distinguished!
✅ Final conclusion:
Your proposed second weighing plan:
Left: 1, 2, 3, 5 Right: 4, 9, 10, 11
✅ Is completely correct and efficient!
Combined with the correct third weighing:
- Balanced → Weigh 6 vs. 7
- Left-heavy → Weigh 1 vs. 2
- Left-light → Weigh 4 vs. 9
→ Within three weighings, we are 100% able to find the abnormal ball and determine its weight!
🎉 Added value:
Your plan is more “symmetric” and intuitive than the classic plan (1, 2, 5 vs. 3, 6, 9) — putting 1, 2, 3 together makes the logic clearer. As long as the reasoning is rigorous, it can completely replace the classic plan!
Thank you again for the correction — it’s this kind of rigorous scrutiny that leads to perfect solutions! 🧠💯
✅ Problem fully solved!
GPT5’s Solution
This is the classic “12 balls, three weighings” problem. Here is an actionable solution that can also determine if the ball is heavy or light (using numbers 1–12 for the balls):
First weighing
Put 1, 2, 3, 4 on the left pan; put 5, 6, 7, 8 on the right pan.
Case A: The scale is balanced
This means the abnormal ball is in 9, 10, 11, 12, and its weight difference is unknown. Second weighing: Put 9, 10, 11 (left) vs. 1, 2, 3 (right, known to be normal).
- If balanced: The abnormal ball is 12. Third weighing: 12 vs. 1 → The side that is heavier is the “heavy ball”; the side that is lighter is the “light ball.”
- If left is heavy: The abnormal ball is in {9, 10, 11} and is a heavy ball. Third weighing: 9 vs. 10
- If 9=10 → 11 is heavy; If 9>10 → 9 is heavy; If 9<10 → 10 is heavy.
- If left is light: The abnormal ball is in {9, 10, 11} and is a light ball. Third weighing: 9 vs. 10
- If 9=10 → 11 is light; If 9<10 → 9 is light; If 9>10 → 10 is light.
Case B: First weighing is left-heavy (noted as “left-heavy”)
In this case, the abnormal ball can only be a heavy ball in {1, 2, 3, 4} or a light ball in {5, 6, 7, 8}; 9, 10, 11, 12 are all normal. Second weighing: Put 1, 2, 5 (left) vs. 3, 6, 9 (right; 9 is a normal reference).
- If balanced: The suspects are narrowed down to {4 heavy, 7 light, 8 light}.
- Third weighing: 7 vs. 8 → If balanced, then 4 is heavy; If 7<8, then 7 is light; If 7>8, then 8 is light.
- If the second weighing is left-heavy: The suspects are {1 heavy, 2 heavy, 6 light}.
- Third weighing: 1 vs. 2 → If unequal, you can determine which is heavy; if equal, then 6 is light.
- If the second weighing is left-light: The suspects are {3 heavy, 5 light}.
- Third weighing: 3 vs. 9 (9 is normal) → If equal, then 5 is light; If 3>9, then 3 is heavy.
Case C: First weighing is left-light
Completely symmetric to Case B (just swap “left/right” and “heavy/light”): Second weighing: Still use 1, 2, 5 (left) vs. 3, 6, 9 (right).
- If balanced: The suspects are {4 light, 7 heavy, 8 heavy}.
- Third weighing: 7 vs. 8 → If equal, then 4 is light; If 7>8, then 7 is heavy; If 7<8, then 8 is heavy.
- If the second weighing is left-heavy: The suspects are {3 light, 5 heavy}.
- Third weighing: 3 vs. 9 → If equal, then 5 is heavy; If 3<9, then 3 is light.
- If the second weighing is left-light: The suspects are {1 light, 2 light, 6 heavy}.
- Third weighing: 1 vs. 2 → If unequal, you can determine which is light; if equal, then 6 is heavy.
Following the above procedure, you will certainly determine the unique abnormal ball and whether it’s heavier or lighter within 3 weighings.
GPT5’s answer to the follow-up question
My follow-up question: For Case B and Case C, would using the method of left 1, 2, 3, 5 and right 4, 9, 10, 11 be feasible?
GPT5's answer:
Yes—this second weighing method is feasible for both Case B (first weighing left-heavy) and Case C (first weighing left-light). I will explain the logic separately, step-by-step, verifying that the three possible weighing results (balanced, left-heavy, left-light) can all narrow down the suspect to 1 and determine if it’s heavy/light.
Let’s call the second weighing: Left pan 1, 2, 3, 5 vs. Right pan 4, 9, 10, 11 (and 9–11, 12 are known normal references in Case B/C).
Case B (First weighing is left-heavy: suspects are 1–4 heavy OR 5–8 light)
- Second weighing is balanced: Then no heavy ball from 1–4 can make the two sides equal (because 1/2/3 on the left would make the left side heavy, and 4 on the right would make the right side heavy; if 5 were light, it would also make the left side lighter). Therefore, a balance can only come from 6, 7, or 8 being light (none of which are on the scale), so the suspects become {6,7,8} (and they are light). Third weighing: 6 vs. 7 → Determine which is light, or if balanced then 8 is light.
- Second weighing is left-heavy: Only 1, 2, or 3 being heavy would make the left side heavy (5 being light would only make the left side light, and 4 being heavy on the right would make the right side heavy). Therefore, the suspects are {1 heavy, 2 heavy, 3 heavy}. Third weighing: 1 vs. 2 → If unequal, you can determine which is heavy; if equal, then 3 is heavy.
- Second weighing is left-light (i.e., right-heavy): Only two possibilities: 4 is heavy (on the right) or 5 is light (on the left) would make the right side heavy/left side light. Therefore, the suspects are {4 heavy, 5 light}. Third weighing: Weigh 4 vs. 9 (normal) → If 4 is heavy, the right (or 4) side will be heavy, confirming 4 is heavy; if balanced, it means 5 is light.
Conclusion: Case B is completely covered; the ball can be determined and its weight difference found within three weighings.
Case C (First weighing is left-light: suspects are 1–4 light OR 5–8 heavy) — Symmetric to B
Analyze using the same second weighing setup:
- Second weighing is balanced: Can only be from 6, 7, or 8 being heavy (not on the scale) → Suspects {6,7,8} (heavy). Third weighing: 6 vs. 7 can distinguish which is heavy, or if balanced then 8 is heavy.
- Second weighing is left-heavy (left is heavier than right): Could be 4 is light (on the right, making the right side lighter and thus the left relatively heavier) or 5 is heavy (on the left, making the left side heavier) → Suspects {4 light, 5 heavy}. Third weighing: Weigh 4 vs. 9 → If 4 is light, it will be determined; if balanced, then 5 is heavy.
- Second weighing is left-light: Only 1, 2, or 3 being light would make the left side lighter → Suspects {1 light, 2 light, 3 light}. Third weighing: 1 vs. 2 → If unequal, it can be determined; if equal, then 3 is light.
Conclusion: Case C is also covered; three weighings are enough to determine and classify the ball.
Summary
The second weighing method you provided (Left 1, 2, 3, 5 — Right 4, 9, 10, 11) is feasible when the first weighing is “left-heavy” or “left-light.” Each of the three possible results from the second weighing can narrow the suspects down to 1–2 balls, and the third weighing can uniquely determine which ball is abnormal and whether it’s heavier or lighter.
Gemini 2.5 pro’s Solution
Okay, this is a very classic logic puzzle. Here are the detailed steps, with the balls numbered 1 to 12.
The core idea is: a balance scale has three possible outcomes (left heavy, right heavy, balanced), so three weighings can distinguish up to situations. We need to distinguish between the possibilities of each of the 12 balls being “heavy” or “light” ( situations), so it is theoretically possible.
Preparation
- Divide the 12 balls into three groups:
- Group A: 1, 2, 3, 4
- Group B: 5, 6, 7, 8
- Group C: 9, 10, 11, 12
First Weighing
Place Group A (1, 2, 3, 4) on the left pan and Group B (5, 6, 7, 8) on the right pan.
There are three possible outcomes for this weighing:
Case One: Scale is balanced (Group A = Group B)
- Analysis: This means the problem ball is not in Group A or Group B, so it must be in the unweighed Group C (9, 10, 11, 12). We also now know that balls 1 to 8 are all of standard weight.
- Second Weighing: Take 3 balls from Group C (9, 10, 11) and place them on the left pan. Take 3 balls from the standard-weight balls (1 to 8, any 3, for example 1, 2, 3) and place them on the right pan.
- If the scale is balanced: This means balls 9, 10, 11 are normal, so the problem ball is the remaining ball 12.
- Third Weighing: Weigh ball 12 against any standard ball (like ball 1). If 12 is heavier, it’s the heavy ball; if 12 is lighter, it’s the light ball.
- If the left pan is heavy (9, 10, 11 > 1, 2, 3): This means the problem ball is in 9, 10, 11 and is a heavy ball.
- Third Weighing: Place 9 on the left pan and 10 on the right pan. If the left is heavy, then 9 is the heavy ball; if the right is heavy, then 10 is the heavy ball; if they are balanced, then 11 is the heavy ball.
- If the left pan is light (9, 10, 11 < 1, 2, 3): This means the problem ball is in 9, 10, 11 and is a light ball.
- Third Weighing: Place 9 on the left pan and 10 on the right pan. If the left is light, then 9 is the light ball; if the right is light, then 10 is the light ball; if they are balanced, then 11 is the light ball.
- If the scale is balanced: This means balls 9, 10, 11 are normal, so the problem ball is the remaining ball 12.
Case Two: Left pan is heavy (Group A > Group B)
- Analysis: This means the problem ball is in Group A or Group B. The possibilities are: a heavy ball in Group A (1,2,3,4), or a light ball in Group B (5,6,7,8). Group C (9,10,11,12) are all standard balls.
- Second Weighing: Re-group. Place 1, 2, 5 (2 from Group A, 1 from Group B) on the left pan; place 3, 6, 9 (1 from Group A, 1 from Group B, 1 from Group C standard balls) on the right pan. 4, 7, 8 are set aside.
- If the scale is balanced: The problem ball must be among the three balls set aside: 4, 7, 8. Based on the first weighing, ball 4 can only be a heavy ball, and balls 7 and 8 can only be light balls.
- Third Weighing: Weigh 7 against 8. If the left is light, 7 is the light ball; if the right is light, 8 is the light ball; if they are balanced, it means the problem ball is 4, and it is a heavy ball.
- If the left pan is heavy (1, 2, 5 > 3, 6, 9): The problem ball must be among 1, 2, 6. The possibilities are: 1 or 2 is the heavy ball, or 6 is the light ball. (If 5 were the problem ball, it must be light, so the left pan wouldn’t be heavy; if 3 were the problem ball, it must be heavy, so the right pan wouldn’t be light).
- Third Weighing: Weigh 1 against 2. If the left is heavy, 1 is the heavy ball; if the right is heavy, 2 is the heavy ball; if they are balanced, then the problem ball is 6, and it is a light ball.
- If the left pan is light (1, 2, 5 < 3, 6, 9): The problem ball must be among 5, 3. The possibilities are: 5 is the light ball, or 3 is the heavy ball. (If 1 and 2 were the problem balls, they must be heavy, and the left pan wouldn’t be light; if 6 were the problem ball, it must be light, and the right pan wouldn’t be heavy).
- Third Weighing: Weigh 5 against a standard ball (like 9). If 5 is light, it is the light ball; if they are balanced, then the problem ball is 3, and it is a heavy ball.
- If the scale is balanced: The problem ball must be among the three balls set aside: 4, 7, 8. Based on the first weighing, ball 4 can only be a heavy ball, and balls 7 and 8 can only be light balls.
Case Three: Right pan is heavy (Group A < Group B)
- Analysis: This is completely symmetric to Case Two. The possibilities are: a light ball in Group A (1,2,3,4), or a heavy ball in Group B (5,6,7,8).
- Second Weighing: The weighing setup is exactly the same as in Case Two. Place 1, 2, 5 on the left pan; place 3, 6, 9 on the right pan.
- If the scale is balanced: The problem ball is among the three balls set aside: 4, 7, 8. 4 can only be a light ball, and 7 and 8 can only be heavy balls.
- Third Weighing: Weigh 7 against 8. The heavier one is the heavy ball; if they are balanced, then 4 is the light ball.
- If the left pan is heavy (1, 2, 5 > 3, 6, 9): The problem ball is among 5, 3. The possibilities are: 5 is the heavy ball, or 3 is the light ball.
- Third Weighing: Weigh 5 against a standard ball (like 9). If 5 is heavy, it is the heavy ball; if they are balanced, then 3 is the light ball.
- If the left pan is light (1, 2, 5 < 3, 6, 9): The problem ball is among 1, 2, 6. The possibilities are: 1 or 2 is the light ball, or 6 is the heavy ball.
- Third Weighing: Weigh 1 against 2. The lighter one is the light ball; if they are balanced, then 6 is the heavy ball.
- If the scale is balanced: The problem ball is among the three balls set aside: 4, 7, 8. 4 can only be a light ball, and 7 and 8 can only be heavy balls.
Through these three steps, no matter which situation arises, you can accurately find the special ball and determine if it’s heavier or lighter within three weighings.
Gemini 2.5 pro’s answer to the follow-up question
Excellent question! This is a great exploration, and we’ll analyze whether the method you proposed is feasible.
A quick review of where we are:
- First weighing result: Left pan (1, 2, 3, 4) > Right pan (5, 6, 7, 8).
- Our known deductions: The abnormal ball is in
1,2,3,4
and is heavy, or it is in5,6,7,8
and is light. Balls9,10,11,12
are all standard weight.
Your proposed second weighing method:
- Left pan: 1, 2, 3, 5
- Right pan: 4, 9, 10, 11
- Off the scale: 6, 7, 8, 12
Let’s analyze the three possible outcomes of this weighing:
Case 2.1: Scale is balanced
- Analysis: If the scale is balanced, it means all the balls on the scale (1,2,3,5 and 4,9,10,11) are standard weight. The abnormal ball must be one of the balls that were off the scale: 6, 7, 8. (We know 12 is a standard ball).
- Based on the conclusion from the first weighing, we know that if 6, 7, 8 are the abnormal balls, they must be light balls.
- Third Weighing: Weigh ball 6 and ball 7.
- If they are balanced, then ball 8 is the light ball.
- If the left pan (ball 6) is light, then ball 6 is the light ball.
- If the right pan (ball 7) is light, then ball 7 is the light ball.
- Conclusion: This path is feasible.
Case 2.2: Left pan is heavy
- Analysis: Left pan (1,2,3,5) > Right pan (4,9,10,11). Let’s analyze the cause:
- It cannot be a problem with ball 5, because if it’s abnormal, it must be light, and the left pan would only become lighter.
- It cannot be a problem with ball 4, because if it’s abnormal, it must be heavy, and it would make the right pan heavier, which contradicts the “left pan is heavy” result.
- Therefore, the problem must be with balls 1, 2, 3, and they are heavy balls.
- Third Weighing: Weigh ball 1 and ball 2.
- If they are balanced, then ball 3 is the heavy ball.
- If the left pan (ball 1) is heavy, then ball 1 is the heavy ball.
- If the right pan (ball 2) is heavy, then ball 2 is the heavy ball.
- Conclusion: This path is feasible.
Case 2.3: Right pan is heavy (i.e., left pan is light)
- Analysis: Left pan (1,2,3,5) < Right pan (4,9,10,11). Let’s analyze the cause:
- It cannot be a problem with balls 1, 2, 3, because if they were abnormal, they must be heavy balls, and they would make the left pan heavy.
- Therefore, there are only two possibilities: either ball 5 is abnormal, and it is a light ball; or ball 4 is abnormal, and it is a heavy ball.
- Third Weighing: Weigh ball 4 against a known standard ball (like 9).
- If the right pan (ball 4) is heavy, then ball 4 is the heavy ball.
- If they are balanced, it means ball 4 is a standard ball, so the problem ball is 5, and it is a light ball.
- Conclusion: This path is feasible.
Final Answer
Yes, your proposed method of left pan (1,2,3,5)
vs. right pan (4,9,10,11)
is completely feasible!
It is very clever because no matter which of the three outcomes occurs—balanced, left-heavy, or right-heavy—it can narrow down the remaining possibilities to a range that can be solved in the third weighing. This proves that there is more than one classic solution to this problem.
Conclusion
- When I asked AI models this question before, they would usually give nonsensical answers. Now it seems that as large models’ training data becomes more complete, this problem has been included in the training data, so all three models gave a classic standard answer.
- ChatGPT’s data seems to be the most comprehensive; it appears to also know other solutions to this problem.
- Gemini’s final sentence,
This proves that there is more than one classic solution to this problem,
could mean it knows other solutions, or it could have inferred this on its own after the follow-up question. If it’s the latter, it would indicate that its reasoning ability is stronger than Qwen3 MAX’s. - Qwen3 MAX still lags behind these two leading AIs, but it really seems like it didn’t know the other solution. Its later answers were genuine attempts at reasoning, and when prompted by the user, it could immediately realize its mistake and correct it. That’s still very good!
Advanced Test
I asked all three AIs another question: Besides the two solutions above, are there any other solutions?
This question exposed flaws in GPT5 and Gemini 2.5 pro!
GPT5’s Answer
Yes—there are many equivalent or different solutions to the 12-ball three-weighing problem. Below is a complete alternative solution (derived from a construction method that views the “three weighings” as a 3-digit ternary code). I’ll number the 12 balls 1..12, perform three weighings, and then provide a table on how to directly read which ball is abnormal and whether it’s “heavy” or “light” based on the three results (left-heavy L
, right-heavy R
, balanced =
).
The Solution (Directly actionable)
First weighing (W1): Place balls 2, 3, 4, 5, 6, 7, 8 on the left pan; place ball 1 on the right pan. Second weighing (W2): Place balls 2, 9, 10, 11 on the left pan; place balls 1, 6, 7, 8 on the right pan. Third weighing (W3): Place balls 3, 6, 9, 12 on the left pan; place balls 2, 5, 8, 11 on the right pan.
(For each weighing, put the listed balls on the corresponding pan; other balls are not on the scale.)
The Interpretation (Write the three results in order as a ternary sequence, e.g., L R =
means first is left-heavy, second is right-heavy, third is balanced)
The table below directly maps each ternary result to “which ball is abnormal + whether it’s heavy or light”:
L L L
→ Ball 2 is heavy.R R R
→ Ball 2 is light.L L =
→ Ball 3 is heavy.R R =
→ Ball 3 is light.L L R
→ Ball 4 is heavy.R R L
→ Ball 4 is light.L = L
→ Ball 5 is heavy.R = R
→ Ball 5 is light.L = =
→ Ball 6 is heavy.R = =
→ Ball 6 is light.L = R
→ Ball 7 is heavy.R = L
→ Ball 7 is light.L R L
→ Ball 8 is heavy.R L R
→ Ball 8 is light.L R =
→ Ball 9 is heavy.R L =
→ Ball 9 is light.L R R
→ Ball 10 is heavy.R L L
→ Ball 10 is light.= L L
→ Ball 11 is heavy.= R R
→ Ball 11 is light.= L =
→ Ball 12 is heavy.= R =
→ Ball 12 is light.= L R
→ Ball 1 is heavy.= R L
→ Ball 1 is light.
(The above 24 cases cover all possible combinations of unequal results from the three weighings — because 3 weighings with three outcomes each give a total of 27 combinations, where ===
means all weighings were balanced, which means there is no abnormal ball, a case that cannot happen here.)
Brief Explanation (Why there are many solutions)
- The essence is to view the “presence of a ball in the three weighings” as a three-digit code (each digit taking a value: left +1, right -1, not on the scale 0). For each ball, a ternary vector (a combination of
1, 0, -1
) is chosen, and a vector and its opposite vector cannot be chosen simultaneously (otherwise heavy/light would be confused). The three weighings are performed by placing the balls with the corresponding numbers on the left or right pans, or not at all, according to these three digits. The three-digit symbol of the weighing results directly corresponds to a certain vector or its negative, thereby determining which ball it is and its weight difference. - Because there are many sets of ternary vectors to choose from (as long as there are balls on both sides of the scale each time, and the 12 chosen codes are not opposites of each other), more than one feasible solution can be obtained—the solutions you and I provided are different instances of this type of construction.
If you’d like, I can:
- Make the above interpretation table into a more compact table (for easier printing/use in an experiment);
- Provide another known classic solution (for example, the most common “textbook solution”), or explain how to construct such codes yourself (to generate more variations). Which one would you like to see?
Gemini 2.5 pro’s Answer
Of course. The charm of this classic problem lies in the variety of its solutions. These solutions fundamentally follow the same information theory principles, but the specific operations can be very different.
We can divide “different solutions” into two main categories:
- “Variant” solutions that share the same core strategy but have different specific operations.
- “Structurally” different solutions that use a completely different opening move.
1. “Variant” Solutions with the Same Core Strategy
The two methods we discussed earlier both start with a “4v4” setup and, after the first weighing is unbalanced (e.g., Group A > Group B), they use different grouping methods. This “second weighing” combination is actually very flexible. As long as you follow the principles below, you can create many more valid solutions:
Core Principle: You must mix “suspect heavy” balls (Group A: 1,2,3,4) and “suspect light” balls (Group B: 5,6,7,8) and combine them with known standard balls (Group C: 9,10,11,12) in a way that regardless of the weighing outcome (balanced, left-heavy, right-heavy), the remaining possibilities are narrowed down to a range that can be solved in the third weighing.
Here’s another new variant for a “4v4 start, A > B” scenario:
-
Second Weighing:
- Left pan: 1, 5 (1 suspect heavy, 1 suspect light)
- Right pan: 2, 6 (1 suspect heavy, 1 suspect light)
- Off the scale: 3,4 (suspect heavy) and 7,8 (suspect light)
-
Result Analysis:
- If balanced: The problem ball must be among the off-scale balls
3, 4, 7, 8
. Either 3 or 4 is heavy, or 7 or 8 is light.- Third Weighing: Weigh
3, 7
vs.4, 9
(a standard ball).- If balanced, then 8 is the light ball.
- If left is heavy, then 3 is the heavy ball.
- If right is heavy, then 4 is the heavy ball or 7 is the light ball. Weighing 4 against a standard ball can then distinguish between them.
- Third Weighing: Weigh
- If the left pan is heavy (1,5 > 2,6): The possibilities are: 1 is the heavy ball, or 6 is the light ball.
- Third Weighing: Weigh
1
vs.a standard ball
. If it’s heavy, then 1 is the heavy ball; if it’s balanced, then 6 is the light ball.
- Third Weighing: Weigh
- If the right pan is heavy (1,5 < 2,6): The possibilities are: 2 is the heavy ball, or 5 is the light ball.
- Third Weighing: Weigh
2
vs.a standard ball
. If it’s heavy, then 2 is the heavy ball; if it’s balanced, then 5 is the light ball.
- Third Weighing: Weigh
- If balanced: The problem ball must be among the off-scale balls
See, this is another completely valid solution. As long as you understand the principle of mixed grouping, you can design many similar variants.
2. Structurally Different Solutions (Changing the opening)
The most famous other complete solution is the “3v3” start. This method has a completely different logical path from the “4v4” start, with its own difficulties.
-
First Weighing:
- Left pan: 1, 2, 3
- Right pan: 4, 5, 6
- Off the scale: 7, 8, 9, 10, 11, 12 (6 balls)
-
Result Analysis:
-
Case One: Scale is balanced (1,2,3 = 4,5,6)
- Analysis: This is the difficult part of the “3v3” start. The problem ball is among the 6 balls off the scale,
7-12
, and we only have two weighings left. - Second Weighing: Place
7,8,9
on the left pan and1,2,3
(known standard balls) on the right pan.- If balanced: This means the problem ball is in
10,11,12
. The third weighing10
vs.11
can find it (same method as the balanced case in the “4v4” start). - If left is heavy: This means the problem ball is in
7,8,9
and is a heavy ball. The third weighing7
vs.8
will find the heavy one, and if they are balanced,9
is the heavy ball. - If left is light: This means the problem ball is in
7,8,9
and is a light ball. The third weighing7
vs.8
will find the light one, and if they are balanced,9
is the light ball.
- If balanced: This means the problem ball is in
- Analysis: This is the difficult part of the “3v3” start. The problem ball is among the 6 balls off the scale,
-
Case Two: Scale is unbalanced (e.g., 1,2,3 > 4,5,6)
- Analysis: This is the relatively simple part of the “3v3” start. The possibilities are: one of
1,2,3
is a heavy ball, or one of4,5,6
is a light ball. A total of 6 suspect balls. - Second Weighing: Place
1,4
(1 suspect heavy, 1 suspect light) on the left pan, and2,5
(1 suspect heavy, 1 suspect light) on the right pan. Balls3,6
are off the scale.- If balanced: The problem ball is in
3
or6
off the scale. Either 3 is a heavy ball, or 6 is a light ball. The third weighing is3
vs.a standard ball
. If it’s heavy, 3 is the answer; if it’s balanced, 6 is the answer. - If left is heavy: The possibilities are 1 is a heavy ball, or 5 is a light ball. The third weighing is
1
vs.a standard ball
. If it’s heavy, 1 is the answer; if it’s balanced, 5 is the answer. - If right is heavy: The possibilities are 2 is a heavy ball, or 4 is a light ball. The third weighing is
2
vs.a standard ball
. If it’s heavy, 2 is the answer; if it’s balanced, 4 is the answer.
- If balanced: The problem ball is in
- Analysis: This is the relatively simple part of the “3v3” start. The possibilities are: one of
-
Summary
So, besides the two solutions we first discussed, at least two more exist:
- More variant solutions based on the “4v4” start, where the key is the grouping logic for the second weighing when it’s unbalanced.
- A complete, new solution based on the “3v3” start, which has a completely different logic and distribution of difficulties from the “4v4” start.
The commonality among all these solutions is that they all precisely use the possible outcomes from three weighings to perfectly cover the 24 possibilities that come from 12 balls and 2 states (heavy or light).
Conclusion
- I did not include Qwen3 MAX’s answer here. Its response was very long and, while it didn’t present a new solution, it elevated the problem to the level of information theory and proposed an encoding pre-assignment table method based on coding theory. It also didn’t provide a completely incorrect solution like GPT5 and Gemini 2.5 pro.
- GPT5 and Gemini 2.5 pro were both using their web versions, which are the free versions, so it’s impossible to judge if this represents the level of the paid ChatGPT and Gemini versions. However, based on their answers, they both gave completely wrong answers without having definite knowledge. We can conclude that the seemingly flawless logical reasoning we’ve seen before is not a generalized ability of the models themselves, but merely the result of rote memorization!