Solving easy problems the hard way

There’s a charming little brain teaser that’s going around the Interwebs. It’s got various forms, but they all look something like this:

This problem can be solved by pre-school children in 5-10 minutes, by programer – in 1 hour, by people with higher education … well, check it yourself! :)

8809=6
7111=0
2172=0
6666=4
1111=0
3213=0
7662=2
9313=1
0000=4
2222=0
3333=0
5555=0
8193=3
8096=5
7777=0
9999=4
7756=1
6855=3
9881=5
5531=0
2581=?

SPOILER ALERT…

The answer has to do with how many circles are in each number. So the number 8 has two circles in its shape so it counts as two. And 0 is one big circle, so it counts as 1. So 2581=2. Ok, that’s cute, it’s an alternative mapping of values with implied addition.

What bugged me was how might I solve this if the mapping of values was not based on shape. So how could I program a computer to solve this puzzle? I gave it a little thought and since I like to pretend I’m an econometrician, this looked a LOT like a series of equations that could be solved with an OLS regression. So how can I refactor the problem and data into a trivial OLS? I really need to convert each row of the training data into a frequency of occurrence chart. So instead of 8809=6 I need to refactor that into something like:

1,0,0,0,0,0,0,0,2,1 = 6

In this format the independent variables are the digits 0-9 and their value is the number of times they occur in each row of the training data. I couldn’t figure out how to do the freq table so, as is my custom, I created a concise simplification of the problem and put it on StackOverflow.com which  yielded a great solution. Once I had the frequency table built, it was simple a matter of a linear regression with 10 independent variables and a dependent with no intercept term.

My whole script, which you should be able to cut and paste into R, if you are so inclined, is the following:

## read in the training data
## more lines than it should be because of the https requirement in Github
temporaryFile <- tempfile()
download.file("https://raw.github.com/gist/2061284/44a4dc9b304249e7ab3add86bc245b6be64d2cdd/problem.csv",destfile=temporaryFile, method="curl")
series <- read.csv(temporaryFile)

## munge the data to create a frequency table
freqTable <- as.data.frame( t(apply(series[,1:4], 1, function(X) table(c(X, 0:9))-1)) )
names(freqTable) <- c("zero","one","two","three","four","five","six","seven","eight","nine")
freqTable$dep <- series[,5]

## now a simple OLS regression with no intercept
myModel <- lm(dep ~ 0 + zero + one + two + three + four + five + six + seven + eight + nine, data=freqTable)
round(myModel$coefficients)

Created by Pretty R at inside-R.org

The final result looks like this:

> round(myModel$coefficients)
zero   one   two three  four  five   six seven eight  nine
   1     0     0     0    NA     0     1     0     2     1
So we can see that zero, six, and nine all get mapped to 1 and eight gets mapped to 2. Everything else is zero. And four is NA because there were no fours in the training data.
There. I’m as smart as a preschooler. And I have code to prove it.

44 Comments

  1. Tal Galili says:

    Brilliant post JD!

    Thanks,
    Tal

  2. Mashkur says:

    Oh no……my intellect level is @ pre school level.

  3. Micah says:

    Did the preschoolers have complete information about the problem? It’s obviously meant to be confusing…

  4. Gok says:

    Thank for you this puzzel to make morning sick good to eat. It is liked to be in boat with numbers floating in eyez and for to have itchy penis.

  5. Shon says:

    Not sure I buy the assertion that a (typical) pre-schooler could solve this problem in 5-10 minutes. I’m sure that there are some that could

  6. Ken says:

    I doubt a second grader would figure out this series without a hint. (Hint Just like above sequence to value, but second graders can spell.)
    8809=7
    7111=8
    2172=6
    6666=0
    1111=8
    3213=7
    7662=3
    9313=7
    0000=8
    2222=4
    3333=8
    5555=4
    8193=7
    8096=5
    7777=8
    9999=4
    7756=5
    6855=4
    9881=7
    5531=6
    2581=6
    I didn’t use an algorithm, just the following function:
    CREATE FUNCTION textToVal(@Parm varchar(10))
    RETURNS VARCHAR(15)
    AS
    BEGIN
    DECLARE @Val int=0,@Res varchar(15)=@Parm + ‘=’, @len int = len(@Parm), @pos int = 1
    DECLARE @t TABLE(num char PRIMARY KEY, val int)
    INSERT INTO @t VALUES (0,2),(1,2),(2,1),(3,2),(4,1),(5,1),(6,0),(7,2),(8,2),(9,1)
    WHILE @pos <= @len
    BEGIN
    SELECT @Val=@Val+val, @pos=@pos+1 FROM @t WHERE num=SUBSTRING(@Parm,@pos,1)
    IF @@ROWCOUNT = 0 SET @pos=@pos+1
    END
    SET @Res=@Res + CAST(@Val as varchar(5))
    RETURN @Res
    END
    GO

  7. Ken says:

    I take it back, it is an algorithm, just not using regression logic.

  8. Andrew Kennedy says:

    Almost as smart as a preschooler. They would have been able to infer the value for 4.

  9. Arun says:

    Thats a hard problem solved easily

  10. Chris says:

    Very nice. Thank you for posting. As I try to wrap my mind around the nested function within the freqTable assignment, I’m intrigued by how apply applies the function table(c(X, 0:9)))-1) to do what it does. I understand table, and I think I understand c(), but how the constituent pieces come together–with the help of apply with the help of -1–to produce the final product is sheer data munging magic! I’d love it if someone could spoon feed me the hows and whys of this coding sorcery.

  11. Brendan says:

    I must be 5, I figured it was the circles almost right away when you said kids figure it out quickly. :)

  12. John S says:

    Brilliant problem, and technique for solving it. Preschoolers know shapes, and I guess as adults we know shapes as well, but only developers are going to know how to solve it with a lookup table, which is essentially what the script is.
    Thanks.

  13. [...] The hard way to solve the 8809=6 puzzle (with regression!); I solved it here. (I actually thought about using regression. But I did not.) [...]

  14. John Smith says:

    The real answer! Just count circles from digits. It`s realy pre-school exercise :)

  15. etrain says:

    I love the regression summary – p-values exactly where you want them.

    And the tabulate trick is a keeper.

    > summary(myModel)

    Call:
    lm(formula = dep ~ 0 + zero + one + two + three + four + five +
    six + seven + eight + nine, data = freqTable)

    Residuals:
    Min 1Q Median 3Q Max
    -2.927e-15 -1.213e-16 2.759e-17 2.363e-16 1.578e-15

    Coefficients: (1 not defined because of singularities)
    Estimate Std. Error t value Pr(>|t|)
    zero 1.000e+00 2.578e-16 3.878e+15 <2e-16 ***
    one 1.054e-16 2.023e-16 5.210e-01 0.613
    two 2.320e-17 2.319e-16 1.000e-01 0.922
    three 2.475e-17 2.166e-16 1.140e-01 0.911
    four NA NA NA NA
    five 1.116e-16 2.181e-16 5.110e-01 0.619
    six 1.000e+00 2.300e-16 4.348e+15 <2e-16 ***
    seven 0.000e+00 2.308e-16 0.000e+00 1.000
    eight 2.000e+00 3.650e-16 5.480e+15 <2e-16 ***
    nine 1.000e+00 2.559e-16 3.908e+15 <2e-16 ***

    Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 1.067e-15 on 11 degrees of freedom
    Multiple R-squared: 1, Adjusted R-squared: 1
    F-statistic: 1.541e+31 on 9 and 11 DF, p-value: < 2.2e-16

  16. isomorphisms says:

    I was actually thinking something similar; since I recently read the words “ℓ₀ norm” which is equivalent to your occurrence count.

    I would have started looking for constants with constants c₀..c₉ — but also in OLS regressions we know we’re supposed to throw in a few squared terms and interaction terms! The dimensions would overwhelm the data and my regression would get nowhere.

    The funny thing is, that viewpoint is actually just a more edu-ma-fied version of problems I had with these “find the pattern” puzzles as a kid. Why am I supposed to assume that the pattern takes a sequential (simple) counting form? Maybe it’s something to do with the relationship between the first digit and the third digit, or every other time it’s the fourth digit rather than those two. Maybe some parts of the puzzle are thrown in as noise. I used to read the “secret code” column in Boys’ Life and the deceptive codes I would come up with would always use convoluted schemes like that — so why wouldn’t the puzzle maker?

  17. isomorphisms says:

    So I ran your myModel and was surprised to find the errors are super low! Every single ∫F < 2e−16 — that's three stars, which are very valuable in preschool.


    anova(myModel)
    Analysis of Variance Table

    Response: dep
    Df Sum Sq Mean Sq F value Pr(>F)
    zero 1 40.500 40.500 5.3263e+31 < 2.2e-16 ***
    one 1 2.613 2.613 3.4363e+30 < 2.2e-16 ***
    two 1 0.059 0.059 7.7220e+28 < 2.2e-16 ***
    three 1 0.410 0.410 5.3886e+29 < 2.2e-16 ***
    five 1 1.563 1.563 2.0549e+30 < 2.2e-16 ***
    six 1 31.727 31.727 4.1725e+31 < 2.2e-16 ***
    seven 1 0.158 0.158 2.0717e+29 < 2.2e-16 ***
    eight 1 63.581 63.581 8.3617e+31 < 2.2e-16 ***
    nine 1 17.391 17.391 2.2871e+31 < 2.2e-16 ***
    Residuals 11 0.000 0.000
    ---
    Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
    Warning message:
    In anova.lm(myModel) :
    ANOVA F-tests on an essentially perfect fit are unreliable

    This is like a neckbearded redditor talking about Victoria’s Secret models: “They’re essentially perfect. There probably are some flaws so I’m always reluctant to give out a perfect 10, but this is as close as real life ever comes to perfection.”

    With n=20 that’s very surprising; is it because the data are all perfect, or because the form of the model already contained the answer?

    First let’s see how well the model does with only half the data (randomly selected, of course!):

    #sample.integer(out of total, how many, replace=FALSE)
    > freqTable[sample.int(20,10),]
    zero one two three four five six seven eight nine dep
    15 0 0 0 0 0 0 0 4 0 0 0
    4 0 0 0 0 0 0 4 0 0 0 4
    2 0 3 0 0 0 0 0 1 0 0 0
    14 1 0 0 0 0 0 1 0 1 1 5
    10 0 0 4 0 0 0 0 0 0 0 0
    3 0 1 2 0 0 0 0 1 0 0 0
    16 0 0 0 0 0 0 0 0 0 4 4
    1 1 0 0 0 0 0 0 0 2 1 6
    6 0 1 1 2 0 0 0 0 0 0 0
    20 0 1 0 1 0 2 0 0 0 0 0

    > c.mast$coef
    zero one two three four
    1.000000e+00 1.252795e-16 -7.056736e-16 4.476346e-17 NA
    five six seven eight nine
    5.074747e-17 NA 1.364642e-15 2.000000e+00 1.000000e+00

    I didn’t round so you can see that the rounding isn’t clipping a .3 estimate down to zero: it’s truncating a trillionth of a billionth of a wrong.

    And yet again the ANOVA table spits out three stars on every one.

    This was too amazing so I tried to really cut the hamstrings off of the OLS: you five observations, basically boiled cabbage with a dollop of ketchup. This seemed to distract the OLS hammer enough to make it hit the wrong mark, so I ran it a few times:


    > gimpy round(gimpy$coef)
    zero one two three four five six seven eight nine
    NA 3 -3 0 NA -5 NA 3 NA NA ound(gimpy$coef)

    zero one two three four five six seven eight nine
    NA 0 0 NA NA NA NA 0 NA 1

    zero one two three four five six seven eight nine
    NA 0 0 0 NA NA NA 0 2 NA

    zero one two three four five six seven eight nine
    NA -1 -1 1 NA NA 1 3 NA NA

    zero one two three four five six seven eight nine
    6 NA 4 NA NA 2 -1 0 NA NA

    zero one two three four five six seven eight nine
    NA 0 0 0 NA 0 1 NA NA NA

    zero one two three four five six seven eight nine
    5 0 2 3 NA 1 NA NA NA NA
    > anova(gimpy)
    Analysis of Variance Table

    Response: dep
    Df Sum Sq Mean Sq F value Pr(>F)
    zero 1 25.0000 25.0000
    one 1 0.5294 0.5294
    two 1 4.0000 4.0000
    three 1 8.4706 8.4706
    five 1 9.0000 9.0000
    Residuals 0 0.0000
    Warning message:
    In anova.lm(gimpy) :
    ANOVA F-tests on an essentially perfect fit are unreliable

    It so happens that my only hammer looks slightly more “M”-shaped than yours, and that’s what finally led me to the resolution. Only when I tried MASS::rlm(dep ~ 0 + one + two + three + four + five + six + seven + eight + nine, data=freqTable) that I finally figured out why these estimates are so perfect. It threw an error saying “singular matrix” so I looked back at freqTable and looked back at lm. Here’s what I found:


    ?lm

    singular.ok: logical. If ‘FALSE’ (the default in S but not in R) a
    singular fit is an error.

    Yup — take a look again at the

    > freqTable
    zero one two three four five six seven eight nine dep
    1 1 0 0 0 0 0 0 0 2 1 6
    2 0 3 0 0 0 0 0 1 0 0 0
    3 0 1 2 0 0 0 0 1 0 0 0
    4 0 0 0 0 0 0 4 0 0 0 4
    5 0 4 0 0 0 0 0 0 0 0 0
    6 0 1 1 2 0 0 0 0 0 0 0
    7 0 0 1 0 0 0 2 1 0 0 2
    8 0 1 0 2 0 0 0 0 0 1 1
    9 4 0 0 0 0 0 0 0 0 0 4
    10 0 0 4 0 0 0 0 0 0 0 0
    11 0 0 0 4 0 0 0 0 0 0 0
    12 0 0 0 0 0 4 0 0 0 0 0
    13 0 1 0 1 0 0 0 0 1 1 3
    14 1 0 0 0 0 0 1 0 1 1 5
    15 0 0 0 0 0 0 0 4 0 0 0
    16 0 0 0 0 0 0 0 0 0 4 4
    17 0 0 0 0 0 1 1 2 0 0 1
    18 0 0 0 0 0 2 1 0 1 0 3
    19 0 1 0 0 0 0 0 0 2 1 5
    20 0 1 0 1 0 2 0 0 0 0 0

    Yup — dep is a linear combination of the other columns (with the constants c₀..c₉ being the known preschool kids’ solution). It was never about what OLS was finding: it was about what error-handling due to singular.ok=TRUE was finding.

    Maybe that’s my takeaway: lm–the tool you use to find out when your left-hand side is a linear combination of the right-hand side.

  18. Vijayan Padmanabhan says:

    ###Just a thought_Using another logic.. why not this way?
    df<-as.character(c(8809,7111,2172,6666,1111,25816))
    l<-data.frame()

    for (i in 1:length(df))
    {
    x<-noquote(df[i])
    for( j in 1:nchar(x))
    {
    yl[i,j]
    }
    }

    for(i in 1:dim(l)[2])
    {
    l[,i]<-as.numeric(l[,i])
    }

    library(Deducer)
    l[1:dim(l)[2]] ’0′;c(8) -> ’2′;c(6,9) -> ’1′;”)

    for(i in 1:dim(l)[2])
    {
    l[,i]<-as.numeric(l[,i])
    }

    l$rowsum<-rowSums(l, na.rm = TRUE)
    l

  19. Carl Witthoft says:

    Just thinking: the numeral “4″ is a troublemaker. Does it count because there’s enclosed white space, or does it *not* count because the enclosed space is bounded by a finite-sided polygon (i.e. not a circle or ellipse)?

  20. [...] a recent blog post by CMastication, a little meme puzzle is presented with the introduction that a preschooler could [...]

  21. Andrew says:

    @KEN

    I give up. To what are you referring with your similar “spelling” problem? Are you referring to the number of vowels? If you are, I think you should review how to spell fOUr, fIvE, sIx, and nInE. If not, please tell me.

  22. G says:

    I have worked this out in a completly different way….. And was shocked when my friend worked it out by the circles. I cracked the puzzle by the answers alone without the circles…….. Take a close look at the answers…
    The first answer is 6……. The second is 4 .. 6-4 =2
    1-4 turn it around 4-1= 3 5-1=4 4-1=3 3-5 turn it around 5-3=2
    that how I got the answer 2

  23. Just count the number of dividers that is not 1 or the number itself (so prime numbers excluded), and add the number of 0′s.

    8809 = 6
    8 dividable by 2 and 4, so 2 dividers
    8 dividable by 2 and 4, so 2 dividers
    9 dividable by 3, so 1 divider
    1 zero

    2581=?
    8 dividable by 2 and 4, so the solution is 2

    This works, not the official explanation, but works for all the numbers.

  24. Just count the number of dividers that is not 1 or the number itself (so prime numbers excluded), and add the number of 0′s.

    8809 = 6
    8 dividable by 2 and 4, so 2 dividers
    8 dividable by 2 and 4, so 2 dividers
    9 dividable by 3, so 1 divider
    1 zero

    2581=?
    8 dividable by 2 and 4, so the solution is 2

    This works, not the official explanation, but works for all the numbers !

  25. Ken says:

    @Andrew, yes I was referring to spelling.
    VALUES (0,2),(1,2),(2,1),(3,2),(4,1),(5,1),(6,0),(7,2),(8,2),(9,1)
    zero zr=0,eo=2 one n=0, oe=2, two tw=0, o=1 three thr=0, ee=2
    four fur=0, o=1, five fiv=0, e=1 six=0 seven svn=0, ee=2
    eight iht=0, eg=2 nine nin=0, e=1. It’s corallary sequence would be
    (0,2),(1,1),(2,2),(3,3),(4,3),(5,3),(6,3),(7,3),(8,3),(9,3) (Hint, I just GAVE you the corallary sequence!)

    Unless the number starts a sentence or if you are trying to make a point, numbers are spelled in lower case, not upper. Of course using upper case does produce a different set of values. (R=1, E=0) :)

    Additional hint, the original problem deals with the SHAPE, not the VALUE of the number. Like the value, consonants and vowels has nothing to do with it.

    Sorry about the slow reply, I don’t lurk on older web sites and I just got three emails TODAY!!! Wow, talk about snail mail!

  26. Brian says:

    I actually think the “pre-schooler” information is a helpful hint. It demonstrated to me that the solution had to be something in the digits themselves and not in the values of the numbers. Made it pretty easy to spot after a few minutes perusal.

    Or maybe I’m just a big 4 year old!

  27. Alix says:

    My preschooler got it in 3 seconds. I tried to have him explain how he got it. So confused. But I’m thinking that the question mark looks like a 2 so it must be 2.

  28. Ramy says:

    To know the answer to (2581=?), you only need to know the values of 2, 5, 8 and 1, so:
    (A)
    If (2222 = 0) then the value of 2 = 0
    If (5555 = 0) then the value of 5 = 0
    If (1111 = 0) then the value of 1 = 0

    And If we take any number that contains 8 as (8809 = 6) and as long as
    (0000 = 4) then the the value 0 = 1
    (9999 = 4) then the the value 9 = 1
    so:
    (B)
    8+8+0+9 = (value of 8) + (value of 8) + 1 + 1 = 6 then value of 8 = 2
    from (A) and (B) then (2581=2)

  29. Kyle Dailey says:

    You are all wrong.

    The real answer is 2581.

    Nowhere did it ask any other question than, “2581=”.

    Nowhere in the text does it tell you to find a pattern and apply it to the only unanswered problem.

    Yes, in the past, there have been questions that ask you to find a pattern, but not in this one.

  30. Dragnucs says:

    Why all this complicated algorithms ?
    Why not a simple one like this :

    solution = 0
    equationSTR = str(equation)
    for number in equation:
    if number == 0 or number == 6:
    solution += 1
    elif number == 8:
    solution += 2
    #…

    print solution

  31. JD Long says:

    Dragnucs, you didn’t present an algo for solving the problem. You presented an algo for printing out a pre-calculated solution. That’s a considerably simpler problem.

  32. Dragnucs says:

    Yes, but we have a solution, right ?
    We can’t even say it’s an algo, its a simple procedure.

  33. Ken says:

    @Dragnucs, the problem with your solution is that you were handed the algorithm to begin with, the author took the data and recursively came up with the algorithm. On the other hand, the author made an assumption in order to come up with the algorithm. I doubt the following could be solved by a 3 year old and it certainly can’t be solved using the assumptions the author originally made. It would be pretty interesting could find the “hard way” to solve this:
    8809=6
    7111=9
    2172=6
    6666=5
    1111=6
    3213=6
    7662=10
    9313=5
    0000=0
    2222=6
    3333=5
    5555=8
    8193=2
    8096=7
    7777=7
    9999=8
    7756=7
    6855=8
    9881=7
    5531=8
    2581=?
    Hint, it is definitly a mathematical algorithnm that gets these solutions. The algorithm is simpler than your solution for the author’s problem.

  34. t says:

    You just count how many circles like 0=1 8=2 6=1

  35. [...] alert: my source gives the solution.  Thanks to CM for posting this puzzle. Share this:TwitterFacebookLike [...]

  36. [...] alert: my source gives the solution.  (Thanks to CM for posting this puzzle!)  Solving it the way he did [...]

  37. GS test demo says:

    Cerebral Mastication » Blog Archive » Solving easy problems the hard way

  38. Assaf says:

    I’d expect a pre-schooler to just google things, since that’s how they were brought up in this world. I therefore immediately googled the answer and managed to beat both programmers and people with higher education, despite belonging to both groups :).

  39. Ross says:

    @ Christophe Meert:

    Your solution is wrong – it fails on 6666 = 4

    6 is divisible by 3 and 2, thus it would be 6666 = 8 for your theory.

  40. Christophe Meert says:

    Yes I know, add number of zeros and substract number of 6 and it works.

  41. Christophe Meert says:

    Or just count the damn circles ;-)

  42. m vargas says:

    my way to solve it was econometrics, was assuming there’s a linear function so
    8809=6 2f(8) + f(0) + f(9) = 6
    and therefore the same for every equation so I just need a few of them to solve the matrix system

    my solution is here: https://dl.dropboxusercontent.com/u/12152177/ejercicio6.jpg

  43. JD Long says:

    M.Vargas, you are correct. If you look in my example I solved it using the lm() function in R which is the linear model function. I believe we ended up in the same place.

  44. [...] JD Long There’s a charming little brain teaser that’s going around the Interwebs. It’s [...]

Leave a Reply