## Solving easy problems the hard way

There’s a charming little brain teaser that’s going around the Interwebs. It’s got various forms, but they all look something like this:

This problem can be solved by pre-school children in 5-10 minutes, by programer – in 1 hour, by people with higher education … well, check it yourself!

8809=6

7111=0

2172=0

6666=4

1111=0

3213=0

7662=2

9313=1

0000=4

2222=0

3333=0

5555=0

8193=3

8096=5

7777=0

9999=4

7756=1

6855=3

9881=5

5531=0

2581=?

SPOILER ALERT…

The answer has to do with how many circles are in each number. So the number 8 has two circles in its shape so it counts as two. And 0 is one big circle, so it counts as 1. So 2581=2. Ok, that’s cute, it’s an alternative mapping of values with implied addition.

What bugged me was how might I solve this if the mapping of values was not based on shape. So how could I program a computer to solve this puzzle? I gave it a little thought and since I like to pretend I’m an econometrician, this looked a LOT like a series of equations that could be solved with an OLS regression. So how can I refactor the problem and data into a trivial OLS? I really need to convert each row of the training data into a frequency of occurrence chart. So instead of 8809=6 I need to refactor that into something like:

1,0,0,0,0,0,0,0,2,1 = 6

In this format the independent variables are the digits 0-9 and their value is the number of times they occur in each row of the training data. I couldn’t figure out how to do the freq table so, as is my custom, I created a concise simplification of the problem and put it on StackOverflow.com which yielded a great solution. Once I had the frequency table built, it was simple a matter of a linear regression with 10 independent variables and a dependent with no intercept term.

My whole script, which you should be able to cut and paste into R, if you are so inclined, is the following:

## read in the training data ## more lines than it should be because of the https requirement in Github temporaryFile <- tempfile() download.file("https://raw.github.com/gist/2061284/44a4dc9b304249e7ab3add86bc245b6be64d2cdd/problem.csv",destfile=temporaryFile, method="curl") series <- read.csv(temporaryFile) ## munge the data to create a frequency table freqTable <- as.data.frame( t(apply(series[,1:4], 1, function(X) table(c(X, 0:9))-1)) ) names(freqTable) <- c("zero","one","two","three","four","five","six","seven","eight","nine") freqTable$dep <- series[,5] ## now a simple OLS regression with no intercept myModel <- lm(dep ~ 0 + zero + one + two + three + four + five + six + seven + eight + nine, data=freqTable) round(myModel$coefficients)

Created by Pretty R at inside-R.org

The final result looks like this:

> round(myModel$coefficients)zero one two three four five six seven eight nine1 0 0 0 NA 0 1 0 2 1

Brilliant post JD!

Thanks,

Tal

Oh no……my intellect level is @ pre school level.

Did the preschoolers have complete information about the problem? It’s obviously meant to be confusing…

Thank for you this puzzel to make morning sick good to eat. It is liked to be in boat with numbers floating in eyez and for to have itchy penis.

Not sure I buy the assertion that a (typical) pre-schooler could solve this problem in 5-10 minutes. I’m sure that there are some that could

I doubt a second grader would figure out this series without a hint. (Hint Just like above sequence to value, but second graders can spell.)

8809=7

7111=8

2172=6

6666=0

1111=8

3213=7

7662=3

9313=7

0000=8

2222=4

3333=8

5555=4

8193=7

8096=5

7777=8

9999=4

7756=5

6855=4

9881=7

5531=6

2581=6

I didn’t use an algorithm, just the following function:

CREATE FUNCTION textToVal(@Parm varchar(10))

RETURNS VARCHAR(15)

AS

BEGIN

DECLARE @Val int=0,@Res varchar(15)=@Parm + ‘=’, @len int = len(@Parm), @pos int = 1

DECLARE @t TABLE(num char PRIMARY KEY, val int)

INSERT INTO @t VALUES (0,2),(1,2),(2,1),(3,2),(4,1),(5,1),(6,0),(7,2),(8,2),(9,1)

WHILE @pos <= @len

BEGIN

SELECT @Val=@Val+val, @pos=@pos+1 FROM @t WHERE num=SUBSTRING(@Parm,@pos,1)

IF @@ROWCOUNT = 0 SET @pos=@pos+1

END

SET @Res=@Res + CAST(@Val as varchar(5))

RETURN @Res

END

GO

I take it back, it is an algorithm, just not using regression logic.

Almost as smart as a preschooler. They would have been able to infer the value for 4.

Thats a hard problem solved easily

Very nice. Thank you for posting. As I try to wrap my mind around the nested function within the freqTable assignment, I’m intrigued by how apply applies the function table(c(X, 0:9)))-1) to do what it does. I understand table, and I think I understand c(), but how the constituent pieces come together–with the help of apply with the help of -1–to produce the final product is sheer data munging magic! I’d love it if someone could spoon feed me the hows and whys of this coding sorcery.

I must be 5, I figured it was the circles almost right away when you said kids figure it out quickly.

Brilliant problem, and technique for solving it. Preschoolers know shapes, and I guess as adults we know shapes as well, but only developers are going to know how to solve it with a lookup table, which is essentially what the script is.

Thanks.

[...] The hard way to solve the 8809=6 puzzle (with regression!); I solved it here. (I actually thought about using regression. But I did not.) [...]

The real answer! Just count circles from digits. It`s realy pre-school exercise

I love the regression summary – p-values exactly where you want them.

And the tabulate trick is a keeper.

> summary(myModel)

Call:

lm(formula = dep ~ 0 + zero + one + two + three + four + five +

six + seven + eight + nine, data = freqTable)

Residuals:

Min 1Q Median 3Q Max

-2.927e-15 -1.213e-16 2.759e-17 2.363e-16 1.578e-15

Coefficients: (1 not defined because of singularities)

Estimate Std. Error t value Pr(>|t|)

zero 1.000e+00 2.578e-16 3.878e+15 <2e-16 ***

one 1.054e-16 2.023e-16 5.210e-01 0.613

two 2.320e-17 2.319e-16 1.000e-01 0.922

three 2.475e-17 2.166e-16 1.140e-01 0.911

four NA NA NA NA

five 1.116e-16 2.181e-16 5.110e-01 0.619

six 1.000e+00 2.300e-16 4.348e+15 <2e-16 ***

seven 0.000e+00 2.308e-16 0.000e+00 1.000

eight 2.000e+00 3.650e-16 5.480e+15 <2e-16 ***

nine 1.000e+00 2.559e-16 3.908e+15 <2e-16 ***

—

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.067e-15 on 11 degrees of freedom

Multiple R-squared: 1, Adjusted R-squared: 1

F-statistic: 1.541e+31 on 9 and 11 DF, p-value: < 2.2e-16

I was actually thinking something similar; since I recently read the words “ℓ₀ norm” which is equivalent to your occurrence count.

I would have started looking for constants with constants c₀..c₉ — but also in OLS regressions we know we’re supposed to throw in a few squared terms and interaction terms! The dimensions would overwhelm the data and my regression would get nowhere.

The funny thing is, that viewpoint is actually just a more edu-ma-fied version of problems I had with these “find the pattern” puzzles as a kid. Why am I supposed to assume that the pattern takes a sequential (simple) counting form? Maybe it’s something to do with the relationship between the first digit and the third digit, or every other time it’s the fourth digit rather than those two. Maybe some parts of the puzzle are thrown in as noise. I used to read the “secret code” column in

Boys’ Lifeand the deceptive codes I would come up with would always use convoluted schemes like that — so why wouldn’t the puzzle maker?So I ran your

`myModel`

and was surprised to find the errors aresuper low! Every single ∫F < 2e−16 — that's three stars, which are very valuable in preschool.anova(myModel)

Analysis of Variance Table

`Response: dep`

Df Sum Sq Mean Sq F value Pr(>F)

zero 1 40.500 40.500 5.3263e+31 < 2.2e-16 ***

one 1 2.613 2.613 3.4363e+30 < 2.2e-16 ***

two 1 0.059 0.059 7.7220e+28 < 2.2e-16 ***

three 1 0.410 0.410 5.3886e+29 < 2.2e-16 ***

five 1 1.563 1.563 2.0549e+30 < 2.2e-16 ***

six 1 31.727 31.727 4.1725e+31 < 2.2e-16 ***

seven 1 0.158 0.158 2.0717e+29 < 2.2e-16 ***

eight 1 63.581 63.581 8.3617e+31 < 2.2e-16 ***

nine 1 17.391 17.391 2.2871e+31 < 2.2e-16 ***

Residuals 11 0.000 0.000

---

Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Warning message:

In anova.lm(myModel) :

ANOVA F-tests on

an essentially perfect fitare unreliableThis is like a neckbearded redditor talking about Victoria’s Secret models: “They’re

essentiallyperfect. There probably are some flaws so I’m always reluctant to give out a perfect 10, but this is as close as real life ever comes to perfection.”With

`n=2`

0 that’s very surprising; is it because the data are all perfect, or because the form of the model already contained the answer?First let’s see how well the model does with only half the data (randomly selected, of course!):

#sample.integer(out of total, how many, replace=FALSE)

> freqTable[sample.int(20,10),]

zero one two three four five six seven eight nine dep

15 0 0 0 0 0 0 0 4 0 0 0

4 0 0 0 0 0 0 4 0 0 0 4

2 0 3 0 0 0 0 0 1 0 0 0

14 1 0 0 0 0 0 1 0 1 1 5

10 0 0 4 0 0 0 0 0 0 0 0

3 0 1 2 0 0 0 0 1 0 0 0

16 0 0 0 0 0 0 0 0 0 4 4

1 1 0 0 0 0 0 0 0 2 1 6

6 0 1 1 2 0 0 0 0 0 0 0

20 0 1 0 1 0 2 0 0 0 0 0

`> c.mast$coef`

zero one two three four

1.000000e+00 1.252795e-16 -7.056736e-16 4.476346e-17 NA

five six seven eight nine

5.074747e-17 NA 1.364642e-15 2.000000e+00 1.000000e+00

I didn’t round so you can see that the rounding isn’t clipping a

`.3`

estimate down to zero: it’s truncating a trillionth of a billionth of a wrong.And yet again the ANOVA table spits out three stars on every one.

This was too amazing so I tried to really cut the hamstrings off of the OLS: you

fiveobservations, basically boiled cabbage with a dollop of ketchup. This seemed to distract the OLS hammer enough to make it hit the wrong mark, so I ran it a few times:> gimpy round(gimpy$coef)

zero one two three four five six seven eight nine

NA 3 -3 0 NA -5 NA 3 NA NA ound(gimpy$coef)

zero one two three four five six seven eight nine

NA 0 0 NA NA NA NA 0 NA 1

zero one two three four five six seven eight nine

NA 0 0 0 NA NA NA 0 2 NA

zero one two three four five six seven eight nine

NA -1 -1 1 NA NA 1 3 NA NA

zero one two three four five six seven eight nine

6 NA 4 NA NA 2 -1 0 NA NA

zero one two three four five six seven eight nine

NA 0 0 0 NA 0 1 NA NA NA

zero one two three four five six seven eight nine

5 0 2 3 NA 1 NA NA NA NA

> anova(gimpy)

Analysis of Variance Table

`Response: dep`

Df Sum Sq Mean Sq F value Pr(>F)

zero 1 25.0000 25.0000

one 1 0.5294 0.5294

two 1 4.0000 4.0000

three 1 8.4706 8.4706

five 1 9.0000 9.0000

Residuals 0 0.0000

Warning message:

In anova.lm(gimpy) :

ANOVA F-tests on an essentially perfect fit are unreliable

It so happens that my only hammer looks slightly more “M”-shaped than yours, and that’s what finally led me to the resolution. Only when I tried

`MASS::rlm(dep ~ 0 + one + two + three + four + five + six + seven + eight + nine, data=freqTable)`

that I finally figured out why these estimates are so perfect. It threw an error saying “singular matrix” so I looked back at`freqTable`

and looked back at`lm`

. Here’s what I found:?lm

`singular.ok: logical. If ‘FALSE’ (the default in S but not in R) a`

singular fit is an error.

Yup — take a look again at the

> freqTable

zero one two three four five six seven eight nine dep

1 1 0 0 0 0 0 0 0 2 1 6

2 0 3 0 0 0 0 0 1 0 0 0

3 0 1 2 0 0 0 0 1 0 0 0

4 0 0 0 0 0 0 4 0 0 0 4

5 0 4 0 0 0 0 0 0 0 0 0

6 0 1 1 2 0 0 0 0 0 0 0

7 0 0 1 0 0 0 2 1 0 0 2

8 0 1 0 2 0 0 0 0 0 1 1

9 4 0 0 0 0 0 0 0 0 0 4

10 0 0 4 0 0 0 0 0 0 0 0

11 0 0 0 4 0 0 0 0 0 0 0

12 0 0 0 0 0 4 0 0 0 0 0

13 0 1 0 1 0 0 0 0 1 1 3

14 1 0 0 0 0 0 1 0 1 1 5

15 0 0 0 0 0 0 0 4 0 0 0

16 0 0 0 0 0 0 0 0 0 4 4

17 0 0 0 0 0 1 1 2 0 0 1

18 0 0 0 0 0 2 1 0 1 0 3

19 0 1 0 0 0 0 0 0 2 1 5

20 0 1 0 1 0 2 0 0 0 0 0

Yup —

`dep`

is a linear combination of the other columns (with the constants c₀..c₉ being the known preschool kids’ solution). It was never about what OLS was finding: it was about what error-handling due to`singular.ok=TRUE`

was finding.Maybe that’s my takeaway:

`lm`

–the tool you use to find out when your left-hand side is a linear combination of the right-hand side.###Just a thought_Using another logic.. why not this way?

df<-as.character(c(8809,7111,2172,6666,1111,25816))

l<-data.frame()

for (i in 1:length(df))

{

x<-noquote(df[i])

for( j in 1:nchar(x))

{

yl[i,j]

}

}

for(i in 1:dim(l)[2])

{

l[,i]<-as.numeric(l[,i])

}

library(Deducer)

l[1:dim(l)[2]] ’0′;c(8) -> ’2′;c(6,9) -> ’1′;”)

for(i in 1:dim(l)[2])

{

l[,i]<-as.numeric(l[,i])

}

l$rowsum<-rowSums(l, na.rm = TRUE)

l

Just thinking: the numeral “4″ is a troublemaker. Does it count because there’s enclosed white space, or does it *not* count because the enclosed space is bounded by a finite-sided polygon (i.e. not a circle or ellipse)?

[...] a recent blog post by CMastication, a little meme puzzle is presented with the introduction that a preschooler could [...]

@KEN

I give up. To what are you referring with your similar “spelling” problem? Are you referring to the number of vowels? If you are, I think you should review how to spell fOUr, fIvE, sIx, and nInE. If not, please tell me.

I have worked this out in a completly different way….. And was shocked when my friend worked it out by the circles. I cracked the puzzle by the answers alone without the circles…….. Take a close look at the answers…

The first answer is 6……. The second is 4 .. 6-4 =2

1-4 turn it around 4-1= 3 5-1=4 4-1=3 3-5 turn it around 5-3=2

that how I got the answer 2

Just count the number of dividers that is not 1 or the number itself (so prime numbers excluded), and add the number of 0′s.

8809 = 6

8 dividable by 2 and 4, so 2 dividers

8 dividable by 2 and 4, so 2 dividers

9 dividable by 3, so 1 divider

1 zero

2581=?

8 dividable by 2 and 4, so the solution is 2

This works, not the official explanation, but works for all the numbers.

Just count the number of dividers that is not 1 or the number itself (so prime numbers excluded), and add the number of 0′s.

8809 = 6

8 dividable by 2 and 4, so 2 dividers

8 dividable by 2 and 4, so 2 dividers

9 dividable by 3, so 1 divider

1 zero

2581=?

8 dividable by 2 and 4, so the solution is 2

This works, not the official explanation, but works for all the numbers !

@Andrew, yes I was referring to spelling.

VALUES (0,2),(1,2),(2,1),(3,2),(4,1),(5,1),(6,0),(7,2),(8,2),(9,1)

zero zr=0,eo=2 one n=0, oe=2, two tw=0, o=1 three thr=0, ee=2

four fur=0, o=1, five fiv=0, e=1 six=0 seven svn=0, ee=2

eight iht=0, eg=2 nine nin=0, e=1. It’s corallary sequence would be

(0,2),(1,1),(2,2),(3,3),(4,3),(5,3),(6,3),(7,3),(8,3),(9,3) (Hint, I just GAVE you the corallary sequence!)

Unless the number starts a sentence or if you are trying to make a point, numbers are spelled in lower case, not upper. Of course using upper case does produce a different set of values. (R=1, E=0)

Additional hint, the original problem deals with the SHAPE, not the VALUE of the number. Like the value, consonants and vowels has nothing to do with it.

Sorry about the slow reply, I don’t lurk on older web sites and I just got three emails TODAY!!! Wow, talk about snail mail!

I actually think the “pre-schooler” information is a helpful hint. It demonstrated to me that the solution had to be something in the digits themselves and not in the values of the numbers. Made it pretty easy to spot after a few minutes perusal.

Or maybe I’m just a big 4 year old!

My preschooler got it in 3 seconds. I tried to have him explain how he got it. So confused. But I’m thinking that the question mark looks like a 2 so it must be 2.

To know the answer to (2581=?), you only need to know the values of 2, 5, 8 and 1, so:

(A)

If (2222 = 0) then the value of 2 = 0

If (5555 = 0) then the value of 5 = 0

If (1111 = 0) then the value of 1 = 0

And If we take any number that contains 8 as (8809 = 6) and as long as

(0000 = 4) then the the value 0 = 1

(9999 = 4) then the the value 9 = 1

so:

(B)

8+8+0+9 = (value of 8) + (value of 8) + 1 + 1 = 6 then value of 8 = 2

from (A) and (B) then (2581=2)

You are all wrong.

The real answer is 2581.

Nowhere did it ask any other question than, “2581=”.

Nowhere in the text does it tell you to find a pattern and apply it to the only unanswered problem.

Yes, in the past, there have been questions that ask you to find a pattern, but not in this one.

Why all this complicated algorithms ?

Why not a simple one like this :

solution = 0

equationSTR = str(equation)

for number in equation:

if number == 0 or number == 6:

solution += 1

elif number == 8:

solution += 2

#…

print solution

Dragnucs, you didn’t present an algo for solving the problem. You presented an algo for printing out a pre-calculated solution. That’s a considerably simpler problem.

Yes, but we have a solution, right ?

We can’t even say it’s an algo, its a simple procedure.

@Dragnucs, the problem with your solution is that you were handed the algorithm to begin with, the author took the data and recursively came up with the algorithm. On the other hand, the author made an assumption in order to come up with the algorithm. I doubt the following could be solved by a 3 year old and it certainly can’t be solved using the assumptions the author originally made. It would be pretty interesting could find the “hard way” to solve this:

8809=6

7111=9

2172=6

6666=5

1111=6

3213=6

7662=10

9313=5

0000=0

2222=6

3333=5

5555=8

8193=2

8096=7

7777=7

9999=8

7756=7

6855=8

9881=7

5531=8

2581=?

Hint, it is definitly a mathematical algorithnm that gets these solutions. The algorithm is simpler than your solution for the author’s problem.

You just count how many circles like 0=1 8=2 6=1

[...] alert: my source gives the solution. Thanks to CM for posting this puzzle. Share this:TwitterFacebookLike [...]

[...] alert: my source gives the solution. (Thanks to CM for posting this puzzle!) Solving it the way he did [...]

Cerebral Mastication » Blog Archive » Solving easy problems the hard way

I’d expect a pre-schooler to just google things, since that’s how they were brought up in this world. I therefore immediately googled the answer and managed to beat both programmers and people with higher education, despite belonging to both groups :).

@ Christophe Meert:

Your solution is wrong – it fails on 6666 = 4

6 is divisible by 3 and 2, thus it would be 6666 = 8 for your theory.

Yes I know, add number of zeros and substract number of 6 and it works.

Or just count the damn circles

my way to solve it was econometrics, was assuming there’s a linear function so

8809=6 2f(8) + f(0) + f(9) = 6

and therefore the same for every equation so I just need a few of them to solve the matrix system

my solution is here: https://dl.dropboxusercontent.com/u/12152177/ejercicio6.jpg

M.Vargas, you are correct. If you look in my example I solved it using the lm() function in R which is the linear model function. I believe we ended up in the same place.

[...] JD Long There’s a charming little brain teaser that’s going around the Interwebs. It’s [...]