motemen puzzle - sudofox's journal

I became curious about id:motemen's profile image today. Is it a puzzle?

f:id:austinburk:20180817095930p:plain

What if we interpret each color as an HTML color code?

#BCCFDE #FCFFF5 #15357C #9AA6B8
#754826 #FEDBCE #F6BFAE #D32925
#9AA2B8 #FDBCB0 #FDAA9A #DB6055
#A4AAAE #30363C #481E11 #4F5054

aburk@aburk:~/Research/motemen-puzzle$ file motemen.png 
motemen.png: PNG image data, 400 x 400, 8-bit/color RGBA, non-interlaced
aburk@aburk:~/Research/motemen-puzzle$ cat color_codes.txt 
#BCCFDE #FCFFF5 #15357C #9AA6B8
#754826 #FEDBCE #F6BFAE #D32925
#9AA2B8 #FDBCB0 #FDAA9A #DB6055
#A4AAAE #30363C #481E11 #4F5054

But which order is it in?

Left to right, then top to bottom

aburk@aburk:~/Research/motemen-puzzle$ cat color_codes.txt | tr -d "#" |tr -d " "|tr -d "\n"|xxd -r -p|xxd
00000000: bccf defc fff5 1535 7c9a a6b8 7548 26fe  .......5|...uH&.
00000010: dbce f6bf aed3 2925 9aa2 b8fd bcb0 fdaa  ......)%........
00000020: 9adb 6055 a4aa ae30 363c 481e 114f 5054  ..`U...06<H..OPT

That doesn't resemble anything particularly useful.

Top to bottom, then left to right

aburk@aburk:~/Research/motemen-puzzle$ for position in $(for word in $(seq 1 4); do echo $word; done); do awk -v position="$position" '{print $position}' color_codes.txt ; done|tr -d '#'|tr -d '\n'
BCCFDE7548269AA2B8A4AAAEFCFFF5FEDBCEFDBCB030363C15357CF6BFAEFDAA9A481E119AA6B8D32925DB60554F5054

Hm..

aburk@aburk:~/Research/motemen-puzzle$ for position in $(for word in $(seq 1 4); do echo $word; done); do awk -v position="$position" '{print $position}' color_codes.txt ; done|tr -d '#'|tr -d '\n'|xxd -r -p|xxd
00000000: bccf de75 4826 9aa2 b8a4 aaae fcff f5fe  ...uH&..........
00000010: dbce fdbc b030 363c 1535 7cf6 bfae fdaa  .....06<.5|.....
00000020: 9a48 1e11 9aa6 b8d3 2925 db60 554f 5054  .H......)%.`UOPT

Clearly I will get the same 3-byte words here, no matter which way I read it. We have 48 bytes from the color codes - this is not a power of 2. Could it be a hash?

I looked up the hash functions that would produce a 48-byte hash.

The only hash function that I can find that produces a 48-byte hash is SHA-384.

Let's write a quick script we can use to test it out:

#!/usr/bin/env perl
# Sudofox - motemen puzzle tester

if (!@ARGV) {
    print "Usage: ./generate.pl <48 bytes of hex>\n";
    exit();
}
my $hex        = $ARGV[0];
my @colorCodes    = unpack("(A6)*", $hex);

print <<'HTML';
<style>
.motemen-cell { width: 50px; height: 50px; word-break:break-all; text-transform: uppercase; font-size:21px; text-align: center; font-family:monospace; line-height:initial;}
.motemen-holder { display: flex; flex-wrap:wrap; width: 200px;}
</style>

<div class="motemen-holder">
HTML
foreach my $color(@colorCodes) {
    print '<div class="motemen-cell" style="background-color: #' .$color. ';">'.$color.'</div>';
}
print "</div>\n";

First, the original image:

f:id:austinburk:20180817095930p:plain

How about "motemen"?

printf motemen|sha384sum|awk '{print $1}'|xargs ./generate.pl

a2acaa

da0a3a

20c41e

b28a08

9c510b

59280d

d2b967

5e9454

dda0bf

4fc087

29cb61

960a7a

cec04c

637bde

f1b95f

a8e7e5

I tried the following:

Hatena
hatena
Hironao OTSUBO
(motemen's publicly-listed email address)
Hiragana: ひろなおオツボ

However, none of these produced a matching image.

My next thought is "brute-force". We can pull a bunch of webpages related to motemen: github code, HTML, bios, et cetera, and then extract all words from it as tokens. We can then check if a substring of the SHA-384 hash contains one of the sections of 3-byte HTML color-codes.

Unfortunately, I had very little luck finding an effective way of extracting words from HTML documents as tokens. I tried HTML::Extract, HTML::Treebuilder, and HTML::TokeParser, but had very little success.

I had an idea, though! There's another source of keywords, right from Hatena!

Hatena Keyword Documentation

It comes in EUC-JP, but everyone uses UTF-8, come on..

First, there's some weird encoding things we need to fix up since it seems a bit broken.

aburk@aburk:~$ cat keywordlist_furigana.csv|tr '\t' '\n' > a.csv; iconv a.csv -f EUC-JP -t UTF-8//IGNORE|sort|uniq > b.csv
aburk@aburk:~$ wc -l b.csv
337408 b.csv

Now, let's try each against our two different ways of reading the colorcodes.

#!/bin/bash
# Motemen Puzzle Tester


function test_word () {
    COMBO1="bccfdefcfff515357c9aa6b8754826fedbcef6bfaed329259aa2b8fdbcb0fdaa9adb6055a4aaae30363c481e114f5054"
    COMBO2="bccfde7548269aa2b8a4aaaefcfff5fedbcefdbcb030363c15357cf6bfaefdaa9a481e119aa6b8d32925db60554f5054"
    INPUT=$1

    if [[ $COMBO1 == $(echo $INPUT|sha384sum|awk '{print $1}') || $COMBO2 == $(echo $INPUT|sha384sum|awk '{print $1}') ]]; then
        echo "MATCH: $INPUT";
    fi
}

while read line
do
    test_word "$line";
done

This turned out to be way too slow, so I converted the list of keywords into a list of hashes instead.

aburk@aburk:~/Research/motemen-puzzle$ grep bccfdefcfff515357c9aa6b8754826fedbcef6bfaed329259aa2b8fdbcb0fdaa9adb6055a4aaae30363c481e114f5054 ~/b_hashed.txt
aburk@aburk:~/Research/motemen-puzzle$ grep bccfde7548269aa2b8a4aaaefcfff5fedbcefdbcb030363c15357cf6bfaefdaa9a481e119aa6b8d32925db60554f5054 ~/b_hashed.txt
aburk@aburk:~/Research/motemen-puzzle$ grep bccfde ~/b_hashed.txt
5f5ec82739f9daba961803aab4d6f78382c0c5b4e519994e98bcd1c3589f31a95bccfde25fab41cffe92e730d080c743
9902899010abde2bcd5aea716c2ffa284639fbe22e3843f7a9f0d53bb1bbccfdeedf0b85c23685ba52f4debdce0f17b9
be1ebeb4a84d65f467e2e5545ab0991dfbccfdec7d5c80292210570a2f186cf1801c3cdb6a4478856011a162991e82fe
c637e83f6e129054270ce1ae97ab24a0260526a4fc57d60c87da7fb94bccfde319d07b4ac26aee1b2d68208b530497b0
aburk@aburk:~/Research/motemen-puzzle$ grep bccfde ~/b_hashed.txt --color
5f5ec82739f9daba961803aab4d6f78382c0c5b4e519994e98bcd1c3589f31a95bccfde25fab41cffe92e730d080c743
9902899010abde2bcd5aea716c2ffa284639fbe22e3843f7a9f0d53bb1bbccfdeedf0b85c23685ba52f4debdce0f17b9
be1ebeb4a84d65f467e2e5545ab0991dfbccfdec7d5c80292210570a2f186cf1801c3cdb6a4478856011a162991e82fe
c637e83f6e129054270ce1ae97ab24a0260526a4fc57d60c87da7fb94bccfde319d07b4ac26aee1b2d68208b530497b0
aburk@aburk:~/Research/motemen-puzzle$ grep bccfde ~/b_hashed.txt --color|grep fcfff5

Alas, I was once again unsuccessful. At this point, I'm going to guess that it's not a puzzle! But, I did give it my best shot, and learned some things along the way, so I can say that I'm satisfied :)