I lost a disk partition that includes tools for making mozc-ut dictionary. I used yahoo and google's "hit numbers" to sort words in mozc-ut1, but I can't do it again. They don't provide free search API now. I wrote mozc-ut2 from scratch. I splitted Wikipedia's articles into 1 million files and got hit numbers by Hyper Estraier. mozc-ut2 will add over 500,000 words.
My big thanks go to the authors/maintainers.
Type "いんたーねっと" and press space ⇨ Internet
If you don't want to use it, run
and uncheck "Katakana to English conversion" in "Dictionary" tab.
Press Caps Lock, type "dolphin" and press Tab.
If you need a dictionary for human, check this page.
I think we can redistribute hatena's yomigana-hyouki pairs, but I can't believe we can redistribute niconico's pairs. If you want to make redistributable mozc-ut, don't uncomment #NICODIC="true" in generate-dictionary.sh.
See mozc's official Build Instructions. If you are using Arch Linux (tested on Antergos Linux), you can make and install packages as follows:
Get the latest Mozc.
Choose optional entries.
If you want to use an English-Japanese dictionary, uncomment the following line.
If you want to use a niconico dictionary, uncomment the following line.
You need 35GB disk space (use SSD) and it will take 8 hours.
This will download the latest edict/hatena/niconico/skk-jisyo files, and refresh hit numbers with the latest Japanese Wikipedia articles.
Install ruby and gcc-6.4.1.
estcmd built with gcc-7.2.0 caused segfault. I sent mails to the author, but I couldn't get a reply.
Install QDBM and Hyper Estraier.
I use Hyper Estraier to get hit numbers.
wget http://fallabs.com/qdbm/qdbm-1.8.78.tar.gz tar xf qdbm-1.8.78.tar.gz cd qdbm-1.8.78/ ./configure --prefix=/usr --enable-zlib make -j4 CC=/usr/bin/gcc-6 sudo make install wget http://fallabs.com/hyperestraier/hyperestraier-1.4.13.tar.gz tar xf hyperestraier-1.4.13.tar.gz cd hyperestraier-1.4.13/ ./configure --prefix=/usr --enable-zlib make -j4 CC=/usr/bin/gcc-6 sudo make install cd ../..
Put mozcdic-ut2 into mozc-tmp.
mkdir -p mozc-tmp mv mozcdic-ut2-date.tar.bz2 mozc-tmp/ cd mozc-tmp/ tar xf mozcdic-ut2-date.tar.bz2
mv alt-cannadic-110208.tar.bz2 mozcdic-ut2-date/alt-cannadic/
Change SEEDVER of mecab-user-dict-seed.
Check mecab-user-dict-seed.yyyymmdd.csv.xz and change SEEDVER in neologd/generate-dictionary.sh.
cd mozcdic-ut2-date/neologd/ leafpad generate-dictionary.sh
Change MOZCVER and DICVER.
cd ../ leafpad generate-dictionary.sh
cd src/ leafpad generate-release.sh
Refresh hit numbers with the latest Japanese Wikipedia articles.