Jakarta Luceneのインストール

戻る

Lucene+JapaneseAnalyzerで全文検索プログラムを作るには以下のファイルが必要です。

■Apache Ant 
■Perl 
■sen 
■Lucene-ja 

/**
* 事前準備
*/

//////////////////////
// 1.Ant
//////////////////////
	
$ which ant
/cygdrive/d/apache-ant-1.6.0/bin/ant

//////////////////////
// 2.Perl
//////////////////////

$ which perl
/usr/local/bin/perl

//////////////////////
// 3.sen
//////////////////////
下記サイトよりダウンロード
https://sen.dev.java.net/servlets/ProjectDocumentList?folderID=755&expandFolder=755&folderID=0

$ wget https://sen.dev.java.net/files/documents/1373/31864/sen-1.2.2.1.zip

$ ls -l sen-1.2.2.1.zip
-rw-r--r--  1 KIOSK mkgroup-l-d 468390 Dec 9 07:58 sen-1.2.2.1.zip

$ unzip sen-1.2.2.1.zip
Archive:  sen-1.2.2.1.zip
   creating: sen-1.2.2.1/
   creating: sen-1.2.2.1/.settings/
   creating: sen-1.2.2.1/bin/
   creating: sen-1.2.2.1/conf/
   creating: sen-1.2.2.1/demo/
   creating: sen-1.2.2.1/dic/
   creating: sen-1.2.2.1/docs/
   creating: sen-1.2.2.1/docs/api/
   creating: sen-1.2.2.1/docs/api/class-use/
   creating: sen-1.2.2.1/docs/api/net/
   creating: sen-1.2.2.1/docs/api/net/java/
   creating: sen-1.2.2.1/docs/api/net/java/sen/
   creating: sen-1.2.2.1/docs/api/net/java/sen/class-use/
   creating: sen-1.2.2.1/docs/api/net/java/sen/io/
	〜以下省略〜

//////////////////////
// 4.Lucene-ja
//////////////////////

下記サイトよりダウンロード
https://sen.dev.java.net/servlets/ProjectDocumentList?folderID=755&expandFolder=755&folderID=0

$ wget https://sen.dev.java.net/files/documents/1373/11260/lucene-ja-1.4.3sen1.2-2.zip

$ ls -l lucene-ja-1.4.3sen1.2-2.zip
-rw-r--r--  1 KIOSK mkgroup-l-d 1002447 Feb  8  2005 lucene-ja-1.4.3sen1.2-2.zip

$ unzip lucene-ja-1.4.3sen1.2-2.zip
Archive:  lucene-ja-1.4.3sen1.2-2.zip
   creating: lucene-ja/
   creating: lucene-ja/bin/
  inflating: lucene-ja/bin/mkhtmlindex.bat
  inflating: lucene-ja/bin/mkhtmlindex.sh
  inflating: lucene-ja/bin/mktextindex.bat
  inflating: lucene-ja/bin/mktextindex.sh
  inflating: lucene-ja/bin/search.bat
  inflating: lucene-ja/bin/search.sh
  inflating: lucene-ja/bin/simplelog.properties
   creating: lucene-ja/docs-ja/
  inflating: lucene-ja/docs-ja/demo.html
  inflating: lucene-ja/docs-ja/demo2.html
  inflating: lucene-ja/docs-ja/demo3.html
  inflating: lucene-ja/docs-ja/gettingstarted.html
  inflating: lucene-ja/docs-ja/index.html
  inflating: lucene-ja/docs-ja/powered.html
  inflating: lucene-ja/docs-ja/resources.html
  inflating: lucene-ja/docs-ja/whoweare.html
   creating: lucene-ja/lib/
  inflating: lucene-ja/lib/commons-logging.jar
  inflating: lucene-ja/lib/lucene-1.4.3.jar
  inflating: lucene-ja/lib/lucene-demos-1.4.3.jar
  inflating: lucene-ja/lib/lucene-ja.jar
  inflating: lucene-ja/lib/sen.jar
  inflating: lucene-ja/LICENSE.txt
  inflating: lucene-ja/lucene-ja-src.jar
  inflating: lucene-ja/readme.txt
   creating: lucene-ja/webapp/
  inflating: lucene-ja/webapp/configuration.jsp
  inflating: lucene-ja/webapp/footer.jsp
  inflating: lucene-ja/webapp/header.jsp
  inflating: lucene-ja/webapp/index.jsp
  inflating: lucene-ja/webapp/results.jsp
   creating: lucene-ja/webapp/WEB-INF/
   creating: lucene-ja/webapp/WEB-INF/lib/
  inflating: lucene-ja/webapp/WEB-INF/lib/commons-logging.jar
  inflating: lucene-ja/webapp/WEB-INF/lib/lucene-1.4.3.jar
  inflating: lucene-ja/webapp/WEB-INF/lib/lucene-demos-1.4.3.jar
  inflating: lucene-ja/webapp/WEB-INF/lib/lucene-ja.jar
  inflating: lucene-ja/webapp/WEB-INF/lib/sen.jar

////////////////////////////////
// 5.senのインストール
////////////////////////////////

■ senの辞書ディレクトリに移動します

$ pwd
/cygdrive/d/MyDevelopment/Lucene/sen-1.2.2.1/dic

$ ls
build.xml  compound.pl  dictionary.properties  ipa2mecab.pl

$ ant -Dperl.bin=/usr/local/bin/perl

cygwinでやると失敗したので、DOSプロンプトから、以下のように実行

D:\MyDevelopment\Lucene\sen-1.2.2.1\dic>ant -Dperl.bin=D:\perl\bin\perl.exe
Buildfile: build.xml

prepare-proxy:

prepare-archive:

prepare-dics0:

prepare-dics:

download:

melt:

prepare:

dics0:
     [exec] ipadic-2.6.0/Adj.dic ...
     [exec] ipadic-2.6.0/Adnominal.dic ...
     [exec] ipadic-2.6.0/Adverb.dic ...
     [exec] ipadic-2.6.0/Auxil.dic ...
     [exec] ipadic-2.6.0/Conjunction.dic ...
     [exec] ipadic-2.6.0/Filler.dic ...
     [exec] ipadic-2.6.0/Interjection.dic ...
     [exec] ipadic-2.6.0/Noun.adjv.dic ...
     [exec] ipadic-2.6.0/Noun.adverbal.dic ...
     [exec] ipadic-2.6.0/Noun.demonst.dic ...
     [exec] ipadic-2.6.0/Noun.dic ...
     [exec] ipadic-2.6.0/Noun.nai.dic ...
     [exec] ipadic-2.6.0/Noun.name.dic ...
     [exec] ipadic-2.6.0/Noun.number.dic ...
     [exec] ipadic-2.6.0/Noun.org.dic ...
     [exec] ipadic-2.6.0/Noun.others.dic ...
     [exec] ipadic-2.6.0/Noun.place.dic ...
     [exec] ipadic-2.6.0/Noun.proper.dic ...
     [exec] ipadic-2.6.0/Noun.verbal.dic ...
     [exec] ipadic-2.6.0/Others.dic ...
     [exec] ipadic-2.6.0/Postp-col.dic ...
     [exec] ipadic-2.6.0/Postp.dic ...
     [exec] ipadic-2.6.0/Prefix.dic ...
     [exec] ipadic-2.6.0/Suffix.dic ...
     [exec] ipadic-2.6.0/Symbol.dic ...
     [exec] ipadic-2.6.0/Verb.dic ...



create:
     [java] [INFO] MkSenDic - (1/7): reading connection matrix ...
     [java] [INFO] MkSenDic - connection file = connect.csv
     [java] [INFO] MkSenDic - charset = EUC_JP
     [java] [INFO] MkSenDic - (2/7): building type dictionary ...
     [java] [INFO] MkSenDic - (3/7): writing conection matrix (5 x 1281 x 701 = 4489905) ...
     [java] [INFO] MkSenDic - (4/7): reading morpheme information ...
     [java] [INFO] MkSenDic - load dic: dic.csv
     [java] [INFO] MkSenDic - 50000...
     [java] [INFO] MkSenDic - 100000...
     [java] [INFO] MkSenDic - 150000...
     [java] [INFO] MkSenDic - 200000...
     [java] [INFO] MkSenDic - 250000...
     [java] [INFO] MkSenDic - 300000...
     [java] [INFO] MkSenDic - 350000...
     [java] [INFO] MkSenDic - (5/7): sorting lex...
     [java] [INFO] MkSenDic - (6/7): writing token...
     [java] [INFO] MkSenDic - key size = 378227
     [java] [INFO] MkSenDic - (7/7): building Double-Array (size = 325254) ...
     [java] [INFO] DoubleArrayTrie - save time = 0.571[s]
     [java] [INFO] MkSenDic - total time = 375[ms]



BUILD SUCCESSFUL
Total time: 7 minutes 39 seconds

※cygwinで失敗した理由はbuild.xmlでperlに関わる部分をチェック

$ grep -n perl build.xml
11:  <property name="perl.bin" value="c:/usr/cygwin/bin/perl.exe"/>
13:  <property name="perl.bin" value="/usr/bin/perl"/>
88:    <exec executable="${perl.bin}">
136:    <exec executable="${perl.bin}">

単にperlのPATHがずれていた、/usr/local/bin/perlに訂正して、単純に下記を実行

$ ant
Buildfile: build.xml

prepare-proxy:

prepare-archive:

prepare-dics0:

prepare-dics:

download:

melt:

prepare:

dics0:

create:
     [java] [INFO] MkSenDic - (1/7): reading connection matrix ...
     [java] [INFO] MkSenDic - connection file = connect.csv
     [java] [INFO] MkSenDic - charset = EUC_JP
     [java] [INFO] MkSenDic - (2/7): building type dictionary ...
     [java] [INFO] MkSenDic - (3/7): writing conection matrix (5 x 1281 x 701 = 4489905) ...
     [java] [INFO] MkSenDic - (4/7): reading morpheme information ...
     [java] [INFO] MkSenDic - load dic: dic.csv
     [java] [INFO] MkSenDic - 50000...
     [java] [INFO] MkSenDic - 100000...
     [java] [INFO] MkSenDic - 150000...
     [java] [INFO] MkSenDic - 200000...
     [java] [INFO] MkSenDic - 250000...
     [java] [INFO] MkSenDic - 300000...
     [java] [INFO] MkSenDic - 350000...
     [java] [INFO] MkSenDic - (5/7): sorting lex...
     [java] [INFO] MkSenDic - (6/7): writing token...
     [java] [INFO] MkSenDic - key size = 378227
     [java] [INFO] MkSenDic - (7/7): building Double-Array (size = 325254) ...
     [java] [INFO] DoubleArrayTrie - save time = 0.811[s]
     [java] [INFO] MkSenDic - total time = 399[ms]



BUILD SUCCESSFUL
Total time: 6 minutes 42 seconds

今度は成功!



////////////////////////////////
// 6.luceneのインストール
////////////////////////////////
	ダウンロードして解凍していますので
	特に作業は不要です。

戻る inserted by FC2 system