Lucene&solr 4 实践（1）-阿里云开发者社区

实践（1）关于solr-core pom和schema部分特性解读
本部分是入口，对pom依赖是工程的启动点，对schema的认识是应用的起点。

注意: pom 中配置信息要完全正确，另外，依赖的工程要求一一正确，后面才可以执行
build.bat > mvnlog.txt //mvn输出的控制台结果直接输出到 mvnlog.txt文件中
子工程pom中依赖的jar 版本会从父pom中寻找，同时除了版本不一样外，其他的信息要保持一致

（1）solr-core
pom 文件参考：http://repo1.maven.org/maven2/org/apache/solr/solr-core/4.0.0/solr-core-4.0.0.pom
版本信息实例参考如下，
注意：
--》版本号有比较强的依赖，例如jetty server的8的才可以，6、7、9的都不行。
--》schema的结构与性能的配合
<dependencies>
<dependency>
<groupId>com.taobao.terminator</groupId>
<artifactId>terminator-solrj</artifactId>
<version>4.0.0-SNAPSHOT</version>
</dependency>

<dependency>
<groupId>junit</groupId>
<artifactId>junit</artifactId>
<version>4.10</version>
<type>jar</type>
<scope>test</scope>
</dependency>

<dependency>
<groupId>easymock</groupId>
<artifactId>easymock</artifactId>
<version>2.0</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-core</artifactId>
<version>4.0.0</version>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-test-framework</artifactId>
<version>4.0.0</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-test-framework</artifactId>
<version>4.0.0</version>
<scope>test</scope>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-analyzers-common</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-highlighter</artifactId>
<version>4.0.0</version>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queryparser</artifactId>
<version>4.0.0</version>
</dependency>


<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-codecs</artifactId>
<version>4.0.0</version>
</dependency>

<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-memory</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-misc</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-queries</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-spatial</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-suggest</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>org.apache.lucene</groupId>
<artifactId>lucene-grouping</artifactId>
<version>4.0.0</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>1.6</version>
</dependency>


<dependency>
<groupId>commons-cli</groupId>
<artifactId>commons-cli</artifactId>
<version>1.2</version>
</dependency>

<dependency>
<groupId>commons-fileupload</groupId>
<artifactId>commons-fileupload</artifactId>
<version>1.2.2</version>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>jcl-over-slf4j</artifactId>
<version>1.6.4</version>
</dependency>

<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-jdk14</artifactId>
<version>1.6.4</version>
</dependency>

<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.3</version>
</dependency>

<dependency>
<groupId>commons-lang</groupId>
<artifactId>commons-lang</artifactId>
<version>2.4</version>
</dependency>

<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>r05</version>
</dependency>
<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-server</artifactId>
<optional>true</optional>


<version>8.1.2.v20120308</version>
</dependency>

<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-util</artifactId>
<optional>true</optional>


<version>8.1.2.v20120308</version>
</dependency>

<dependency>
<groupId>org.eclipse.jetty</groupId>
<artifactId>jetty-webapp</artifactId>
<optional>true</optional>


<version>8.1.2.v20120308</version>
</dependency>

<dependency>
<groupId>org.codehaus.woodstox</groupId>
<artifactId>wstx-asl</artifactId>
<scope>runtime</scope>
<version>4.0.6</version>
<exclusions>
<exclusion><groupId>stax</groupId>
<artifactId>stax-api</artifactId>
</exclusion></exclusions>
</dependency>

<dependency>
<groupId>javax.servlet</groupId>
<artifactId>servlet-api</artifactId>
<version>2.5</version>
<scope>provided</scope>
</dependency>

<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>4.2.1</version>
<exclusions>
<exclusion>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpmime</artifactId>
<version>4.2.1</version>
</dependency>
</dependencies>
（2）solr-schema
注意：
schema version信息与老版本的升级、bytes类型的发挥、boolean类型的发挥、int pint tint和区间和排序的问题、地理数据结构、全语言分词的支持、随机域的使用

this schema includes many optional features and should not be used for benchmarking.
To improve performance one could
- set stored="false" for all fields possible (esp large fields) when you only need to search on the field
but don't need to return the original value.
- set indexed="false" if you don't need to search on the field, but only
return the field as a result of searching on other indexed fields.
- remove all unneeded copyField statements
- for best index size and searching performance, set "index" to false for all general text fields,
use copyField to copy them to the catchall "text" field, and use that for searching.
- For maximum indexing performance, use the StreamingUpdateSolrServer java client.
- Remember to run the JVM in server mode, and use a higher logging level
that avoids logging every request

attribute "name" is the name of this schema and is only used for display purposes.
version="x.y" is Solr's version number for the schema syntax and
semantics. It should not normally be changed by applications.
1.0: multiValued attribute did not exist, all fields are multiValued
by nature
1.1: multiValued attribute introduced, false by default
1.2: omitTermFreqAndPositions attribute introduced, true by default
except for text fields.
1.3: removed optional field compress feature
1.4: autoGeneratePhraseQueries attribute introduced to drive QueryParser
behavior when a single string produces multiple tokens. Defaults
to off for version >= 1.4
1.5: omitNorms defaults to true for primitive field types
(int, float, boolean, string...)
2. sortMissingLast 和 sortMissingFirst对的属性之一。只在对该进行排序时，才起作用。
sortMissingLast = true时，那些在该上没有值的documents将被排在那些在该上有值的documents之后。
sortMissingFirst = true时的情况正好相反。
如果两者都设为false，则使用Lucene的排序。
3. positionIncrementGap=100 只对multiValue = true 的fieldType有意义。

<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="float" class="solr.TrieFloatField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>


<fieldType name="tint" class="solr.TrieIntField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tfloat" class="solr.TrieFloatField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0"/>
<fieldType name="tdouble" class="solr.TrieDoubleField" precisionStep="8" positionIncrementGap="0"/>


<fieldtype name="binary" class="solr.BinaryField"/>


<fieldType name="pint" class="solr.IntField"/>
<fieldType name="plong" class="solr.LongField"/>
<fieldType name="pfloat" class="solr.FloatField"/>
<fieldType name="pdouble" class="solr.DoubleField"/>
<fieldType name="pdate" class="solr.DateField" sortMissingLast="true"/>


<fieldType name="random" class="solr.RandomSortField" indexed="true" />




<CJK bigram >

Lucene&solr 4 实践（1）

热门文章

最新文章

相关课程

相关电子书

热门

活动广场

任务中心

开发者评测

高校计划

乘风者计划

训练营

阿里云MVP

话题

直播

下载

镜像站

技术资料

插件

Lucene&solr 4 实践（1）

热门文章

最新文章

相关课程

相关电子书