First I give the architecture of the Segmentation and Translit aftering analyzing the classes in the Segmenter.jar written by endeca.com group.
Segmentation process:
Translit process:
We can see the two process are just use the same design pattern .Use adapter to modify the interface that can be suitable to be wrapped into the Runnable thread, and in the inner implementation ,it delegate the AdapterHandler to handle it .
Segmenter process is used for segmentation ABCDE to A B C D E.
Translit process is used for read the feed file and make dictionary.
AdapterConfig ,( I cannot see the source code or decompiled code ,because ENDECA is not open sourced),but just guess from it ,I think it is a configuration wrapper class which can be mapped to some adapter configuration in the form of xml.
AdapterHandler, (still I cannot see the source code) .I think it can wrap some callback method and be used in the Segmenter & Translit process.
Some functions I cannot totally understand ,which prevent me from understanding the whole process .
1. The first() method in the AdapterConfig class , I know it is used to retrieve maybe the first element which map the element tagged in the parameter ,but I do not know which file it parses.
2. The emit() method in the AdapterHandler class