|
Lucene in Action Erik Hatcher and Otis Gospodnetić 2004 | 456 pages ISBN: 1932394281 |
|||
![]() |
$44.95 | Softbound print book | |
![]() |
$22.50 | ThoutReader + PDF ebook | |
Index
A
abbreviation, handling 355
accuracy 360
Ackley, Ryan 250
Adobe Systems 235
agent, distributed 349
AliasAnalyzer 364
Alias-i 361
Almaer, Dion 371
alternative spellings 354
analysis 103
Analyzers 19
Ant
Antiword 264
ANTLR 100, 336
Apache Jakarta 7, 9
Apache Software Foundation 9
Apache Software License 7
Arabic 359
architecture
ASCII 142
Asian language analysis 142
B
Bakhtiar, Amir 320
Beagle 318
Bell, Timothy C. 26
Berkeley DB, storing index 307
Bialecki, Andrzej 271
biomedical, use of Lucene 352
BooleanQuery 85
boosting 79
BrazilianAnalyzer 282
C
C++ 10
CachingWrappingFilter 171, 177
Cafarella, Michael 326
Carpenter, Bob 351
cell phone, T9 WordNet interface 297
ChainedFilter 177, 304
Chandler 307, 322
charades 125
Chinese analysis 142143, 282
CJK (Chinese Japanese Korean) 142
CJKAnalyzer 143, 145, 282
Clark, Andy 245
Clark, Mike 214
CLucene 314, 317
color
command-line interface 269
compound index
converting native files to ASCII 142
coordination, query term 79
Cozens, Simon 318
CPAN 318
crawler 372
crawling alternatives 330
CSS in highlighting 301
Cutting, Doug 9
CVS
CyberNeko. See NekoHTML
CzechAnalyzer 282
D
database 8
date, indexing 216
DateField 39
DateFilter 171173
DbDirectory 308
debugging, queries 94
DefaultSimilarity 79
deleting documents 375
Digester
Directory 19
directory in Berkeley DB 308
DMOZ 27
DNA 354
Docco 265
DocSearcher 264
Document 20, 71
document boosting 377
document frequency
document handler
document type handling
documentation 388
dotLucene 317318
downloading Lucene 388
Dutch 354
DutchAnalyzer 282
E
Egothor 24
encoding
Etymon PJ 264
Explanation 80
F
Field 2022
file handle
Filter 76
FilteredQuery 178, 212
filtering
foreign language analysis 140
Formatter 300
Fragmenter 300
FrenchAnalyzer 282
fuzzy string similarity 351
FuzzyEnum 350
FuzzyQuery 92
G
GCJ 308
German analysis 141
Giustina, Fabrizio 242
Glimpse 26
GNOME 318
Google 6, 27
government intelligence, use of Lucene 352
H
Harvest 26
Harvest-NG 26
Harwood, Mark 300
highlighting, query terms 300303, 343
Hindi 354
HitCollector 76, 201203
Hits 24, 7071, 76
ht://Dig 26
HTML 8
HTMLParser 264
HTTP
HTTP request
I
I18N. See internationalization
index optimization 5659
index structure
IndexFiles 389
IndexHTML 390
indexing
IndexReader 199
IndexSearcher 23, 70, 78
IndexWriter 19
information overload 6
Information Retrieval (IR) 7
Installing Lucene 387392
intelligent agent 6
internationalization 141
inverse document frequency 79
inverted index 404
IR. See Information Retrieval (IR)
ISO-8859-1 142
J
Jakarta Commons Digester 230235
Jakarta POI 249250
Japanese analysis 142
Java Messaging Service 352
Java, keyword 331
JavaCC 100
JavaScript
JDOM 264
jGuru 341
JGuruMultiSearcher 339
Jones, Tim 150
JPedal 264
jSearch 7
JTidy 242245
JUnitPerf 213
JWordNet 297
K
keyword analyzer 124
Konrad, Karsten 344
Korean analysis 142
L
language
LARM 7, 372
Levenshtein distance algorithm 92
lexicon, definition 331
LIMO 279
LingPipe 353
linguistics 353
Litchfield, Ben 236
Lookout 6, 318
Lucene
Lucene ports 312324
Lucene Wiki 7
Lucene.Net 6
lucli 269
Luke 271, 391
Lupy 308, 320322
M
Managing Gigabytes 26
Matalon, Dror 269
Metaphone 125
MG4J 26
Michaels.com 361371
Microsoft 6, 318
Microsoft Index Server 26
Microsoft Outlook 6, 318
Microsoft Windows 14
Microsoft Word 8
Miller, George 292
misspellings 354
mock object 131, 211
Moffat, Alistair 26
morphological variation 355
Movable Type 320
MSN 6
MultiFieldQueryParser 160
multifile index, creating 398
multiple indexes 331
MultiSearcher 178185
multithreaded searching. See ParallelMultiSearcher
Multivalent 264
N
Namazu 26
native2ascii 142
natural language with XM-InformationMinder 345
NekoHTML 245248, 329, 352
.NET 10
n-gram TokenStream 357
NGramQuery 358
NGramSearcher 358
Nioche, Julien 279
noisy-channel model 355
normalization
numeric
Nutch 7, 9, 329
O
OLE 2 Compound Document format 249
open files formula 401
OpenOffice SDK 264
optimize 340
orthographic variation 354
Overture 6
P
paging
ParallelMultiSearcher 180
Parr, Terence 329
ParseException 204, 379
parsing 73
partitioning indexes 180
PDF 8
PDF Text Stream 264
PDFBox 236241
PerFieldAnalyzerWrapper
performance
Perl 10
pharmaceutical, uses of Lucene 347
PhrasePrefixQuery 157159
PhraseQuery 87
Piccolo 264
Plucene 318320
POI 264
Porter stemming algorithm 136
Porter, Dr. Martin 25, 136, 283
position, increment offset in SpanQuery 161
precision 11, 360
PrefixQuery 84
Properties file, encoding 142
PyLucene 308, 322323
Python 10
Q
Query 23, 70, 72

