i recently merged a whole lot of indexes into one big index for testing
purposes. however now the programs i use to search the index are taking
much longer. this may be a stupid question (orHi all
I`ve got a question about the delete feature. I have a very large
collection of XML documents each document contains a classification
and one document can be in different classficHello
I am new in opencms and lucene tecnology.
I won index pdf files and index de content of this files.
I work in this way
Make a PDFDocument class like JspDocument class.
Is there a formal grammar available that describes
the latest query syntax? And where can I get it?
A new Lucene release is available.
It can be downloaded from
Release notes are at
Does Lucene support exact matching on a tokenized field?
So for example... if I add these three phrases to the index
- "The quick brown fox "
- "The quick brown fox jumped "
- "brown fox "
What could cause such weird exception?
RAMInputStream. <init > java.lang.NullPointerException
at org.apache.lucene.store.RAMInputStream. <init >(RAMDirectoryI 've found something about expression extractions (the ability when a word
and another appear frequently side-by-side to detect that they form an
expression) http //www.miv.t.u-tokyo.ac.jp/papeHi all
For my job in indexing stage I would like to keep stop words such as the with of by etc as normal words. I did this by instantiating a standardAnalyzer object (in INdexHTML program) witHi.
I 'm trying to extract expressions from the terms position information i.e.
if two words appears frequently side-by-side then we can consider that the
two words are only one. For instance Is anyone doing anything interesting with the
Token.setPositionIncrement during analysis?
Just for fun I 've written a simple stop filter that bumps the position
increments to account for the sHi
Wonder if anyone can help. Has anyone used Lucene on a Windows environment?
Anyone know of any documentation specifically focused on doing that?
Or anyone know of any gotchas to avoid?
I have a very hierarchical document structure where each level of the
hierarchy contains indexable information. It looks like this
Study - >
Section - >
DataFile - >
This is pretty much off topic but...
ZOE has been nominated as one of the candidate project to go the Open
Source Innovation Area on the COMDEX Exhibit Floor.
Can the Lucene search engine index and search though PDF documents?
What are the file format limits for Lucene search engine.
Thanks in Advance
Is there a recommended strategy to deal with allowing to search an index
that is updated continuously?
One idea that I thought of is to have two indexes one for searching and
one for inde
I am trying to index UTF-8 encoded HTML files with content in various
languages with Lucene. So far I always receive a message
"Parse Aborted Lexical error at line 146 column 79.
I have a field "VOLUME " of type "keyword ". When I search for "VOLUME 1 " the
expected hits are returned but when I search for "VOLUME 2 " I get an
ArrayIndexOutOfBoundsException with message 101 Hi all.
I need to define my own tokenizer so as to detect accentuated characters.
So as not to modify the Lucene classes I made a copy of the
StandardTokenizer.jj in another package.
Then I modifieHello
when I search for "MS-Word " I get all the documents that contain exactly
that word which is good. If however I search for MS-Word (without the
quotes) then the MultiFieldQueryParser restMoving to lucene-user list.
If not the author maybe some users of this code can tell us how this
uppercase/lowercase business should work.
And the issue even includes patches. I don 't use the Germ
BroadVision tell me this is far better than their 2 attempts at using
Now there is absolutely no reason for any BroadVision site not to have a
pretty damn good search facility.
I know youHi
When I use the Mutliple index seach on one large index and one small index
look like sometimes the documents from the small index get higher score
compared the documents from the big index. But wAs with many people I want the default query behavior to be AND (instead
of OR). However I 'm also (always) creating multi-field queries. I don 't
see a way to accomplish this cleanly in the API. hi
does anyone know of a way to get the similarity between two documents as
opposed to between a document and a query? at the moment i 'm forced to
make a term-frequency vector for each document anHi
The index directory that Lucene created has 2 322 files in it. When I
try to open it I get the dreaded "Too Many Open Files " problem
java.io.FileNotFoundException C \Index\_1lvq.f107 (Too Hello
I 'm playing around with Struts to see if i should build my search web app using the Struts framework. I began by making an Action which performs the search and places the Hits object on the seI 'm running lucene 1.2 and when I do the following query I get the
at org.apache.lucene.queryParser.QueryParser.Term(Unknown SoHi
I 'm having quite a bit of success with Lucene designing a new search
tool for our website -- the only problem is that I 've had to drop down
to java 1.3.6 (all our production system are java 1.4.xHi!
Somebody wrote a SQLDirectory for lucene 1.2 (only) but discontinued it for a
matter of performance issues.
Well I really would like to store that index at the same place as the data