Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: Java-User
Estimate index filesystem requirements
 

Index | Next | Previous | View Flat


lothar at mountproc

Nov 18, 2007, 7:15 AM


Views: 351
Permalink
Estimate index filesystem requirements

Hi,

I'm wondering if there is a kind of "formule" to estimate the size of a
lucene index. Searching the list, I did not find any pointers.

Does anybody has a hint?

What I figured out from the file format description and some empirical
tests is, that for every index-file:
Field-files:
field-data .fdt: NumberOfDocs * NumberOfFieldsPerDoc
field-index .fdx: NumberOfDocs * 8
field-info .fnm: ignored

Term-Files:
term-data .tis: NumberOfTerms * 8
term-index .tii: no idea so far
term-freq: .frq: estimated as NumberOfDocs * NumberOfTerms
Normalization:
Norm file: .nrm: NumberOfDocs

This concerns only Un-stored fields of course.

I estimate the total NumberOfTerms of my document collection with 10% of
the NumberOfDocuments. Does someone has similiar experience?

lofi


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe[at]lucene.apache.org
For additional commands, e-mail: java-user-help[at]lucene.apache.org

Subject User Time
Estimate index filesystem requirements lothar at mountproc Nov 18, 2007, 7:15 AM

  Index | Next | Previous | View Flat
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.