Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Lucene: General

Nutch 1.3 & Solr 3.4

 

 

Lucene general RSS feed   Index | Next | Previous | View Threaded


ib.bakayoko at gmail

Sep 20, 2011, 8:22 PM

Post #1 of 1 (717 views)
Permalink
Nutch 1.3 & Solr 3.4

I am connecting drupal and nutch to the same apache solr index.
I am indexing drupal content pushing it to solr
and crawling a website and pushing the index to solr.

I am getting the following error while pushing nutch index to solr:

SolrIndexer: starting at 2011-09-20 22:23:03
SolrIndexer: finished at 2011-09-20 22:23:05, elapsed: 00:00:01
SolrDeleteDuplicates: starting at 2011-09-20 22:23:05
SolrDeleteDuplicates: Solr url: http://{servername}:8080/solr/intranetd/
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:362)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:152)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:54)

-----------------------------------------------------------------------------------

hadoop.log

2011-09-20 22:23:05,566 INFO solr.SolrIndexer - SolrIndexer: finished at
2011-09-20 22:23:05, elapsed: 00:00:01
2011-09-20 22:23:05,567 INFO solr.SolrDeleteDuplicates -
SolrDeleteDuplicates: starting at 2011-09-20 22:23:05
2011-09-20 22:23:05,567 INFO solr.SolrDeleteDuplicates -
SolrDeleteDuplicates: Solr url: http://{servername}:8080/solr/intranetd/
2011-09-20 22:23:07,695 WARN mapred.LocalJobRunner - job_local_0020
java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:388)
at org.apache.hadoop.io.Text.set(Text.java:178)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:271)
at
org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:242)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)

--------------------------------
I will appreciate you help.

When i use differents index and merge drupal and nutch index it's working
fine.

I am having the error when i try to push nutch index and drupal index to an
existing drupal index.

Can two applications write in the same index?


--
View this message in context: http://lucene.472066.n3.nabble.com/Nutch-1-3-Solr-3-4-tp3354294p3354294.html
Sent from the Lucene - General mailing list archive at Nabble.com.

Lucene general RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.