python - KeyError: 'id' when trying to index documents to Solr using sunburnt -
i trying index few text files solr using sunburnt. below code
solr_url = "http://localhost:8983/solr" h = httplib2.http(cache="/var/tmp/solr_cache") solr_instance = sunburnt.solrinterface(url=solr_url, http_connection=h) url,title, webpage in webpages: html_id = hashlib.md5(url).hexdigest() doc = {"id":html_id, "content":webpage, "title":title} solr_instance.add(doc) try: solr_instance.commit() except: print "could not commit changes solr, check log files." else: print "successfully committed changes"
but when run below error.
file "/users/ananya/desktop/dbms project/code/extracttext/extracttext.py", line 94, in index_to_solr solr_instance = sunburnt.solrinterface(url=solr_url, http_connection=h) file "/users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 166, in __init__ self.init_schema() file "/users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/sunburnt.py", line 177, in init_schema self.schema = solrschema(schemadoc, format=self.format) file "/users/ananya/anaconda/lib/python2.7/site-packages/sunburnt/schema.py", line 417, in __init__ if self.unique_key else none keyerror: 'id'
i new solr. please me. need make changes schema file? if yes, please let me know how.
thanks.
if you're using solr 4.8 or greater bug against sunburnt 0.6.
the fork of sunburnt arafalov has patch fixed me.
try:
git clone git@github.com:arafalov/sunburnt.git cd sunburnt python setup.py install # optionally --user
Comments
Post a Comment