How to use the Solr Data Import Handler to index a MySQL table?

user967451 picture user967451 · Apr 27, 2012 · Viewed 13.8k times · Source

When I try to import a mysql table by loading this in the browser:

http://192.168.136.129:8983/solr/dataimport?command=full-import

I get this error:

HTTP ERROR 404

Problem accessing /solr/dataimport. Reason:

    NOT_FOUND

Powered by Jetty://

I'm following this tutorial from the official Solr wiki to get started with the DIH:

http://wiki.apache.org/solr/DIHQuickStart

As per the tutorial I added this to my solrconfig.xml:

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
  <lst name="defaults">
    <str name="config">data-config.xml</str>
  </lst>
</requestHandler>  

in data-config.xml I have the following:

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://localhost/site" 
              user="root" 
              password="123"/>
  <document>
    <entity name="profiles" 
            query="select user_id,about,music,movies,occupation from profiles">
    </entity>
  </document>
</dataConfig>

And these are the fields defined in my schema.xml:

  <fields>
    <field name="user_id" type="string" indexed="true" stored="true" required="true" />
    <field name="about" type="string" indexed="true" stored="true" />
    <field name="music" type="string" indexed="true" stored="true" />
    <field name="movies" type="string" indexed="true" stored="true" />
    <field name="occupation" type="string" indexed="true" stored="true" />  
    <field name="text" type="text_general" indexed="true" stored="false" multiValued="true"/>
  </fields>

  <uniqueKey>user_id</uniqueKey>

So what am I doing wrong? I imagine it may have something to do with the data-config.xml file. In it I don't know if a certain path to the MySQL driver is being assumed. I downloaded the MySQL JDBC driver from here:

http://dev.mysql.com/downloads/connector/j/3.1.html

and put it in my /solr/lib directory.

When I downloaded the driver and extracted it there was a bunch of folders inside one folder called "mysql-connector-java-3.0.17-ga".

I do notice that inside that there is a dir called: com and inside that mysql and inside that jbdc and inside that there is a file called Driver.class.

Is this what is being referenced from data-config.xml? If so why isn't the initial directory not mentioned.

Basically I have no idea what the issue is, can someone help please.

Answer

Anand Khatri picture Anand Khatri · Feb 5, 2013

just make sure you have these lines of code in solrconfig.xml file

<lib dir="../../../contrib/dataimporthandler/lib/" regex=".*\.jar" />
<lib dir="../../../dist/" regex="solr-dataimporthandler-\d.*\.jar" />

make sure the path of those jar files and those jar files should be available physically at that path. If you don't have then please add that and try to restart the tomacat server and hopefully it will be resolved.