Teguh Arief

Importing MySQL Databases Using Solr DataImportHandler

An illustration depicting the process of importing MySQL databases using Solr DataImportHandler.

Teguh Arief

Published on: September 24, 2020

Share:

While Solr functions as both a search engine and a distributed document database, its DataImportHandler (DIH) provides an efficient mechanism for importing MySQL databases using Solr DataImportHandler and subsequently indexing their content. The primary advantage of this approach for data ingestion is the elimination of additional software development and the rapid integration of your data source.

Table of structure:

  • virtual-host/docker-composer.yml
  • etc/opt/solr/lib/...
  • mysql-connector-java-5.1.46.jar
  • mysql-connector-java-5.1.46-bin.jar
  • ...
  • solr-dataimporthandler-4.10.1.jar
  • solr-dataimporthandler-extras-4.10.1.jar
  • ...
  • mycore/...
  • data-config.xml
  • ...
  • schema.xml
  • ...
  • solrconfig.xml
  • ...
  • solr.xml

Docker Container

To begin importing MySQL databases using Solr DataImportHandler, first pull the necessary Docker images:

$ docker pull mysql:5.7.19
$ docker pull 2degrees/solr4

Setup

docker-compose.yml

Configure your `docker-compose.yml` to define both MySQL and Solr services, facilitating the process of importing MySQL databases using Solr DataImportHandler:

version: '2'
services:
    mysql:
        image: mysql:5.7.19
        environment:
            - MYSQL_USER=root
            - MYSQL_ROOT_PASSWORD=123456
        ports:
            - "3306:3306"
    solr:
        image: 2degrees/solr4
        volumes:
            - "./etc/opt/solr:/etc/opt/solr"
            - "./etc/opt/solr/mycore:/etc/opt/solr/mycore"
            - "./etc/opt/solr/lib:/etc/opt/solr/lib"
        ports:
            - "8983:8983"
        links:
            - mysql

Download Apache Solr DataImportHandler and MySQL connector

Copy `solr-dataimporthandler-*.jar` from the downloaded archive to the `lib` folder. This is crucial for importing MySQL databases using Solr DataImportHandler. Copy `mysql-connector-java-*.jar` from its downloaded archive to the `lib` folder.

solr.xml

To enable the DIH, add the following lines to `solr.xml`:

<solr>
  <cores adminPath="/admin/cores" defaultHost="127.0.0.1" />
</solr>

solrconfig.xml

Edit `solrconfig.xml` by adding the following configuration for the DataImportHandler:

<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
  <lst name="defaults">
    <str name="config">data-config.xml</str>
  </lst>
</requestHandler>

And add these lines to `solrconfig.xml` to specify your `data-config.xml`:

<config>
  <dataDir>data-config.xml</dataDir>
</config>

data-config.xml for MySQL database

The `data-config.xml` file will define the data to import from your MySQL data store, central to importing MySQL databases using Solr DataImportHandler:

<dataConfig>
  <dataSource driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://mysql:3306/your_database_name"
              user="root"
              password="123456" />
  <document>
    <entity name="your_table" query="SELECT * FROM your_table">
      <field column="id" name="id" />
      <field column="your_column1" name="your_column1" />
      <field column="your_column2" name="your_column2" />
      <!-- Add more fields as needed -->
    </entity>
  </document>
</dataConfig>

schema.xml

Edit `schema.xml` to align with the fields defined in `data-config.xml` for proper indexing:

<!-- Basic field types -->
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<fieldType name="int" class="solr.IntPointField" docValues="true" />
<!-- Define fields that match your data-config.xml -->
<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
<field name="your_column1" type="string" indexed="true" stored="true"/>
<field name="your_column2" type="string" indexed="true" stored="true"/>
<!-- Add more fields as needed -->

<!-- Unique Key -->
<uniqueKey>id</uniqueKey>

Execute

Now, a Full Import operation can be initiated either through the Solr admin screen or by accessing a URL similar to `http://192.168.99.100:8983/solr/mycore/dataimport?command=full-import`. Solr will display the DataImportHandler's configuration and allow you to start and monitor the status of import commands, as defined by the options selected on the Solr screen and within your configuration files, thereby successfully importing MySQL databases using Solr DataImportHandler.

Related Posts

Illustrated guide for creation search engine on Laravel sites using Solr.

Creation Search Engine on Laravel Sites Using Solr

Build powerful search on Laravel sites using Solr! This guide covers Solr configuration, Solarium integration, and best practices for efficient search.

Read More
Building a NestJS CRUD application with MongoDB and Mongoose for a superior backend.

NestJS CRUD with MongoDB using Mongoose

Learn to build a robust backend with NestJS, covering CRUD operations using MongoDB and Mongoose, from setup to creating, reading, and updating.

Read More
An illustration demonstrating the process of running Nginx virtual host for Laravel sites using Docker.

Running Nginx Virtual Host for Laravel Sites Using Docker

Run Laravel sites on Nginx virtual hosts using Docker. Step-by-step guide for setting up your local development environment efficiently.

Read More