
Teguh Arief
Published on: September 24, 2020
While Solr functions as both a search engine and a distributed document database, its DataImportHandler (DIH) provides an efficient mechanism for importing MySQL databases using Solr DataImportHandler and subsequently indexing their content. The primary advantage of this approach for data ingestion is the elimination of additional software development and the rapid integration of your data source.
Table of structure:
- virtual-host/docker-composer.yml
- etc/opt/solr/lib/...
- mysql-connector-java-5.1.46.jar
- mysql-connector-java-5.1.46-bin.jar
- ...
- solr-dataimporthandler-4.10.1.jar
- solr-dataimporthandler-extras-4.10.1.jar
- ...
- mycore/...
- data-config.xml
- ...
- schema.xml
- ...
- solrconfig.xml
- ...
- solr.xml
Docker Container
To begin importing MySQL databases using Solr DataImportHandler, first pull the necessary Docker images:
$ docker pull mysql:5.7.19 $ docker pull 2degrees/solr4
Setup
docker-compose.yml
Configure your `docker-compose.yml` to define both MySQL and Solr services, facilitating the process of importing MySQL databases using Solr DataImportHandler:
version: '2' services: mysql: image: mysql:5.7.19 environment: - MYSQL_USER=root - MYSQL_ROOT_PASSWORD=123456 ports: - "3306:3306" solr: image: 2degrees/solr4 volumes: - "./etc/opt/solr:/etc/opt/solr" - "./etc/opt/solr/mycore:/etc/opt/solr/mycore" - "./etc/opt/solr/lib:/etc/opt/solr/lib" ports: - "8983:8983" links: - mysql
Download Apache Solr DataImportHandler and MySQL connector
Copy `solr-dataimporthandler-*.jar` from the downloaded archive to the `lib` folder. This is crucial for importing MySQL databases using Solr DataImportHandler. Copy `mysql-connector-java-*.jar` from its downloaded archive to the `lib` folder.
solr.xml
To enable the DIH, add the following lines to `solr.xml`:
<solr> <cores adminPath="/admin/cores" defaultHost="127.0.0.1" /> </solr>
solrconfig.xml
Edit `solrconfig.xml` by adding the following configuration for the DataImportHandler:
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler>
And add these lines to `solrconfig.xml` to specify your `data-config.xml`:
<config> <dataDir>data-config.xml</dataDir> </config>
data-config.xml for MySQL database
The `data-config.xml` file will define the data to import from your MySQL data store, central to importing MySQL databases using Solr DataImportHandler:
<dataConfig> <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://mysql:3306/your_database_name" user="root" password="123456" /> <document> <entity name="your_table" query="SELECT * FROM your_table"> <field column="id" name="id" /> <field column="your_column1" name="your_column1" /> <field column="your_column2" name="your_column2" /> <!-- Add more fields as needed --> </entity> </document> </dataConfig>
schema.xml
Edit `schema.xml` to align with the fields defined in `data-config.xml` for proper indexing:
<!-- Basic field types --> <fieldType name="string" class="solr.StrField" sortMissingLast="true" /> <fieldType name="int" class="solr.IntPointField" docValues="true" /> <!-- Define fields that match your data-config.xml --> <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" /> <field name="your_column1" type="string" indexed="true" stored="true"/> <field name="your_column2" type="string" indexed="true" stored="true"/> <!-- Add more fields as needed --> <!-- Unique Key --> <uniqueKey>id</uniqueKey>
Execute
Now, a Full Import operation can be initiated either through the Solr admin screen or by accessing a URL similar to `http://192.168.99.100:8983/solr/mycore/dataimport?command=full-import`. Solr will display the DataImportHandler's configuration and allow you to start and monitor the status of import commands, as defined by the options selected on the Solr screen and within your configuration files, thereby successfully importing MySQL databases using Solr DataImportHandler.
Related Posts

Creation Search Engine on Laravel Sites Using Solr
Build powerful search on Laravel sites using Solr! This guide covers Solr configuration, Solarium integration, and best practices for efficient search.
Read More
NestJS CRUD with MongoDB using Mongoose
Learn to build a robust backend with NestJS, covering CRUD operations using MongoDB and Mongoose, from setup to creating, reading, and updating.
Read More
Running Nginx Virtual Host for Laravel Sites Using Docker
Run Laravel sites on Nginx virtual hosts using Docker. Step-by-step guide for setting up your local development environment efficiently.
Read More