HBase & Solr Search Integration 1

­HBase & Solr – Near Real time indexing and search


A. HBase Table

B. Solr collection on HDFS

C. Lily HBase Indexer.

D. Morphline Configuration file

Once Solr server ready then we are ready to configure our collection (in solr cloud); which will be link to HBase table.

  • Add below properties to hbase-site.xml file.
  • Add below properties to/etc/hbase-solr/conf/hbase-indexer-site.xml. This will enable Lily indexer to reach HBase cluster for indexing. Replace your values for properties. Replace the hbase-cluster-zookeeper values as mentioned in hbase-site.xml, for local environment its value is localhost.

  • Restart below services

  • Create a hbase table with replication
Since the HBase Indexer works by acting as a Replication Sink, we need to make sure that Replication is enabled in HBase. You can activate replication using Cloudera Manager by clicking HBase Service->Configuration->Backup and ensuring “Enable HBase Replication" and “Enable Indexing" are both checked.
In addition, we have to make sure that the column family in the HBase table that needs to be replicated must have replication enabled. This can be done by ensuring that the REPLICATION_SCOPE flag is set while the column family is created, as shown below:

  • Create Solr cloud collection

Once you run above command get into path $HOME/hbase-collection1/conf in which there is solr config file; you can edit the schema.xml file

with our own schema, for this use case we have to add below tag which is column family of HBase (data).

  • Create a Solrcloud collection with the above schema.xml

Creating a Lily HBase Indexer configuration

Creating a Morphline Configuration File