In this post we will discuss about the basic introduction of Apache Solr and we will also describe the procedure for Apache Solr Installation on Ubuntu Machine.
Apache Solr Overview:
What is Apache Solr?
Apache Solr is another top level project from Apache Software Foundation, it is an open source enterprise search platform built on Apache Lucene. As Apache Solr is based on open source search engine Apache Lucene, some times these two words are used interchangeably Lucene/Solr.
- Solr provides a blazing-fast search platform
- Supports Full-text search, hit highlighting, faceted search, dynamic clustering
- Solr provides easy integration with databases and rich set of document handling. We can easily export XML, Word, PDF, etc…documents to Solr.
- Solr is highly scalable by its distributed search and index replication feature.
- Solr is written in Java and runs inside a Java servlet container such as Tomcat, Jetty, or Resin.
- Solr is highly reliable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration.
- Supports Near Real-Time Indexing on data collected via MorphlineSink in Flume setup.
- Optimized for high volume traffic and Standards are based on Open Interfaces – XML, JSON and HTTP. We can see the search results in any of the formats – JSON, XML, Text, etc…
Apache Solr Installation on Ubuntu:
In this section we will install Apache Solr 4.10.2 (The latest stable release of solr at the time of writing this post) on Ubuntu 14.04 machine.
- As Solr is written in Java, Apache Solr requires at least Jdk 1.7 to be available on the machine. Check the java version with $ java -version command on terminal.
As Solr runs inside a Java servlet container, we need any Java servlet container (Tomcat, Jetty or Resin) to be available for Solr to run. By default Apache Solr distribution contains working Jetty Java Servlet container, with optimized settings for Solr, inside the
exampledirectory. For example tutorials we can use the Jetty servlet container but in this post we are using the famous Tomcat as servlet container.
- First check whether java version is JDK1.7 or later if not re-install the JDK by purging the older version first.
- Tomcat is an open source implementation of the Java Servlet and JavaServer Pages technologies, released by the Apache Software Foundation. Install latest stable version of Tomcat if it is not available already on Ubuntu machine. Below are the installation instructions for Tomcat7 on ubuntu 14.04.
Verify the status of Tomcat7 service with the below command:
or we can also verify the Tomcat Installation at http://hostname or IP address:8080. If we receive message similar to below then Tomcat installation is successful.
We can start/stop/restart or check tomcat7 service status with below commands:
There are three important directories for Tomcat in Ubuntu installation:
/etc/tomcat7/Catalina :for configuration of new apps
- /usr/share/tomcat7/lib : Library containing required jar files or properties files
/usr/share/tomcat7-root: for webapps
The alternative path to Tomcat, called CATALINA_BASE, is
Configure Tomcat Web Interface
By default, no user is included in the “manager-gui” role, which is required to operate the “/manager/html” web application. As we open solr in web UI, we must define a user. In order to use the manager webapp we must add a login to our Tomcat server. We will do this by editing the
Add the below lines in between <tomcat-users> </tomcat-users> tags to add a new user.
Save and quit the tomcat-users.xml file.
- Download the latest stable binary tarball version of Apache Solr from Apache download mirrors. In this post we are using solr-4.10.2.tgz.
- Copy the gzipped tarball into our preferred location of installation directory (usually into /usr/lib/solr) and extract its contents.