Solr
is the open source enterprise search platform from the Apache Lucene project.
Its major features include powerful full-text search, hit highlighting, faceted
search, near real-time indexing, dynamic clustering, database integration, rich
document (e.g., Word, PDF) handling, and geospatial search. Solr is highly
reliable, scalable and fault tolerant, providing distributed indexing,
replication and load-balanced querying, automated failover and recovery,
centralized configuration and more. Solr powers the search and navigation
features of many of the world’s largest internet sites.
To
know more about Apache Solr Features in detail and where it stands in
comparison with other popular cloud search engines refer article: http://harish11g.blogspot.in/2013/01/amazon-cloudsearch-vs-apache-solr_16.html
A
single Solr server can handle only a certain number of requests without
affecting performance. In such, not so uncommon, scenarios it is best to set up
a Solr master-slave cluster so that the load can be balanced effectively among
them. Master usually takes up the task of index updates whereas the
slaves’ responsibilities are to poll the master for updates and handle the
ever-increasing search requests.
This
article explains the index replication that works over HTTP and how to set it
up using Solr 3.6. Let’s get started!
Solr has
launched 4.X with better features on replication, sharding and High
Availability. Please check this post to understand more about SolrCloud 4.x
Index Replication
A master-slave replication includes both index replication and
(an optional) configuration files replication. Index replication, as the phrase
indicates, is the replication of Lucene index from the master to the slaves.
The slaves poll the master for any updates and the master sends a delta of the
index so that everyone can be in sync.
Setting up master-slave replication
Open
the file solrconfig.xml and add the following -
<requestHandler
name="/replication" class="solr.ReplicationHandler">
<lst name="master">
<!--Replicate on 'startup' and
'commit'. 'optimize' is also a valid value for replicateAfter. -->
<str
name="enable">${enable.master:false}</str>
<str
name="replicateAfter">startup</str>
<str
name="replicateAfter">commit</str>
<str
name="commitReserveDuration">00:00:10</str>
</lst>
<lst name="slave">
<str
name="enable">${enable.slave:false}</str>
<str
name="masterUrl">http://master_server_ip:solr_port/solr/replication</str>
<str
name="pollInterval">00:00:20</str>
</lst>
</requestHandler>
Note that
${enable.master:false}
and ${enable.slave:false}
are false indicating that currently this
machine is neither set up as a master nor a slave. These settings HAVE to be
overridden by specifying the values in the file solrcore.properties which is located under the conf directory of each core’s instance
directory.
On the master server, open the file solrcore.properties and add the following -
enable.master=true
enable.slave=false
On the slave server, open the file solrcore.properties and add the following -
enable.master=false
enable.slave=true
Fire
up these machines and you have a master-slave Solr cluster ready!
Repeater
A master may be able to serve only so many slaves without
adversely affecting performance. Some organizations have deployed slave servers
across multiple Data centers. If each slave downloads the index from a remote
data center, the resulting download may consume too much network bandwidth. To
avoid performance degradation in cases like this, you can configure one or more
slaves as repeaters. A repeater is simply a node that acts as both a master and
a slave. (enable.master=true
enable.slave=true
)
Note: Be sure to have
replicateAfter
‘commit’
setup on repeater even if replicateAfter
is
set to optimize on the main master. This is because on a repeater (or any
slave), only a commit is called after index is downloaded. Optimize is never
called on slaves.
As
with our lives, nothing is certain in the lives of machines too! Any machine
can go down at any time and there is nothing we can do about it except to plan
for such inevitable cases and have a mitigation strategy in place.
Mitigation Strategies when master is down
Since master-slave replication is done pull-style,
there are always inconsistencies with the indices of the master and the slaves.
When some loss
of updates are acceptable -
Mitigation Plan 1: Every machine is either a
master or a slave and not BOTH
1. Nominate one of the slaves as master
2. Stop the Solr server on the new master
3. Change the
4. Start the Solr server of new master
5. Detach the EIP from the failed master and associate with the new nominated master.
6. That’s it!
2. Stop the Solr server on the new master
3. Change the
solrcore.properties
to promote it as master.4. Start the Solr server of new master
5. Detach the EIP from the failed master and associate with the new nominated master.
6. That’s it!
Mitigation Plan 2: Every machine is both
master and slave (Concept of Repeater)
1.
Nominate one of the instances as master
2. Detach the EIP from the failed master and associate with the new nominated master.
3. That’s it!
2. Detach the EIP from the failed master and associate with the new nominated master.
3. That’s it!
In
each of the mitigation plans, the first step is to nominate a slave. The
obvious question arises – How do we decide which slave is the best-fit?
We have to choose that slave whose index is closest to master.
To carry out this operation, use the
LukeRequestHandler
(enabled
by default) and query the version
parameter.
This parameter shows the timestamp in milliseconds of the last index operation.
Pick the slave which satisfies the following conditions -
1. Retrieve the
2. Query the
3. Among the slaves, pick the slave that has the highest
4. As a double check, check that the nominated slave’s
version
attribute
on the master from S3. (Aside: Since the master is down currently, there is no
way you can get the version
of
the master. Hence, you have to query and store the master version
in S3
periodically when the master was running!)2. Query the
version
on
all solr slaves.3. Among the slaves, pick the slave that has the highest
version
. That is the best nomination.4. As a double check, check that the nominated slave’s
version
is
closest or equal to that of the master (Replicating a master index means
copying index as-is from the master to slaves. That’s why lastModified
and version
are
the same on a slave once replication is successful. This is the reason why
slave version
can
never be greater than that of the master)
However
in production environments, any loss of updates is not acceptable and hence
more robust mitigation plans need to be in place.
Mitigation
Plan 1
1.
Detach EBS from master and mount to any slave in the same AZ. (This is because
of EBS restriction)
2. Reattach EIP from master to the slave
3. That’s it!
2. Reattach EIP from master to the slave
3. That’s it!
Mitigation
Plan 2
1. Use GlusterFS as network file system. Index is
automatically replicated across AZ and regions.
2. Reattach EIP to the secondary master.
3. That’s it!
2. Reattach EIP to the secondary master.
3. That’s it!
Mitigation
Plan 3
1. Use SolrCloud feature
of Solr 4.0!. To know more about SolrCloud deployment strategies check : http://harish11g.blogspot.com/2013/03/Apache-Solr-cloud-on-Amazon-EC2-AWS-VPC-implementation-deployment.html
Original article was authored by vijay . He can be reached @ in.linkedin.com/in/vijayolety/
References
Related
Articles:
Introduction to Apache SolrCloud on AWS
Apache SolrCloud Implementation on Amazon VPC
Configuring Apache SolrCloud on Amazon VPC
Apache SolrCloud on AWS FAQ
Part 1: Comparison Analysis: Amazon CloudSearch vs Apache Solr
Apache SolrCloud Implementation on Amazon VPC
Configuring Apache SolrCloud on Amazon VPC
Apache SolrCloud on AWS FAQ
Part 1: Comparison Analysis: Amazon CloudSearch vs Apache Solr
No comments:
Post a Comment