SearchStax Help Center


Replicas Missing on some Cluster Nodes

In the SearchStax Managed Search service, the easiest mistake to make when creating a new Solr collection is to specify the wrong number of replicas. The number of replicas should equal the number of servers in the cluster. (See What is a collection/core/shard/replica? for help in sorting out these terms.)

The Problem

We showed you how to create a new collection in the Quick Start lesson:

curl 'https://ss123456-us-west-1-aws.searchstax.com/solr/admin/collections?action=CREATE&name=testcollection&collection.configName=test1&numShards=1&replicationFactor=1&maxShardsPerNode=1'

This example works perfectly for creating a collection on a single-node system. However, when clients scale up to a cluster, they sometimes forget to change replicationFactor=1 to replicationFactor=2 or 3 depending on the size of the cluster.

This mistake creates a collection that resides on one node only. That node does all the work while the rest of the cluster is idle.

The Symptoms

A replicationFactor error comes to our attention in one of several ways:

  • A client complains about high query latency. One server gets overloaded handing a high volume of queries.
  • A client notices that some servers don’t have the right number of cores (replicas).
  • A client complains that a backup/restore operation missed an index.
  • Query service is interrupted when one server stops running. A high-availability cluster should continue to serve queries when a single server is down.
  • A client adds nodes to a cluster. The replica count must be manually changed.

We confirm a replicationFactor error by looking in the deployment’s Solr Dashboard.

  1. From the SearchStax Cloud Dashboard, click the name of the deployment.
  2. On the Deployment Servers page, click the Solr HTTP Endpoint.
  3. On the Solr Dashboard, click Cloud. The graph shows a <collection> → <shard> → <list of replicas>. For a three-node cluster, you should see three replicas.
SearchStax Solr Missing Replica

If you see only one replica where you expected three, you’ll have to add two replicas to the collection. Fortunately, this is easy to do.

The Solution

The solution is to manually add replicas to the nodes that lack them.

The Solr API lets us add a new replica directly to a collection by sending a message to the Solr HTTP Endpoint. The syntax of the message is:

curl '<HTTP Solr Endpoint>/solr/admin/collections?action=ADDREPLICA&collection=<collection-name>&shard=shard1&node=<node-name>'

You already know the values of the HTTP Solr Endpoint and the collection-name. The node-name is not as obvious.

Make a note of the IP address of the single replica that appeared in the graph above. That’s the IP of the server that already has a replica. Now we need the names of the two servers that need to have a replica added.

Click Tree under Cloud. Open the list of live_nodes.

SearchStax Solr Missing Replica

Those are the node-names. Identify the two nodes that need replicas, and plug in all the required values in the ADDREPLICA message.

This is a practical example that we used with Solr 7.3.1:

curl 'https://ss123456-us-west-1-aws.searchstax.com/solr/admin/collections?action=ADDREPLICA&collection=testcollection&shard=shard1&node=10.0.1.164:8983_solr'

The successful creation of a new replica returns a message similar to this one:

{
  "responseHeader":{
    "status":0,
    "QTime":582},
  "success":{
    "10.0.1.164:8983_solr":{
      "responseHeader":{
        "status":0,
        "QTime":435},
      "core":"testcollection_shard1_replica_n3"}}}

Repeat this step any remaining nodes.

When you return to the Cloud graph display, you will see that your collection now has three replicas.

SearchStax Solr Missing Replica

Solr Dashboard displays wrong Replication Factor!

Solr has a known oddity on the Collections page. When you select a collection, it displays the original Replication Factor even if you have added replicas to the collection.

SearchStax Solr Missing Replica

This screenshot shows a collection that has two replicas, but the Replication Factor says “1”. That “replication factor” is not used for anything and cannot be reset. You may ignore it. If you see the right number of active replicas, there is nothing to be concerned about.

Questions?

Do not hesitate to contact the SearchStax Support Desk.


Return to Frequently Asked Questions.