Bug in 7.17.3-27.2 -- Unable to acquire a search lock on indices

hayden · May 22, 2022, 4:00pm

Hi there,

Ever since upgrading our deployment with Federate 7.17.3-27.2 we are starting to see the following responses from time to time from the siren/_search endpoint.

{
  "error" : {
    "root_cause" : [
      {
        "type" : "i",
        "reason" : "Unable to acquire a search lock on indices MY_INDEX_NAME",
        "suppressed" : [
          {
            "type" : "index_not_found_exception",
            "reason" : "no such index [MY_INDEX_NAME]",
            "index_uuid" : "XXXXXXXX",
            "index" : "MY_INDEX_NAME"
          }
        ]
      }
    ],
    "type" : "i",
    "reason" : "Unable to acquire a search lock on indices MY_INDEX_NAME",
    "suppressed" : [
      {
        "type" : "index_not_found_exception",
        "reason" : "no such index [MY_INDEX_NAME]",
        "index_uuid" : "XXXXXXXXXXXX",
        "index" : "MY_INDEX_NAME"
      }
    ]
  },
  "status" : 500
}

It does not occur every single time. Retrying the request can sometimes make the issue go away. I’m not sure why it thinks the index isn’t found because it’s definitely there and I have no issues querying it from the normal Elasticsearch /_search endpoints.

My current workaround is to clone the index and that seems to solve the problem but then within hours or days it will start complaining about the same issue with a different index.

Is this a known issue with this version? Any ideas how to fix this?

Thank you!

Manu_Agarwal · May 23, 2022, 8:52am

Hi Hayden,

Can you please confirm how did you perform the Siren federate upgrade is it a Rolling upgrade? What was the last version of Siren federate?

Also please cross check the disk space generally when the disk space reaches 90%-95% used Elasticsearch indices get locked when there is a shortage of disk space on the server.

Regards
Manu Agarwal

Martin_Anseaume · May 23, 2022, 9:37am

Hi Hayden,

To investigate on this error, we would need some more details:

from which Federate version did you upgrade to Federate 7.17.3-27.2 ?
what is the type of this index ? Is it an alias ? A time series ?
can you describe your deployment ? How many nodes and how many shards/replica involved ?
did you notice shards relocations in between errors ?

Regards
Martin Anseaume

hayden · May 23, 2022, 4:50pm

Sure, happy to provide some more details:

We upgraded from 7.16.3-26.5
These indexes are all part of an alias managed by ILM
There are 6+ nodes and each index has 1 primary and 1 replica shard
All nodes have 1+ TB of free disk space
I have not noticed any shard relocations beyond the usual ILM ones, but the specific indexes in question had been rolled over already

I’ll stress that normal Elasticsearch search requests have no issues on the same indexes. It’s only when I switch to the /siren/INDEX/_search endpoint that I see these so it feels plugin-specific rather than a misconfiguration with the underlying index. I cannot think of any configuration change that we have made beyond updating Elasticsearch and the Federate plugin.

Happy to provide any other info you think would be helpful.

Thanks,
Hayden

Manu_Agarwal · May 25, 2022, 1:27pm

Hi Hayden,

We are internally investigating on it and will get back to it soon.

Regards
Manu Agarwal

hayden · May 26, 2022, 2:26pm

Thank you. Happy to provide any other details that might help your investigation.

microsen · May 29, 2022, 4:40am

I wanted to add that my team just ran into this exact issue. We were upgrading from ES 6.8.2 to 7.17.3-27.2 on a 21 node cluster.

The issue only occurred when running siren queries on indices that had also been written to after the upgrade.

We tried restarting the elastic service on each node and that seemed to resolve the problem. Though we are eager to hear if there is any more information about this issue.

hayden · May 30, 2022, 1:36pm

@microsen Just tried restarting the problem cluster and you’re right, restarting seems to have resolved the problem. I’ll report back here if the issue crops back up, but restarting seems to be a valid workaround in the meantime.

hayden · May 30, 2022, 4:35pm

I spoke too soon… It seemed to help initially but the errors eventually showed back up. We’re seeing a pattern where it happens after ILM moves and merges an index from a hot to a warm node.

microsen · May 31, 2022, 1:39pm

@hayden - that is interesting. We have not seen a recurrence of the issue yet, but we also are not currently using ILM.

hayden · May 31, 2022, 6:31pm

I’ve been running more tests on my end and I believe it has to do with shard location. One of our clusters uses ILM but does not relocate shards to warm nodes and is not giving us these errors.

So, I ran a test where I confirmed that siren searches were responding without errors. I then manually rolled over the index. As soon as the shards moved to their warm nodes siren started throwing the index not found errors. I manually moved the shards back to their original nodes and the siren requests started succeeding again.

I guess the workaround would be to ensure shards don’t move but that only works as a temporary solution because then a single node will start to be overloaded.

@Manu_Agarwal Could you please try replicating the above scenario? We are deployed in Elastic Cloud and are unable to roll back to prior versions of ES + plugin so we’re a bit stuck here.

Martin_Anseaume · June 1, 2022, 6:25am

Hi Hayden,

Thank you for these useful details which allowed us to identify a malfunction related to indices rerouting, we are still investigating on it, and we’ll provide a patch release as soon as possible.

Regards,
Martin Anseaume

Martin_Anseaume · June 1, 2022, 6:26am

Hi microsen,

Thank you for your feedback, we have internally an issue related to rolling upgrades, work is in progress.

Regards,
Martin Anseaume

Martin_Anseaume · June 8, 2022, 10:04am

Hi Hayden,

We could replicate the scenario you described, the bug was fully identified and is now fixed.
Thanks again for taking the time to report this issue.
The fix will be available in the next patch release coming shortly.

Regards
Martin Anseaume

hayden · June 8, 2022, 4:46pm

Thanks, Martin, that’s great news. I’ll keep an eye out for the update and will report back here if our issue is resolved.

renaud.delbru · June 15, 2022, 10:51am

Hi Hayden

We have released Federate 7.17.4-27.3 which includes the bug fix:

Kind regards

hayden · June 17, 2022, 7:26pm

I’m trying to download the updated plugin now but the link is just timing out for the zip file on the download page. Any ideas? Thanks!

Manu_Agarwal · June 20, 2022, 9:06am

Hi Hayden,

Are you sure you are using the current link for download as it is working fine.

Regards
Manu Agarwal

hayden · June 21, 2022, 2:19pm

Looks like the link is working now. We deployed this and ran a few tests with relocating shards and all seems to be resolved! Thank you for your open communication about this issue and quick resolution.

Topic		Replies	Views
Release - Federate 7.17.4 - 27.3 Announcements	0	308	June 15, 2022
Updating from 6.x to 7.x Siren Federate (Elasticsearch Plugin)	4	368	March 29, 2022
Release - Federate 7.10.2 - 27.3 Announcements	0	264	June 15, 2022
Release - Federate 7.17.1 - 27.2 Announcements	0	362	April 19, 2022
Error JAVA on update Siren Federate (Elasticsearch Plugin)	8	968	January 21, 2020

Bug in 7.17.3-27.2 -- Unable to acquire a search lock on indices

Related topics