Whitelist / Blacklist searching in Elasticsearch
May 2018
How do we match a large number of documents against a dynamic whitelist/blacklist in Elasticsearch?
You can use the terms query
Suppose we have an index of page access logs like so:
PUT /mybeat-2018/_doc/1
{
"host" : "elastic.co",
"ttl" : 40
}
PUT /mybeat-2018/_doc/2
{
"host" : "elastic.co",
"ttl" : 666
}
PUT /mybeat-2018/_doc/3
{
"host" : "google.com",
"ttl" : 55
}
and an independent whitelist that can shrink or grow, with a bunch of hosts:
PUT /whitelist/_doc/1
{
"hosts" : [
{
"name" : "elastic.co"
},
{
"name" : "twitter.com"
}
]
}
Then a search on the mybeat-*
for whatever is in the whitelist should reference the whitelist document (in our case the document with id: 1) like so:
GET /mybeat-*/_search
{
"query" : {
"terms" : {
"host" : {
"index" : "whitelist",
"type" : "_doc",
"id" : "1",
"path" : "hosts.name"
}
}
}
}