According to Searchkick’s documentation, one can control what data is indexed with the search_data method. It’s not apparent, but what actually happens is that if a search_data instance method is indeed present, the return value of it will be what goes into Elasticsearch when indexing.

With this in mind, we can use it to preprocess data before it goes into Elasticsearch. Searchkick’s documentation shows a trivial example:

class Product < ActiveRecord::Base
  def search_data
    as_json only: [:name, :active]
    # or equivalently
    {
      name: name,
      active: active
    }
  end
end

But we can go further. For example, I have a serialized field in my example Post model called metadata which might take the following form (don’t go hating on the data modelling - sometimes one has no choice when taking on legacy code):

> Post.first.metadata
> [{type: "title", content: "some title"}, {type: "category", content: "a category"}]

We can’t be chucking the entire serialized metadata in - there’d be stuff like type: \"title\" which will really mess up the text searching.

Instead, we can do something like:

def search_data
  {
    content: content,
    metadata: {body: body, metadata: metadata.map{|x| x["content"]}.join(" ") }
  }
end

which will concatenate all of the actual content before assigning it to the metadata field.

This is a trivial example. Being an instance method, you are allowed the same access to whatever typical Active Record instance methods have access to, which gives you a lot of latitude.