How to connect Elasticsearch to Django REST Framework
6 min read
This blog will start with a non-core topic for him. But the task is interesting, and perhaps my method will help someone.

A customer contacted me with a request to improve the search for an online store. The project is based on Django and has a REST API (using Django REST framework) to work with the client, PostgreSQL full-text search was used for the search. Because the product database was large, the search algorithms were planned to be complex, and the API was fast, then the choice fell to Elasticsearch. And as it turned out, adding it to the project can be quite simple.
Installing Elasticsearch
The project uses Docker, so first we add a new Elasticsearch container to Docker Compose. Of course, you can run it separately.
elasticsearch: image: elasticsearch:7.17.8 environment: - discovery.type=single-node - cluster.name=es-docker - node.name=node1 - bootstrap.memory_lock=true - "ES_JAVA_OPTS=-Xms4g -Xmx4g" deploy: resources: limits: memory: 8G # Limited the amount of RAM restart: always ports: - 9200:9200 - 9300:9300 volumes: - esdata:/usr/share/elasticsearch/data # Volume for data storage depends_on: - server # Starting the Container After Running Our API (Django) networks: - prnetwork # Specify the network for the container to work
Because Since the server has a limit on the amount of free RAM, it was necessary to limit its container to 8 GB. And in order not to lose our data when recreating the container, we save them in volume.
Server Tuning
To set up interaction with the search engine from the Django REST Framework, we will use django-elasticsearch-dsl-drf. This package allows you to do detailed configuration, and customize the API for our needs. Therefore, we add the following packages to the project:
elasticsearch==7.17.8elasticsearch-dsl==7.4.0django-elasticsearch-dsl==7.2.2django-elasticsearch-dsl-drf==0.22.5
In the settings.py file, we write the packages in INSTALLED_APPS, and specify the host to connect to:
INSTALLED_APPS = [ # ......... 'rest_framework', # REST framework 'django_elasticsearch_dsl', # Integration with Elasticsearch 'django_elasticsearch_dsl_drf', # API Package # .........]
ELASTICSEARCH_DSL = { 'default': { 'hosts': os.environ.get("ELASTICSEARCH_HOST") },}
In our case, the work was carried out within the same Docker network, so the following was added to the .env file:
ELASTICSEARCH_HOST=http://elasticsearch:9200
Setting up models
We proceed from the situation that you already have your model, so I miss the process of creating it. For example, I will attach my version of the model, which I simplified for visual clarity:
class Product(models.Model): """Product Model""" name = models.CharField(max_length=255, db_index=True, verbose_name="Name") slug = models.SlugField(max_length=255, db_index=True, unique=True, verbose_name="Url address") barcode = models.CharField(max_length=300, db_index=True, blank=True, null=True, verbose_name="Barcode") brand = models.ForeignKey(Brand, on_delete=models.CASCADE, related_name='products', verbose_name="Manufacturer") category = models.ForeignKey(Category, on_delete=models.CASCADE, related_name='products', verbose_name="Category") price = models.IntegerField(default=0, blank=True, null=True, verbose_name="Price") visible = models.BooleanField(default=True, verbose_name="Visibility")
class Meta: ordering = ['name'] verbose_name = 'product' verbose_name_plural = 'Products'
def __str__(self): return self.name
@property def get_brand(self): return self.brand.name
@property def get_category(self): return self.category.name
The get_brand()
and get_category()
methods are needed so that we can process the name of brands and categories as a simple text field.
Now you need to create documents (analogous to search models) for working with product models. To do this, in the folder with our API, we create the documents.py file, connect the necessary libraries and describe our entities.
from django_elasticsearch_dsl import Document, fieldsfrom elasticsearch_dsl import analyzer, Indexfrom django_elasticsearch_dsl.registries import registryfrom shop.models import Product
product = Index('products') # Index name in Elasticsearch
product.settings( number_of_shards=1, number_of_replicas=1) # It is better to read the decoding of this setting in the Elasticsearch documentation
html_strip = analyzer( 'html_strip', tokenizer="standard", filter=["lowercase", "stop", "snowball"], char_filter=["html_strip"]) # Text field analyzer
@registry.register_document@product.documentclass ProductDocument(Document): """Product Document"""
id = fields.IntegerField(attr='id')
name = fields.TextField( analyzer=html_strip, fields={ 'raw': fields.TextField(analyzer='keyword'), } )
barcode = fields.TextField( analyzer=html_strip, fields={ 'raw': fields.TextField(analyzer='keyword'), } )
brand = fields.TextField( attr='get_brand', # Result of get_brand() method analyzer=html_strip, fields={ 'raw': fields.TextField(analyzer='keyword'), } )
category = fields.TextField( attr='get_category', # Result of get_category() method analyzer=html_strip, fields={ 'raw': fields.TextField(analyzer='keyword'), } )
slug = fields.FileField(attr='slug') price = fields.IntegerField(attr='price') visible = fields.BooleanField(attr='visible')
class Django: model = Product # Our Django model
I will not describe the process of creating a document, as the documentation of the package is very good, so it is better to read it right away. You may also need the Elasticsearch documentation to properly set up the algorithms.
Now that the documents are ready, we can index the data:
python manage.py search_index --create -fpython manage.py search_index --populate -f
python manage.py search_index --delete
Getting data via REST API
It remains for us to create a serializer, describe the search algorithm, and everything will be ready. For simplicity, we will display only 4 fields:
class ProductDocumentSerializer(DocumentSerializer): """Product serializer for search"""
class Meta: document = ProductDocument
fields = ( 'id', 'name', 'slug', 'price', )
For this project, it was necessary to change the pagination so that the maximum and minimum prices were displayed along with the results in a certain format. Therefore, we additionally create our pagination class:
from django_elasticsearch_dsl_drf.pagination import PageNumberPagination
class SearchPagination(PageNumberPagination): """Setting up pagination for product search"""
def get_paginated_response_context(self, data): min_price = 0 max_price = 0
__facets = self.get_facets() if __facets is not None: if __facets['min']['value'] != None: min_price = int(__facets['min']['value']) if __facets['max']['value'] != None: max_price = int(__facets['max']['value'])
# Redefining the answer for our needs return [ ('page', self.page.number), ('count', self.page.count), ('min_price', min_price), ('max_price', max_price), ('results', data) ]
And now we supplement views.py, and immediately set it up so that only products with display enabled are shown:
from elasticsearch_dsl import Qfrom django_elasticsearch_dsl_drf.viewsets import BaseDocumentViewSetfrom django_elasticsearch_dsl_drf.constants import ( LOOKUP_FILTER_RANGE )from django_elasticsearch_dsl_drf.filter_backends import ( FilteringFilterBackend, OrderingFilterBackend, DefaultOrderingFilterBackend, SearchFilterBackend,)from .documents import ProductDocument
class SearchProductsView(BaseDocumentViewSet): """Product search""" document = ProductDocument # Specify the document serializer_class = ProductDocumentSerializer pagination_class = SearchPagination lookup_field = 'id' filter_backends = [ FilteringFilterBackend, OrderingFilterBackend, DefaultOrderingFilterBackend, SearchFilterBackend, ]
# Define the fields to be searched for # And set the rules of precedence and error search_fields = { 'name': { 'fuzziness': 'AUTO', 'boost': 2 }, 'barcode': { 'boost': 3 }, 'brand': { 'boost': 1 }, 'category': { 'boost': 1 }, }
# Define fields for filtering filter_fields = { 'price': { 'field': 'price', 'lookups': [ LOOKUP_FILTER_RANGE, ], }, }
# Defining fields for sorting ordering_fields = { 'price': 'price', }
# Default sort ordering = ['_score']
def get_queryset(self): """Leave only visible products""" queryset = self.search.query(Q('term', visible='true')) queryset.model = self.document.Django.model return queryset
def paginate_queryset(self, queryset): """Calculate the maximum and minimum price of goods""" queryset.aggs.metric('max', 'max', field='price') queryset.aggs.metric('min', 'min', field='price') if self.paginator is None: return None return self.paginator.paginate_queryset(queryset, self.request, view=self)
The sorting _score
means that the results will be in the order of relevance to the query.
Add a link in the urls.py of our API:
router.register(r'search', SearchProductsView, basename='search_products')
And that’s it, the search works. Thanks for reading!