How to connect Elasticsearch to Django REST Framework
6 min read
This blog will start with a non-core topic for him. But the task is interesting, and perhaps my method will help someone.
A customer contacted me with a request to improve the search for an online store. The project is based on Django and has a REST API (using Django REST framework) to work with the client, PostgreSQL full-text search was used for the search. Because the product database was large, the search algorithms were planned to be complex, and the API was fast, then the choice fell to Elasticsearch. And as it turned out, adding it to the project can be quite simple.
Please note that this is not a guide or tutorial, just my way of connecting search.
Installing Elasticsearch
The project uses Docker, so first we add a new Elasticsearch container to Docker Compose. Of course, you can run it separately.
elasticsearch:
image: elasticsearch:7.17.8
environment:
- discovery.type=single-node
- cluster.name=es-docker
- node.name=node1
- bootstrap.memory_lock=true
- "ES_JAVA_OPTS=-Xms4g -Xmx4g"
deploy:
resources:
limits:
memory: 8G # Limited the amount of RAM
restart: always
ports:
- 9200:9200
- 9300:9300
volumes:
- esdata:/usr/share/elasticsearch/data # Volume for data storage
depends_on:
- server # Starting the Container After Running Our API (Django)
networks:
- prnetwork # Specify the network for the container to work
Because Since the server has a limit on the amount of free RAM, it was necessary to limit its container to 8 GB. And in order not to lose our data when recreating the container, we save them in volume.
Server Tuning
To set up interaction with the search engine from the Django REST Framework, we will use django-elasticsearch-dsl-drf. This package allows you to do detailed configuration, and customize the API for our needs. Therefore, we add the following packages to the project:
elasticsearch==7.17.8
elasticsearch-dsl==7.4.0
django-elasticsearch-dsl==7.2.2
django-elasticsearch-dsl-drf==0.22.5
In the settings.py file, we write the packages in INSTALLED_APPS, and specify the host to connect to:
INSTALLED_APPS = [
# .........
'rest_framework', # REST framework
'django_elasticsearch_dsl', # Integration with Elasticsearch
'django_elasticsearch_dsl_drf', # API Package
# .........
]
ELASTICSEARCH_DSL = {
'default': {
'hosts': os.environ.get("ELASTICSEARCH_HOST")
},
}
In our case, the work was carried out within the same Docker network, so the following was added to the .env file:
ELASTICSEARCH_HOST=http://elasticsearch:9200
Setting up models
We proceed from the situation that you already have your model, so I miss the process of creating it. For example, I will attach my version of the model, which I simplified for visual clarity:
class Product(models.Model):
"""Product Model"""
name = models.CharField(max_length=255, db_index=True, verbose_name="Name")
slug = models.SlugField(max_length=255, db_index=True, unique=True, verbose_name="Url address")
barcode = models.CharField(max_length=300, db_index=True, blank=True, null=True, verbose_name="Barcode")
brand = models.ForeignKey(Brand, on_delete=models.CASCADE, related_name='products', verbose_name="Manufacturer")
category = models.ForeignKey(Category, on_delete=models.CASCADE, related_name='products', verbose_name="Category")
price = models.IntegerField(default=0, blank=True, null=True, verbose_name="Price")
visible = models.BooleanField(default=True, verbose_name="Visibility")
class Meta:
ordering = ['name']
verbose_name = 'product'
verbose_name_plural = 'Products'
def __str__(self):
return self.name
@property
def get_brand(self):
return self.brand.name
@property
def get_category(self):
return self.category.name
The get_brand()
and get_category()
methods are needed so that we can process the name of brands and categories as a simple text field.
Now you need to create documents (analogous to search models) for working with product models. To do this, in the folder with our API, we create the documents.py file, connect the necessary libraries and describe our entities.
from django_elasticsearch_dsl import Document, fields
from elasticsearch_dsl import analyzer, Index
from django_elasticsearch_dsl.registries import registry
from shop.models import Product
product = Index('products') # Index name in Elasticsearch
product.settings(
number_of_shards=1,
number_of_replicas=1
) # It is better to read the decoding of this setting in the Elasticsearch documentation
html_strip = analyzer(
'html_strip',
tokenizer="standard",
filter=["lowercase", "stop", "snowball"],
char_filter=["html_strip"]
) # Text field analyzer
@registry.register_document
@product.document
class ProductDocument(Document):
"""Product Document"""
id = fields.IntegerField(attr='id')
name = fields.TextField(
analyzer=html_strip,
fields={
'raw': fields.TextField(analyzer='keyword'),
}
)
barcode = fields.TextField(
analyzer=html_strip,
fields={
'raw': fields.TextField(analyzer='keyword'),
}
)
brand = fields.TextField(
attr='get_brand', # Result of get_brand() method
analyzer=html_strip,
fields={
'raw': fields.TextField(analyzer='keyword'),
}
)
category = fields.TextField(
attr='get_category', # Result of get_category() method
analyzer=html_strip,
fields={
'raw': fields.TextField(analyzer='keyword'),
}
)
slug = fields.FileField(attr='slug')
price = fields.IntegerField(attr='price')
visible = fields.BooleanField(attr='visible')
class Django:
model = Product # Our Django model
I will not describe the process of creating a document, as the documentation of the package is very good, so it is better to read it right away. You may also need the Elasticsearch documentation to properly set up the algorithms.
Now that the documents are ready, we can index the data:
python manage.py search_index --create -f
python manage.py search_index --populate -f
If you need to delete all data, then use the command python manage.py search_index --delete
Getting data via REST API
It remains for us to create a serializer, describe the search algorithm, and everything will be ready. For simplicity, we will display only 4 fields:
class ProductDocumentSerializer(DocumentSerializer):
"""Product serializer for search"""
class Meta:
document = ProductDocument
fields = (
'id',
'name',
'slug',
'price',
)
For this project, it was necessary to change the pagination so that the maximum and minimum prices were displayed along with the results in a certain format. Therefore, we additionally create our pagination class:
from django_elasticsearch_dsl_drf.pagination import PageNumberPagination
class SearchPagination(PageNumberPagination):
"""Setting up pagination for product search"""
def get_paginated_response_context(self, data):
min_price = 0
max_price = 0
__facets = self.get_facets()
if __facets is not None:
if __facets['min']['value'] != None:
min_price = int(__facets['min']['value'])
if __facets['max']['value'] != None:
max_price = int(__facets['max']['value'])
# Redefining the answer for our needs
return [
('page', self.page.number),
('count', self.page.count),
('min_price', min_price),
('max_price', max_price),
('results', data)
]
And now we supplement views.py, and immediately set it up so that only products with display enabled are shown:
from elasticsearch_dsl import Q
from django_elasticsearch_dsl_drf.viewsets import BaseDocumentViewSet
from django_elasticsearch_dsl_drf.constants import ( LOOKUP_FILTER_RANGE )
from django_elasticsearch_dsl_drf.filter_backends import (
FilteringFilterBackend,
OrderingFilterBackend,
DefaultOrderingFilterBackend,
SearchFilterBackend,
)
from .documents import ProductDocument
class SearchProductsView(BaseDocumentViewSet):
"""Product search"""
document = ProductDocument # Specify the document
serializer_class = ProductDocumentSerializer
pagination_class = SearchPagination
lookup_field = 'id'
filter_backends = [
FilteringFilterBackend,
OrderingFilterBackend,
DefaultOrderingFilterBackend,
SearchFilterBackend,
]
# Define the fields to be searched for
# And set the rules of precedence and error
search_fields = {
'name': { 'fuzziness': 'AUTO', 'boost': 2 },
'barcode': { 'boost': 3 },
'brand': { 'boost': 1 },
'category': { 'boost': 1 },
}
# Define fields for filtering
filter_fields = {
'price': {
'field': 'price',
'lookups': [
LOOKUP_FILTER_RANGE,
],
},
}
# Defining fields for sorting
ordering_fields = {
'price': 'price',
}
# Default sort
ordering = ['_score']
def get_queryset(self):
"""Leave only visible products"""
queryset = self.search.query(Q('term', visible='true'))
queryset.model = self.document.Django.model
return queryset
def paginate_queryset(self, queryset):
"""Calculate the maximum and minimum price of goods"""
queryset.aggs.metric('max', 'max', field='price')
queryset.aggs.metric('min', 'min', field='price')
if self.paginator is None: return None
return self.paginator.paginate_queryset(queryset, self.request, view=self)
The sorting _score
means that the results will be in the order of relevance to the query.
Add a link in the urls.py of our API:
router.register(r'search', SearchProductsView, basename='search_products')
And that’s it, the search works. Thanks for reading!