Create3D.xyz
Useful

How to connect Elasticsearch to Django REST Framework

6 min read

This blog will start with a non-core topic for him. But the task is interesting, and perhaps my method will help someone.

How to connect Elasticsearch to Django REST FrameworkPhoto by Jametlene Reskp / Unsplash

A customer contacted me with a request to improve the search for an online store. The project is based on Django and has a REST API (using Django REST framework) to work with the client, PostgreSQL full-text search was used for the search. Because the product database was large, the search algorithms were planned to be complex, and the API was fast, then the choice fell to Elasticsearch. And as it turned out, adding it to the project can be quite simple.

Please note that this is not a guide or tutorial, just my way of connecting search.

Installing Elasticsearch

The project uses Docker, so first we add a new Elasticsearch container to Docker Compose. Of course, you can run it separately.

elasticsearch:  
    image: elasticsearch:7.17.8
    environment:
        - discovery.type=single-node
        - cluster.name=es-docker
        - node.name=node1
        - bootstrap.memory_lock=true
        - "ES_JAVA_OPTS=-Xms4g -Xmx4g"
    deploy:
        resources:
            limits:
                memory: 8G # Limited the amount of RAM
    restart: always
    ports:
        - 9200:9200
        - 9300:9300
    volumes:
        - esdata:/usr/share/elasticsearch/data # Volume for data storage
    depends_on:
        - server # Starting the Container After Running Our API (Django)
    networks: 
        - prnetwork # Specify the network for the container to work

Because Since the server has a limit on the amount of free RAM, it was necessary to limit its container to 8 GB. And in order not to lose our data when recreating the container, we save them in volume.

Server Tuning

To set up interaction with the search engine from the Django REST Framework, we will use django-elasticsearch-dsl-drf. This package allows you to do detailed configuration, and customize the API for our needs. Therefore, we add the following packages to the project:

elasticsearch==7.17.8
elasticsearch-dsl==7.4.0
django-elasticsearch-dsl==7.2.2
django-elasticsearch-dsl-drf==0.22.5

In the settings.py file, we write the packages in INSTALLED_APPS, and specify the host to connect to:

INSTALLED_APPS = [
    # .........
    'rest_framework', # REST framework
    'django_elasticsearch_dsl', # Integration with Elasticsearch
    'django_elasticsearch_dsl_drf', # API Package
    # .........
]

ELASTICSEARCH_DSL = {
    'default': {
        'hosts': os.environ.get("ELASTICSEARCH_HOST")
    },
}

In our case, the work was carried out within the same Docker network, so the following was added to the .env file:

ELASTICSEARCH_HOST=http://elasticsearch:9200

Setting up models

We proceed from the situation that you already have your model, so I miss the process of creating it. For example, I will attach my version of the model, which I simplified for visual clarity:

class Product(models.Model):
    """Product Model"""
    name = models.CharField(max_length=255, db_index=True, verbose_name="Name")
    slug = models.SlugField(max_length=255, db_index=True, unique=True, verbose_name="Url address")
    barcode = models.CharField(max_length=300, db_index=True, blank=True, null=True, verbose_name="Barcode")
    brand = models.ForeignKey(Brand, on_delete=models.CASCADE, related_name='products', verbose_name="Manufacturer")
    category = models.ForeignKey(Category, on_delete=models.CASCADE, related_name='products', verbose_name="Category")
    price = models.IntegerField(default=0, blank=True, null=True, verbose_name="Price")
    visible = models.BooleanField(default=True, verbose_name="Visibility")

    class Meta:
        ordering = ['name']
        verbose_name = 'product'
        verbose_name_plural = 'Products'

    def __str__(self):
        return self.name
    
    @property
    def get_brand(self):
        return self.brand.name
    
    @property
    def get_category(self):
        return self.category.name

The get_brand() and get_category() methods are needed so that we can process the name of brands and categories as a simple text field.

Now you need to create documents (analogous to search models) for working with product models. To do this, in the folder with our API, we create the documents.py file, connect the necessary libraries and describe our entities.

from django_elasticsearch_dsl import Document, fields
from elasticsearch_dsl import analyzer, Index
from django_elasticsearch_dsl.registries import registry
from shop.models import Product

product = Index('products') # Index name in Elasticsearch

product.settings(
    number_of_shards=1,
    number_of_replicas=1
) # It is better to read the decoding of this setting in the Elasticsearch documentation
   
html_strip = analyzer(
    'html_strip',
    tokenizer="standard",
    filter=["lowercase", "stop", "snowball"],
    char_filter=["html_strip"]
) # Text field analyzer
        
@registry.register_document
@product.document
class ProductDocument(Document):
    """Product Document"""
    
    id = fields.IntegerField(attr='id')
    
    name = fields.TextField(
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
        }
    )
    
    barcode = fields.TextField(
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
        }
    )
    
    brand = fields.TextField(
        attr='get_brand', # Result of get_brand() method
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
        }
    )
    
    category = fields.TextField(
        attr='get_category', # Result of get_category() method
        analyzer=html_strip,
        fields={
            'raw': fields.TextField(analyzer='keyword'),
        }
    )

    slug = fields.FileField(attr='slug')
    price = fields.IntegerField(attr='price')
    visible = fields.BooleanField(attr='visible')
    
    class Django:
        model = Product # Our Django model

I will not describe the process of creating a document, as the documentation of the package is very good, so it is better to read it right away. You may also need the Elasticsearch documentation to properly set up the algorithms.

Now that the documents are ready, we can index the data:

python manage.py search_index --create -f
python manage.py search_index --populate -f

If you need to delete all data, then use the command python manage.py search_index --delete

Getting data via REST API

It remains for us to create a serializer, describe the search algorithm, and everything will be ready. For simplicity, we will display only 4 fields:

class ProductDocumentSerializer(DocumentSerializer):
    """Product serializer for search"""

    class Meta:
        document = ProductDocument

        fields = (
            'id',
            'name',
            'slug',
            'price',
        )

For this project, it was necessary to change the pagination so that the maximum and minimum prices were displayed along with the results in a certain format. Therefore, we additionally create our pagination class:

from django_elasticsearch_dsl_drf.pagination import PageNumberPagination

class SearchPagination(PageNumberPagination):
    """Setting up pagination for product search"""

    def get_paginated_response_context(self, data):
        min_price = 0
        max_price = 0
        
        __facets = self.get_facets()
        if __facets is not None:
            if __facets['min']['value'] != None:
                min_price = int(__facets['min']['value'])
            if __facets['max']['value'] != None:
                max_price = int(__facets['max']['value'])

        # Redefining the answer for our needs
        return [
            ('page', self.page.number),
            ('count', self.page.count),
            ('min_price', min_price),
            ('max_price', max_price),
            ('results', data)
        ]

And now we supplement views.py, and immediately set it up so that only products with display enabled are shown:

from elasticsearch_dsl import Q
from django_elasticsearch_dsl_drf.viewsets import BaseDocumentViewSet
from django_elasticsearch_dsl_drf.constants import ( LOOKUP_FILTER_RANGE )
from django_elasticsearch_dsl_drf.filter_backends import (
    FilteringFilterBackend,
    OrderingFilterBackend,
    DefaultOrderingFilterBackend,
    SearchFilterBackend,
)
from .documents import ProductDocument

class SearchProductsView(BaseDocumentViewSet):
    """Product search"""
    document = ProductDocument # Specify the document
    serializer_class = ProductDocumentSerializer
    pagination_class = SearchPagination
    lookup_field = 'id'
    filter_backends = [
        FilteringFilterBackend,
        OrderingFilterBackend,
        DefaultOrderingFilterBackend,
        SearchFilterBackend,
    ]

    # Define the fields to be searched for
    # And set the rules of precedence and error
    search_fields = {
        'name': { 'fuzziness': 'AUTO', 'boost': 2 },
        'barcode': { 'boost': 3 },
        'brand': { 'boost': 1 },
        'category': { 'boost': 1 },
    }
    
    # Define fields for filtering
    filter_fields = {
        'price': {
            'field': 'price',
            'lookups': [
                LOOKUP_FILTER_RANGE,
            ],
        },
    }
    
    # Defining fields for sorting
    ordering_fields = {
        'price': 'price',
    }
    
    # Default sort
    ordering = ['_score']
    
    def get_queryset(self):
        """Leave only visible products"""
        queryset = self.search.query(Q('term', visible='true'))
        queryset.model = self.document.Django.model
        return queryset
        
    def paginate_queryset(self, queryset): 
        """Calculate the maximum and minimum price of goods"""       
        queryset.aggs.metric('max', 'max', field='price')
        queryset.aggs.metric('min', 'min', field='price')
        if self.paginator is None: return None
        return self.paginator.paginate_queryset(queryset, self.request, view=self)

The sorting _score means that the results will be in the order of relevance to the query.

Add a link in the urls.py of our API:

router.register(r'search', SearchProductsView, basename='search_products')

And that’s it, the search works. Thanks for reading!

Quick way to deploy MySQL and phpMyAdmin with Docker
Useful

Quick way to deploy MySQL and phpMyAdmin with Docker

In this article, we'll look at how to quickly start the MySQL database management system with phpMyAdmin. For speed of launch and better control, we will use Docker

2 min read