What is Redisearch? 2coffee.dev is using redisearch as the database!

What is Redisearch? 2coffee.dev is using redisearch as the database!

Daily short news for you
  • Since the Lunar New Year holiday has started, I won't be posting anymore. See you all after the holiday! 😁

    » Read more
  • Continuing about jj. I'm wondering if there are any GUI software made for it yet to make it easier to use. There are already so many similar to git that I can't count them all.

    Luckily, the author has compiled them all together in Community-built tools around Jujutsu 🥳

    » Read more
  • Turso announces that they are rewriting SQLite in Rust. This adds another piece of evidence supporting the notion that Rust is "redefining" many things.

    But the deeper reason is more interesting. Why are they doing this? Everyone knows that SQLite is open source, and anyone can create a fork to modify it as they wish. Does the Turso team dislike or distrust C—the language used to build SQLite?

    Let me share a bit of a story. Turso is a provider of database server services based on SQLite; they have made some customizations to a fork of SQLite to serve their purposes, calling it libSQL. They are "generous" in allowing the community to contribute freely.

    Returning to the point that SQLite is open source but not open contribution. There is only a small group of people behind the maintenance of this source code, and they do not accept pull requests from others. This means that any changes or features are created solely by this group. It seems that SQLite is very popular, but the community cannot do what they want, which is to contribute to its development.

    We know that most open source applications usually come with a "tests" directory that contains very strict tests. This makes collaboration in development much easier. If you want to modify or add a new feature, you first need to ensure that the changes pass all the tests. Many reports suggest that SQLite does not publicly share this testing suite. This inadvertently makes it difficult for those who want to modify the source code, as they are uncertain whether their new implementation is compatible with the existing features.

    tursodatabase/limbo is the project rewriting SQLite in Rust mentioned at the beginning of this article. They claim that it is fully compatible with SQLite and completely open source. Limbo is currently in the final stages of development. Let’s wait and see what the results will be in the future. For a detailed article, visit Introducing Limbo: A complete rewrite of SQLite in Rust.

    » Read more

Issue

Databases are an essential part of modern websites. Most of us have heard of two main types: SQL and NoSQL. Each has its own strengths and weaknesses, depending on the use case. Redis is a NoSQL database that is commonly used for data caching purposes.

For those who don't know, my blog uses redisearch as the database. Redisearch is a module of Redis. Before adopting redisearch, I used MySQL. But why? Is it difficult to switch to redisearch and is it easy to use redisearch? Let’s explore these questions in this article...

What is Redisearch?

Redisearch is a module of Redis that provides powerful full-text search capabilities, indexing, and querying. You know the LIKE command in SQL, right? It is used to search for data that matches a specified pattern. However, LIKE has limitations in SQL, such as slow search speed on large datasets and difficulty in performing complex searches like spelling errors or phrase suggestions. Redisearch overcomes these limitations.

Redisearch uses inverted indexes along with compression, allowing for fast indexing with low memory costs. "Inverted indexes" are a type of index that stores an mapping from content, such as words or numbers, to its position in a table or document or a set of documents. Its purpose is to allow fast "full-text" searches. Their "compressed" technology is used to reduce storage costs.

If you have experience using Redis for data caching, then you will find that Redisearch offers both speed and powerful querying capabilities.

Why did I choose Redisearch?

The nature of a blog is that reading data is more common than writing, so I always prioritize retrieving data as quickly and accurately as possible. Moreover, with a modest server configuration of 1 CPU and 1GB RAM, memory optimization and processing speed for the system are crucial.

As mentioned earlier, I used MySQL before. In terms of stability, MySQL is very stable and has good data querying capabilities. However, it consumes a relatively large amount of memory, and its full-text search capabilities were not strong enough for my requirement of search being the main feature of the blog.

I've heard about Redisearch for a long time but never had the opportunity to explore it. Coincidentally, while searching for a solution to the above problem, I decided to dive into researching whether I could apply it to my project. As expected, Redisearch fully met my expectations.

Initially, I only intended to use Redisearch as a secondary database for full-text search, but later I discovered that Redisearch can handle both storage and data retrieval capabilities, making it a complete replacement for MySQL.

Redisearch is currently in the process of being improved to receive more from the community. As a result, there were many difficulties when transitioning from MySQL to Redisearch initially. However, the documentation on the Redisearch homepage is quite comprehensive, so most of the issues I encountered were resolved.

How to use Redisearch?

Installing Redisearch

The first thing you need to do is install Redisearch. There are several ways to install Redisearch, such as using Docker, installation packages, or building from source.

To install with Docker, use the following command:

docker run -p 6379:6379 redislabs/redisearch:latest

Or you can see all other ways in the Quick Start Guide for RediSearch.

Creating an index

We need to create an index before using it for searching. The index is responsible for indexing and searching.

Use the FT.CREATE command to create an index.

For example, I create an index named article to store articles:

FT.CREATE article ON HASH PREFIX 1 article: SCHEMA url TAG SEPARATOR "," title TEXT content WEIGHT 5.0 TEXT created_at NUMERIC SORTABLE

The index article is stored in the HASH data type, with keys starting with article. It includes the following data fields: url with the TAG type, title and content with the TEXT data type, and created_at as NUMERIC to store the article creation date in Unix timestamp format.

Redisearch supports three data types for fields: TEXT, NUMERIC, and TAG. TAGs can be thought of as primary keys in SQL databases.

The WEIGHT parameter is used to determine the importance of a field. The higher the WEIGHT, the higher the priority in search results. SORTABLE needs to be declared if you want the data to be sorted during queries.

Adding data to the index

HSET article:1 url "hello-word" title "hello world" content "lorem ipsum" created_at 1630245601
HSET article:2 url "hello-word-2" title "hello world 2" content "lorem ipsum 2" created_at 1630245602
HSET article:3 url "hello-word-3" title "hello world 3" content "lorem ipsum 3" created_at 1630245603

Querying

Redisearch provides the ability to search based on data fields. You can perform exact searches, searches for specific words or phrases using logical operators like OR, AND, NOT in queries.

Check all the search query syntax that Redisearch supports in the Search Query Syntax.

To query based on a specific data field, use the @field syntax followed by the data to search for. For example:

Search by exact match on url:

FT.SEARCH article @url:{ hello-world }

Search for the phrase "hello world":

FT.SEARCH article @url:"hello world"

Search for articles with created_at greater than 1630245602:

FT.SEARCH article @created_at:[(1630245602 inf]

Retrieve all articles and sort them by created_at in descending order:

FT.SEARCH article * SORTBY created_at DESC

Data stored in Redisearch is analyzed and processed according to certain rules. For example, special characters are ignored during indexing unless we intervene. To understand these rules in more detail, see Controlling Text Tokenization and Escaping.

Speed

In terms of indexing and search speed, Redisearch is not inferior to any other search tool such as Elastic Search or Solr. In fact, it has an amazing search speed.

In a specific case, the developers on the Redis blog performed a performance comparison between Redisearch and Elastic Search by indexing and searching 5.6 million documents taken from the Wikipedia website.

Indexing results:
Indexing results

Search results with two random keywords:
Search results with two random keywords

For more details, you can read Search Benchmarking: RediSearch vs. Elasticsearch.

Summary

Redisearch is a powerful database that supports indexing and full-text search with optimized memory costs. I came across Redisearch while looking for a full-text search solution to replace Elastic Search, which requires relatively high server configurations.

Redisearch supports multiple search syntaxes on multiple data fields within an index. In addition, you can assign weights to documents to improve search accuracy, and search results are evaluated based on search scores.

In terms of performance, Redisearch is not inferior to any other search tools such as Elastic Search or Solr.

Premium
Hello

The secret stack of Blog

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

As a developer, are you curious about the technology secrets or the technical debts of this blog? All secrets will be revealed in the article below. What are you waiting for, click now!

View all

Subscribe to receive new article notifications

or
* The summary newsletter is sent every 1-2 weeks, cancel anytime.

Comments (2)

Leave a comment...
Avatar
Linh Trần2 years ago
Nếu như có thể thay thế els bằng redis được không bạn?
Reply
Avatar
Ẩn danh4 months ago
trong trường hợp nào nữa bạn ơi 1 cái lưu trên ram 1 cái lưu trên ổ địa so sánh với nhau khập khiễn mà
Avatar
Nhí Nhố Tí2 years ago
Bài viết mang tính tổng quát
Reply
Avatar
Nhí Nhố Tí2 years ago
Tôi cần nhiều bài chi tiết hơn nữa
Scroll or click to go to the next page