PostgreSQL Full-Text Search: Concepts, SQL Usage, and Integration with Java and Python

PostgreSQL provides a powerful, native Full-Text Search (FTS) engine designed to efficiently index and search large volumes of textual data. Unlike simple string matching using LIKE or ILIKE, PostgreSQL Full-Text Search performs linguistic analysis, tokenization, stemming, stop-word removal, and relevance ranking.

This article starts by introducing Full-Text Search directly at the SQL level, then explains its internal concepts, and finally demonstrates recommended, production-grade usage from Java and Python applications using Spring Data JPA and Peewee.

Getting Started with PostgreSQL Full-Text Search (Pure SQL)

PostgreSQL Full-Text Search is built into the database and can be used without any external libraries or frameworks.

A Basic Example

Consider a simple table storing articles:

CREATE TABLE article (
  id SERIAL PRIMARY KEY,
  title TEXT,
  body TEXT
);

A basic full-text search can be performed by converting text into a tsvector and comparing it with a tsquery:

SELECT *
FROM article
WHERE to_tsvector('portuguese', title || ' ' || body)
      @@ plainto_tsquery('portuguese', 'postgres database');

This query finds rows whose text semantically matches the words “postgres” and “database”, regardless of word order, case, or inflection.

Why Not LIKE?

Traditional SQL searches often rely on:

WHERE body ILIKE '%postgres%'

This approach:

Cannot use linguistic analysis
Does not scale well on large datasets
Fails to rank results by relevance

PostgreSQL Full-Text Search solves these limitations.

Using plainto_tsquery

plainto_tsquery is designed for user input. It automatically:

Splits text into terms
Applies the language dictionary
Combines terms using logical AND

SELECT *
FROM article
WHERE to_tsvector('portuguese', body)
      @@ plainto_tsquery('portuguese', 'powerful database');

Using to_tsquery (Advanced Syntax)

When the query syntax is controlled, to_tsquery allows explicit logical operators:

SELECT *
FROM article
WHERE to_tsvector('portuguese', body)
      @@ to_tsquery('portuguese', 'postgres & (database | data)');

Prefix matching is also supported:

WHERE to_tsvector('portuguese', body)
      @@ to_tsquery('portuguese', 'postgre:*');

Core Concepts of PostgreSQL Full-Text Search

PostgreSQL Full-Text Search relies on two specialized data types:

tsvector: A normalized and indexed representation of a document. Text is tokenized, lowercased, stripped of stop words, and stemmed according to a language dictionary.

tsquery: A structured search query containing terms and logical operators such as AND (&), OR (|), NOT (!), and prefix matching (:*).

Example Transformation

The text:

PostgreSQL is a powerful database

may be transformed into a tsvector similar to:

'databas':4 'power':3 'postgresql':1

A user query such as:

powerful database

becomes a tsquery equivalent to:

'power' & 'databas'

The @@ Operator

The @@ operator is the central operator of PostgreSQL Full-Text Search:

tsvector @@ tsquery

It evaluates whether a document matches a full-text query and returns a boolean result.

Conceptually, it can be read as:

"Does this indexed document match this search query?"

The @@ operator:

Executes the full-text match
Enables the use of GIN or GiST indexes
Is required for ranking and highlighting
Has no equivalent in standard SQL or JPQL

Without @@, PostgreSQL Full-Text Search cannot function.

Indexing with GIN

For performance, PostgreSQL Full-Text Search relies on inverted indexes, most commonly GIN (Generalized Inverted Index).

A recommended approach is to store a dedicated tsvector column and index it:

ALTER TABLE article
ADD COLUMN fts tsvector
GENERATED ALWAYS AS (
  setweight(to_tsvector('portuguese', coalesce(title,'')), 'A') ||
  setweight(to_tsvector('portuguese', coalesce(body,'')), 'B')
) STORED;

CREATE INDEX idx_article_fts
ON article USING GIN (fts);

This approach avoids triggers and guarantees consistency.

Ranking and Highlighting

PostgreSQL provides ranking functions such as ts_rank and ts_rank_cd to score documents by relevance.

Highlighting is handled by ts_headline, which extracts and emphasizes relevant fragments from the original text.

Using PostgreSQL Full-Text Search from Java (Spring Data JPA)

Spring Data JPA does not support PostgreSQL Full-Text Search operators directly in JPQL. The recommended approach is to use native SQL queries while keeping entity management and CRUD operations within JPA.

Entity Mapping

@Entity
@Table(name = "article")
public class Article {

  @Id
  @GeneratedValue(strategy = GenerationType.IDENTITY)
  private Long id;

  private String title;

  @Column(columnDefinition = "text")
  private String body;

  @Column(name = "fts", insertable = false, updatable = false)
  private String fts;
}

Repository with Full-Text Search

public interface ArticleRepository extends JpaRepository<Article, Long> {

  @Query(value = """
    SELECT a.*
    FROM article a
    WHERE a.fts @@ plainto_tsquery('portuguese', :query)
    ORDER BY ts_rank(a.fts, plainto_tsquery('portuguese', :query)) DESC
    """, nativeQuery = true)
  List<Article> search(@Param("query") String query);
}

Using PostgreSQL Full-Text Search from Python (Peewee)

Peewee is a lightweight ORM that allows direct access to PostgreSQL operators and functions, making it well-suited for Full-Text Search when used with explicit SQL expressions.

Model Definition

from peewee import *
from playhouse.postgres_ext import PostgresqlExtDatabase

db = PostgresqlExtDatabase("example_db")

class BaseModel(Model):
    class Meta:
        database = db

class Article(BaseModel):
    title = TextField()
    body = TextField()
    fts = TextField(null=True)

Full-Text Search Query

from peewee import fn, SQL

query_text = "postgres database"

query = (
    Article
    .select(
        Article,
        fn.ts_rank(
            Article.fts,
            fn.plainto_tsquery('portuguese', query_text)
        ).alias('rank')
    )
    .where(
        Article.fts.op('@@')(
            fn.plainto_tsquery('portuguese', query_text)
        )
    )
    .order_by(SQL('rank').desc())
)

for article in query:
    print(article.title, article.rank)

ORM vs Search Engines

PostgreSQL Full-Text Search is sufficient for most applications, including:

Blog and article search
Documentation systems
Administrative and internal tools
Moderate autocomplete requirements

Dedicated search engines such as Elasticsearch or OpenSearch are recommended only when requirements include advanced linguistic analysis, distributed indexing, or large-scale search analytics.

Best Practices

Always use a dedicated tsvector column
Always create a GIN index on the tsvector
Prefer plainto_tsquery or websearch_to_tsquery for user input
Use to_tsquery only when the query syntax is controlled
Keep Full-Text Search logic inside PostgreSQL
Use ORM abstractions for CRUD and native SQL for search

Conclusion

PostgreSQL Full-Text Search is a mature, efficient, and production-ready solution that integrates naturally with both Java and Python ecosystems.

By starting with SQL fundamentals and layering ORM integration on top, applications can achieve high-performance, semantically rich search without the operational complexity of external search engines.

Web-Style Search with websearch_to_tsquery

PostgreSQL provides websearch_to_tsquery, a function designed to accept search strings written in a format similar to modern web search engines (such as Google).

Unlike to_tsquery, which requires explicit operators, and plainto_tsquery, which always combines terms with AND, websearch_to_tsquery supports a more expressive and user-friendly syntax.

Supported Syntax

websearch_to_tsquery understands:

Space-separated terms (implicit AND)
OR using the keyword OR
Quoted phrases
Unary minus for negation

Examples

Basic search (implicit AND):

SELECT *
FROM article
WHERE fts @@ websearch_to_tsquery('portuguese', 'postgres database');

OR operator:

WHERE fts @@ websearch_to_tsquery('portuguese', 'postgres OR mysql');

Quoted phrase:

WHERE fts @@ websearch_to_tsquery('portuguese', '"full text search"');

Negation:

WHERE fts @@ websearch_to_tsquery('portuguese', 'postgres -mysql');

Why websearch_to_tsquery Matters

This function is particularly well-suited for direct user input because:

It avoids syntax errors common with to_tsquery
It feels natural to end users
It reduces the need for custom query parsing in the application layer

For most user-facing search boxes, websearch_to_tsquery is the recommended choice.

FTS vs pg_trgm

PostgreSQL offers two powerful but fundamentally different approaches to text search: Full-Text Search (FTS) and trigram similarity via the pg_trgm extension.

Understanding their differences is essential for choosing the correct tool.

pg_trgm Overview

The pg_trgm extension indexes text by breaking it into overlapping sequences of three characters (trigrams). It enables fast similarity searches and partial matching.

To enable it:

CREATE EXTENSION IF NOT EXISTS pg_trgm;

A typical trigram index:

CREATE INDEX idx_article_trgm
ON article
USING GIN (body gin_trgm_ops);

Example query:

SELECT *
FROM article
WHERE body ILIKE '%postgre%';

Key Differences

Feature	PostgreSQL FTS (@@)	pg_trgm
Linguistic analysis	Yes (stemming, stop words)	No
Word-based search	Yes	No
Partial word matching	Limited (:*)	Yes
Typo tolerance	Low	High
Ranking by relevance	Yes (ts_rank)	Limited (similarity)
Language awareness	Yes	No
Index size	Moderate	Larger
Best use case	Semantic text search	Fuzzy / partial matching

= When to Use Full-Text Search

Use PostgreSQL Full-Text Search when:

Searching long documents or articles
Language semantics matter
You need relevance ranking
Queries are word-based
You want deterministic, explainable results

= When to Use pg_trgm

Use pg_trgm when:

Implementing autocomplete
Supporting typo-tolerant search
Matching substrings or partial words
Searching identifiers, codes, or names

Combining FTS and pg_trgm

In many real-world systems, the best solution is to combine both approaches.

Example:

Use FTS (@@) for main content search
Use pg_trgm for autocomplete or fallback fuzzy matching
Merge or rank results at the application layer

This hybrid approach often delivers search quality comparable to dedicated search engines while keeping all logic inside PostgreSQL.

POSTGRESQL

Índice

PostgreSQL Full-Text Search: Concepts, SQL Usage, and Integration with Java and Python

Getting Started with PostgreSQL Full-Text Search (Pure SQL)

A Basic Example

Why Not LIKE?

Using plainto_tsquery

Using to_tsquery (Advanced Syntax)

Core Concepts of PostgreSQL Full-Text Search

Example Transformation

The @@ Operator

Indexing with GIN

Ranking and Highlighting

Using PostgreSQL Full-Text Search from Java (Spring Data JPA)

Entity Mapping

Repository with Full-Text Search

Using PostgreSQL Full-Text Search from Python (Peewee)

Model Definition

Full-Text Search Query

ORM vs Search Engines

Best Practices

Conclusion

Web-Style Search with websearch_to_tsquery

Supported Syntax

Examples

Why websearch_to_tsquery Matters

FTS vs pg_trgm

pg_trgm Overview

Key Differences

= When to Use Full-Text Search

= When to Use pg_trgm

Combining FTS and pg_trgm

Menu de navegação

POSTGRESQL

PostgreSQL Full-Text Search: Concepts, SQL Usage, and Integration with Java and Python

Getting Started with PostgreSQL Full-Text Search (Pure SQL)

A Basic Example

Why Not LIKE?

Using plainto_tsquery

Using to_tsquery (Advanced Syntax)

Core Concepts of PostgreSQL Full-Text Search

Example Transformation

The @@ Operator

Indexing with GIN

Ranking and Highlighting

Using PostgreSQL Full-Text Search from Java (Spring Data JPA)

Entity Mapping

Repository with Full-Text Search

Using PostgreSQL Full-Text Search from Python (Peewee)

Model Definition

Full-Text Search Query

ORM vs Search Engines

Best Practices

Conclusion

Web-Style Search with websearch_to_tsquery

Supported Syntax

Examples

Why websearch_to_tsquery Matters

FTS vs pg_trgm

pg_trgm Overview

Key Differences

= When to Use Full-Text Search

= When to Use pg_trgm

Combining FTS and pg_trgm

Menu de navegação

Pesquisa