I'm using PostgreSQL full text search for a project where traditional stop words ('a', 'the', 'if' etc.) should be indexed and searchable, which is not the default behaviour. For example, I might want my users to find results for the query 'to be or not to be'.
The documentation indicates that I could achieve this by creating an empty stopwords dictionary in $SHAREDIR/tsearch_data/english.stop
(for example), but this will complicate deployment; I want to be able to configure PostgreSQL's stop word handling with SQL. Is this possible? If so, can you provide a sample SQL statement?
As per your comment on the previous answer, you can easily switch between using no stop words and all stop words. You can acheive this with a custom search configuration:
(1) You can create a custom dictionary without using the stop words file, for example:
CREATE TEXT SEARCH DICTIONARY english_stem_nostop (
Template = snowball
, Language = english
);
Note, in the above I left out the StopWords
parameter.
(2) Then create a new configuration to use your new dictionary:
CREATE TEXT SEARCH CONFIGURATION public.english_nostop ( COPY = pg_catalog.english );
ALTER TEXT SEARCH CONFIGURATION public.english_nostop
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, hword, hword_part, word WITH english_stem_nostop;
(3) Then, when searching specify the config you want use (alternatively you can change the default_text_search_config
option each time), eg:
SELECT
title
FROM
articles
WHERE
to_tsvector('english_nostop', COALESCE(title,'') || ' ' || COALESCE(body,''))
@@ to_tsquery('english_nostop', 'how & to');
You can specify just 'english'
in the above SQL to use the normal config.
Note, in this example that using the standard configuration will result in notices because there are only stop words.
Bear in mind the following, however: