Elasticsearch
Terms
Mapping
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html
type | description | ||||
---|---|---|---|---|---|
text |
|
||||
keyword |
|
||||
date |
“format”: “yyyy-MM-dd HH:mm:ss | yyyy-MM-dd | epoch_millis” | ||
boolean |
|||||
binary |
|
||||
long |
numeric | ||||
integer |
numeric | ||||
short |
numeric | ||||
byte |
numeric | ||||
double |
numeric | ||||
float |
numeric | ||||
half_float |
numeric | ||||
scaled_float |
numeric | ||||
integer_range |
range | ||||
long_range |
range | ||||
double_range |
range | ||||
date_range |
range | ||||
ip_range |
range | ||||
geo_point |
|
||||
geo_shape |
arbitrary geo shapes such as rectangles and polygons | ||||
ip |
Supports CIDR_notation |
Analysis
To register Analyzers, Tokenizers and TokenFilters
Sample:
Analyzer
/Tokennizer{1} + TokenFilter*/
Predefined tokenizers, token filters and character filters to configure custome analyzers.
- Standard Analyzer
- Simple Analyzer
- Whitespace Analyzer
- Stop Analyzer
- Keyword Analyzer
- Pattern Analyzer
- Language Analyzers
- Snowball Analyzer
- Custom Analyzer
Tokenizer
Token Filter
- Standard
- ASCII Folding
- Length
- Lowercase
- Uppercase
- NGram
- Edge NGram
- Porter Stem
- Shingle
- Stop
- Word Delimiter
- Stemmer
- Stemmer Override
- Keyword Marker
- Keyword Repeat
- KStem
- Snowball
- Phonetic
- Synonym
- Compound Word
- Reverse
- Elision
- Truncate
- Unique
- Pattern Capture
- Pattern Replace
- Trim
- Limit Token Count
- Hunspell
- Common Grams
- Normalization
- CJK Width
- CJK Bigram
- Delimited Payload
- Keep Words
- Keep Types
- Classic
- Apostrophe
- Decimal Digit
Character Filter
Query
Query API
Basic Query String
- {endpoint}/_search?q=hello&size=5
Ref:
Full Query API
- {endpoint}/_search?source={Query-as-JSON}
- curl -XGET {endpoint}/_search -d ‘Query-as-JSON’
Query Language
|
|
Context
- Query Context: How well does this document match this query clause?
- Filter Context: Does this document match this query clause?
Query Cateories
- Leaf query clauses
- match
- term
- range
- Compound query clauses
- not
- bool
- dis_max
- constant_score
Query Types
Match All Query
|
|
|
|
Full Text Queries
Match Query
|
|
|
|
Supported parameters:
- analyzer
- boost
- operator
- minimum_should_match
- fuzziness
- prefix_length
- max_expansions
- rewrite
- zero_terms_query
- cutoff_frequency
boolean (default)
- fuzziness
- zero_terms_query
- cutoff_frequency
- relative: [0..1), absolute: 1.0..infinite
- per-shard-level
phrase
|
|
|
|
phrase_prefix
Same as match_phrase, except that it allows for prefix matches on the last term in the text.
Multi-match Query
Multi-field queires.
Types:
- best_fields: (default) any fields with best
_score
- most_fields: any fields and conbines the
_score
- cross_fields: aggs fields of same
analyzer
- phrase: run
match_phrase
query on each field and conbines_score
- phrase_prefix: similar with
phrase
Common_terms Query
|
|
|
|
Query String Query
Full-text style query across all fields(default_field
defaults to _all
).
|
|
Parameter:
- query
- default_field (_all)
- default_operator (OR)
- analyzer
- allow_leading_wildcard (true)
- lowercase_expanded_terms (wildcard, prefix, fuzzy, and range queries) (true)
- enable_position_increments (true)
- fuzzy_max_expansions (50)
- fuzziness (AUTO)
- fuzzy_prefix_length (0)
- phrase_slop (0)
- boost (1.0)
- analyze_wildcard (false)
- auto_generate_phrase_queries (false)
- max_determinized_states (10000)
- minimum_should_match
- lenient (false)
- locale (ROOT)
- time_zone []Joda timezone(http://www.joda.org/joda-time/apidocs/org/joda/time/DateTimeZone.html)
Simple Query String Query
Never throw an exception, and discards invalid parts of the query.
Term Level Queries
Term Query
Finds documents which contain the exact term Kimchy
in the inverted index of the user
field.
Optional boost
Terms Query
|
|
Range Query
- TermRangeQuery (string)
- NumericRangeQuery (number/date)123456789{"range" : {"age" : {"gte" : 10,"lte" : 20,"boost" : 2.0}}}
Date Math:
Date format & time zone:
Parameters:
- gte: Greater-than or equal to
- gt: Greater-than
- lte: Less-than or equal to
- lt: Less-than
- boost: Sets the boost value of the query, defaults to 1.0
Exists Query
Returns documents that have at least one non-null value in the original field.
Missing Query
|
|
Prefix Query
|
|
Optional boost:
Wildcard Query
|
|
Optional boost:
Regexp Query
|
|
Optional boost:
Fuzzy Query
- Levenshtein edit distance for string
- +/- margin on numeric and date
|
|
|
|
|
|
|
|
Parameters:
- fuzziness: The maximum edit distance. Defaults to
AUTO
. - prefix_length: The number of initial characters which will not be “fuzzified”. This helps to reduce the number of terms which must be examined. Defaults to 0.
- max_expansions: The maximum number of terms that the fuzzy query will expand to. Defaults to 50.
Type Query
Matching the provided document / mapping type.
IDs Query
Filters documents that only have the provided _uid
.
Compound Queries
Constant Score query
A query that wraps another query and simply returns a constant score equal to the query boost for every document in the filter. Maps to Lucene ConstantScoreQuery.
Bool Query
Occurrence types:
- must: The clause (query) must appear in matching documents and will contribute to the score.
- filter: The clause (query) must appear in matching documents. the score of the query will be ignored.
- should: The clause (query) should appear in the matching document. In a boolean query with no must clauses, one or more should clauses must match a document, the minimum number of should clauses to match can be set using the minimum_should_match parameter.
- must_not: The clause (query) must not appear in the matching documents.
|
|
Dis Max Query
Generates the union of documents produced by its subqueries, an dscores each document with the maximum score for that document as produced by any subquery, plus a tie breaking increment for any additional matching subqueries.
Function Score Query
|
|
|
|
score_mode:
multiply
: scores are multiplied (default)sum
: scores are summedavg
: scores are averagedfirst
: the first function that has a matching filter is appliedmax
: maximum score is usedmin
: minimum score is used
boost_mode:
multiply
: query score and function score is multiplied (default)replace
: only function score is used, the query score is ignoredsum
: query score and function score are addedavg
: averagemax
: max of query score and function scoremin
: min of query score and function score
Boosting Query
|
|
Indices Query
|
|
Limit Query
A limit query limits the number of documents (per shard) to execute on.
Joining Queries
Nested Query
Sample mapping:
Sample nested query:
Has Child Query
|
|
Has Parent Query
|
|
Geo Queries
GeoShape Query
Requires the geo_shape Mapping.
Given docoment:
With envelope
extensoin:
Geo Bounding Box Query
Given document:
Query:
Geo Distance Queqy
Givven document:
Query:
Options:
Option | Descript |
---|---|
distance | The radius of the circle centred on the specified location. Points which fall into this circle are considered to be matches. The distance can be specified in various units. See the section called “Distance Unitsedit”. |
distance_type | How to compute the distance. Can either be sloppy_arc (default), arc (slightly more precise but significantly slower) or plane (faster, but inaccurate on long distances and close to the poles). |
optimize_bbox | Whether to use the optimization of first running a bounding box check before the distance check. Defaults to memory which will do in memory checks. Can also have values of indexed to use indexed value check (make sure the geo_point type index lat lon in this case), or none which disables bounding box optimization. |
_name | Optional name field to identify the query |
coerce | Set to true to normalize longitude and latitude values to a standard -180:180 / -90:90 coordinate system. (default is false). |
ignore_malformed | Set to true to accept geo points with invalid latitude or longitude (default is false). |
Geo Distance Range Query
|
|
Geo Polygon Query
|
|
Options:
Option | Description |
---|---|
_name | Optional name field to identify the filter |
coerce | Set to true to normalize longitude and latitude values to a standard -180:180 / -90:90 coordinate system. (default is false). |
ignore_malformed | Set to true to accept geo points with invalid latitude or longitude (default is false). |
Geohash Cell Query
Geohash needs be indexed:
Specialized Queries
More Like This Query
|
|
Template Query
Based on Mustache.
Stored template:
Or:
|
|
Script Query
|
|
Span Query
Span Term Query
|
|
Span Multi Term Query
|
|
Span First Query
|
|
Span Near Query
|
|
Span Or Query
|
|
Span Not Query
|
|
Span Containing Query
|
|
mapping
Mapping Options:
Option | Description | Default |
---|---|---|
tree | geohash / quadtree | geohash |
precision | in, inch, yd, yard, mi, miles, km, kilometers, m,meters, cm,centimeters, mm, millimeters | meters |
tree_levels | 50m | |
strategy | The approach for how to represent shapes at indexing and search time | recursive |
distance_error_pct | precise | 0.025 ((2.5%) |
orientation | Optionally define how to interpret vertex order for polygons / multipolygons | ccw |
points_only | false |