Axiom Docs

The phrases aggregation extracts and counts common phrases or word sequences from text fields across a dataset. It analyzes text content to identify frequently occurring phrases, helping you discover patterns, trends, and common topics in your data. You can use this aggregation to identify common user queries, discover trending topics, extract key phrases from logs, or analyze conversation patterns in AI applications.

For users of other query languages

If you come from other query languages, this section explains how to adjust your existing queries to achieve the same results in APL.

Splunk SPL users

In Splunk SPL, there’s no built-in phrases function, but you might use the rare or top commands on tokenized text.

| rex field=message "(?<words>\w+)"
| top words

ANSI SQL users

In ANSI SQL, you would need complex string manipulation and grouping to extract common phrases.

SELECT 
  phrase,
  COUNT(*) as frequency
FROM (
  SELECT UNNEST(SPLIT(message, ' ')) as phrase
  FROM logs
)
GROUP BY phrase
ORDER BY frequency DESC
LIMIT 10

Usage

Syntax

summarize phrases(column, max_phrases)

Parameters

column (string, required): The column containing text data from which to extract phrases.
max_phrases (long, optional): The maximum number of top phrases to return. Default is 10.

Returns

Returns a dynamic array containing the most common phrases found in the specified column, ordered by frequency.

Use case examples

Log analysis
OpenTelemetry traces
Security logs

Extract common URL patterns to understand which endpoints are most frequently accessed.Query

['sample-http-logs']
| where status == '404'
| summarize common_404_paths = phrases(uri, 20)

Run in PlaygroundOutput

common_404_paths
[“/api/v1/users/profile”, “/assets/old-logo.png”, “/docs/deprecated”, …]

This query identifies the most common 404 error paths, helping you fix broken links or redirect old URLs.

make_list: Creates an array of all values. Use this when you need all occurrences rather than common phrases.
make_set: Creates an array of unique values. Use this for distinct values without frequency analysis.
topk: Returns top K values by a specific aggregation. Use this for numerical top values rather than phrase extraction.
count: Counts occurrences. Combine with group by for manual phrase counting if you need more control.
dcount: Counts distinct values. Use this to understand the variety of phrases before extracting top ones.

Get started

Functions

Operators

Reference

Migration

phrases

For users of other query languages

Usage

Syntax

Parameters

Returns

Use case examples

Get started

Functions

Operators

Reference

Migration

​For users of other query languages

​Usage

​Syntax

​Parameters

​Returns

​Use case examples

​List of related functions

For users of other query languages

Usage

Syntax

Parameters

Returns

Use case examples

List of related functions