Skip to content

engine.queries

Purpose

The engine.queries dataset logs detailed information about the execution of queries within your environment. It captures both semantic details (e.g., query structure, labels, joins) and execution-level statistics (e.g., performance metrics, errors, resource usage). This dataset is invaluable for query performance analysis, helping teams investigate slow queries, diagnose failures, and track usage patterns. By understanding the nuances of query execution, teams can identify inefficiencies, optimize query performance, and ensure the reliability of their data operations.

Schema description

Full JSON pathField data typeField data exampledescription
clientInfo.userEmailString"alex@acme.io"The email address of the user who issued the query.
clientInfo.originatingTeamIdNumber42Internal team ID that owns or initiated the query.
queryInfo.semanticLabelsObject{...}Aggregated booleans describing structural/query features.
queryInfo.semanticLabels.containsFreeTextSearchBooleantrueWhether the query includes full-text search elements.
queryInfo.semanticLabels.containsUnionsBooleanfalseWhether SQL-style UNION clauses are present.
queryInfo.semanticLabels.containsAggregationsBooleantrueWhether aggregate functions (e.g., SUM, COUNT, AVG) are used.
queryInfo.semanticLabels.nonDefaultLimitRequestedBooleantrueIndicates a non-default result size was requested.
queryInfo.semanticLabels.containsWildTextSearchBooleanfalseWhether wildcard search terms (e.g., *, %) are used.
queryInfo.semanticLabels.containsGroupingSetsBooleanfalseWhether grouping sets or similar constructs are used.
queryInfo.semanticLabels.containsJoinsBooleantrueWhether the query joins multiple datasets/tables.
queryInfo.semanticLabels.containsExtractBooleantrueWhether extract operations (e.g., parsing time/strings) are used.
queryInfo.semanticLabels.extractBeforeFilterBooleanfalseWhether the extract occurs before filtering in execution order.
queryInfo.querySyntaxString"SQL"The syntax format used (e.g., SQL, DPL, Lucene).
queryInfo.sourcesArray\[{"fqDataset":"system/logs.app","teamId":"sre", "timeFrame":{...}}]Source datasets and configurations used in the query.
queryInfo.sources.fqDatasetString"system/logs.app"Fully-qualified dataset name.
queryInfo.sources.teamIdString"sre"Team responsible for the dataset (optional).
queryInfo.sources.timeFrameObject{"start":"2025-09-01T00:00:00Z","end":"2025-09-01T01:00:00Z","durationMs":3600000}Selected time range for the source.
queryInfo.sources.timeFrame.startString (Date)"2025-09-01T00:00:00Z"Start timestamp of the source’s time range.
queryInfo.sources.timeFrame.endString (Date)"2025-09-01T01:00:00Z"End timestamp of the source’s time range.
queryInfo.sources.timeFrame.durationMsNumber3600000Total timeframe duration in milliseconds.
queryInfo.sources.scopeExpressionString"service='api' AND env='prod'"Expression that filters/scopes the source data.
queryInfo.queryOutcomeObject{...}Final execution outcome details.
queryInfo.queryOutcome.errorMessageString"Syntax error near 'FROM'"Optional error message if the query failed.
queryInfo.queryOutcome.outputRowCountNumber124Number of rows returned.
queryInfo.queryOutcome.statusEnum"Completed"Final status: Completed, Failed, Cancelled, or Incomplete.
queryInfo.queryOutcome.storageObject{...}Storage/engine resource stats and limits reached.
queryInfo.queryOutcome.storage.aggBucketsLimitReachedBooleanfalseWhether the aggregation bucket limit was hit.
queryInfo.queryOutcome.storage.bytesShuffledNumber987654321Bytes shuffled during execution.
queryInfo.queryOutcome.storage.scrollTimeoutReachedBooleanfalseTrue if the scroll operation timed out.
queryInfo.queryOutcome.storage.blocksLimitReachedBooleanfalseTrue if an internal blocks limit was hit.
queryInfo.queryOutcome.storage.filesReadLimitReachedBooleanfalseTrue if a hard limit on files read was reached.
queryInfo.queryOutcome.storage.scanLimitReachedBooleanfalseTrue if the data scan exceeded the configured threshold.
queryInfo.queryOutcome.storage.columnLimitReachedBooleanfalseWhether the column limit was hit.
queryInfo.queryOutcome.storage.shuffleSizeLimitReachedBooleanfalseWhether the shuffle size limit was exceeded.
queryInfo.queryOutcome.storage.bytesScannedNumber3456789012Total bytes scanned by the engine.
queryInfo.queryOutcome.failureTypeEnum"bad request"Failure reason (if applicable).
queryInfo.queryOutcome.failureClassEnum"clientError"Error class: clientError or serverError.
queryInfo.queryOutcome.e2eDurationMsNumber842End-to-end execution time (ms).
queryInfo.queryIdString"q-2025-09-04-abc123"Unique identifier for the query execution.
queryInfo.queryBlueprintsObject{...}Normalized representations of query components.
queryInfo.queryBlueprints.queryTextSearchFiltersString"text:\"payment failed\""Representation of text-based filters.
queryInfo.queryBlueprints.queryNoLiteralsString"SELECT * FROM logs WHERE status = ?"Query string with literals removed.
queryInfo.queryBlueprints.queryLabelFiltersString"service=api, env=prod"Normalized label-based filters.
queryInfo.defaultTimeFrameObject{"start":"2025-09-01T00:00:00Z","end":"2025-09-01T06:00:00Z","durationMs":21600000}Default time range if none is specified.
queryInfo.defaultTimeFrame.startString (Date)"2025-09-01T00:00:00Z"Default start time.
queryInfo.defaultTimeFrame.endString (Date)"2025-09-01T06:00:00Z"Default end time.
queryInfo.defaultTimeFrame.durationMsNumber21600000Default time range duration in milliseconds.
queryInfo.queryTextString"SELECT count(*) FROM logs WHERE service='api' AND env='prod'"Original raw query issued by the user.
queryInfo.queryEngineString"DPL"Backend engine that executed the query (e.g., DPL, SQL).

How the data in this dataset can be used

Query performance optimization

By analyzing fields like queryOutcome.status, queryOutcome.storage.bytesScanned, and queryOutcome.storage.aggBucketsLimitReached, users can identify slow or inefficient queries. For instance, if the bytesScanned is very high, the query might need optimization to reduce the data being processed.

Example query:

source system/engine.queries
| filter queryOutcome.status == 'completed'
| filter queryOutcome.storage.bytesScanned > 1000000
| groupby queryInfo.queryHash
    aggregate avg(queryOutcome.storage.bytesScanned) as avg_bytes_scanned
| sortby avg_bytes_scanned desc

Error diagnosis and root cause investigation

If queries frequently fail, analyzing queryOutcome.errorMessage and queryOutcome.failureType can provide insights into common issues, such as timeouts or resource limits.

Example query:

source system/engine.queries
| filter queryOutcome.status == 'failed'
| groupby queryOutcome.failureType
    aggregate count() as failures
| sortby failures desc

Usage pattern analysis

By looking at clientInfo.originatingTeamId, queryInfo.semanticLabels.containsJoins, and queryInfo.sources.fqDataset, users can track query usage trends across different teams and datasets. This helps understand which data sources or query structures are being used most frequently.

Example query:

source system/engine.queries
| filter queryInfo.semanticLabels.containsJoins == true
| groupby clientInfo.originatingTeamId aggregate count() as queries
| sortby queries desc

engine.queries Schema

{ clientInfo
userEmail

type: string
The email address of the user who issued the query.

originatingTeamId

type: number
The internal team ID that owns or initiated the query.

}
{ queryInfo
{ semanticLabels
containsFreeTextSearch

type: boolean
Whether the query includes full-text search elements.

containsUnions

type: boolean
Whether the query includes SQL-style UNION clauses.

containsAggregations

type: boolean
Whether the query uses aggregation functions like SUM, COUNT, AVG.

nonDefaultLimitRequested

type: boolean
Indicates if the query requests a result size different from the default.

containsWildTextSearch

type: boolean
Whether the query includes wildcard search terms (e.g., *, %).

containsGroupingSets

type: boolean
Whether grouping sets or similar constructs are used in the query.

containsJoins

type: boolean
Whether the query involves joining multiple datasets or tables.

containsExtract

type: boolean
Whether the query includes extract operations (e.g., parsing time or string fields).

extractBeforeFilter

type: boolean
Whether the extract operation occurs before filtering in execution order.

}
querySyntax

type: string
The syntax format used in the query (e.g., SQL, Lucene).

{ sources

type: array
A list of source datasets and their configurations used in the query.

{ items
fqDataset

type: string
The fully qualified name of the dataset.

teamId

type: string
The team responsible for the dataset (optional).

{ timeFrame
start

type: string
Start timestamp of the source's time range.

end

type: string
End timestamp of the source's time range.

durationMs

type: number
Total duration of the selected timeframe in milliseconds.

}
scopeExpression

type: string
Expression that filters or scopes the source data (e.g., labels or conditions).

}
}
{ queryOutcome
errorMessage

type: string
Optional error message if the query failed.

outputRowCount

type: number
Number of rows returned by the query.

status

Enum: Completed, Failed, Cancelled, Incomplete
The final status of the query execution.

{ storage
aggBucketsLimitReached

type: boolean
Indicates whether the aggregation bucket limit was hit.

bytesShuffled

type: number
Amount of data shuffled during execution, in bytes.

scrollTimeoutReached

type: boolean
True if the scroll operation timed out before completing.

blocksLimitReached

type: boolean
True if the query hit a limit on internal data blocks processed.

filesReadLimitReached

type: boolean
True if a hard limit on the number of files read was reached.

scanLimitReached

type: boolean
True if the data scan exceeded the configured threshold.

columnLimitReached

type: boolean
Indicates whether the column limit was hit during execution.

shuffleSizeLimitReached

type: boolean
Indicates if the shuffle size limit was exceeded.

bytesScanned

type: number
Total number of bytes scanned by the query engine.

}
failureType

Enum: bad request, rate limit reached, business timeout, not found, permission denied, internal, resource exhausted, internal death, query failed, query timed out
The reason the query failed, if applicable.

failureClass

Enum: clientError, serverError
Classification of the error as client-side or server-side.

e2eDurationMs

type: number
Total execution time from request to response in milliseconds.

}
queryId

type: string
A unique identifier for the query execution.

{ queryBlueprints
queryTextSearchFilters

type: string
A representation of the query's text-based filters.

queryNoLiterals

type: string
The query string with literals removed for comparison/normalization.

queryLabelFilters

type: string
A normalized form of the label-based filters in the query.

}
{ defaultTimeFrame
start

type: string
Default time range start if none is specified.

end

type: string
Default time range end if none is specified.

durationMs

type: number
Total default time range duration in milliseconds.

}
queryText

type: string
The original raw query as issued by the user.

queryEngine

type: string
The backend engine that executed the query (e.g., DPL, SQL).

}

Here’s the flattened one-table view for your new schema, merged into the same style: