Commands
Commands in DataPrime serve as the fundamental building blocks for performing various operations on your data. They allow for a wide range of data manipulations and transformations, helping you refine, structure, and analyze your data to meet specific requirements. These operations can be composed together to continuously transform the data until the desired results are achieved.
Key types of operations
- Data/Structure Manipulations: Modify or adjust the structure of your dataset to suit your analysis needs.
- Filtering and Searching: Refine your data by including or excluding specific entries based on defined criteria.
- Aggregations: Summarize your data by calculating metrics such as sums, averages, counts, etc.
- Data Selection/Projection: Choose specific fields or columns to focus on from your dataset.
- Sorting: Order your data in a specific manner, based on defined criteria like ascending or descending order.
- Joins/Unions: Combine data from different sources or datasets, merging or linking them based on common keys.
- Datatype Conversions: Convert data from one type to another, ensuring compatibility for further processing.
- Data Extraction (semi-structured to structured): Transform semi-structured data into a fully structured format, making it easier to work with and analyze.
Please refer to the Command Language Reference for a detailed list of available commands and their use cases.
Example command queries
Commands offer an extremely varied set of functionality that can be turned to any problem.
Simplistic filtering using filter
Consider the following documents:
{ "name": "john", "age": 48 , "country": "il" }
{ "name": "jane", "age": 20 }
{ "name": "sophia", "age": 70 , "country": "us", "city": "San Francisco" }
{ "name": "chris", "age": 30 , "country": "uk", "city": "Manchester" }
This document represents names, ages, countries and cities of some user. We can filter this data in a number of different ways. For example, by age:
This will result in the following document, because both sophia
and john
are over the age of 30, and chris
is 30.
Filtering can be done on any expression that returns a boolean
value, meaning much more complex calculations can be used as the predicate for filtering.
Creating a new field using create
Consider the following documents:
{ "name": "john", "age": 48 , "country": "il" }
{ "name": "jane", "age": 20 }
{ "name": "sophia", "age": 70 , "country": "us", "city": "San Francisco" }
Assume we have a use case, where we need to visualize the age of each individual in days. We can perform a crude calculation to approximate this, by multiplying age
by 365
. Doing this is simple:
This will result in the following documents:
{ "name": "john", "age": 48, "age_days": 17520, "country": "il" }
{ "name": "jane", "age": 20, "age_days": 7300 }
{ "name": "sophia", "age": 70, "age_days": 25550, "country": "us", "city": "San Francisco" }
Redacting sensitive information using redact
Consider the following documents:
{ "name": "john", "msg": "John's email is [email protected]"}
{ "name": "jane", "msg": "Jane's email is [email protected]"}
{ "name": "sophia", "msg": "Sophia's email is [email protected]"}
If we wish to redact the email from the msg
fields, we can do this using the redact
command:
This will result in the following documents:
{ "name": "john", "msg": "John's email is REDACTED"}
{ "name": "jane", "msg": "Jane's email is REDACTED"}
{ "name": "sophia", "msg": "Sophia's email is REDACTED"}