JSON Search Engine
JSON-SQL
JSON-SQL is a Typescript query processor and execution engine that performs analytics using a SQL-like language in a JSON format.
It supports many basic analytics commands, including:
- column selection
- filtering
- ordering
- aggregate functions
It also supports nested queries and multiple ordering.
Table of Contents:
See here for the Github page. Based on the Insight UBC class project.
How to Use
The query syntax is based on JSON. Simple commands can be nested together to form more complex queries.
Here is an example of a simple select where
statement:
{
"WHERE": {
"GT": {
"courses_avg": 95
}
},
"OPTIONS": {
"COLUMNS": [
"courses_avg",
"courses_title"
]
}
}
More complex queries can be formed by nesting together queries with AND
or OR
syntax:
{
"WHERE": {
"AND": [{
"IS": {
"courses_dept": "cpsc"
}},
{
"GT": {
"courses_avg": 90
}
}]
},
"OPTIONS": {
"COLUMNS": [
"courses_dept",
"courses_avg",
"courses_title"
]
}
}
Results can be order by columns using the ORDER
option:
{
"WHERE": {
"LT": {
"courses_avg": 10
}
},
"OPTIONS": {
"COLUMNS": [
"courses_avg",
"courses_dept",
"courses_instructor",
"courses_title",
"courses_pass",
"courses_fail",
"courses_audit",
"courses_year"
],
"ORDER": {
"dir": "DOWN",
"keys": [
"courses_instructor",
"courses_dept"
]
}
}
}
We can also perform aggregation by using TRANSFORMATIONS
:
{
"WHERE": {
"GT": {
"courses_avg": 95
}
},
"OPTIONS": {
"COLUMNS": [
"courses_avg",
"courses_dept",
"overallSum"
],
"ORDER": {
"dir": "UP",
"keys": [
"courses_avg",
"courses_dept"
]
}
},
"TRANSFORMATIONS": {
"GROUP": [
"courses_avg",
"courses_dept"
],
"APPLY": [
{
"overallSum": {
"SUM": "courses_avg"
}
}
]
}
}
Supported Features
The project currently supports the following filters and options:
WHERE
The WHERE
keyword is used to filter columns based on a given condition. It takes supports only one condition, but can be expanded by using AND
or OR
keywords.
Syntax:
{
"WHERE": {
"<filter-keyword>": {
"<column>": 0
}
}
}
GT
The GT
keyword is used to filter for records greater than a numeric value in the specified column.
Example:
{
"WHERE": {
"GT": {
"year": 2017
}
}
}
LT
The LT
keyword is used to filter for records less than a numeric value in the specified column.
Example:
{
"WHERE": {
"LT": {
"year": 2017
}
}
}
EQ
The EQ
keyword is used to filter for records equal to a numeric value in the specified column.
Example:
{
"WHERE": {
"EQ": {
"year": 2017
}
}
}
IS
The IS
keyword is used to filter for records that match a string value in the specified column.
Example:
{
"WHERE": {
"GT": {
"year": "2017"
}
}
}
AND
The AND
keyword is used to add more filter conditions to the query. It will select records that satisfy all given conditions.
Example:
{
"WHERE": {
"AND": [{
"EQ": {
"year": 2017
}
},
{
"GT": {
"age": 23
}
}]
}
}
OR
The OR
keyword is used to add more filter conditions to the query. It will select the records that satisfy any given conditions.
Example:
{
"WHERE": {
"OR": [{
"EQ": {
"year": 2017
}
},
{
"GT": {
"age": 23
}
}]
}
}
OPTION
The OPTION
keyword is used to attach options to the query. This includes column selection, multiple ordering, and aggregate functions.
Syntax:
{
"OPTIONS": {
"<option-keyword>": {
...
}
}
}
COLUMNS
The COLUMNS
keyword is used to select items from a query. Columns are listed by name in the query, and these will be the values returned by the query engine.
Example:
{
"OPTIONS": {
"COLUMNS": [
"year",
"age",
"title"
]
}
}
ORDER
The ORDER
column is used to order the records by ascending or descending order. It supports both single selection and multiple selection.
Single Selection
You can select one column to order by ascending order by directly providing a column name in the ORDER
section.
Example:
{
"OPTIONS": {
"ORDER": "year"
}
}
Multiple Selection
You can select multiple columns by using the dir
and keys
keywords.
Example: Example:
{
"OPTIONS": {
"ORDER": {
"dir": "UP",
"keys": [
"year",
"age"
]
}
}
}
TRANSFORMATIONS
Transformations can be applied in the query to better analyze the dataset. Insight supports aggregate functions via keywords. See below for each keyword.
MAX
The MAX
keyword is used to find the maximum value of a column in a collection of records.
MIN
The MIN
keyword is used to find the minimum value of a column in a collection of records.
SUM
The SUM
keyword is used to find the sum of the values of a column in a collection of records.
AVG
The AVG
keyword is used to find the average value of a column in a collection of records.
COUNT
The COUNT
keyword is used to find the number of records for a given column in the collection of records.
REST API Service Example
The example service is currently implemented as a local REST API service. A dataset is first preloaded into the service with a HTTP PUT
request. The data is then extracted and loaded into memory for searching.
When HTTP requests are sent to the endpoint, it parses the JSON request and queries the dataset for entries that match the request.
If the corresponding values are found, the service returns a 200
status code and the requested values in the body of the response.
If no valid values are found, it will return a 200
status code with an empty array in the body of the response.
If the request is formatted incorrectly, it will return a 4XX
status code representing request was invalid or query execution failed.
Clone the code base and run npm start
to start the service locally.