Implementing a Tag Database: Part 1
Background
Skyspark is a popular analytics platform for the energy management & building services industry. It’s built for the purpose of analysing and reporting on a building’s efficiency and asset health by tapping in to the historical data on site.
In my current role, I use it to monitor the HVAC performance as well as the energy consumption on site to provide performance benchmarks and aid the decision making process for the facilities managers.
My approach to learning new opensource tools has always been to implement some component of infrastructure I’m familiar with. Today, I’ll begin implementing Skyspark’s database for storing tagged records. It’s designed to store and serve Haystack compliant entities to model the relationships and layout of a building’s assets.
The end result will consist of a database that accepts haystack filters and returns the records matching the query. It will support:
- haystack filters
- SQL-like HAS, AND & OR queries
- diff, (add,remove and modify)
- tags (called markers)
- haystack datatypes
Design
Redis will be used as the opensource back-end. It has some useful features we will take advantage of such as the ability to store record ID’s in a set data structure and also the ability to use Lua scripts to simplify EVAL queries and minimise network IO for batched queries.
After some thought, a few of Redis’s limitations became apparent. It only supports strings as a primitive type for it’s values. No SQL like lookups with the ability to filter by value. Lua scripts and transactions block other operations on the database. Redis also includes a debugger for Lua.
We need the ability to add new records, modify existing records, and retrieve them by matching tags.
Setup
- Download Redis (I used docker) * docker pull redis
- Download ZeroBrane Studio
- Download Redis extension here * See https://youtu.be/7mlajCj4QPw for instructions
Implementation
Once Redis is up and running and ZeroBrane is set up with our swanky new Redis extension. We can implement our first database operation: recAdd.lua
--[[
recAdd adds haystack style records to the database
Must provide 1 unique key for the new record.
Should fail if record already exists.
TODO
* Also store values
* Create schema so that recs can be retrieved based on value
]]--
local id = KEYS[1]
local length = table.getn(ARGV)
local existing_rec = redis.call('SISMEMBER', 'REC:GLOBAL', id)
if existing_rec == 0 then
-- Add REC ID to global space
local global_res = redis.call('SADD', 'REC:GLOBAL', id)
for i=1, #ARGV do
local _, _, tag, typ, value = string.find(ARGV[i] , "(%a+):(%a+):(.*)")
-- Add REC ID to tag space
local tag_res = redis.call('SADD', 'REC:TAG:' .. tag, id)
end
else
error("Record cannot be inserted as ID: '".. id .. "' already exists")
end
return 1
The data model is based on sets by adding the new record’s ID to the set, we can test for membership and derive the haystack and & or queries needed to evaluate a standard filter. I’m simulating a namespace for the sets by prefixing strings that hold meaning to the application:
REC:TAG:{tagName}
: Contains a set with the IDs of all records that contain a value under the tag {tagName}.REC:GLOBAL
: Global set containing the IDs of all records in the database.
REC:GLOBAL is used to test whether or not the record already exists. If no record with that ID exists already in the database, the tag specs for the given record are processed and added under their own namespace REC:TAG:{tagName}
.
REC:TAG:{tagName}
is used to perform the set operations intersection
and union
that will be called when the and
& or
operators are processed by the client.
Running the code for the first time with the parameters: NEW_ID , num:int:1 created:date:20200101
tells the script that we have a new record with “NEW_ID”
for the ID and two tags: num
and created
. We’re employing a similar strategy to the namespacing mechanism so we can account for the tag values and datatypes later on. For now, we’re not storing the datatype and the value.
Redis prompt
127.0.0.1:6379> KEYS *
1) "REC:TAG:created"
2) "REC:TAG:num"
3) "REC:GLOBAL"
127.0.0.1:6379> SMEMBERS REC:TAG:created
1) "NEW_ID"
127.0.0.1:6379> SMEMBERS REC:TAG:num
1) "NEW_ID"
127.0.0.1:6379> SMEMBERS REC:GLOBAL
1) "NEW_ID"
127.0.0.1:6379>
Conclusion
So far, we’ve successfully implemented the recAdd Haystack/Skyspark operation with Lua aided Redis. Tune in soon for the next episode.
Citation
@online{andronikos2020,
author = {Andronikos, Christopher},
title = {Implementing a {Tag} {Database:} {Part} 1},
date = {2020-05-15},
url = {https://candronikos.com/posts/2020-05-10-implementing-tag-database-part-1/},
langid = {en}
}