See how ArangoDB compares to other database management systems and how to use ArangoDB to build and operate a multi-level nested document system.
A QUICK SUMMARY – FOR THE BUSY ONES
Background and Technologies:
Challenge:
Solution Requirements:
Potential Solutions Evaluated:
DBMS Evaluation:
Reasons for Choosing ArangoDB:
Implementation and Benefits:
Drawbacks and Mitigations:
Read on to discover the details.
TABLE OF CONTENTS
As a software development company, we very often work on complex applications that need to handle lots of data. Recently, on one of our projects, we’ve faced a challenge – We had a lot of data on many levels and we had to be able to operate directly on these documents.
Do you want to know how we resolved our problem?
Join us on our journey to find out.
But first, let me introduce our background.
At Brainhub we specialize in building apps with JavaScript and we do so using the following technologies:
We have a lot of data on many levels, which means, in a document model, many levels of nested documents. Moreover, we have to be able to operate directly on these nested documents (children, grandchildren, great-grandchildren etc.).
We have to create an API not only for our frontend but also for external integrations. The user should be able to send a JSON schema, which is later used for validation of provided data when creating or updating, and it’s also used to join documents from various collections.
An example of a simple JSON schema:
{
"$schema": "http://json-schema.org/draft-04/schema#",
"title": "profiles",
"description": "Profile Schema",
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "ID"
},
"address": {
"type": "string",
"description": "Address"
},
"email": {
"type": "string",
"description": "E-mail address"
},
"firstname": {
"type": "string",
"description": "First name"
},
"lastname": {
"type": "string",
"description": "Last name"
},
"transactions": {
"type": "array",
"description": "List of transactions connected to the profile",
"items": {
"title": "transactions",
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "ID"
},
"orderTotal": {
"type": "string",
"description": "Total value of the transaction"
},
"invoices": {
"type": "array",
"description": "List of invoices related to the transaction",
"items": {
"title": "invoices",
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "ID"
},
"discountPercent": {
"type": "string",
"description": "Discount percent"
},
"itemNo": {
"type": "string",
"description": "Item ID"
}
}
}
}
}
}
}
}
}
So there are the following solutions:
We would like a DBMS which satisfies:
Other useful features are:
We love open source solutions, so we eliminated DBMS such as Oracle, SQL Server, DB2 and for licensing issues MySQL.
We made a comparison of many No-SQL DBMS (not only for this project but also to have some overview for other projects, the data is as of February 18th, 2018):
We took some DBMS from the top of the rank above but eliminating:
Moreover, among the SQL DBMS, we decided to include only PostgreSQL in our research because it makes it possible to store JSON-like data, which we need.
Based on the table above and other research, DBMS that seem to suit our needs the most are:
ArangoDB seems to be something like MongoDB (we have the most experience in MongoDB) with some extra features. Of course it lacks some MongoDB features like the Aggregation Framework but, in reality, this one is not lacking but replaced with something more user-friendly – AQL + joins.
ArangoDB like MongoDB provides clustering, though the ArangoDB clustering has not proven to work stably on production. One of the key factors was a very active community. It has a very low ratio of open issues to the total number of issues. Moreover, everyone can easily access ArangoDB Slack where the support team is very helpful, and also in Stackoverflow they give adequate responses.
Another reason was that ArangoDB is a multi-model DBMS, which is useful as we were planning to extend our documents with using graphs.
We have used the following ArangoDB features:
We are potentially planning to use in the future:
In the ArangoDB shell, we found a very useful feature which doesn’t exist in MongoDB. To learn AQL, no data in the collections was needed because it’s possible to type something like this:
db._query('for i in [1,2,3] return i * i')
Because one of the requirements was to build the data from many collections using the provided JSON schema, we were looking for an ArangoDB query builder.
We found something which was rather unpopular and lacked many features, so we created our own ArangoDB query builder.
We created an abstract interface, so when replacing ArangoDB into another DBMS, only the inner implementation would be changed.
An example code of our query builder:
const QueryBuilder = () => {
const priv = {
// private fields and methods
};
const pub = {
getQueryTree() {
return priv.queryTree;
},
fromSchema(schema) {
priv.mainCollectionName = schema.title;
priv.queryTree.loop = `FOR ${schema.title}Item in ${schema.title}`;
priv.queryTree.sorting = `SORT ${schema.title}Item.id`;
// some more code
return pub;
},
withLimit(offset, count) {
// some code
},
byId(id) {
// some code
},
byIdentifiers(identifiers) {
// some code
},
byParentId(collectionName, id) {
// some code
},
toAQL() {
return [
priv.toString(),
priv.bindings,
];
},
};
return pub;
};
export default QueryBuilder;
We have created some JavaScript code which runs on the ArangoDB server, and we use this code for most transactions.
// this function is run on the ArangoDB server, and, thus, it cannot use all es6 features
const dbProcedure = (params) => {
const db = require('internal').db;
const updateProperObject = () => {
const collection = db._collection(params.myCollectionName);
return collection.updateByExample({ id: params.newObject.id }, params.newObject);
};
const removeObjects = (collectionName) => {
params.childrenIdsToBeRemoved[collectionName].forEach((id) => {
db._collection(collectionName).removeByExample({ id });
});
};
/*
* The remaining public and private methods
*/
const actions = {
create,
update,
remove,
override,
};
return actions[params.action]();
};
export default dbProcedure;
It was pretty cool that we were able to use some popular libraries like Lodash even on the database server.
ArangoDB provides an HTTP framework named Foxx which simplifies creating microservices that connect to ArangoDB.
However, we decided not to use Foxx because we didn’t want our microservices to be dependent on a database (like it is while using MongoDB + Mongoose or direct connecting between frontend and CouchDB REST API). Instead, we created abstract models which internally use ArangoDB.
This approach proved to be a good choice because later we had to replace ArangoDB into Redis in some critical places of the system in order to remove bottlenecks that were hindering the overall performance.
Thanks to AQL, writing and debugging queries was much easier than when using the MongoDB Aggregation Framework. Also, the transactions were very helpful. Even running JavaScript code on the database server was easier than in MongoDB because it was possible to use some libraries like Lodash.
Unfortunately, ArangoDB was not fast enough to handle a very large number of write/reads in a short period of time (however, even with other DBMS, e.g. MongoDB or MySQL, writing data to the hard disk would be too slow). That’s why in some microservices we decided to replace ArangoDB into Redis, which works in memory.
However, even in this situation, using ArangoDB was a better choice than MongoDB + Mongoose because Mongoose models usually make the entire architecture dependent on DBMS.
On the other hand, we created an abstract model using ArangoDB, so replacing DBMS into another was relatively easy.
Take note that MongoDB can be used with such an abstract model as well – so I just say that ArangoDB + our abstract models can be much better than MongoDB + Mongoose.
Our promise
Every year, Brainhub helps 750,000+ founders, leaders and software engineers make smart tech decisions. We earn that trust by openly sharing our insights based on practical software engineering experience.
Authors
Read next
Popular this month