Saturday, November 11, 2017

JavaScript / Node.js best practice to save JSON objects AND most efficient way to retrieve certain values

Leave a Comment

I'm building a Node.js app that will store a huge amount of data, so I want to plan ahead and think of how I should structure the data.

Let's say I want to save 500,000 student accounts information:

       ID:  unique string,   // SID0001  username:  string,          // moe-kanan  password:  string,          // 123123      Name:  string,          // Moe kanan       Age:  int,             // 1 to 100     grade:  string,          // A, B, C or D 

Now, what is the best, fastest and most efficient way to structure the data to get a specific student's account information? Example, If a student wants to login, we have to check their credentials.

Therefore, If we save the information as an array of students, that means we will have to loop through the array. Will this slow the app if we have a huge amount of people tries to login at the same time?

I came up with two different ways to do it, but I don't know which one is faster and more efficient. Please explain that in your answers.


1. First method

Store them as JSON objects, and the object key will be the unique id - in this case would be the student ID. Example:

var database = {}; //NOTICE this is an object  database["SID0001"] = {       "ID":       "SID0001",        "username": "moe-kanan",        "password": "123123",        "name":     "Moe Kanan",        "age":      99,        "grade":    "A"  } 

in this method, I don't have to loop. I can get the credentials just by doing this:

var username = database["SID0001"].username;  //moe-kanan var password = database["SID0001"].password;  //123123 

2. Second method

var database = []; //NOTICE this is an array  database.push({       "ID":       "SID0001",        "username": "moe-kanan",        "password": "123123",        "name":     "Moe Kanan",        "age":      99,        "grade":    "A"  });  var getStudentInfo = (id) => {     let obj = database.filter(studen =>  studen.ID == id)[0];   //NOTICE the [0]         return {"username": obj.username, "password": obj.password} }  getStudentInfo("SID0001"); //{username: "moe-kanan", password: "123123"} 

Please feel free to add better solutions :) I really appreciate it!

NOTE: Keep in mind, I don't want to use a database for now, but I will use MongoDB in the future.

2 Answers

Answers 1

It pretty obvious that the first method using an object much faster/efficient that using an array.

Time complexity is O(1) using a hash map as opposed to O(n) using an array.

As pointed out as others, the only real answer is that you should use a database.

Answers 2

Assuming you want to store your data in the filesystem, I assume those are flat JSON files inside a directory. We are after the O(1) cost of retrieving the data to be as efficient as possible.

Personally I would go after a file per row solution as it would be easy to maintain and implemement.

Given each row has a unique ID we could store all the files inside a 3 levels deep directory tree, where the first directory would map to the first character of the ID, second directory would map to the second character and so on:

Given ID 0001, the path to the file would be:

/storage-directory/0/0/0/0001.json 

That way we could retrieve the data in one single step given the ID. However there are half a milion files so in each directory so in each directory there would be lots of files since each ID as far as I can tell is padded with zeros. That would degrade performance somewhat because most of the filesystems does not like having too many files inside one directory.

We could use a deterministic hashing function (like SHA1 for instance) to hash the ID so a larger number of characters would be available.

SHA1(0000001) produces 82c27eaf3472b30a873d39f4342f5e54de9532b9

so the row could be stored as:

/storage-directory/8/2/c/0000001.json 

Naive implementation of the getStudentInfo method could be:

this.getStudentInfo = (id) => {     let index = this.sha1Index(id);     let key = index[0]+"/"+index[1]+"/"+index[2]+"/"+id+".json";      return fs.parseJson(this.storageDirectory+"/"+key); } 

You would have to ensure to always normalize the ID before calculating the index because SHA1 would produce different hash for let say 1 and 001 while this is the same row (strip leftmost 0 for instance).

Congratulations, you have just invented your first key-value store.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment