Wednesday, October 5, 2016

Where to store SQL commands for execution

Leave a Comment

We face code quality issues because of inline mysql queries. Having self-written mysql queries really clutters the code and also increases code base etc.

Our code is cluttered with stuff like

/* beautify ignore:start */ /* jshint ignore:start */ var sql = "SELECT *" +" ,DATE_ADD(sc.created_at,INTERVAL 14 DAY) AS duedate" +" ,distance_mail(?,?,lat,lon) as distance,count(pks.skill_id) c1" +" ,count(ps.profile_id) c2" +" FROM TABLE sc" +" JOIN " +" PACKAGE_V psc on sc.id = psc.s_id " +" JOIN " +" PACKAGE_SKILL pks on pks.package_id = psc.package_id  " +" LEFT JOIN PROFILE_SKILL ps on ps.skill_id = pks.skill_id and         ps.profile_id = ?" +" WHERE sc.type in " +" ('a'," +" 'b'," +" 'c' ," +" 'd'," +" 'e'," +" 'f'," +" 'g'," +" 'h')" +" AND sc.status = 'open'" +" AND sc.crowd_type = ?" +" AND sc.created_at < DATE_SUB(NOW(),INTERVAL 10 MINUTE) " +" AND sc.created_at > DATE_SUB(NOW(),INTERVAL 14 DAY)" +" AND distance_mail(?, ?,lat,lon) < 500" +" GROUP BY sc.id" +" HAVING c1 = c2 " +" ORDER BY distance;"; /* jshint ignore:end */ /* beautify ignore:end */ 

I had to blur the code a little bit.

As you can see, having this repeatedly in your code is just unreadable. Also because atm we can not go to ES6, which would at least pretty the string a little bit thanks to multi-line strings.

The question now is, is there a way to store that SQL procedures in one place? As additional information, we use node (~0.12) and express to expose an API, accessing a MySQL db.

I already thought about, using a JSON, which will result in an even bigger mess. Plus it may not even be possible since the charset for JSON is a little bit strict and the JSON will probably not like having multi line strings too.

Then I came up with the idea to store the SQL in a file and load at startup of the node app. This is at the moment my best shot to get the SQL queries at ONE place and offering them to the rest of the node modules. Question here is, use ONE file? Use one file per query? Use one file per database table?

Any help is appreciated, I can not be the first on the planet solving this so maybe someone has a working, nice solution!

PS: I tried using libs like squel but that does not really help, since our queries are complex as you can see. It is mainly about getting OUR queries into a "query central".

10 Answers

Answers 1

I prefer putting every bigger query in one file. This way you can have syntax highlighting and it's easy to load on server start. To structure this, i usually have one folder for all queries and inside that one folder for each model.

# queries/mymodel/select.mymodel.sql SELECT * FROM mymodel;  // in mymodel.js const fs = require('fs'); const queries = {   select: fs.readFileSync(__dirname + '/queries/mymodel/select.mymodel.sql', 'utf8') }; 

Answers 2

Put your query into database procedure and call procedure in the code, when it is needed.

create procedure sp_query() select * from table1; 

Answers 3

I suggest you store your queries in .sql files away from your js code. This will separate the concerns and make both code & queries much more readable. You should have different directories with nested structure based on your business.

eg:

queries ├── global.sql ├── products │   └── select.sql └── users     └── select.sql 

Now, you just need to require all these files at application startup. You can either do it manually or use some logic. The code below will read all the files (sync) and produce an object with the same hierarchy as the folder above

var glob = require('glob') var _  = require('lodash') var fs = require('fs')  // directory containing all queries (in nested folders) var queriesDirectory = 'queries'  // get all sql files in dir and sub dirs var files = glob.sync(queriesDirectory + '/**/*.sql', {})  // create object to store all queries var queries = {}  _.each(files, function(file){     // 1. read file text     var queryText = fs.readFileSync(__dirname + '/' + file, 'utf8')      // 2. store into object     // create regex for directory name     var directoryNameReg = new RegExp("^" + queriesDirectory + "/")      // get the property path to set in the final object, eg: model.queryName     var queryPath = file         // remove directory name         .replace(directoryNameReg,'')         // remove extension         .replace(/\.sql/,'')         // replace '/' with '.'         .replace(/\//g, '.')      //  use lodash to set the nested properties     _.set(queries, queryPath, queryText) })  // final object with all queries according to nested folder structure console.log(queries) 

log output

{     global: '-- global query if needed\n',     products: {         select: 'select * from products\n'     },      users: {         select: 'select * from users\n'     } } 

so you can access all queries like this queries.users.select

Answers 4

Can you create a view which that query.

Then select from the view

I don't see any parameters in the query so I suppose view creation is possible.

Answers 5

I come from different platform, so I'm not sure if this is what you are looking for. like your application, we had many template queries and we don't like having it hard-coded in the application.

We created a table in MySQL, allowing to save Template_Name (unique), Template_SQL.

We then wrote a small function within our application that returns the SQL template. something like this:

SQL = fn_get_template_sql(Template_name); 

we then process the SQL something like this: pseudo:

if SQL is not empty     SQL = replace all parameters// use escape mysql strings from your parameter     execute the SQL 

or you could read the SQL, create connection and add parameters using your safest way.

This allows you to edit the template query where and whenever. You can create an audit table for the template table capturing all previous changes to revert back to previous template if needed. You can extend the table and capture who and when was the SQL last edited.

from performance point of view, this would work as on-the-fly plus you don't have to read any files or restart server when you are depending on starting-server process when adding new templates.

Answers 6

You could create a completely new npm module let's assume the custom-queries module and put all your complex queries in there.

Then you can categorize all your queries by resource and by action. For example, the dir structure can be:

/index.js -> it will bootstrap all the resources /queries /queries/sc (random name) /queries/psc (random name) /queries/complex (random name) 

The following query can live under the /queries/complex directory in its own file and the file will have a descriptive name (let's assume retrieveDistance)

// You can define some placeholders within this var because possibly you would like to be a bit configurable and reuseable in different parts of your code. /* jshint ignore:start */ var sql = "SELECT *" +" ,DATE_ADD(sc.created_at,INTERVAL 14 DAY) AS duedate" +" ,distance_mail(?,?,lat,lon) as distance,count(pks.skill_id) c1" +" ,count(ps.profile_id) c2" +" FROM TABLE sc" +" JOIN " +" PACKAGE_V psc on sc.id = psc.s_id " +" JOIN " +" PACKAGE_SKILL pks on pks.package_id = psc.package_id  " +" LEFT JOIN PROFILE_SKILL ps on ps.skill_id = pks.skill_id and ps.profile_id = ?" +" WHERE sc.type in " +" ('a'," +" 'b'," +" 'c' ," +" 'd'," +" 'e'," +" 'f'," +" 'g'," +" 'h')" +" AND sc.status = 'open'" +" AND sc.crowd_type = ?" +" AND sc.created_at < DATE_SUB(NOW(),INTERVAL 10 MINUTE) " +" AND sc.created_at > DATE_SUB(NOW(),INTERVAL 14 DAY)" +" AND distance_mail(?, ?,lat,lon) < 500" +" GROUP BY sc.id" +" HAVING c1 = c2 " +" ORDER BY distance;"; /* jshint ignore:end */  module.exports = sql; 

The top level index.js will export an object with all the complex queries. An example can be:

var sc = require('./queries/sc'); var psc = require('./queries/psc'); var complex = require('./queries/complex');  // Quite important because you want to ensure that no one will touch the queries outside of // the scope of this module. Be careful, because the Object.freeze is freezing only the top // level elements of the object and it is not recursively freezing the nested objects. var queries = Object.freeze({   sc: sc,   psc: psc,   complex: complex });  module.exports = queries; 

Finally, on your main code you can use the module like that:

var cq = require('custom-queries'); var retrieveDistanceQuery = cq.complex.retrieveDistance; // @todo: replace the placeholders if they exist 

Doing something like that you will move all the noise of the string concatenation to another place that you would expect and you will be able to find quite easily in one place all your complex queries.

Answers 7

There are a few things you want to do. First, you want to store multi-line without ES6. You can take advantage of toString of a function.

var getComment = function(fx) {          var str = fx.toString();          console.log(str.indexOf('/*'));          return str.substring(str.indexOf('/*') + 2, str.indexOf('*/'));        },        queryA = function() {          /*               select blah                from tableA               where whatever = condition          */        }        console.log(getComment(queryA));

You can now create a module and store lots of these functions. For example:

//Name it something like salesQry.js under the root directory of your node project.  var getComment = function(fx) {      var str = fx.toString();      console.log(str.indexOf('/*'));      return str.substring(str.indexOf('/*') + 2, str.indexOf('*/'));    },    query = {};    query.template = getComment(function() { /*Put query here*/ });  query.b = getComment(function() {    /*    SELECT *     ,DATE_ADD(sc.created_at,INTERVAL 14 DAY) AS duedate     ,distance_mail(?,?,lat,lon) as distance,count(pks.skill_id) c1     ,count(ps.profile_id) c2      FROM TABLE sc      JOIN  PACKAGE_V psc on sc.id = psc.s_id       JOIN  PACKAGE_SKILL pks on pks.package_id = psc.package_id        LEFT JOIN PROFILE_SKILL ps on ps.skill_id = pks.skill_id AND ps.profile_id = ?     WHERE sc.type in ('a','b','c','d','e','f','g','h')       AND sc.status = 'open'       AND sc.crowd_type = ?       AND sc.created_at < DATE_SUB(NOW(),INTERVAL 10 MINUTE)        AND sc.created_at > DATE_SUB(NOW(),INTERVAL 14 DAY)       AND distance_mail(?, ?,lat,lon) < 500     GROUP BY sc.id    HAVING c1 = c2     ORDER BY distance;    */  });    //Debug  console.log(query.template);  console.log(query.b);    //module.exports.query = query //Uncomment this.

You can require the necessary packages and build your logic right in this module or build a generic wrapper module for better OO design.

//Name it something like SQL.js. in the root directory of your node project. var mysql = require('mysql'),   connection = mysql.createConnection({     host: 'localhost',     user: 'me',     password: 'secret',     database: 'my_db'   });  module.exports.load = function(moduleName) {   var SQL = require(moduleName);   return {     query: function(statement, param, callback) {       connection.connect();       connection.query(SQL[statement], param, function(err, results) {         connection.end();         callback(err, result);       });     }   }); 

To use it, you do something like:

var Sql = require ('./SQL.js').load('./SalesQry.js');  Sql('b', param, function (err, results) {   ...   }); 

Answers 8

This is no doubt a million dollar question, and I think the right solution depends always on the case.

Here goes my thoughts. Hope could help:

One simple trick (which, in fact, I read that it is surprisingly more efficient than joining strings with "+") is to use arrays of strings for each row and join them.

It continues being a mess but, at least for me, a bit clearer (specially when using, as I do, "\n" as separator instead of spaces, to make resulting strings more readable when printed out for debugging).

Example:

var sql = [     "select foo.bar",     "from baz",     "join foo on (",     "  foo.bazId = baz.id",     ")", // I always leave the last comma to avoid errors on possible query grow. ].join("\n"); // or .join(" ") if you prefer. 

As a hint, I use that syntax in my own SQL "building" library. It may not work in too complex queries but, if you have cases in which provided parameters could vary, it is very helpful to avoid (also subotptimal) "coalesce" messes by fully removing unneeded query parts. It is also on GitHub, (and it isn't too complex code), so you can extend it if you feel it useful.

If you prefer separate files:

About having single or multiple files, having multiple files is less efficient from the point of view of reading efficiency (more file open/close overhead and harder OS level caching). But, if you load all of them single time at startup, it is not in fact a hardly noticeable difference.

So, the only drawback (for me) is that it is too hard to have a "global glance" of your query collection. Even, if you have very huge amount of queries, I think it is better to mix both approaches. That is: group related queries in the same file so you have single file per each module, submodel or whatever criteria you chosen.

Of course: Single file would result in relatively "huge" file, also difficult to handle "at first". But I (hardly) use vim's marker based folding (foldmethod=marker) which is very helpfull to handle that files.

Of course: if you don't (yet) use vim (truly??), you wouldn't have that option, but sure there is another alternative in your editor. If not, you always can use syntax folding and something like "function (my_tag) {" as markers.

For example:

---(Query 1)---------------------/*{{{*/ select foo from bar; ---------------------------------/*}}}*/  ---(Query 2)---------------------/*{{{*/ select foo.baz  from foo join bar using (foobar) ---------------------------------/*}}}*/ 

...when folded, I see it as:

+--  3 línies: ---(Query 1)------------------------------------------------  +--  5 línies: ---(Query 2)------------------------------------------------ 

Which, using properly selected labels, is much more handy to manage and, from the parsing point of view, is not difficult to parse the whole file splitting queries by that separation rows and using labels as keys to index the queries.

Dirty example:

#!/usr/bin/env node "use strict";  var Fs = require("fs");  var src = Fs.readFileSync("./test.sql");  var queries = {};   var label = false;  String(src).split("\n").map(function(row){     var m = row.match(/^-+\((.*?)\)-+[/*{]*$/);     if (m) return queries[label = m[1].replace(" ", "_").toLowerCase()] = "";     if(row.match(/^-+[/*}]*$/)) return label = false;     if (label) queries[label] += row+"\n"; });  console.log(queries); // { query_1: 'select foo from bar;\n', //   query_2: 'select foo.baz \nfrom foo\njoin bar using (foobar)\n' }  console.log(queries["query_1"]); // select foo from bar;  console.log(queries["query_2"]); // select foo.baz // from foo // join bar using (foobar) 

Finally (idea), if you do as much effort, wouldn't be a bad idea to add some boolean mark together with each query label telling if that query is intended to be used frequently or only occasionally. Then you can use that information to prepare those statements at application startup or only when they are going to be used more than single time.

Answers 9

Create store procedures for all queries, and replace the var sql = "SELECT..." for calling the procedures like var sql = "CALL usp_get_packages".

This is the best for performance and no dependency breaks on the application. Depending on the number of queries may be a huge task, but for every aspect (maintainability, performance, dependencies, etc) is the best solution.

Answers 10

Call procedure in the code after putting query into the db procedure. @paval also already answered you may also refer here.

create procedure sp_query()
select * from table1;

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment