Thursday, May 31, 2018

Lookup against MYSQL TEXT type column

Leave a Comment

My table/model has TEXT type column, and when filtering for the records on the model itself, the AR where produces the correct SQL and returns correct results, here is what I mean :

MyNamespace::MyValue.where(value: 'Good Quality') 

Produces this SQL :

SELECT `my_namespace_my_values`.*  FROM `my_namespace_my_values`  WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"' 

Take another example where I m joining MyNamespace::MyValue and filtering on the same value column but from the other model (has relation on the model to my_values). See this (query #2) :

OtherModel.joins(:my_values).where(my_values: { value: 'Good Quality' }) 

This does not produce correct query, this filters on the value column as if it was a String column and not Text, therefore producing incorrect results like so (only pasting relevant where) :

WHERE my_namespace_my_values`.`value` = 'Good Quality' 

Now I can get past this by doing LIKE inside my AR where, which will produce the correct result but slightly different query. This is what I mean :

OtherModel.joins(:my_values).where('my_values.value LIKE ?, '%Good Quality%') 

Finally arriving to my questions. What is this and how it's being generated for where on the model (for text column type)?

WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"' 

Maybe most important question what is the difference in terms of performance using :

WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"' 

and this :

(my_namespace_my_values.value LIKE '%Good Quality%') 

and more importantly how do I get my query with joins (query #2) produce where like this :

WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"' 

4 Answers

Answers 1

(Partial answer -- approaching from the MySQL side.)

What will/won't match

Case 1: (I don't know where the extra backslashes and quotes come from.)

WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"'  \"Good Quality\"               -- matches Good Quality                   -- does not match The product has Good Quality.  -- does not match 

Case 2: (Find Good Quality anywhere in value.)

WHERE my_namespace_my_values.value LIKE '%Good Quality%'  \"Good Quality\"               -- matches Good Quality                   -- matches The product has Good Quality.  -- matches 

Case 3:

WHERE `my_namespace_my_values`.`value` = 'Good Quality'  \"Good Quality\"               -- does not match Good Quality                   -- matches The product has Good Quality.  -- does not match 

Performance:

  • If value is declared TEXT, all cases are slow.
  • If value is not indexed, all are slow.
  • If value is VARCHAR(255) (or smaller) and indexed, Cases 1 and 3 are faster. It can quickly find the one row, versus checking all rows.

Phrased differently:

  • LIKE with a leading wildcard (%) is slow.
  • Indexing the column is important for performance, but TEXT cannot be indexed.

Answers 2

What is this and how it's being generated for where on the model (for text column type)?

Thats generated behind Active Records (Arel) lexical engine. See my answer below on your second question as to why.

What is the difference in terms of performance using...

The "=" matches by whole string/chunk comparison While LIKE matches by character(s) ( by character(s)).

In my projects i got tables with millions of rows, from my experience its really faster to the use that comparator "=" or regexp than using a LIKE in a query.

How do I get my query with joins (query #2) produce where like this...

Can you try this,

OtherModel.joins(:my_values).where(OtherModel[:value].eq('\\\"Good Quality\\\"')) 

Answers 3

I think it might be helpful.

to search for \n, specify it as \n. To search for \, specify it as \\ this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.

link

LIKE and = are different operators.

= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.

LIKE is a string operator that compares character by character.

mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci; +-----------------------------------------+ | 'ä' LIKE 'ae' COLLATE latin1_german2_ci | +-----------------------------------------+ |                                       0 | +-----------------------------------------+ mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci; +--------------------------------------+ | 'ä' = 'ae' COLLATE latin1_german2_ci | +--------------------------------------+ |                                    1 | +--------------------------------------+ 

Answers 4

The '=' op is looking for an exact match while the LIKE op is working more like pattern matching with '%' being similar like '*' in regular expressions.

So if you have entries with

  1. Good Quality
  2. More Good Quality

only LIKE will get both results.

Regarding the escape string I am not sure where this is generated, but looks like some standardized escaping to get this valid for SQL.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment