My table/model has TEXT
type column, and when filtering for the records on the model itself, the AR where
produces the correct SQL and returns correct results, here is what I mean :
MyNamespace::MyValue.where(value: 'Good Quality')
Produces this SQL :
SELECT `my_namespace_my_values`.* FROM `my_namespace_my_values` WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"'
Take another example where I m joining MyNamespace::MyValue
and filtering on the same value
column but from the other model (has relation on the model to my_values
). See this (query #2) :
OtherModel.joins(:my_values).where(my_values: { value: 'Good Quality' })
This does not produce correct query, this filters on the value
column as if it was a String column and not Text, therefore producing incorrect results like so (only pasting relevant where) :
WHERE my_namespace_my_values`.`value` = 'Good Quality'
Now I can get past this by doing LIKE inside my AR where
, which will produce the correct result but slightly different query. This is what I mean :
OtherModel.joins(:my_values).where('my_values.value LIKE ?, '%Good Quality%')
Finally arriving to my questions. What is this and how it's being generated for where on the model (for text column type)?
WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"'
Maybe most important question what is the difference in terms of performance using :
WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"'
and this :
(my_namespace_my_values.value LIKE '%Good Quality%')
and more importantly how do I get my query with joins (query #2) produce where like this :
WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"'
4 Answers
Answers 1
(Partial answer -- approaching from the MySQL side.)
What will/won't match
Case 1: (I don't know where the extra backslashes and quotes come from.)
WHERE `my_namespace_my_values`.`value` = '\\\"Good Quality\\\"' \"Good Quality\" -- matches Good Quality -- does not match The product has Good Quality. -- does not match
Case 2: (Find Good Quality
anywhere in value
.)
WHERE my_namespace_my_values.value LIKE '%Good Quality%' \"Good Quality\" -- matches Good Quality -- matches The product has Good Quality. -- matches
Case 3:
WHERE `my_namespace_my_values`.`value` = 'Good Quality' \"Good Quality\" -- does not match Good Quality -- matches The product has Good Quality. -- does not match
Performance:
- If
value
is declaredTEXT
, all cases are slow. - If
value
is not indexed, all are slow. - If
value
isVARCHAR(255)
(or smaller) and indexed, Cases 1 and 3 are faster. It can quickly find the one row, versus checking all rows.
Phrased differently:
LIKE
with a leading wildcard (%
) is slow.- Indexing the column is important for performance, but
TEXT
cannot be indexed.
Answers 2
What is this and how it's being generated for where on the model (for text column type)?
Thats generated behind Active Records (Arel) lexical engine. See my answer below on your second question as to why.
What is the difference in terms of performance using...
The "=" matches by whole string/chunk comparison While LIKE matches by character(s) ( by character(s)).
In my projects i got tables with millions of rows, from my experience its really faster to the use that comparator "=" or regexp than using a LIKE in a query.
How do I get my query with joins (query #2) produce where like this...
Can you try this,
OtherModel.joins(:my_values).where(OtherModel[:value].eq('\\\"Good Quality\\\"'))
Answers 3
I think it might be helpful.
to search for \n, specify it as \n. To search for \, specify it as \\ this is because the backslashes are stripped once by the parser and again when the pattern match is made, leaving a single backslash to be matched against.
LIKE and = are different operators.
= is a comparison operator that operates on numbers and strings. When comparing strings, the comparison operator compares whole strings.
LIKE is a string operator that compares character by character.
mysql> SELECT 'ä' LIKE 'ae' COLLATE latin1_german2_ci; +-----------------------------------------+ | 'ä' LIKE 'ae' COLLATE latin1_german2_ci | +-----------------------------------------+ | 0 | +-----------------------------------------+ mysql> SELECT 'ä' = 'ae' COLLATE latin1_german2_ci; +--------------------------------------+ | 'ä' = 'ae' COLLATE latin1_german2_ci | +--------------------------------------+ | 1 | +--------------------------------------+
Answers 4
The '=' op is looking for an exact match while the LIKE op is working more like pattern matching with '%' being similar like '*' in regular expressions.
So if you have entries with
- Good Quality
- More Good Quality
only LIKE will get both results.
Regarding the escape string I am not sure where this is generated, but looks like some standardized escaping to get this valid for SQL.