I've written a script which is supposed to compare the content of column A between two sheets in a workbook to find out if there are partial matches. To be clearer: If any of the content of any cell in coulmn A in sheet 1 matches any of the content of any cell in coulmn A in sheet 2 then that will be a match and the script will print that in immediate window.
This is my attempt so far:
Sub GetPartialMatch() Dim paramlist As Range Set paramlist = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) For Each cel In Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) If InStr(1, cel(1, 1), paramlist, 1) > 0 Then 'I used "paramlist" here as a placeholder as I can't use it Debug.Print cel(1, 1) End If Next cel End Sub
The thing is I can't make use of this paramlist defined within my script. I just used it there as a placeholder.
5 Answers
Answers 1
You want a double loop.
Sub GetPartialMatch() Dim paramlist As Range Dim cel as Range, cel2 as Range ; declare all variables! Set paramlist = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) For Each cel In Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) For Each cel2 in paramlist 'Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) If InStr(1, cel(1, 1), cel2, 1) > 0 Then Debug.Print cel(1, 1) End If Next cel2 Next cel End Sub
Always use Option Explicit
. Always.
This may be easier using a helper column and a formula, where the row in the helper column indicates TRUE
if a MATCH
is found. No VBA then. And it will be inherently faster.
Answers 2
a very fast approach is given by the use of arrays and Application.Match()
function:
Sub GetPartialMatch() Dim paramlist1 As Variant, paramlist2 As Variant Dim cel As Range Dim i As Long paramlist1 = Sheets(1).Range("A2", Sheets(1).Cells(Rows.Count, 1).End(xlUp)).Value ' collect all sheets(1) column A values in an array paramlist2 = Sheets(2).Range("A2", Sheets(2).Cells(Rows.Count, 1).End(xlUp)).Value ' collect all sheets(2) column A values in an array For i = 1 To UBound(paramlist1) ' loop through paramlist1 array row index If Not IsError(Application.Match(paramlist1(i, 1), paramlist2, 1)) Then Debug.Print paramlist1(i, 1) ' if partial match between current paramlist1 value and any paramlist2 value, then print it Next End Sub
if you want an exact match just use 0 as the last parameter in Match()
function, i.e.:
If Not IsError(Application.Match(paramlist1(i, 1), paramlist2, 0)) Then Debug.Print paramlist1(i, 1) ' if exact match between current paramlist1 value and any paramlist2 value, then print it
BTW, if you need an exact match you could also use Autofilter()
method of Range
object with xlFilterValues
as its Operator
parameter:
Sub GetPartialMatch2() Dim paramlist As Variant Dim cel As Range paramlist = Application.Transpose(Sheets(1).Range("A2", Sheets(1).Cells(Rows.Count, 1).End(xlUp)).Value) ' collect all sheets(1) column A values in an array With Sheets(2).Range("A1", Sheets(2).Cells(Rows.Count, 1).End(xlUp)) ' reference sheets(2) column A cells from row 1 (header) down to last not empty one .AutoFilter field:=1, Criteria1:=paramlist, Operator:=xlFilterValues ' filter referenced range with 'paramlist' If Application.WorksheetFunction.Subtotal(103, .Cells) > 1 Then ' if any filtered cell other then header For Each cel In .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible) ' loop through all sheets(2) filtered cells but the header Debug.Print cel.Value2 Next End If .Parent.AutoFilterMode = False 'remove filter End With End Sub
Answers 3
Have you tried adding in:
Application.Screenupdating = false Application.Calculation = xlCalculationManual ...Code... Application.Screenupdating = true Application.Calculation = xlCalculationAutomatic
These turn off the screen updating and automatic calculation of formulas within your instance of excel which can help speed up code a lot, you just have to remember to turn them back on at the end or you might give yourself a bit of a headache. It should be noted, though, that if you turn off screenupdating you won't be able to see the results roll in. You'll have to scroll backwards at the end
Another thing to consider would be store the data in an array before hand and doing the operations to the array and simply pasting it back in to the sheet. Accessing the sheet excessively slows down code drastically. Working with the accepted answer provided by @AJD, I made a few changes that will hopefully speed it up.
Sub macro() Dim paramlist() As Variant Dim DataTable() As Variant Dim cell1 As Variant Dim cell2 As Variant paramlist() = Sheets(1).Range("A2:A" & Worksheets(1).Cells(Rows.Count, 1).End(xlUp).Row).Value DataTable() = Sheets(2).Range("A2:A" & Worksheets(2).Cells(Rows.Count, 1).End(xlUp).Row).Value For Each cell1 In paramlist For Each cell2 In DataTable If InStr(1, cell2, cell1, 1) > 0 Then Debug.Print cell1 exit for End If Next cell2 Next cell1 End Sub
I would have suggested this under the accepted answer as a suggestion, but unfortunately, I don't have enough rep to comment yet.
Edit: switching the order of the for loops allows you to insert a more efficient exit for
and can allow you to skip large portions of data within the search array
Answers 4
Not sure if this is any faster (it uses pretty much the same algorithm, a loop inside of a loop), but I would argue it's a bit clearer:
Sub SearchForPartialMatches() Dim needle1 As Range, needle2 As Range Set needle1 = Excel.Worksheets(1).Range("$B$2") Do While needle1.Value <> "" Set needle2 = Excel.Worksheets(2).Range("$B$2") Do While needle2.Value <> "" If InStr(1, needle1.Value, needle2.Value) > 0 Then Debug.Print needle1.Value, needle2.Value End If Set needle2 = needle2.Offset(rowoffset:=1) Loop Set needle1 = needle1.Offset(rowoffset:=1) Loop End Sub
The main difference is it's not looping over the entire column, but instead starts at the top, and uses the offset
method until there are no more rows (with data).
Of course, you'll need to change the starting cell for needle1
and needle2
.
I ran this with the EFF large word list copied into both sheets, and it ran in about 4 minutes (which was less time than with @AJD, but that might've been a fluke). YMMV.
Answers 5
Just one more option. Not much different from any suggestions above ... The concept is to speed up processing by minimizing VBA - Excel interactions by loading the values to arrays and processing arrays like this:
Dim cel as String, cel2 as String Dim arr1() as String, arr2 As String arr1 = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) arr2 = Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) For Each cel In arr1 For Each cel2 in arr2 If InStr(1, cel, cel2, 1) > 0 Then Debug.Print cel End If Next cel2 Next cel
I'd like to know if it helps at all :)
0 comments:
Post a Comment