Thursday, September 27, 2018

Unable to create a loop to compare the content of two sheets

Leave a Comment

I've written a script which is supposed to compare the content of column A between two sheets in a workbook to find out if there are partial matches. To be clearer: If any of the content of any cell in coulmn A in sheet 1 matches any of the content of any cell in coulmn A in sheet 2 then that will be a match and the script will print that in immediate window.

This is my attempt so far:

Sub GetPartialMatch()     Dim paramlist As Range      Set paramlist = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)      For Each cel In Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)         If InStr(1, cel(1, 1), paramlist, 1) > 0 Then  'I used "paramlist" here as a placeholder as I can't use it            Debug.Print cel(1, 1)         End If     Next cel End Sub 

The thing is I can't make use of this paramlist defined within my script. I just used it there as a placeholder.

5 Answers

Answers 1

You want a double loop.

Sub GetPartialMatch()     Dim paramlist As Range     Dim cel as Range, cel2 as Range ; declare all variables!      Set paramlist = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)      For Each cel In Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)         For Each cel2 in paramlist 'Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)             If InStr(1, cel(1, 1), cel2, 1) > 0 Then                   Debug.Print cel(1, 1)             End If         Next cel2     Next cel End Sub 

Always use Option Explicit. Always.

This may be easier using a helper column and a formula, where the row in the helper column indicates TRUE if a MATCH is found. No VBA then. And it will be inherently faster.

Answers 2

a very fast approach is given by the use of arrays and Application.Match() function:

Sub GetPartialMatch()     Dim paramlist1 As Variant, paramlist2 As Variant     Dim cel As Range     Dim i As Long      paramlist1 = Sheets(1).Range("A2", Sheets(1).Cells(Rows.Count, 1).End(xlUp)).Value ' collect all sheets(1) column A values in an array     paramlist2 = Sheets(2).Range("A2", Sheets(2).Cells(Rows.Count, 1).End(xlUp)).Value ' collect all sheets(2) column A values in an array      For i = 1 To UBound(paramlist1) ' loop through paramlist1 array row index         If Not IsError(Application.Match(paramlist1(i, 1), paramlist2, 1)) Then Debug.Print paramlist1(i, 1) ' if partial match between current paramlist1 value and any paramlist2 value, then print it     Next End Sub 

if you want an exact match just use 0 as the last parameter in Match() function, i.e.:

If Not IsError(Application.Match(paramlist1(i, 1), paramlist2, 0)) Then Debug.Print paramlist1(i, 1) ' if exact match between current paramlist1 value and any paramlist2 value, then print it 

BTW, if you need an exact match you could also use Autofilter() method of Range object with xlFilterValues as its Operator parameter:

Sub GetPartialMatch2()     Dim paramlist As Variant     Dim cel As Range      paramlist = Application.Transpose(Sheets(1).Range("A2", Sheets(1).Cells(Rows.Count, 1).End(xlUp)).Value) ' collect all sheets(1) column A values in an array      With Sheets(2).Range("A1", Sheets(2).Cells(Rows.Count, 1).End(xlUp)) ' reference sheets(2) column A cells from row 1 (header) down to last not empty one         .AutoFilter field:=1, Criteria1:=paramlist, Operator:=xlFilterValues ' filter referenced range with 'paramlist'         If Application.WorksheetFunction.Subtotal(103, .Cells) > 1 Then ' if any filtered cell other then header             For Each cel In .Offset(1).Resize(.Rows.Count - 1).SpecialCells(xlCellTypeVisible) ' loop through all sheets(2) filtered cells but the header                 Debug.Print cel.Value2             Next         End If         .Parent.AutoFilterMode = False 'remove filter     End With End Sub 

Answers 3

Have you tried adding in:

Application.Screenupdating = false Application.Calculation = xlCalculationManual  ...Code...  Application.Screenupdating = true Application.Calculation = xlCalculationAutomatic 

These turn off the screen updating and automatic calculation of formulas within your instance of excel which can help speed up code a lot, you just have to remember to turn them back on at the end or you might give yourself a bit of a headache. It should be noted, though, that if you turn off screenupdating you won't be able to see the results roll in. You'll have to scroll backwards at the end

Another thing to consider would be store the data in an array before hand and doing the operations to the array and simply pasting it back in to the sheet. Accessing the sheet excessively slows down code drastically. Working with the accepted answer provided by @AJD, I made a few changes that will hopefully speed it up.

Sub macro()  Dim paramlist() As Variant Dim DataTable() As Variant Dim cell1 As Variant Dim cell2 As Variant  paramlist() = Sheets(1).Range("A2:A" & Worksheets(1).Cells(Rows.Count, 1).End(xlUp).Row).Value DataTable() = Sheets(2).Range("A2:A" & Worksheets(2).Cells(Rows.Count, 1).End(xlUp).Row).Value   For Each cell1 In paramlist     For Each cell2 In DataTable         If InStr(1, cell2, cell1, 1) > 0 Then             Debug.Print cell1             exit for         End If     Next cell2 Next cell1  End Sub 

I would have suggested this under the accepted answer as a suggestion, but unfortunately, I don't have enough rep to comment yet.

Edit: switching the order of the for loops allows you to insert a more efficient exit for and can allow you to skip large portions of data within the search array

Answers 4

Not sure if this is any faster (it uses pretty much the same algorithm, a loop inside of a loop), but I would argue it's a bit clearer:

Sub SearchForPartialMatches()     Dim needle1 As Range, needle2 As Range      Set needle1 = Excel.Worksheets(1).Range("$B$2")      Do While needle1.Value <> ""         Set needle2 = Excel.Worksheets(2).Range("$B$2")          Do While needle2.Value <> ""             If InStr(1, needle1.Value, needle2.Value) > 0 Then                 Debug.Print needle1.Value, needle2.Value             End If             Set needle2 = needle2.Offset(rowoffset:=1)         Loop         Set needle1 = needle1.Offset(rowoffset:=1)     Loop  End Sub 

The main difference is it's not looping over the entire column, but instead starts at the top, and uses the offset method until there are no more rows (with data).

Of course, you'll need to change the starting cell for needle1 and needle2.

I ran this with the EFF large word list copied into both sheets, and it ran in about 4 minutes (which was less time than with @AJD, but that might've been a fluke). YMMV.

Answers 5

Just one more option. Not much different from any suggestions above ... The concept is to speed up processing by minimizing VBA - Excel interactions by loading the values to arrays and processing arrays like this:

Dim cel as String, cel2 as String Dim arr1() as String, arr2 As String  arr1 = Sheets(1).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row) arr2 = Sheets(2).Range("A2:A" & Cells(Rows.Count, 1).End(xlUp).Row)  For Each cel In arr1     For Each cel2 in arr2         If InStr(1, cel, cel2, 1) > 0 Then               Debug.Print cel         End If     Next cel2 Next cel 

I'd like to know if it helps at all :)

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment