KimC
KimC

Reputation: 1

MS Access Extract Multiple Matching Text Strings from Long Text Field compared to Table List

Issue: Query is not able to pull all of the restricted words found in a Long Text Field. It is getting the restricted words from a Table Column of ~100 values.

Sample Data

Table: RecipeTable with Long Text Field: RecipeText

Example Contents of RecipeText Field: Add the rutabaga, leeks, carrots and cabbage to the Instant Pot®. Seal and cook on high pressure for 4 minutes. Quick release the steam. Thinly slice the brisket across the grain and transfer to a serving platter. Arrange the vegetables around the meat, sprinkle with the parsley and serve with the sour cream, horseradish and mustard on the side.

Desired Result: Want to Compare RecipeText Field against every value in this Short Text Field RestrictedItem in Table: RestrictedTable.

RestrictedTable.RestrictedItem contains 100 values. Let's say it contains 6 for this exercise: milk, bake, spoon, carrots, mustard and steam.

Query would find these matched words in no particular order for a single record: carrots mustard steam

I've tried this: How to find words in a memo field with microsoft access

Result: Finds only 1 of many matches within the Long Text field.

Desired Result: Find ALL matched words extracted within the Long Text string. Duplicates & wildcards are fine. Case sensitive is bad.

Example Tried:

SELECT a.Adjectives, b.Content
FROM A, B
WHERE b.Content Like "*" & a.[adjectives] & "*"

LIKE and after is where I believe the issue is. I've tried using %, parentheses, spaces, etc to no avail.

Mine became this:

SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable, RestrictedTable
WHERE RecipeTable.RecipeText LIKE  "*" & RestrictedTable.RestrictedItem & "*";

Notes:

  1. I can find lots of advice to find single words, but not comparing whole table columns to one field.
  2. And, lots of advice to find the first substring or nth position, but I want all of the substrings that match. Not the position & I'm afraid that applying trimming, etc, will slow things down on searching 100 words & trimming for each one.
  3. I am fine making this a calculated field on my form that holds the RecipeText field.
  4. Also fine with making a button that would launch a query to compare the RecipeText field with the RestrictedTable.RestrictedItem List & fill in an empty field RestrictedFound on the same form.

Upvotes: 0

Views: 2851

Answers (1)

Charles Byrne
Charles Byrne

Reputation: 834

The code below are two approaches to find all restricted words that are in a memo field. While this could all be done programmatically without staging/work tables I would recommend using a temporary or permanent table to extract the words from the memo field via the split function in VBA (after accounting for punctuation and other data scrubbing).

After splitting the words from the memo field into an array they could then be inserted into a separate table with a foreign key reference to RecipeTable. This could be a temporary table or permanent if needed and could be part of the workflow process. A field like PendingReview could be added to RecipeTable for processing new records then marked as false afterwards so they won't be processed again.

After the words were added to the other table it could be joined to RecipeTable by foreign key and you should have all matches of restricted words.

Once you have the information you could store the stats and discard the work record from your temporary table or delete the work records until the process is run again.

You could do it all in VBA with a dictionary lookup of the restricted words, i.e., query restricted words table, add to a dictionary then loop through matching each word in the memo field with lower case or case insensitive comparison, but it may take a while.

First Code Snippet Below

(If you want compile time checks then you must Reference the Microsoft Scripting Runtime my path is C:\Windows\SysWOW64\scrrun.dll)

Dim dic as Dictionary
Dim memoField as string
Dim words() as String
Dim matchCnt as Integer
'Other variables I didnt declare

'Code to populate dictionary
'Do Until rstRestricted.EOF
'  dic.add LCase$(rst("restrictedWord")), 0
'  rstRestricted.MoveNext
'Loop
'rstRestricted.Close
'Set rstRestricted = Nothing

Set rst = New adodb.Recordset
rst.Open "SELECT [MemoField] FROM RecipeTable;"
lngRowCnt = CLng(rst.RecordCount) - 1
For x = 0 to lngRowCnt
   memoField = LCase$(Nz(rst("MemoField")))
   'Replace punctuation like commas, periods
   'memoField = Replace(memoField, ",","")

   'Now split after data scrubbed
   words = Split(memoField, " ")
   intWordCnt = UBound(words)
   For z = 0 to intWordCnt
      If LenB(words(z)) <> 0 Then
          If dic.Exists(words(z) = True Then
              matchCnt = dic(words(z))
              dic(words(z)) = matchCnt + 1
          End If
      End If
   Next z
Next x

Dim WordKeys() as Variant
Dim y as Integer
Dim restrictedWord as string
Dim wordCnt as Integer

WordKeys = dic.Keys
For y = 0 to UBound(WordKeys) '-1
     restrictedWord = CStr(WordKeys(y))
     wordCnt = CInt(WordKeys(restrictedWord))
     'code to save or display stats
Next y

rst.Close
Set rst = Nothing
Set conn = Nothing

I would just do the split of all words into a working table with the word field indexed then do an aggregate with counts of restricted words.

Second Code Snippet

'Option Explicit 
Dim sql as String
Dim memoDelimitedData() as String
'Other variables declared

'Code to open Recordset table for recipe and also code to open 
'Work table with adOpenDynamic (SELECT * from WorkTable)

'loop through records to be processed
'Split Field (May need variant instead of array.  My Access VBA is rusty)
 words = Split(memoField, " ")
 intWordCnt = UBound(words)
 For x = 0 to intWordCnt
    With rstWorkTable
      .AddNew
      !Word = words(x)
      !ForeignKeyIdToRecipeTable = intForeignKeyId
      .Update
    End With
 Next x 

Then when you have the work table records added you can join to the RecipeTable and the RestrictedTable.

So build a WorkTable of delimited Words from the memo field. Have the foreign key reference to the recipe table then join the RestrictedTable to the WorkTable by the RestrictedItem.

If needed this could be a query for a make table or a staging table permanent table. etc.

So something like this would then give you matches, of any words in your restricted table:

SELECT RecipeTable.RecipeText, RestrictedTable.RestrictedItem
FROM RecipeTable
  INNER JOIN WorkTable ON
   RecipeTable.Id = WorkTable.RecipeTableId
  INNER JOIN RestrictedTable ON
   WorkTable.ForeignKeyIdToRecipeTable = RestrictedTable.RestrictedItem

MS Access Split Function

At that point you could do counts, sums, and other data.

I'm sorry I thought I had example code, but I couldn't find it. I had to do something like this in college many moons ago using VBA and Access (Word Count/Ranking assignment), but I can't find it. Nowadays I'd do this kind of stuff with SQL Server with numbers tables, XML/JSON functionality or the Full Text Searching capability.

Hopefully this may help point you in the right direction if you need to limit your work inside MS Access.

If you're not comfortable with working with ADODB or DAO recordsets you could build a CSV delimited file with the foreign key and the word then import that file into a work table.

Upvotes: 0

Related Questions