Comparing Strings Not in the Same Order

Here is another situation that needed to be solved. I want to compare the name of a company from one page, to the company name that appears on their profile to make sure they are the same. The problem comes in because the names aren't listed the same way in both places.

On the first page, the name can be listed as:

Good, Johnny B

On the next page, it can be listed as:

Johnny B. Good

However, that's not always the case. In some cases the full company name can be listed in both places such as:

Amalgamated Beverage Coop

Reversing the order or switching around the last word isn't always correct. So the question is, how can these two strings be compared even though they aren't in the same order? Built in functions like equal, contains, matches, and compareTo won't work correctly.

If you go searching, there are several possible solutions. Some look relatively reliable, while others look overly complex. But based on several common theories, it looks like a reasonable solution is to break the words apart then do the comparison. That turns out to be pretty straightforward and can be done in a few steps.

I don't know if this solution borders on being clever, or stands right in the middle of total hackery. I tested it against multiple examples, and it did work, so here goes.

If you notice above, there is a "," in the first example and a "." in the second. But, they are not common to each other. The first order of business is to use replaceAll so we only have text to work with. We need to remove everything that isn't a letter of the alphabet or the space between words.

.replaceAll("[^a-zA-Z ]","")

Next, the words need to parsed into blocks so they can be compared. This can be done using the .split() command. In this case, split on the space that exists between each word. For this example, we will get 3 results.

.split(" ")

Finally, the words need to be sorted so they can be compared correctly:

.sort()

This creates, "Johnny", "B", "Good" for both strings, which is then turned into, "B", "Good", "Johnny" for both. A comparison of that will shown as the same. Even when the company name is in the same order, it is still broken apart, but the same words exist for each. That should be considered a match.

Keep in mind this won't be 100% accurate 100% of the time. It will be possible for two different companies to use the same words in a different order. However, that is a corner case and this solution still works for 90% of the situations.

List company1
List company2
company1=WebUI.getText(findTestObject('Company Name 1').replaceAll("[^a-zA-Z ]","").split(" ").sort()
company2=WebUI.getText(findTestObject('Company Name 2').replaceAll("[^a-zA-Z ]","").split(" ").sort()

if (company1!=company2){
    log.logError("ERROR: The Company Name for this profile does not match")
    KeywordUtil.markFailed("ERROR: The Company Name for this profile does not match")
}
Maybe I should've written that in a different font.

Author Signature for Posts

0