Compare two vcards

Question

I have two vcards :

vcard1 = "BEGIN:VCARD
          VERSION:3.0
          N;CHARSET=UTF-8:Name;;;;
          TEL:0005555000
          END:VCARD"

vcard2 = "BEGIN:VCARD
      VERSION:3.0
      N;CHARSET=UTF-8:Name;;;;
      TEL:0005555000
      EMAIL;CHARSET=UTF-8:my_email@email.com
      END:VCARD"

As you can see the only difference is that the second vcard has an additional attribute which is EMAIL? Are these two vcards could be considered as equal using code ?

import vobject
print(vobject.readOne(vcard1).serialize()==vobject.readOne(vcard2).serialize())

CypherX · Accepted Answer

Solution

The first rule for any comparison is to define the basis of comparison. You can even compare apples and oranges, provided you are looking for a quantity that can be compared: such as "how many apples vs. oranges" or "weight of 5-apples vs. 5-oranges". The point being the definition of underlying basis of comparison must be unambiguous.

Note: I will use the data from the Dummy Data section below.

Extending this concept to your use-case, you can compare the vcards against each field and then also compare against all fields. For example, I have shown you three ways to compare them:

Example A1: compare only commmon fileds between vcard1 and vcard2.
Example A2: compare all fileds between vcard1 and vcard2.
Example A3: compare only commmon user-specified fileds between vcard1 and vcard2.

Obviously, in this case if you compare the serialized versions of vcard1 and vcard2, it would return False as the content of these two vcards are different.

vc1.serialize()==vc2.serialize() # False

Example

In each case (A1, A2, A3), the custom function compare_vcards() returns two things:

match: a dict, giving matches at each field's level
summary: a dict, giving aggregated absolute match (if all fields matched) and relative (scale: [0,1]) matches (good for partial match).

But you will have to define your own business logic to determine what you consider as a match and what is not. What I have shown here should help you get started though.

## Example - A1
#  Compare ONLY COMMON fields b/w vc1 and vc2
match, summary = compare_vcards(vc1, vc2, mode='common')
print(f'match:   	{match}')
print(f'summary: 	{summary}')

## Output
# match:    {'n': True, 'tel': True, 'version': True}
# summary:  {'abs_match': True, 'rel_match': 1.0}

## Example - A2
#  Compare ALL fields b/w vc1 and vc2
match, summary = compare_vcards(vc1, vc2, mode='all')
print(f'match:   	{match}')
print(f'summary: 	{summary}')

## Output
# match:    {'tel': True, 'email': False, 'n': True, 'version': True}
# summary:  {'abs_match': False, 'rel_match': 0.75}

## Example - A3
#  Compare ONLY COMMON USER-SPECIFIED fields b/w vc1 and vc2
match, summary = compare_vcards(vc1, vc2, fields=['email', 'n', 'tel'])
print(f'match:   	{match}')
print(f'summary: 	{summary}')

## Output
# match:    {'email': False, 'n': True, 'tel': True}
# summary:  {'abs_match': False, 'rel_match': 0.6666666666666666}

Code

def get_fields(vc1, vc2, mode='common'):
    if mode=='common':
        fields = set(vc1.sortChildKeys()).intersection(set(vc2.sortChildKeys()))
    else:
        # mode = 'all'
        fields = set(vc1.sortChildKeys()).union(set(vc2.sortChildKeys()))
    return fields

def compare_vcards(vc1, vc2, fields=None, mode='common'):
    if fields is None:
        fields = get_fields(vc1, vc2, mode=mode) 
    match = dict(
        (field, str(vc1.getChildValue(field)).strip()==str(vc2.getChildValue(field)).strip()) 
        for field in fields
    )
    summary = {
        'abs_match': all(match.values()), 
        'rel_match': sum(match.values()) / len(match)
    }
    return match, summary

Dummy Data

vcard1 = """
BEGIN:VCARD
VERSION:3.0
N;CHARSET=UTF-8:Name;;;;
TEL:0005555000
END:VCARD
"""

vcard2 = """
BEGIN:VCARD
VERSION:3.0
N;CHARSET=UTF-8:Name;;;;
TEL:0005555000
EMAIL;CHARSET=UTF-8:my_email@email.com
END:VCARD
"""

# pip install vobject
import vobject

vc1 = vobject.readOne(vcard1)
vc2 = vobject.readOne(vcard2)

Compare two vcards

Answers (1)

Solution

Example

Code

Dummy Data

References

Related Questions