Amber.G
Amber.G

Reputation: 1513

parse and extract info from string

i am using windows OS, python 2.7 I list all the available ports (COM1, 3, 4) and the current output is:

[('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'), ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\\3&11583659&0&B3'), ('COM1', 'Communications Port (COM1)', 'ACPI\\PNP0501\\0')]

My question is, I would like to use/connect to COM4. So I hope I can find a way to parse that port and use it afterwards. i.e. how to extract COM4 from that long string? I plan to target it by PID.

Thanks.

Upvotes: 0

Views: 251

Answers (2)

jeremija
jeremija

Reputation: 2538

Let's say your string is stored in the value variable:

value = "[('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'), ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\\3&11583659&0&B3'), ('COM1', 'Communications Port (COM1)', 'ACPI\\PNP0501\\0')]"
data = [tuple(item[1:][:-1].split("', '")) for item in value[2:][:-2].split('), (')]

Given:

data = [('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'), ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\\3&11583659&0&B3'), ('COM1', 'Communications Port (COM1)', 'ACPI\\PNP0501\\0')]

Then a regex can be used to extract the PID value:

import re
pid = re.search('VID:PID=.*:(.*) ', data[0][2]).group(1)

If you want both VID and PID:

import re
match = re.search('VID:PID=(.*):(.*) ', data[0][2])
if match and len(match.groups()) > 0:
    vid = match  and match.group(1) or None
    pid = match and match.group(2) or None

Update

To map a port to PID, or vice versa:

import re

data = [('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'), ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\\3&11583659&0&B3'), ('COM1', 'Communications Port (COM1)', 'ACPI\\PNP0501\\0')]

def extract_pid(dev_string):
    match = re.search('PID=.*:(.*) ', dev_string)
    groups = match and match.groups()
    return groups and len(groups) > 0 and groups[0] or None

port_to_pid_dict = dict((item[0], extract_pid(item[2])) for item in data)
# port_to_pid_dict = {'COM1': None, 'COM3': None, 'COM4': '2108'}
pid_to_port_dict = dict((extract_pid(item[2]), item[0]) for item in data if 'PID=' in item[2])
# pid_to_port_dict = {'2108': 'COM4'}

And then you can use pid_to_port_dict['2018'], which gives 'COM4'.

Of course, additional logic would be necessary if you had multiple instances of the same product connected to the computer at the same time:

import collections

pid_to_port_dict = collections.defaultdict(list)
for item in data:
    pid = extract_pid(item[2])
    if pid:
        pid_to_port_dict[pid].append(item[0])

Now pid_to_port_dict['2018'] will result with an array of ports the products with specific productId are connected to: ['COM4'].

It would probably be a better idea to check for both product ID and vendor ID as multiple products from various vendors can have the same product ID.

Update 2

Here is how I would do it. Click here for an interactive example.

import collections
import re

class Device:
    def __init__(self, vendorId, productId, port):
        self.vendorId = vendorId
        self.productId = productId
        self.port = port

    def __repr__(self):
        return self.__str__()

    def __str__(self):
        return "Device {}:{} at {}".format(self.vendorId, self.productId, self.port)

def create_defaultdict_with_list():
    return collections.defaultdict(list)

class DeviceParser:
    def __init__(self, devices):
        self.devices = devices

    @staticmethod
    def _parse(device_string, port):
        match = re.search('VID:PID=(.*):(.*) ', device_string)
        if match and len(match.groups()) == 2:
            return Device(vendorId=match.group(1),
                          productId=match.group(2),
                          port=port)
        return None

    def parse_as_list(self):
        devices = []
        for port, description, dev_str in self.devices:
            dev = self._parse(dev_str, port)
            if dev:
                devices.append(dev)
        return devices

    def parse_as_vendor_map(self):
        vendors = collections.defaultdict(create_defaultdict_with_list)
        for port, description, dev_str in self.devices:
            dev = self._parse(dev_str, port)
            if not dev:
                continue
            vendors[dev.vendorId][dev.productId].append(dev)
        return vendors

def main():
    data = [
        ('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'),
        ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\\3&11583659&0&B3'),
        ('COM1', 'Communications Port (COM1)', 'ACPI\\PNP0501\\0')
    ]
    parser = DeviceParser(data)
    devices = parser.parse_as_list()
    print('recognized device list:', devices)
    vendors = parser.parse_as_vendor_map()
    print('device by vendorId and productId:', vendors['2108']['780C'])
    print('port of device:', vendors['2108']['780C'][0].port)

if __name__ == '__main__':
    main()

Upvotes: 1

Open AI - Opting Out
Open AI - Opting Out

Reputation: 24164

Bit by bit:

>>> ports = [('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX'),
...          ('COM3', 'Intel(R) Active Management Technology - SOL (COM3)', 'PCI\VEN_8086&DEV_9C3D&SUBSYS_05DE1028&REV_04\3&11583659&0&B3'),
...          ('COM1', 'Communications Port (COM1)', 'ACPI\PNP0501\0')]

>>> ports[0]
('COM4', 'Neato Robotics USB Port (COM4)', 'USB VID:PID=2108:780C SNR=XXXX')

>>> ports[0][2]
'USB VID:PID=2108:780C SNR=XXXX'

>>> ports[0][2].split(':')
['USB VID', 'PID=2108', '780C SNR=XXXX']

>>> ports[0][2].split(':')[1]
'PID=2108'

>>> ports[0][2].split(':')[1].split('=')
['PID', '2108']

>>> ports[0][2].split(':')[1].split('=')[1]
'2108'

>>> int(ports[0][2].split(':')[1].split('=')[1])
2108

Upvotes: 2

Related Questions