ReverseEngineer
ReverseEngineer

Reputation: 559

Convert a text file into table using Python

I have a text file that looks like this

client interface: GigabitEthernet10/0/0.100 is up
active state: up
line state: up
vc state: down
...
colour: -

client interface: EthTrunk10.0.1 is up
active state: up
line state: up
vc state: down
...
colour: -

The list can go very long with around 5000-10000 lines of text

Any better idea to convert it to a table form like below?

Client Interface           Active State    Line State    VC State    ...     Color
GigabitEthernet10/0/0.100     up              up           down                 -
EthTrunk10.0.1                up              up           down                 -

Upvotes: 0

Views: 1849

Answers (4)

DarrylG
DarrylG

Reputation: 17156

Using Pandas

import re

# Pattern for recognizing category and value
pattern = re.compile(r'^(\w+ \w+): (\S+)')

details = {}
with open('data.txt') as file:
    for line in file:
        line = line.rstrip()
        res = pattern.match(line)
        label = res.group(1).title()   # category 
        value = res.group(2)           # value
        details.setdefault(label, []).append(value)
        
df = pd.DataFrame(details)

Data Frame

    Client Interface                  Active State  Line State  Vc State
0   GigabitEthernet10/0/0.100         up            up          down
1   EthTrunk10.0.1                    up            up          down

Upvotes: 0

Chinni Srikar
Chinni Srikar

Reputation: 30

I guess this is what you are looking for. Since, I don't know the exact amount of things each interface contains, I found headers first and then looped inside.

import pandas as pd
with open("sample.txt","r") as f:
    lines = f.readlines()
    lines = [l.strip() for l in lines]
n = len(lines)
headers = set()
for i in range(n):
    if(len(lines[i])>=3):
        key, value = lines[i].split(":")
        headers.add(key)
hlen = len(headers)
completeList = list()
for i in range(0,n,hlen):
    dictrow = dict()
    for j in range(hlen):
        if(j+i<n and len(lines[j+i])>=3):
            key,value = lines[j+i].split(":")
            dictrow[key] = value
    completeList.append(dictrow)
    dictrow = dict()
df = pd.DataFrame(completeList)

Upvotes: 0

vhessel
vhessel

Reputation: 101

Assuming You know the number of columns per interface:

import pandas as pd

df = pd.read_csv('text_filename.txt', sep=':', header=None)
N_COLS = 5

table_data = df.iloc[:,1].values.reshape(-1, N_COLS)
table_header = df.iloc[:n_cols,0].values

df = pd.DataFrame(table_data, columns=table_header)

Output:

               client interface active state line state vc state colour
0   GigabitEthernet10/0/0.100 is up           up         up     down      -
1              EthTrunk10.0.1 is up           up         up     down      -

Upvotes: 0

wasif
wasif

Reputation: 15478

To import in pandas use like this:

import pandas as pd
df = pd.read_csv("filename.txt", sep=":", header=None)

Upvotes: 1

Related Questions