Olga
Olga

Reputation: 11

How to group splitter columns by count?

After splitting data by delimiter I got different columns count. For example first line from file can have 5 columns, the third 4 columns and etc.

How to group rows by count of columns after splitting and create the corresponding dataframes.

Input data is rows:

18 olga australia
12 vasily
15 gamaka
30 gobush germeny strauth 4000

As result I need to get three dataframes.

Upvotes: 0

Views: 94

Answers (1)

furas
furas

Reputation: 142869

When you split line then you could use len(row) to put it in dictionary with lists

data[ len(row) ].append(row)

And after sorting all rows you can convert every element to DataFrame


Minimal working code.

I use io only to simulate file in memory - so everyone can simply copy and run it - but you should use open()

text = '''18 olga australia
12 vasily
15 gamaka
30 gobush germeny strauth 4000'''

import io

data = {}

#with open(filename) as fh:
with io.StringIO(text) as fh:
    for line in fh:
        line = line.strip()
        row = line.split(' ')
        if len(row) not in data:
            data[ len(row) ] = []
        data[ len(row) ].append(row)

#print(data)

for key, val in data.items():
    print('\n--- len:', key, '---\n')
    print(val)
    # here create DataFrame

Result

--- len: 3 ---

[['18', 'olga', 'australia']]

--- len: 2 ---

[['12', 'vasily'], ['15', 'gamaka']]

--- len: 5 ---

[['30', 'gobush', 'germeny', 'strauth', '4000']]

Upvotes: 2

Related Questions