Reputation: 11
After splitting data by delimiter I got different columns count. For example first line from file can have 5 columns, the third 4 columns and etc.
How to group rows by count of columns after splitting and create the corresponding dataframes.
Input data is rows:
18 olga australia
12 vasily
15 gamaka
30 gobush germeny strauth 4000
As result I need to get three dataframes.
Upvotes: 0
Views: 94
Reputation: 142869
When you split line then you could use len(row)
to put it in dictionary with lists
data[ len(row) ].append(row)
And after sorting all rows you can convert every element to DataFrame
Minimal working code.
I use io
only to simulate file in memory - so everyone can simply copy and run it - but you should use open()
text = '''18 olga australia
12 vasily
15 gamaka
30 gobush germeny strauth 4000'''
import io
data = {}
#with open(filename) as fh:
with io.StringIO(text) as fh:
for line in fh:
line = line.strip()
row = line.split(' ')
if len(row) not in data:
data[ len(row) ] = []
data[ len(row) ].append(row)
#print(data)
for key, val in data.items():
print('\n--- len:', key, '---\n')
print(val)
# here create DataFrame
Result
--- len: 3 ---
[['18', 'olga', 'australia']]
--- len: 2 ---
[['12', 'vasily'], ['15', 'gamaka']]
--- len: 5 ---
[['30', 'gobush', 'germeny', 'strauth', '4000']]
Upvotes: 2