Reputation: 175
Currently I am working on grabbing data from a csv (string type) and splitting the string (int type) and putting each integer of the string in its own column. The Strings are all the same length, So far I have this code:
import pandas as pd
column_names = ['W1', 'W2', 'W3', 'W4', 'W5', 'W6', 'W7', 'W8', 'W9', 'W10', 'W11', 'W12', 'W13', 'W14', 'W15', 'W16',
'W17', 'W18', 'W19', 'W20']
db = pd.read_csv(databasefile, skip_blank_lines=True,
names=['A', 'B', 'C', 'D'], header=0)
db[column_names] = db['B'].str.split(',', expand=True)
This code does work to some extent, the values from the B column are split from the list and values are recorded to the columns in the dataframe, I am able to check this by printing out the column values such as print(db["W2"]
where the values are printed.
My problem however is that the data is recorded to the dataframe but not the actual CSV. The columns ['W1', 'W2', 'W3', 'W4', 'W5', 'W6', 'W7', 'W8', 'W9', 'W10', 'W11', 'W12', 'W13', 'W14', 'W15', 'W16', 'W17', 'W18', 'W19', 'W20'] are not in the CSV to fix this I tried to use
db = pd.concat([db, pd.DataFrame(columns=column_names)])
I also tried using
db[column_names] = db['Winning_Numbers'].str.split(',', expand=True).to_csv(databasefile, index=False)
This does work*, the problem is that it overwrites all information in the CSV
Anyhow thankyou for reading! I would appreciate any help with this problem
UPDATE:
The Desired Function is to have this CSV
Where the B column is a string, is to split the string in column B and take each number in the string and put it into its own column, This is done by the following code
db[column_names] = db['B'].str.split(',', expand=True)
this works and I am able to read the data in each column, [W1-W20] However the CSV currently only has 4 Columns, I am trying to append the information in the dataframe to the CSV but it only overwrites the current data in there with
to_csv
I tried using the appending mode for the to_csv but that never appended the data in the dataframe to the csv, so hopefully that is more clarification to the problem, on how to append data from the dataframe into the CSV (adding more columns to the CSV and appending data to those columns)
Upvotes: 0
Views: 1909
Reputation: 30609
I'm afraid I still don't fully understand what is the desired output but maybe this helps you to get started. It appends the original data with the new columns to the existing data in the csv.
import pandas as pd
column_names = ['W1', 'W2', 'W3', 'W4', 'W5', 'W6', 'W7', 'W8', 'W9', 'W10', 'W11', 'W12', 'W13', 'W14', 'W15', 'W16',
'W17', 'W18', 'W19', 'W20']
db = pd.read_csv(databasefile, skip_blank_lines=True, names=['A', 'B', 'C', 'D'], header=None)
db[column_names] = db['B'].str.split(',', expand=True)
with open(databasefile, 'a') as f:
db.to_csv(f, header=False, index=False)
If you want to replace the string column 'B' with the expanded values you can use:
db[['A']+column_names+['C','D']].to_csv(f, header=False, index=False)
Upvotes: 1