Reputation: 1012
I'm trying to dump a .csv file into a .yml file and have succedeed. Only thing is that the syntax in the .yml file is not how I want it.
My .csv file:
NAME,KEYWORDS
Adam,Football Hockey
Where I read the .csv file and dump it into a .yml file:
import csv
import pandas
import yaml
""" Reading whole csv file with panda library """
df = pandas.read_csv('keywords.csv')
""" Dump DataFrame into getData.yml as yaml code """
with open('getData.yml', 'w') as outfile:
yaml.dump(
df.to_dict(orient='records'),
outfile,
sort_keys=False,
width=72,
indent=4
)
How the .yml output looks:
- NAME: Adam
KEYWORDS: Football Hockey
How I want it to look:
- NAME: Adam
KEYWORDS: Football, Hockey
I want to have a comma between Football and Hockey. But if I put that in the .csv file it will all look weird because everything is separated by comma from the first place. How can i do this?
Upvotes: 0
Views: 3088
Reputation: 1003
The accepted answer is perfectly good. It seems the task is converting a csv file into yaml. If that is the case, the pandas library is not really necessary, as the built-in csv module can read csv files.
import csv
import yaml
with open('keywords.csv') as f:
reader = csv.reader(f)
next(reader) # skip header
name_keywords = [ {'NAME': n, 'KEYWORDS': ', '.join(k.split())}
for n, k in reader ]
""" Dump DataFrame into getData.yml as yaml code """
with open('getData.yml', 'w') as outfile:
yaml.dump(
name_keywords,
outfile,
sort_keys=False,
width=72,
indent=4
)
Upvotes: 0
Reputation: 490
You have 2 options for that:
In a CSV file, if a comma is within quotes, then it won't be considered as a delimiter during parsing. This way, your CSV file would looks as follows:
NAME,KEYWORDS
Adam,"Football, Hockey"
Alternatively, you can process the KEYWORDS column after reading it. This would add the following to your code:
df = pandas.read_csv('keywords.csv')
df["KEYWORDS"] = df["KEYWORDS"].apply(lambda x: ", ".join(x.split()))
Upvotes: 1
Reputation: 888
I reproduced your dataframe with:
df = pd.read_csv(io.StringIO(
"""
NAME,KEYWORDS
Adam,Football Hockey
"""
), sep=",")
I assume that there can be multiple keywords each separated with a space. To insert commas you can use the apply()
method that pandas provides.
df.KEYWORDS = df.KEYWORDS.apply(lambda k: k.replace(" ", ", "))
Then run the rest of your code to produce the desired outcome.
Upvotes: 0