Reputation: 165
Using Python, I'd like to create a loop to write text in a CSV file when a row contains text.
The original CSV format is:
user_id, text
0,
1,
2,
3, sample text
4, sample text
I'm seeking to add another column "text_number" that will insert the string "text_x", with x representing the number of texts in the column. I'd like to iterate this and increase the string's value by +1 for each new text. The final product would look like:
user_id, Text, text_number
0,
1,
2,
3, sample text, text_0
4, sample text, text_1
With my working code I can insert the header "text_number", but I'm having difficulty in putting together the loop for text_x.
import csv
output = list()
with open("test.csv") as file:
csv_reader = csv.reader(file)
for i, row in enumerate(csv_reader):
if i == 0:
output = [row+["text_number"]]
continue
# here's where I'm stuck
with open("output2.csv", "w", newline="") as file:
csv_writer = csv.writer(file, delimiter=",")
for row in output:
csv_writer.writerow(row)
Any thoughts?
Upvotes: 0
Views: 966
Reputation: 3987
You can try this:
import csv
output = list()
x=0
with open("test.csv") as file:
csv_reader = csv.reader(file)
for i, row in enumerate(csv_reader):
row[1]=row[1].strip()
if i == 0:
row.append("text_number")
else:
if row[1]=="":
row.append(" ")
else:
row.append(f"text_{x}")
x+=1
output.append(row)
with open("output2.csv", "w", newline="") as file:
csv_writer = csv.writer(file, delimiter=",")
for row in output:
csv_writer.writerow(row)
I haven't changed anything in your code, which should be changed. I
am just adding
new element
in row
in every iteration
. And append
that every row
in output
, for making new list of row
.
If you are comfortable with pandas
then you can try this too:
import pandas as pd
df=pd.read_csv("test.csv")
r=[]
x=0
for i in range(df.shape[0]):
if df[" text"][i].strip()=="":
r.append(f" ")
else:
r.append(f"text_{x}")
x+=1
df["text_number"]=r
print(df)
"""
user_id text text_number
0 0
1 1
2 2
3 3 sample text text_0
4 4 sample text text_1
"""
pd.to_csv("output2.csv")
Here we are making list for text_number
column.
Upvotes: 1
Reputation: 9047
find description in comments
# asuming the file
# user_id,text
# 0,
# 1,
# 2,
# 3,sample text
# 4,sample text
# 5,
# 6,sample text
# import the library
import pandas as pd
df = pd.read_csv('test.csv').fillna('')
# creating column text_number initializing with ''
df['text_number'] = ''
# getting the index where text is valid
index = df.loc[df['text'].str.strip().astype(bool)].index
# finally creating the column text_number with increment as 0, 1, 2 ...
df.loc[index, 'text_number'] = [f'text_{i}' for i in range(len(index))]
print(df)
# save it to disk
df.to_csv('output2.csv')
# user_id text text_number
# 0 0
# 1 1
# 2 2
# 3 3 sample text text_0
# 4 4 sample text text_1
# 5 5
# 6 6 sample text text_2
Upvotes: 1
Reputation: 11321
You could try the following modification of your first part:
output = list()
with open("test.csv") as file:
csv_reader = csv.reader(file)
output.append(next(csv_reader) + ['text_number'])
text_no = 0
for row in csv_reader:
if row[1].strip():
row.append(f'text_{text_no}')
text_no += 1
output.append(row)
Upvotes: 1