Reputation: 21
Hi I'm sitting on this exercise from my python course. There is this list of data stored as a string. I want to clean it to save it in separate lists. As you can see in the output, it didn't work out well. Some of the individual data points are stored together.Also \n
is stored in plain text. I want to append the different categories in meant for lists but i can just iterate trough them because the same category isn't always on the same index in the list.
I've heard something about panda library but this is a beginner exercise so there shouldn't be anything imported.
daily_sales= """Edith Mcbride ;,;$1.21 ;,; white ;,;
09/15/17 ,Herbert Tran ;,; $7.29;,;
white&blue;,; 09/15/17 ,Paul Clarke ;,;$12.52
;,; white&blue ;,; 09/15/17 ,Lucille Caldwell
;,; $5.13 ;,; white ;,; 09/15/17,
Eduardo George ;,;$20.39;,; white&yellow
;,;09/15/17 , """
sales_list = daily_sales.replace(" ", "")
sales_list = sales_list.replace(";,;", " | ")
daily_transactions = sales_list.split(",")
daily_transactions_split = sales_list.split("|")
transactions_clean = []
for compunds in daily_transactions_split:
transactions_clean.append(compunds.strip()
Out of "transactions_clean":
['EdithMcbride', '$1.21', 'white', '09/15/17,HerbertTran', '$7.29', 'white&blue', '09/15/17,PaulClarke', '$12.52', 'white&blue', '09/15/17,LucilleCaldwell', '$5.13', 'white', '09/15/17,\nEduardoGeorge', '$20.39', 'white&yellow', '09/15/17,']
Upvotes: 0
Views: 157
Reputation: 82899
Your code is almost working, the only problem is that you split by ,
, and then split the original string by |
instead of all of the substrings resulting from the previous split
operation.
Also, replacing the spaces does not really make sense, as there are spaces in the names, and you are still left with the \n
. Better strip
the individual entries.
You can combine those steps in a single list comprehension if you like:
sales_list = daily_sales.replace(";,;", " | ")
transactions_clean = [[x.strip() for x in t.split("|")]
for t in sales_list.split(",")]
Result for transactions_clean
:
[['Edith Mcbride', '$1.21', 'white', '09/15/17'],
['Herbert Tran', '$7.29', 'white&blue', '09/15/17'],
['Paul Clarke', '$12.52', 'white&blue', '09/15/17'],
['Lucille Caldwell', '$5.13', 'white', '09/15/17'],
['Eduardo George', '$20.39', 'white&yellow', '09/15/17'],
['']]
Upvotes: 2