Eldo Sunny
Eldo Sunny

Reputation: 21

Split and use values from Excel columns

I'm really new to coding. I have 2 columns in Excel - one for ingredients and other the ratio.

Like this:

ingredients [methanol/ipa,ethanol/methanol,ethylacetate]     
spec[90/10,70/30,100]     
qty[5,6,10]

So this data is entered continuously. I want to get the total amount of ingredients, by eg from first column methanol will be 5x90 and ipa will be 10x5.

I tried to split them based on / and use a for loop to iterate

import pandas as pd

solv={'EA':0,'M':0,'AL':0,'IPA':0}
data_xls1=pd.read_excel(r'C:\Users\IT123\Desktop\Solvent stock.xlsx',Sheet_name='PLANT',index_col=None)  
sz=range(len(data_xls1.index)) 
a=data_xls1.Solvent.str.split('/',0).tolist() 
b=data_xls1.Spec.str.split('/',0).tolist()
print(a)
for i in sz: 
    print(b[i][0:1])
    print(b[i][1:2])

I want to split the ingredients and spec column multiply with qty and store in a solve dictionary

Error right now is float object is not subscript able

Upvotes: 2

Views: 72

Answers (2)

Eldo Sunny
Eldo Sunny

Reputation: 21

I should have posted this first bur this is what my input excel sheet looks like

Upvotes: 0

Daniel Večeřa
Daniel Večeřa

Reputation: 56

You have already found the key part, namely using the str.split function.

I would suggest that you bring the data to a a long format like this:

|    |   Transaction | ingredients   |   spec |   qty |
|---:|--------------:|:--------------|-------:|------:|
|  0 |             0 | methanol      |     90 |   4.5 |
|  1 |             0 | ipa           |     10 |   0.5 |
|  2 |             1 | ethanol       |     70 |   4.2 |
|  3 |             1 | methanol      |     30 |   1.8 |
|  4 |             2 | ethylacetate  |    100 |   10  |

The following code produces that result:

import pandas as pd

d = {"ingredients":["methanol/ipa","ethanol/methanol","ethylacetate"],
"spec":["90/10","70/30","100"],
"qty":[5,6,10]
    }

df = pd.DataFrame(d)
df.index = df.index.rename("Transaction") # Add sensible name to the index
#Each line represents a transcation with one or more ingridients

#Following lines split the lines by the delimter. Stack Functinos moves them to long format.
ingredients = df.ingredients.str.split("/", expand = True).stack()
spec = df.spec.str.split("/", expand = True).stack()

Each of them will look like this:

|  TrID, |spec |
|:-------|----:|
| (0, 0) |  90 |
| (0, 1) |  10 |
| (1, 0) |  70 |
| (1, 1) |  30 |
| (2, 0) | 100 |

Now we just need to put everything together:

df_new = pd.concat([ingredients, spec], axis = "columns")
df_new.columns = ["ingredients", "spec"]

#Switch from string to float
df_new.spec =  df_new.spec.astype("float")

#Multiply by the quantity, 
#Pandas automatically uses Transaction (Index of both frames) to filter accordingly
df_new["qty"] = df_new.spec  * df.qty / 100

#As long as you are not comfortable to work with multiindex, just run this line:
df_new = df_new.reset_index(level = 0, drop = False).reset_index(drop = True)

The good thing about this format is that you can have a multiple-way splits for your ingredients, str.split will work without a problem, and summing up is straightforward.

Upvotes: 2

Related Questions