Reputation: 1
I've tried to copy/paste a price list from a PDF into an excel sheet, but everything pastes into a single column (A), rather than converting it to the appropriate number of columns.
Using text to columns with a space delimiter doesn't work because many of the products have multiple spaces in the product name.
I've got about 300 lines with varying lengths of item description. There should be 4 columns: Item number, description, qty and price. But all 4 columns may have varying numbers of delimiters and characters.
Example:
MGB123 Brand 20 Mil Business Card Magnet 500 .225
NNR1221 Brand 15 Mil Custom Shaped Marker and Clip 250 4.123
PQXRF123 Brand 10 Mil Pen 1000 .221
etc...
I've used:
=MID(A1,FIND("~~~~~",SUBSTITUTE(A1," ","~~~~~",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))+1,LEN(A1))
to get the price separated, but how would I alter the formula to get the next to last column?
Upvotes: 0
Views: 1000
Reputation: 60224
Here are formulas that should work on earlier versions of Excel also:
Item Number: =LEFT(A2,FIND(" ",A2)-1)
Description: =MID(A2,FIND(" ",A2)+1,FIND(CHAR(1), SUBSTITUTE(A2," ",CHAR(1),LEN(A2)-LEN(SUBSTITUTE(A2," ",""))-1))-FIND(" ",A2))
Qty: =--TRIM(MID(SUBSTITUTE(A2," ",REPT(" ",99)),99*2,99))
Price: =--TRIM(RIGHT(SUBSTITUTE(A2," ",REPT(" ",99)),99))
The double unary at the beginning of Qty and Price are to convert the string to a number.
Upvotes: 1
Reputation: 9857
You can use these 4 formulas to extract the item number, name, quantity and price.
=FILTERXML("<t><s>"& SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[1]")
=TEXTJOIN(" ",TRUE, FILTERXML("<t><s>"& SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[position()>1 and position()<last()-1]"))
=FILTERXML("<t><s>"& SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[last()-1]")
=FILTERXML("<t><s>"& SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[last()]")
If TEXTJOIN isn't available in your version of Excel you could try using CONCAT.
=CONCAT(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[position()>1 and position()<last()-1]")&" ")
Which, again depending on your version of Excel, might need to be entered as an array formula using CTRL+SHIFT+ENTER.
If CONCAT is unavailable try this,
=MID(A1,FIND(" ",A1)+1,FIND(FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>", "//s[last()-1]"),A1)-FIND(" ",A1)+2)
though I'm sure there's probably a better way.:)
Upvotes: 2