DGMS89
DGMS89

Reputation: 1677

Trimming strings in excel for different words

Scenario: I have some rows with string of data in excel. The data is always on the same order ("columns") but the size of the data in each "column" varies. In the original strings, there can be one or multiple blank spaces between each piece of "column" data, and so far I used the trim function to reduce that to 1 blank space.

Objective: I am trying to somehow separate the data from the string in different columns, but inside each column data, there might also be spaces, for example I am trying to output this original:

James Smith                                        code1 code2     10.5 09/23/1900AT PRESENT    UUUB   SJ   SPECIAL                      250AAA No  No  NoCORRECTED  part1

to this with trim:

James Smith code1 code2 10.5 09/23/1900AT PRESENT UUUB SJ SPECIAL 250AAA No No NoCORRECTED part1

as this:

James Smith    code1    code2    10.5    09/23/1900    AT PRESENT    UUUB    SJ    SPECIAL    250AAA    No    No    No   CORRECTED    part1

where each field is in its proper column.

Obs1: One of the problematic fields for me is the one that has the result "AT PRESENT", because there is a space in between, and there is no space between the "AT" and the last digit of the previous column.

Obs2: I also face similar problems in the first row (headers), which also can have more than 1 work per field.

Obs3: Here are two other string examples that appear in the dataset:

code1       03/15/1950TEAM-ALPHA              h/s/s            CERTIFIED3-3/1                          third point                        03/19/1944               --             --SR   SR   Prototype

code1         200000.00especial reduced                Redone                                                                                                  third part                                                          --          No

What I already tried: I have been trying the LEFT, RIGHT and MID functions, but since I cannot foresee how many letters will be in most of the fields, I found no proper way to do it. I also tried doing simple character substitution, but that does not solve the problem of the fields that are mistakenly merged. The first thing I tried was using "text to columns": here the result is also problematic, if I have spaces inside a field it gets divided, and if there is no space between fields, there will also be an error. I am tried to to this as dynamically as possible, to account for different data variants.

Question: Any suggestions or ideas on how to tackle this situation?

Upvotes: 0

Views: 73

Answers (1)

Grosi
Grosi

Reputation: 152

Have you tried Text to Columns on Data tab? Set your original data type delimited and select the "Space" delimiter. Make sure you tick "Treat consecutive delimiters as one"

Upvotes: 1

Related Questions