Andy
Andy

Reputation: 1

Extract data from one column of an excel spreadsheet by searching with part of the data in another column?

Basically, I have an excel spreadsheet with two columns: Column A has a full gene name, which looks something like gi|748593723|ref|WP_005837193.1| gene name, and Column B which only has the accession number, which is the WP_005837193.1 part. Column B is much shorter because it contains the accession numbers of only the genes I am interested in, while Column A is the full list of genes. I need to convert the accession numbers in B to the full format in A. I thought I would be able to do something where excel searches for B1 in column A and returns the cell in column A for which it finds the value, but I am struggling. Does anyone know how to go about something like this? Thanks!

Upvotes: 0

Views: 162

Answers (2)

Michael Chad
Michael Chad

Reputation: 425

The simplest thing is probably:

  1. Copy all of column "A" into column "C"
  2. Highlight Column "C" and use text to columns
    • choose delimited, and check other, then type | into the box
  3. Now copy all of the accession numbers (from column "F"?) and insert in column "A", which will shift everything over.
  4. Deleted all columns from "D" over

Now in column "D" you can use =vlookup(C2, A:B, 2, False) in cell "D2", assuming your data starts in row 2, and fill down to the bottom of your accession list. You should get a new list in column "D" with the full gene for each accession number.

Upvotes: 0

Easiest to solve your problem with regular expression (I use regex add-in in Excel):

  • Insert a column before column A (now original columns A and B are changed to B and C)

  • formula in column A (starting from A2 supposing you have headers): =rxfind(B2,"WP[^|]*")

  • formula in column D: =vlookup(C2,A:B,2,false)

Upvotes: 1

Related Questions