How to change order of string based on dates

Question

I received data with a string variable that looks something like:

var_name
25-DEC-99: A11, B14, C89; 28-FEB-94: A27, B94, C30
01-APR-11: A25, B82, C65
04-JUL-09: A21, B55, C26; 12-MAR-03: A11, B72, C68; 08-JUN-11: A62, B47, C82
12-JUN-00: A77, B19, C73; 03-JUL-12: A99, B04, C54
27-OCT-15: A22, B95, C08

And so on. My goal is to split these strings up into different variable names. The variable names would be v1_date, v1_A, v1_B, v1_C, v2_date, v2_A, v2_B, v2_C, v3_date, v3_A, v3_B, v3_C.

I can use split var_name, p(";"), rename to be v1, v2, and v3, and then split again to do this. But the problem is that I want v1, v2, and v3 to be in chronological order based on the date and the data is not currently arranged in that fashion. How can I make it so that the date of v1 comes before v2 and the date of v2 comes before the date of v3? For example in the first observation, I want 25-DEC-99: A11, B14, C89 to be associated with v2 and 28-FEB-94: A27, B94, C30 to be associated with v1.

Roberto Ferrer · Accepted Answer

The following gets you close, I believe. It uses both split and reshape.

clear
set more off

input ///
str100 myvar
"25-DEC-99: A11, B14, C89; 28-FEB-94: A27, B94, C30"
"01-APR-11: A25, B82, C65"
"04-JUL-09: A21, B55, C26; 12-MAR-03: A11, B72, C68; 08-JUN-11: A62, B47, C82"
"12-JUN-00: A77, B19, C73; 03-JUL-12: A99, B04, C54"
"27-OCT-15: A22, B95, C08"
end

split myvar, p(;)
drop myvar

gen obs = _n
reshape long myvar, i(obs)
drop if missing(myvar)

split myvar, p(:)
drop myvar

gen myvar11 = date(myvar1, "DMY", 2020)
format %td myvar11

drop myvar1
rename (myvar11 myvar2) (mydate mycells)
order mydate, before(mycells)

bysort obs (mydate) : gen neworder = _n
drop _j

reshape wide mydate mycells, i(obs) j(neworder)

list

You can loop over the mycells variables if you need to further split them.

How to change order of string based on dates

Answers (2)

Related Questions