bricevk
bricevk

Reputation: 207

Removing characters before a certain value in variable names in stata

EDIT: the issue with this question was resolved as Stata changed the variable names in Excel to variable "labels" upon importing the data, and generated the variable "names" that I needed automatically. So the question is unnecessary.

I have a dataset in Stata that has a handful of variable names, some of which begin with a number and a period. Like so:

name of car    62. color of car    145. year of sale    state of sale
Accord         Red                 1995                 GA
Corvette       Pink                2010                 FL
...

How can I remove the numbers from the variable names that contain them so that I wind up with:

name of car    color of car        year of sale         state of sale
Accord         Red                 1995                 GA
Corvette       Pink                2010                 FL
...

I have some familiarity with the substr() function, but I am confused by the fact that the character count that I need to remove from is not consistent. Instead, I need to remove everything from the period following the number, back.

Upvotes: 0

Views: 778

Answers (1)

Nick Cox
Nick Cox

Reputation: 37183

All those "names" are illegal as variable names, because Stata variable names just can't include spaces or periods or start with a number.

So either your Stata is corrupted beyond belief or you're misunderstanding what you have.

My best guess is that you have read in metadata so that text that could and should be variable labels is in fact making up the first observation (row) in your dataset. If so, the best advice is to go back and repeat the import so that metadata is not read into the dataset. The commands concerned have options to choose that.

In any case, it is immensely better to show data examples using dataex: see the tag wiki for Stata.

Upvotes: 1

Related Questions