gnixon14
gnixon14

Reputation: 31

Parsing file names with javascript

I have file names like the following:

SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT SEM_VSE_SECURITY_555001881_181002_1559_37072093.DAT SEM_VSE_MEDICALCONDEMERGENCIES_555001881_181002_1559_37072093.DAT SEM_REASONS_555001881_181002_1414_37072093.DAT SEM_PSE_NPI_SECURITY_555001881_181002_1412_37072093.DAT

and I need to strip the numbers from the end. This will happen daily and and the numbers will change. I HAVE to do it in javascript. The problem is, I know really nothing about javascript. I've looked at both split and slice and I'm not sure either will work. These files come from a government entity which means the file name will probably not be consistent.

expected output:

SEM_VSE_SKINSHARPS

SEM_VSE_SECURITY

SEM_VSE_MEDICALCONDEMERGENCIES

SEM_REASONS

SEM_PSE_NPI_SECURITY

Any help is greatly appreciated.

Upvotes: 1

Views: 78

Answers (3)

Tom O.
Tom O.

Reputation: 5941

Below is a solution that assumes you have your file name strings stored in an array. The code below simply creates a new array of properly formatted file names by utilizing Array.prototype.map on the original array - the map callback function first grabs the extension part of the string to tack on the file name later. Next, the function breaks the fileName string into an array delimited on the _ character. Finally, the filter function returns true if it does not find a number within the fileName string - returning true means that the element will be part of the new array. Otherwise, filter will return false and will not include the portion of the string that contains a number.

var fileNames = ['SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT', 'SEM_VSE_SECURITY_555001881_181002_1559_37072093.DAT', 'SEM_VSE_MEDICALCONDEMERGENCIES_555001881_181002_1559_37072093.DAT', 'SEM_REASONS_555001881_181002_1414_37072093.DAT', 'SEM_PSE_NPI_SECURITY_555001881_181002_1412_37072093.DAT'];

var formattedFileNames = fileNames.map(fileName => {
  var ext = fileName.substring(fileName.indexOf('.'), fileName.length);
  var parts = fileName.split('_');
  return parts.filter(part => !part.match(/[0-9]/g)).join('_') + ext;
});

console.log(formattedFileNames);

Upvotes: 0

jrook
jrook

Reputation: 3519

If all the files end in .XYZ and follow the given pattern, this might also work:

var filename = "SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT"
filename.slice(0,-4).split("_").filter(x => !+x).join("_")

results in:

"SEM_VSE_SKINSHARPS"

This is how it works:

  1. drop the last 4 chars (.DAT)
  2. split by _
  3. filter out the numbers
  4. join what is remaining with another _

You can also create a function out of this solution (or the other ones) and use it to process all the files provided they are in an array:

var fileTrimmer = filename => filename.slice(0,-4).split("_").filter(x => !+x).join("_")
var result = array_of_filenames.map(fileTrimmer)

Upvotes: 0

elixenide
elixenide

Reputation: 44841

This is a good use case for regular expressions. For example,

var oldFileName = 'SEM_VSE_SKINSHARPS_555001881_181002_1559_37072093.DAT',
    newFileName;
newFileName = oldFileName.replace(/[_0-9]+(?=.DAT$)/, ''); // SEM_VSE_SKINSHARPS.DAT

This says to replace as many characters as it can in the set - and 0-9, with the requirement that the replaced portion must be followed by .DAT and the end of the string.

If you want to strip the .DAT, as well, use /[_0-9]+.DAT$/ as the regular expression instead of the one above.

Upvotes: 1

Related Questions