GhitaB
GhitaB

Reputation: 3437

First letter of each word in a string and punctuation

Input: Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.

Output: A i e, t o s a, v v n.

I need a javascript function to solve this.

I'm trying something like this:

function short_verse(verse) {
  let result = [];

  verse.split(' ').map(word => word.charAt(0) != '' ? result.push(word.charAt(0)) : '');

  return result.join(" ");
}

let input = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.",
  output = short_verse(input);

console.log(output);

The story: They say you can memorize texts this way. :) So, I create an application that will include this feature, too.

It should work for non-ascii chars, too. Example:

Input: Aliqușam țipsum ex, tempăs ornâre semper ac, varius vitae îbh.

Output: A ț e, t o s a, v v î

Note: In my case romanian diacritics would be enough - ăâîșțĂÂÎȘȚ.

Upvotes: 9

Views: 2301

Answers (7)

Grewu
Grewu

Reputation: 83

I posted a similar question - JavaScript Regex replace words with their first letter except when within parentheses

The best answer for what I was working on was:

/(\w|\([^)]+)\w*/g,'$1'

"Aliquam ipsum ex, (tempus ornare semper ac), varius vitae nibh!".replace(/(\w|([^)]+)\w*/g,'$1')

"A i e, (tempus ornare semper ac), v v n!"

This may not be what you need for your mnemonic device but it can still be helpful to see options.

I use this for learning lines in screenplays and theatrical scripts. That's why I was looking to keep text in parenthesis untouched - those are stage instructions. I usually need to do a fair bit of work to clean the theatrical script first but it more than makes up for it in the time saved to learn my lines

Upvotes: 3

s.kuznetsov
s.kuznetsov

Reputation: 15213

I used this regular expression /^(.)|[^\s,.!?:@]/g, using method map(). This works with non-ascii chars in mind.

let input = "Aliqușam țipsum ex, tempăs ornâre semper ac, varius vitae îbh.";
let output = input.split(/\s+/).map((w) => w.replace(/^(.)|[^\s,.!?:@]/g, "$1")).join(" ");

console.log(output);

Upvotes: 2

The fourth bird
The fourth bird

Reputation: 163362

If you are using only word characters, you can keep the first character and remove the rest of the word characters.

\B matches a non word boundary and \w+ matches 1 or more word characters:

const s = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
console.log(s.replace(/\B\w+/g, ""));

For the updated question, you can capture leading chars other than any letter or whitespace char, followed by a single letter. Follow optional letters that should be removed, and use capture group 1 in the replacement.

([^\p{L}\s]*\p{L})\p{L}*

See the regex matching in this regex demo.

[
  "Dumnezeu a zis: „Să fie o întindere între ape, și ea să despartă apele de ape.”",
  "Aliqușam țipsum ex, tempăs ornâre semper ac, varius vitae îbh.",
  "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh."
].forEach(s =>
  console.log(s.replace(/([^\p{L}\s]*\p{L})\p{L}*/gu, "$1"))
)

Upvotes: 7

Ovidijus Parsiunas
Ovidijus Parsiunas

Reputation: 2732

The following function should work for characters, numbers and symbols. The magic is in the regex; [a-zA-ZÀ-ÿăâîșțĂÂÎȘȚ]+ extracts all unique words that contain alphanumeric and romanian alphabet characters (as per question request), \s extracts all space characters as we want to preserve the spacing and finally ^\w\s extracts all non-alphanumeric and non-space characters - a.k.a symbols:

function short_verse(verse) {
  let result = [];
  const tokens = verse.match(/([a-zA-ZÀ-ÿăâîșțĂÂÎȘȚ]+)|(\s)|[^\w\s]/g);
  const firstChars = tokens.map((token) => token.charAt(0));
  return firstChars.join('');
}

let input1 = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
console.log(short_verse(input1));
let input2 = "Să fie o întindere între ape, și ea să despartă apele de ape."
console.log(short_verse(input2));

Upvotes: 6

AFLAH ALI
AFLAH ALI

Reputation: 451

Try this

function short_verse(verse){
   return verse.split(' ').reduce((acc,current) => (
      `${acc}${current[0]}${current.slice(-1).match(/\W/)?current.slice(-1):''}` 
   ),'')
}

You can replce \W with your preferred punctuation characters if needed.

Eg: .match(/[.!?\-]/)

Upvotes: 1

Tim Biegeleisen
Tim Biegeleisen

Reputation: 521409

We can use a regex replacement approach here:

var input = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh.";
var output = input.replace(/(\w)\w*/g, "$1");
console.log(output);

Upvotes: 7

vitalragaz
vitalragaz

Reputation: 328

This should do the trick. Probably you need to adjust the regex to include special chars, depending on your use case.

const input = "Aliquam ipsum ex, tempus ornare semper ac, varius vitae nibh."

const parsed = input.split(" ").map(w => w[0] + (/^[A-Za-z0-9]*$/.test(w) ? "" : w[w.length - 1])).join(" ");

console.log(parsed);

Upvotes: 3

Related Questions