Nyxshield
Nyxshield

Reputation: 171

Scripting: Replacing lines in a file in a very particular manner. How to do it with powershell?


I am new to regular expressions and powershell scripting.

I have a lot of javascript files in my directory. Most of them contain the statement "window.open(...)" (possibly a hundred times).

What I'd like to do is to replace all lines containing one of the following:

  1. window.open('...', ...)
  2. window.open('...')
  3. window.open("...", ...)
  4. window.open("...")


With the following, respectively:

  1. window.open(encodeURI('...'), ...)
  2. window.open(encodeURI('...'))
  3. window.open(encodeURI("..."), ...)
  4. window.open(encodeURI("..."))

In short, I would like to insert the function encodeURI(...) in front of every first argument in the function call window.open(...), if it exists (no need to do anything with window.open() which has no arguments at all).

I know how to find all the javascript files in a directory and process every file on its own for each of these files. However, I am having some trouble performing what I described. I thought with regular expressions, any ideas are appreciated.

Thank you,
Regards.

Upvotes: 2

Views: 60

Answers (3)

Nyxshield
Nyxshield

Reputation: 171

Thanks to Mathias for pointing out that I should use a parser in this case.

I will leave my answer as a reference to anyone trying to find arguments of functions in javascript. I used Esprima and Codegen libraries for NodeJS to parse my JS files. This way I can easily find the first arguments of all the calls of window.open(). In my case, I am replacing the first argument which is the url with an encoded version of it using the function encodeURI():

// This script finds all the calls "window.open" in a javascript file and converts
// the first argument of it to an encoded version of it.

function isWindowOpenCall(node) {
    if (node) {
    return (node && (node.type === 'CallExpression') &&
        (node.callee.type === 'MemberExpression') &&
        (node.callee.object.type === 'Identifier') &&
        (node.callee.object.name === 'window') &&
        (node.callee.property.type === 'Identifier') && 
        (node.callee.property.name === 'open'))
    }
    else return false;
}

// For JS Code Parsing
var esprima = require("esprima");

// For JS Code Regeneration
var escodegen = require("escodegen");

// For Reading File System
var fs = require("fs");

// For Reading the Lines
var readline = require('readline');

// Read File Name from Command Line
const fileNames = process.argv.splice(2);

// Read the file
var file = fileNames[0];
var sourceCode = fs.readFileSync(file, "utf-8");

// Parse the File
var parsedFile = esprima.parseScript(sourceCode, {}, function(node) {
    if (isWindowOpenCall(node)) {
        if(node){
        var arguments_parsed = node.arguments;
        var firstArgument_parsed = arguments_parsed[0];
        var firstArgument_plain = escodegen.generate(firstArgument_parsed);
        var firstArgumentModified_plain = "encodeURI(" + firstArgument_plain +")";
        var firstArgumentModified_parsed = esprima.parse(firstArgumentModified_plain);
        node.arguments[0] = firstArgumentModified_parsed;
        }
    }
});

// This is because the escodegen library automatically adds a ";" after changing
// the first argument, putting a syntax error in the script of the two forms:
// ");," in case `encodeURI` has multiple arguments or ";);" in case `encodeURI`
// has a single argument.
var result = escodegen.generate(parsedFile).replace(/;\),/g, '),');
var result = escodegen.generate(result).replace(/;\);/g, ');');
console.log(result);

Upvotes: 0

Mathias R. Jessen
Mathias R. Jessen

Reputation: 174485

You could look for, and capture any ' or " preceded by window.open(, then capture anything up to the next '/" you find, based on the first capture group:

<#
$lines = @'
window.open('...', ...)
window.open('...')
window.open("...", ...)
window.open("...")
'@ -split '\r?\n'
#> 

$lines -replace '(?<=window\.open\()([''"])(.*?\1)','encodeURI($1$2)'

('' is not a typo, it's the powershell single-quote string escape sequence)

Regex pattern explanation:

(?<=               # open positive look-behind
  window\.open\(   # literal string `window.open(`
)                  # close positive look-behind
(                  # open capture group 1
  ['"]             # one of either ' or "
)                  # close capture group 1
(                  # open capture group 2
  .*?              # non-greedy match of 0 or more of any character
  \1               # back-reference to capture group one (ie. either ' or ")
)                  # close capture group 2

Upvotes: 1

MDR
MDR

Reputation: 2670

Does it have to be powershell? If it was 10-30 files (say a good handful but not a massive amount) I may open all of them in Notepad++ or Visual Studio Code and find and replace (with regex) across all open files and use:

(window\.open\()(.*?),?(.*?)\)

Replace with:

$1encodeURI($3))

Demo:https://regex101.com/r/TVxRyA/1

Example using Visual Studio Code editor...

Starting files...

enter image description here

Click replace all...

enter image description here

Afterwards...

enter image description here

Upvotes: 0

Related Questions