Reputation: 3
I want to be able to generate a random data file in Windows using either CMD or PowerShell, with the generated data being comprised of lines and lines of random text. I've managed to achieve this in PowerShell using the following command, however this took around 1 minute to generate 1MB of data, which is way too slow to be generating GBs:
1..100000 | % { [System.Web.Security.Membership]::GeneratePassword(70, 3) >> C:/dummy.txt }
File output should be the following:
YyS@ZRU98udC3q#R@5o7AR$*Bh44v22J!ekKSpIAgLQyp^pbBx
s8Wm589aYYH39@Arb2^ZRMPjx2UaEwHYkMmhgFaU$QyAU@@@WU
yB^!qo6e4x*eFvx%ZY7738&&FkhHXU24OCJCxfyQ7a%peo!$ap
...........
...........
$GVhMrkZfJbIkgAgri0w9lFVt6a^vXh6ev&jwPHGfoE!pVW85r
Does anyone have any suggestions? Preferably I can do this without the need for external tools, as the script to create this data will be run automatically during startup of the machine.
Upvotes: 0
Views: 2747
Reputation: 174515
The reason your current approach is slow is because of overhead from opening, writing to and closing the same file 100K times. Move the file redirection operator outside the pipeline expression and to only incur said overhead once:
1..100000 | % { [System.Web.Security.Membership]::GeneratePassword(70, 3) } >> .\dummy.txt
To illustrate the difference, here's measurements from 1000 passwords with your original vs. moving the output redirection outside:
PS ~>
>> Measure-Command {
>> 1..1000 | % { [System.Web.Security.Membership]::GeneratePassword(70, 3) >> .\dummy.txt }
>> } |Select TotalMilliseconds
TotalMilliseconds
-----------------
8881.9736
PS ~>
>> Measure-Command {
>> 1..1000 | % { [System.Web.Security.Membership]::GeneratePassword(70, 3) } >> .\dummy.txt
>> } |Select TotalMilliseconds
TotalMilliseconds
-----------------
72.7485
Literally >100 times faster already at only 1000 lines
Upvotes: 2
Reputation: 887
This should get you underway. The main point is that it uses C# for the generation of the random stuff. You can control the characters you want to use. It uses Linq, so it can be made to run much faster, but you should already see a massive performance boost when compared to random generation in PS. It will generate reasonably sized textfiles, if you want GB sized data you'll want to look into other methods.
$code = @"
using System;
using System.Linq;
namespace HelloWorld
{
public class Program$id
{
const string chars = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
private static Random random = new Random();
public static string RandomString(int length) // infinitely faster than Get-Random in Powershell
{
return new string(Enumerable.Repeat(chars, length)
.Select(s => s[random.Next(s.Length)]).ToArray());
}
}
}
"@
Add-Type -TypeDefinition $code -Language CSharp
$lines = 100000
$lenght = 70
$outfile = "e:\temp\dummy.txt" # tweak as needed
$sb = [System.Text.StringBuilder]::new() # spin up a stringbuilder to hold characters
# create a string buffer with $lines of $length characters
1..$lines | % {
[void]$sb.AppendLine((Invoke-Expression "[HelloWorld.Program$id]::RandomString($lenght)"))
}
$sb.ToString() | Out-File $outfile # write out results
cat $outfile -Tail 10 # show last 10 lines in output file
This takes 2-3 seconds on my system (including the cat
).
Upvotes: 1