Jeff
Jeff

Reputation:

Generating Random Files in Windows

Does anyone have a way to generate files of random data in Windows? I would like to generate 50,000 small (2K) files as an example.

Upvotes: 46

Views: 96819

Answers (17)

Jonathan DeMarks
Jonathan DeMarks

Reputation: 2441

This one is fun because it generates nested folders as well:

$rng = [System.Security.Cryptography.RNGCryptoServiceProvider]::new();
$total = 100
0..$total | % {
    $p = $(Join-Path $pwd ("$_".ToCharArray() -join '/')) + ".txt"
    $d = [System.IO.Path]::GetDirectoryName($p)
    New-Item -ItemType Directory -Path $d -ErrorAction SilentlyContinue | Out-Null
    $size = 1023 * (Get-Random -Minimum 1 -Maximum 512);
    $contents = [Byte[]]::new($size);
    $rng.GetBytes($contents);
    Set-Content -Force -Path $p -Value $contents
    if (($_%10) -eq 0) { Write-Host -NoNewline "$(($_/$total)*100)% " }
}; echo ""

Upvotes: 0

robotik
robotik

Reputation: 2007

for those who found this looking for a way to overwrite unused space on drive with random data, the following command in cmd will do:

cipher /w:X:\

substitute X with your drive letter

it will first write 0x00's then 0xFF's then random data on the unallocated space (leaving the files alone). it takes some time.

Upvotes: 1

Daniel J.
Daniel J.

Reputation: 366

Well, a bit late, but this is my contribution to a problem that will always be actual.

The way it works is self-explained in the code:

There are two loops. The outer one, running on the $j counter creates folders in the root of some given $currdrive letter. The inner loop generates $i files of $size (default 1GB)

$currdrive="G"
for ($j=0;$j -lt 5;$j++){
    new-item "$($currdrive):\random$($j)" -itemType directory
    for ($i=0;$i -lt 100;$i++){
        echo "Creating file $i...";
        $size=1048576*1024;
        $out = new-object byte[] $size;(new-object Random).NextBytes($out);[IO.File]::WriteAllBytes("$($currdrive):\random$($j)\random-file$($i).bin", $out)
    }
}

Upvotes: 0

Yoshi Walsh
Yoshi Walsh

Reputation: 2077

None of the answers here were cutting it for me, so here's a script which takes advantage of the Cryptography library to generate lots of random files.

This will generate files very quickly until your system's entropy is exhausted (on my PC this was about 4,000 files). After this it (and any other applications on your system that need cryptographic random numbers) will run very slowly. (In Linux terms, consider this script to use /dev/random instead of /dev/urandom)

$directory = (Get-Location).Path;
0..10000 | ForEach-Object {
    $size = 1023 * (Get-Random -Minimum 10 -Maximum 1536);
    $contents = [Byte[]]::new($size);

    $rng = [System.Security.Cryptography.RNGCryptoServiceProvider]::new();
    $rng.GetBytes($contents);

    $filename = "$directory\random$($_.ToString().PadLeft(5, '0')).txt"
    Write-Host $filename
    [System.IO.File]::WriteAllBytes($filename, $contents)
}

If you'd rather not deplete your system's entropy, replace the two $rng lines with these:

    $rng = [System.Random]::new();
    $rng.NextBytes($contents);

This will run much slower, but for large quantities of files (or larger files) it should be more reliable.

Upvotes: 3

psomatic
psomatic

Reputation: 21

So, i decided to add an accurate answer this time.

language is powershell. assumptions: filenames will be sequential and not random. file contents are to be cryptographically secure and unique. file location to be C:\temp\

#create a fixed size byte array for later use.  make it the required file size.
$bytearray = New-Object byte[] 2048

#create and start a stopwatch object to measure how long it all takes.
$stopwatch = [Diagnostics.Stopwatch]::StartNew()

#create a CSRNG object
$RNGObject = New-Object Security.Cryptography.RNGCryptoServiceProvider

# set up a loop to run 50000 times
0..49999 | Foreach-Object {

    # create a file stream handle with a name format 'filennnnn'
    $stream = New-Object System.IO.FileStream("c:\temp\file$("{0:D5}" -f $_)"), Create

    # and a stream writer handle
    $writer = New-Object System.IO.BinaryWriter($stream)

    # Fill our array from the CSRNG
    $RNGObject.GetNonZeroBytes($bytearray)

    # Append to the current file
    $writer.write($bytearray)

    # Close the stream
    $stream.close()

}

# how long did it all take?
$stopwatch.stop()
$stopwatch

And the output:

IsRunning Elapsed          ElapsedMilliseconds ElapsedTicks
--------- -------          ------------------- ------------
False 00:07:53.7685350              473768   1434270755

Mmm, it feels like it took a long time, but

$stopwatch.ElapsedMilliseconds/50000
9.47536

so, thats about 10ms per file. Thats to an old sata disk.

Upvotes: 2

psomatic
psomatic

Reputation: 21

edit

I re-read the question, the following will not provide the answer (50x2k files) as is, but will create arbitrarily sized files with truly random binary data.

Please comment if you would like to see an example that answers the question exactly.

/edit

The following can generate a 1GB file of cryptographically secure random data using objects available in powershell:

#set the size, 1024^3 = 1GB
$size=1024*1024*1024

#as we will build the file 1k at a time, divide required size by 1k
$size/=1024

#now create the byte array of a fixed size
$bytearray=new-object byte[] 1024

#and create a CSRNG object
$RNGObject=new-object Security.Cryptography.RNGCryptoServiceProvider

#Create a file for streaming. PS will overwrite if it exists.
#its probably bad form to hard code the filename, an exercise for you
$stream = New-Object System.IO.FileStream("d:\file1.bin"), Create

#open the stream and grab the handle.
$writer = New-Object System.IO.BinaryWriter($stream)

#create a timer object so we can measure the runtime.  start it.
$stopwatch=[diagnostics.stopwatch]::startnew()

#now, iterate through the required file size 1k at a time
0..($size-1) | Foreach-Object{
    #filling our byte array with random non zero bytes
    $RNGObject.GetNonZeroBytes($bytearray)
    #and them append them to the file stream.
    $writer.write($bytearray)
}

#captain obvious
$stopwatch.stop()
$stream.close()

#and display the stopwatch data
$stopwatch

IsRunning Elapsed          ElapsedMilliseconds ElapsedTicks
--------- -------          ------------------- ------------
False 00:00:23.2019782               23201     70240880

To use random data with zero values, just replace

$RNGObject.GetNonZeroBytes($bytearray)

with

$RNGObject.GetBytes($bytearray)

A quick intro to duckduckgo, if you go to duckduckgo.com and search with

!msdn Security.Cryptography.RNGCryptoServiceProvider

you will be given extremely focussed results direct from the Microsoft Developer Network, allowing you to see the Crypto classes, methods and properties available.

Upvotes: 0

gwiazdorrr
gwiazdorrr

Reputation: 6329

One-liner in Powershell:

$out = new-object byte[] 1048576; (new-object Random).NextBytes($out); [IO.File]::WriteAllBytes('d:\file.bin', $out)

This is lightning fast, compared to @user188737 solution.

Upvotes: 27

Crellin
Crellin

Reputation: 1

You could use VBA in excel if you have limited permissions on the machine that you are on. This would create txt files to the number required with random numbers. Probably not the quickest of ways of going about this though.

Sub rndcreate()

Application.ScreenUpdating = False
Application.DisplayAlerts = False

Dim sbook As Workbook
Dim i As Double
Dim upperbound, lowerbound, totalentries, totalfiles As Integer
Dim x, folder, file As String

'Set output location

folder = "C:\test\"

'Number of files created and entries in files as below

totalfiles = 1
totalentries = 150
upperbound = 99999
lowerbound = 1

For p = 1 To totalfiles

'Add new workbook to populate with data

Set sbook = Workbooks.Add

'Set file name

file = "randomdatafile" & p

For i = 1 To totalentries

    'Randomly created integers between your two bounds

    x = ((upperbound - lowerbound + 1) * Rnd + lowerbound)

    Range("A" & i) = x

Next

ActiveWorkbook.SaveAs Filename:=folder & file & ".txt", FileFormat:=xlTextWindows
ActiveWorkbook.Close

Next

End Sub

Upvotes: 0

Marc Kellerman
Marc Kellerman

Reputation: 466

Instead of using Get-Random to generate the text as per user188737 & mguassa suggestions, I improved the speed by using GUIDs.

Function New-RandomFile {
    Param(
        $Path = '.', 
        $FileSize = 1kb, 
        $FileName = [guid]::NewGuid().Guid + '.txt'
        ) 
    (1..($FileSize/128)).foreach({-join ([guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-").SubString(1, 126) }) | set-content "$Path\$FileName"
}

This took 491 milliseconds to generate a 1mb file. Running:

New-RandomFile -FileSize 1mb

UPDATE:

I've updated my function to use a ScriptBlock, so you can replace the 'NewGuid()' method with anything you want.

In this scenario, I make 1kb chunks, since I know I'm never creating smaller files. This improved the speed of my function drastically!

Set-Content forces a NewLine at the end, which is why you need to remove 2 Characters each time you write to file. I've replaced it with [io.file]::WriteAllText() instead.

Function New-RandomFile_1kChunks {
    Param(
        $Path = (Resolve-Path '.').Path, 
        $FileSize = 1kb, 
        $FileName = [guid]::NewGuid().Guid + '.txt'
        ) 

    $Chunk = { [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-" }

    $Chunks = [math]::Ceiling($FileSize/1kb)

    [io.file]::WriteAllText("$Path\$FileName","$(-Join (1..($Chunks)).foreach({ $Chunk.Invoke() }))")

    Write-Warning "New-RandomFile: $Path\$FileName"

}

If you dont care that all chunks are random, you can simply Invoke() the generation of the 1kb chunk once.. this improves the speed drastically, but won't make the entire file random.

Function New-RandomFile_Fast {
    Param(
        $Path = (Resolve-Path '.').Path, 
        $FileSize = 1kb, 
        $FileName = [guid]::NewGuid().Guid + '.txt'
        ) 

    $Chunk = { [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid +
               [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid + [guid]::NewGuid().Guid -Replace "-" }
    $Chunks = [math]::Ceiling($FileSize/1kb)
    $ChunkString = $Chunk.Invoke()

    [io.file]::WriteAllText("$Path\$FileName","$(-Join (1..($Chunks)).foreach({ $ChunkString }))")

    Write-Warning "New-RandomFile: $Path\$FileName"

}

Measure-Command all these changes to generate a 10mb file:

Executing New-RandomFile: 35.7688241 seconds.

Executing New-RandomFile_1kChunks: 25.1463777 seconds.

Executing New-RandomFile_Fast: 1.1626236 seconds.

Upvotes: 5

Rubenisme
Rubenisme

Reputation: 816

Yes, fsutil is great, but doesn't generate random data, just ASCII nulls.

I don't remember where I found this but searching on google these days I can still find it at: http://www.private-files.com/other/random.c.txt

I don't know how old this program is but at least as old as your question, probably somewhat older.

Anyway here's a program in C which creates files with a chi-squared test result of 0:

// ------------------------------------------------------------     
// Name: random.c  (program to create random files)
//     
// This "no-frills" program creates files with the following 
// characteristics: 
//
//    (1) Byte sequences are random (no predictability);
//    (2) Files produced are binary files; 
//    (3) File sizes are multiples of 256; 
//    (4) Files will have a chi-squared test result of 0
//        (see analyze.exe by Wenger for explanation)
//  
//              Programmer:  Scott Wenger
//                           Box 802
//                           Stevens Point, WI 54481
//                           [email protected]
//
//       Note:  part of this code is from Knuth Volume II
// 
//  Enhancements and modifications of this program are left 
//  to the imagination and creativity of the programmer.
//  Check your compiler for required header files.  You may 
//  need to include the iostream header.
//
//  Random files are of potential use to cryptographers
//  for the purpose of encryption.  
//  
//  To analyze files produced by this program, see 
//  the analyze.exe program by Scott Wenger (found at
//  http://www.coredcs.com/sware.html)
// ------------------------------------------------------------


// This program works in the following way:
// The time is used to seed the random number generator.
// Using Knuth's algorithm, random numbers are generated
// in the range of 0 to 255 (corresponding to 256 ASCII chars.)
// When random numbers are generated they are marked as used and 
// are not re-used until all 256 ASCII values appear.  Characters 
// are written to disk and the process continues until the
// desired file size is reached.  Output is a random binary file
// called random.bin (placed in the root directory)
// The controlled filesize along with the placeholder feature 
// of this code forces a very high degree of randomness in 
// the output file. 

#include <time.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void init_mm();
void clear_array(); 
int  number_range(int minval, int maxval);
int  number_mm();

static int rgiState[2 + 55]; 
int place_holder[256];          // to keep track of numbers already generated

int main()
{
  mainprogram();
  return 0;
}

int mainprogram()
{
  int ch; 
  int c_used = 0;  // counter of chars in placeholder
  int done = 0; 
  int random;

  char buffer[2];

  long x;
  long byte_size = 0L;
  FILE *fp;

  clear_array();
  init_mm();  // seed random number generator

  // create a random file of length specified by user
  printf("\nrandom.exe originally by Scott Wenger");
  printf("\nThis program creates a random binary file.\n");
  printf("\nPlease specify length of random file to create (in megabytes): ");  

  scanf("%ld", &byte_size);

  while (byte_size > 1000 || byte_size <= 0 )
  {
    printf("\nWill not create files larger than a gigabyte! ");
    printf("\nPlease specify length of random file to create (in megabytes): ");
    flushall();
    scanf("%ld", &byte_size);
  }

  byte_size = byte_size * 1024 * 1024;

  if ( (fp = fopen("random.bin", "wb"))  == NULL) {
    fprintf(stderr, "\nOutput file (random.bin) could not be created.");      
    fflush(stdout);
    exit(1);
  }

  for (x = 0L; x < byte_size; x++) {

    if (c_used == 256) {
      clear_array();
      c_used = 0;
    }

    random = number_range(0, 255);    // use all ASCII values

    if ( *(place_holder + random) ) {  // already used, find another
      done = 0;
      while (!done) {
        random = number_range(0, 255);
        if ( *(place_holder + random) == 0) {
          *(place_holder + random) = 1;
          done = 1;
        }
      }         
    }
    else *(place_holder + random) = 1;  // use it and mark as used 

    c_used++;   // found next character so increment counter

    sprintf(buffer, "%c", random);  // convert ASCII value to char 
    ch = buffer[0];
    fputc(ch, fp); // write to file
  }

  fclose(fp);

  printf("\nDone. File \"random.bin\" was created (size: %ld bytes)", byte_size);
  printf("\nOutput file is in the root directory (c:\\random.bin)\n");
  return(0);
}

// ---------------------------------------------------------------------------------

void clear_array()
{
  register int x;
  for (x = 0; x < 256; x++) 
    *(place_holder + x) = 0;
}

// ---------------------------------------------------------------------------------

int number_mm()
{
    int *piState;
    int iState1;
    int iState2;
    int iRand;

    piState     = &rgiState[2];
    iState1     = piState[-2];
    iState2     = piState[-1];
    iRand       = ( piState[iState1] + piState[iState2] )
                & ( ( 1 << 30 ) - 1 );
    piState[iState1]    = iRand;

    if ( ++iState1 == 55 )  iState1 = 0;
    if ( ++iState2 == 55 )  iState2 = 0;

    piState[-2]     = iState1;
    piState[-1]     = iState2;

    return(iRand >> 6);
}

// ---------------------------------------------------------------------------------

//  Generate a random number.

int number_range( int minval, int maxval )
{
  int power, number;

  if ( ( maxval = maxval - minval + 1 ) <= 1 ) return (minval);

  for ( power = 2; power < maxval; power <<= 1 )
    ;
  while ( ( number = number_mm( ) & ( power - 1 ) ) >= maxval )
    ;
  return(minval + number);
}

// ---------------------------------------------------------------------------------

// Mitchell-Moore algorithm from Knuth Volume II. 

void init_mm( )
{
  int *piState;
  int iState;

  piState = &rgiState[2];
  piState[-2]   = 55 - 55;
  piState[-1]   = 55 - 24;
  piState[0]    = ( (int) time( NULL ) ) & ( ( 1 << 30 ) - 1 );
  piState[1]    = 1;

  for ( iState = 2; iState < 55; iState++ ) 
  {
    piState[iState] = ( piState[iState-1] + piState[iState-2] )
                      &  ( ( 1 << 30 ) - 1 );
  }
}

// -------------------- End -------------------------------------------------------

Upvotes: 1

Gerrit
Gerrit

Reputation: 881

You can use PowerShell to generate cheap random data for your files:

[Byte[]] $out = @()
0..2047 | % {$out += Get-Random -Minimum 0 -Maximum 255}
[System.IO.File]::WriteAllBytes("myrandomfiletest", $out)

This uses an algorithm with a seed taken from the system clock, so don't use this for ANY serious cryptographic applications.

In addition, be wary of the performance degradation of Get-Random when increasing the size of the output file. More on this aspect here:

Upvotes: 20

David Waters
David Waters

Reputation: 12028

I have been using Random Data File Creator and liking it, it creates binary files (i.e. not text files) filled with pseudo-random bits, it can quickly create very large files. To use it to create multiple small files you would need to script it, which would be very easy given it is command line.

Upvotes: 16

EBGreen
EBGreen

Reputation: 37730

Since you don't specify a language, I'll simply pick one at random. Here is a powershell script to do it:

$rootDir = 'C:\Temp\TestRandomFiles\'
$baseFile = $rootDir + "base.txt"
$desiredFileSize = 2*1KB
$fileCount = 50000
"start" | Out-File -Filepath $baseFile
While ($(Get-ChildItem -path $baseFile).Length -lt $desiredFileSize)
{
    $(Get-ChildItem -path $baseFile).Length | Out-File $baseFile -APPEND
}
for($i=1;$i -lt $fileCount;$i++)
{
    Copy-Item $baseFile "File$i.txt"
}

You'll have to change the variables to the parameters that you want of course.

Upvotes: 9

Bogdan
Bogdan

Reputation: 3075

You can run fsutil in a batch loop to create files of any size.

fsutil file createnew filename.extension 2000

Upvotes: 36

beach
beach

Reputation: 8640

How about something like this: Random File Generator 1.1

Or File generator

Upvotes: 0

levand
levand

Reputation: 8500

You'll have to create files in the normal way, and then populate them with randomized data, probably from a rand() function of some sort.

It really depends on your programming language. Windows itself certainly won't provide this capability.

There are a number of programming languages that could do this easily, however, including basic windows batch/CMD scripts. What language are you interested in using?

Upvotes: 2

jgallant
jgallant

Reputation: 11273

Well, technically you could write something to do this for you.
I don't know of anything specific.. but the easiest way would be to create a TEXT file of a specific size (2K for example).. then write a batch file to copy it 50000 times.

Upvotes: 0

Related Questions