XtC
XtC

Reputation: 29

Windows batch file to group 1000 files into alphanumeric folders

This is my first post/request on here. I've done numerous searches to try and find a solution, but I think my requirement is a tad ambitious.

I'm using Windows 7 and would prefer to do this in a DOS batch file rather than PowerShell.

I have folders containing 10s of thousands of old Zip archives. Since there are so many files in one folder, it can be slow to list them. I want to move the Zip archives into alphabetised folders, but each folder needs to be limited to 1000 files

So the first 1000 a*.zip files would be moved into a folder named A1. The second thousand a*.zip files into a folder named A2 and so on.

The files would need to be moved in order, so that if the last file copied into A1 is an_example_file_97.zip, then the first file moved into the A2 folder would be an_example_file_98.zip

I would need to do this for the whole alphabet and also numerically named Zip archives. Then I would end up with a folder/file structure like this...

<DIR> 01
    1000 zip archives whose filename begins with a number
<DIR> 02
    Next 1000 zip archives whose filename begins with a number
<DIR> 03
    Next 1000 zip archives whose filename begins with a number

<DIR> A1
    1000 zip archives whose filename begins with A
<DIR> A2
    Next 1000 zip archives whose filename begins with A
<DIR> A3
    Next 1000 zip archives whose filename begins with A
<DIR> B1
    1000 zip archives whose filename begins with B
<DIR> B2
    Next 1000 zip archives whose filename begins with B
<DIR> B3
    Next 1000 zip archives whose filename begins with B

<DIR> Z1
    1000 zip archives whose filename begins with Z
<DIR> Z2
    Next 1000 zip archives whose filename begins with Z
<DIR> Z3
    Next 1000 zip archives whose filename begins with Z

My apologies if this solution already exists on this site, but it's tricky to know exactly what to search for.

Thank you.

Upvotes: 2

Views: 2062

Answers (4)

user6811411
user6811411

Reputation:

Contrary to Joey I don't think it's to complicated or painful for batch.

Counting in an Array requires two level delayed expansion inside a for loop but that's it.

@Echo off&SetLocal EnableExtensions EnableDelayedExpansion
PushD "X:\start\here"
for /F "Delims=" %%A in ('Dir /B /ON *') Do (
  Set "Name=%%~nA"
  Set "N=!Name:~0,1!"
  Echo !N!|Findstr "[0-9]" 2>&1>NUL &&Set N=0
  Set /A "Array[!N!]+=1"
  Call Set "Count=%%Array[!N!]%%"
  Set /A "F=!Count! / 1000 +1, I = !Count! %% 1000"
  Set "Dest=!N!!F!"
  If not Exist "!Dest!" REM MD "!Dest!"
  Echo Move "%%A" "!Dest!\"
)
Set Array[
Pause
Popd

While testing the MD command is REMed out,
The Move command is only echoed.

Take a look at the counting Array[ variable finally output to see how much dirs would be created.

Shorted sample output:

Move "WSH_Shell_4.html" "W1\"
Move "xyz" "x1\"
Move "yarn.bat" "y1\"
Move "ZehnerZaehlen.cmd" "Z1\"
Move "zeiten.txt" "z1\"
Array[0]=7
Array[A]=10
Array[B]=3
Array[c]=30
Array[D]=19
Array[E]=15
Array[F]=18
Array[G]=29
Array[H]=11
Array[I]=9
Array[J]=2
Array[K]=1
Array[L]=9
Array[m]=20
Array[N]=5
Array[O]=4
Array[P]=11
Array[r]=16
Array[S]=25
Array[T]=17
Array[U]=19
Array[v]=3
Array[W]=19
Array[x]=1
Array[y]=1
Array[Z]=2

Upvotes: 0

Aacini
Aacini

Reputation: 67216

This is the same LotPings' answer, but with a couple small modifications that make it simpler and faster:

@Echo off
SetLocal EnableExtensions EnableDelayedExpansion

PushD "X:\start\here"
for %%A in (*.*) Do (
   Set "Name=%%~nA"
   Set "N=!Name:~0,1!"
   If "!N!" lss "a" Set "N=0"
   Set /A "Array[!N!]+=1, F=Array[!N!] / 1000 +1"
   Set "Dest=!N!!F!"
   If not Exist "!Dest!" REM MD "!Dest!"
   Echo Move "%%A" "!Dest!\"
)

If the listing of plain for %%A command is not sorted, then change the for %%A in (*.*) Do ( line by these two lines:

for %%L in (0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z) Do (
   for /F "Delims=" %%A in ('Dir /B /ON %%L*') Do (

... and add a closing parentheses at end. This is better than a simple for /F "Delims=" %%A in ('Dir /B /ON *') Do ( because if the number of files is huge, then the execution of such a for could take too much time...

Upvotes: 3

Magoo
Magoo

Reputation: 80033

@ECHO Off
SETLOCAL ENABLEDELAYEDEXPANSION
SET "sourcedir=U:\sourcedir\t w o"
SET "destdir=U:\destdir"
SET "oldc1=*"
SET "oldsd=*"
SET "dircnt=4"

FOR /f "delims=" %%a IN (
 'dir /b /a-d /on "%sourcedir%\*.zip" '
 ) DO (
 SET "c1=%%a"
 SET "c1=!c1:~0,1!"
 IF /i "!c1!" neq "!oldc1!" (
  REM we changed initial character
  SET "oldc1=!c1!"
  SET "destsd=!c1!"
  FOR %%b IN (0 1 2 3 4 5 6 7 8 9) DO IF "%%b"=="!c1!" SET "destsd=0"
  IF /i "!oldsd!" neq "!destsd!" (
   SET /a fcount=dircnt-1
   SET /a olddestsd2=9
   SET "oldsd=!destsd!"
  )
 )
 SET /a fcount +=1
 SET /a destsd2=fcount / dircnt
 IF !destsd2! neq !olddestsd2! (
  SET /a olddestsd2=destsd2
  ECHO MD "%destdir%\!destsd!!destsd2!" 2>nul
 )
 ECHO MOVE "%sourcedir%\%%a" "%destdir%\!destsd!!destsd2!\"
)

GOTO :EOF

You would need to change the setting of sourcedir to suit your circumstances.

You would need to change the settings of sourcedir and destdir to suit your circumstances.

The required MD commands are merely ECHOed for testing purposes. After you've verified that the commands are correct, change ECHO MD to MD to actually create the directories.

The required MOVE commands are merely ECHOed for testing purposes. After you've verified that the commands are correct, change ECHO MOVE to MOVE to actually move the files. Append >nul to suppress report messages (eg. 1 file moved)

Cut-and-paste the post to a .BAT file. Do not attempt to reformat for æsthetic reasons - batch can be quite sensitive to layout.

For testing, I set the value dircnt to 4. Your application requires 1000.

I'd suggest you use some dummy directories for testing.

This routine uses delayed expansion where %var% represents the value of a variable when a code block (parenthesised statement group) is parsed and !var! the value as it may change within the loop.

The first issue is to produce a directory-list in memory This listed is sorted in alphabetical order (/on) and of the basic format (/b) [names only] with no directorynames (/a-d). Each line of lines is then read by the for command and assigned to %%a.

c1 is then employed to contain the first character of the filename which is compared with oldc1 - the previous value so that the destination directoryname is only recalculated when the first character of the filename changes.

If the first character of the filename numeric, then the destination subdirectory will be 0?, otherwise it's the first character of the filename now in c1. We initialise the filecount within this directory to the maximum - 1 and the subdirectory's second character to 9.

This has established the new destination if the destination has changed.

We then increment the count of files moved and calculate the second character of the destination by simply dividing the count-of-files by the files-per-directory. Since batch does integer maths, the result is int(total files with this first character/files per directory), so it starts at 1.

We then detect whether the destination directory has changed, and create the new directory if required.

Then move the file.

Upvotes: 0

Joey
Joey

Reputation: 354566

In PowerShell I'd probably approach it as follows (not really caring for efficiency for now):

  1. Get list of files:

    Get-ChildItem |
    
  2. Sort them by name (with numbers in the name sorted numerically instead of alphabetically)

    Sort-Object {
      [regex]::Replace($_.Name, '\d+', { '{0:0000000000}' -f ([int]$args[0].Value) })
    }
    

    This will take care of sorting some-name-100 after some-name-99.

  3. Group by first character:

    Group-Object {
      $first = $_.Name[0];
      # Emit only a 0 for all numbers
      if ($first -match '[0-9]') { '0' } else { $first }
    } |
    
  4. Take those groups and divide them into batches of at most 1000 elements

    ForEach-Object {
      $name = $_.Name
      $group = $_.Group
      0..([Math]::Floor($_.Count / 1000)) |
        ForEach-Object {
          $items = $group[($_ * 1000)..($_ * 1000 + 999)
          [pscustomobject] @{
            FirstLetter = $name
            Index = $_ + 1
            Items = $items
          }
        }
    } |
    
  5. Create folders and move files to their respective folders

    ForEach-Object {
      $dir = New-Item Directory ($_.FirstLetter + $_.Index)
      $_.Items | Move-Item -Destination $dir
    }
    

This is untested, but should more or less work, probably. For testing you should perhaps test on dummy data and include a -WhatIf on the Move-Item.

All this is going to be much more painful in cmd than PowerShell, especially things like grouping, or sorting numerically.

Upvotes: 0

Related Questions