joslinm
joslinm

Reputation: 8105

Split a string into multiple sections by index perl

I have the output from a diskpart command:

Volume ###  Ltr  Label        Fs     Type        Size     Status     Info
----------  ---  -----------  -----  ----------  -------  ---------  --------
Volume 0     D                       DVD-ROM         0 B  No Media
Volume 1     C   OSDisk       NTFS   Partition    232 GB  Healthy    Boot
Volume 2         BDEDrive     NTFS   Partition    300 MB  Healthy    System

I want to capture each of these into their own specific variable, so my first inclination was to do something like ($volume, $ltr, ..., $info) = $line =~ ((\w+\s\d+)\s+([A-Z])?...

The problem I ran into with that is that there's nothing unique between Label, FS, and Type so if I'm using (\w+)\s+ on each of those columns, there's a chance Label doesn't exist but a FS does, and thus the filesystem reads into $label improperly.

I'm not too sure if I can make this work with regex, but I'm open to suggestions! Instead I was going to go in a new direction and just split the string up based on the indices of beginning - and ending -. If I pulled in all these indices, what's the best method to split this string up into their respective substrings Perl?

I looked at substr, and attempted to pass it multiple indices like ($a,$b,$c) = substr('abcd', 1,2,3); but this merely resulted in $a being split between 2,3

Is there any elegant solution to this besides just splitting everything up one line at a time?

Upvotes: 1

Views: 595

Answers (2)

Swen Vermeul
Swen Vermeul

Reputation: 104

Instead of using a (not very maintainable) regex, it is much easier to use unpack:

my @l = unpack('A12 A5 A13 A7 A12 A9 A11 A9', $_);

You still have to throw away the second line, but you don't have to care about how your data looks like.

Upvotes: 5

Toto
Toto

Reputation: 91385

How about:

#!/usr/bin/perl
use strict;
use warnings;
use Data::Dump qw(dump);


while(<DATA>) {
    chomp;
    my @l = /^(\w*\s\d*)\s+(\w|\s)\s+(\w+|\s+)\s+(\w+|\s+)\s+([\w-]+|\s+)\s+(\d+\s\w{1,2})\s+?([\w\s]+)\s+?([\w\s]+)$/;
    dump(@l) if @l;
}


__DATA__
Volume ###  Ltr  Label        Fs     Type        Size     Status     Info
----------  ---  -----------  -----  ----------  -------  ---------  --------
Volume 0     D                       DVD-ROM         0 B  No Media          
Volume 1     C   OSDisk       NTFS   Partition    232 GB  Healthy    Boot
Volume 2         BDEDrive     NTFS   Partition    300 MB  Healthy    System

output:

(
  "Volume 0",
  "D",
  " ",
  " ",
  "DVD-ROM",
  "0 B",
  " No Media        ",
  " ",
)

(
  "Volume 1",
  "C",
  "OSDisk",
  "NTFS",
  "Partition",
  "232 GB",
  " Healthy   ",
  "Boot",
)

(
  "Volume 2",
  " ",
  "BDEDrive",
  "NTFS",
  "Partition",
  "300 MB",
  " Healthy   ",
  "System",
)

Upvotes: 2

Related Questions