Reputation: 13
In our company there are many users and many applications with restricted access and database with evidence of those accessess. I don´t have access to that database, but what I do have is automatically generated (once a day) csv file with all accessess of all my users. I want them to have a chance to check their access situation so i am writing a simple powershell script for this purpose.
CSV:
user;database1_dat;database2_dat;database3_dat
john;0;0;1
peter;1;0;1
I can do:
import-csv foo.csv | where {$_.user -eq $user}
But this will show me original ugly headres (with "_dat" suffix). Can I delete last four characters from every header which ends with "_dat", when i can´t predict how many headers will be there tomorrow?
I am aware of calculated property like:
Select-Object @{ expression={$_.database1_dat}; label='database1' }
but i have to know all column names for that, as far as I know.
Am I convicted to "overingeneer" it by separate function and build whole "calculated property expression" from scratch dynamically or is there a simple way i am missing?
Thanks :-)
Upvotes: 1
Views: 2000
Reputation: 437953
Assuming that file foo.csv
fits into memory as a whole, the following solution performs well:
$headerRow, $dataRows = (Get-Content -Raw foo.csv) -split '\r?\n', 2
# You can pipe the result to `where {$_.user -eq $user}`
ConvertFrom-Csv ($headerRow -replace '_dat(?=;|$)'), $dataRows -Delimiter ';'
Get-Content
-Raw
reads the entire file into memory, which is much faster than reading it line by line (the default).
-split
'\r?\n', 2
splits the resulting multi-line string into two: the header line and all remaining lines.
\r?\n
matches a newline (both a CRLF (\r\n
) and a LF-only newline (\n
)), 2
limits the number of tokens to return to 2
, meaning that splitting stops once the 1st token (the header row) has been found, and the remainder of the input string (comprising all data rows) is returned as-is as the last token.$null
as the first target variable in the multi-assignment, which is used to discard the empty token that results from the separator regex matching at the very start of the string.$headerRow -replace '_dat(?=;|$)'
-replace
'_dat(?=;|$)'
uses a regex to remove any _dat
column-name suffixes (followed by a ;
or the end of the string); if substring _dat
only ever occurs as a name suffix (not also inside names), you can simplify to -replace '_dat'
ConvertFrom-Csv
directly accepts arrays of strings, so the cleaned-up header row and the string with all data rows can be passed as-is.
Alternative solution: algorithmic renaming of an object's properties:
Note: This solution is slow, but may be an option if you only extract a few objects from the CSV file.
As you note in the question, use of Select-Object
with calculated properties is not an option in your case, because you neither know the column names nor their number in advance.
However, you can use a ForEach-Object
command in which you use .psobject.Properties
, an intrinsic member, for reflection on the input objects:
Import-Csv -Delimiter ';' foo.csv | where { $_.user -eq $user } | ForEach-Object {
# Initialize an aux. ordered hashtable to store the renamed
# property name-value pairs.
$renamedProperties = [ordered] @{}
# Process all properties of the input object and
# add them with cleaned-up names to the hashtable.
foreach ($prop in $_.psobject.Properties) {
$renamedProperties[($prop.Name -replace '_dat(?=.|$)')] = $prop.Value
}
# Convert the aux. hashtable to a custom object and output it.
[pscustomobject] $renamedProperties
}
Upvotes: 2
Reputation: 60045
You can do something like this:
$textInfo = (Get-Culture).TextInfo
$headers = (Get-Content .\test.csv | Select-Object -First 1).Split(';') |
ForEach-Object {
$textInfo.ToTitleCase($_) -replace '_dat'
}
$user = 'peter'
Get-Content .\test.csv | Select-Object -Skip 1 |
ConvertFrom-Csv -Delimiter ';' -Header $headers |
Where-Object User -EQ $user
User Database1 Database2 Database3
---- --------- --------- ---------
peter 1 0 1
Not super efficient but does the trick.
Upvotes: 1