How to save each line of text file as array through powershell

JOberloh picture JOberloh · Sep 4, 2018 · Viewed 52.6k times · Source

If I have a text file, C:\USER\Documents\Collections\collection.txt that has the following information:

collectionA.json
collectionB.json
collectionC.json
collectionD.json

I am wondering how, through Powershell, I am able to store each line in the text file as elements of an array such as..

array arrayFromFile = new Array;
foreach(line x in collection.txt)
{
    arrayFromFile.Add(x);
}

..with the end goal of doing the following:

foreach(string x in arrayFromFile)
{
    newman run x;
}

My apologies for the seemingly easy question - I have never dealt with Powershell before.

Answer

mklement0 picture mklement0 · Sep 4, 2018

To complement JohnLBevan's helpful answer:

Get-Content, as a cmdlet, outputs objects one by one to the pipeline, as they become available. (Note that a pipeline is involved even when invoking a cmdlet in the absence of the pipe symbol, |, for chaining multiple commands).
In this case, the output objects are the individual lines of the input text file.

If you collect a pipeline's output objects, such as by assigning it to a variable such as $arrayFromFile or by using the pipeline in the context of a larger expression with (...):

  • PowerShell captures multiple output objects in an automatically created array, of type [object[]],
  • but if there's only one output object, that object is captured as-is (without an array wrapper)

However, it often isn't necessary to ensure that you always receive an array, because PowerShell treats scalars (single values that aren't collections) the same as arrays (collections) in many contexts, such as in foreach statements or when outputting a value to be enumerated to the pipeline, to be processed via the ForEach-Object cmdlet, for instance; therefore, the following commands work just fine, irrespective of how many lines the input file contains:

# OK - read all lines, then process them one by one in the loop.
# (No strict need to collect the Get-Content output in a variable first.)
foreach ($line in Get-Content C:\USER\Documents\Collections\collection.txt) {
  newman run $line
}

# Alternative, using the pipeline:
# Read line by line, and pass each through the pipeline, as it is being
# read, to the ForEach-Object cmdlet.
# Note the use of automatic variable $_ to refer to the line at hand.
Get-Content C:\USER\Documents\Collections\collection.txt |
  ForEach-Object { newman run $_ }

In order to ensure that a command's output is always an array, PowerShell offers @(...), the array-subexpression operator, which wraps even single-object output in an array.

Therefore, the PowerShell-idiomatic solution is:

$arrayFromFile = @(Get-Content C:\USER\Documents\Collections\collection.txt)

TheMadTechnician points out that you can also use [array] to cast / type-constrain pipeline output as an alternative to @(...), which also creates [object[]] arrays:

# Equivalent of the command above that additionally locks in the variable date type.
[array] $arrayFromFile = Get-Content C:\USER\Documents\Collections\collection.txt

By using [array] $arrayFromFile = ... rather than $arrayFromFile = [array] (...), variable $arrayFromFile becomes type-constrained, which means that its data type is locked in (whereas by default PowerShell allows you to change the type of a variable anytime).

[array] is a command-independent alternative to the type-specific cast used in John's answer, [string[]]; you may use the latter for enforcing use of a uniform type across the array's elements, but that is often not necessary in PowerShell[1] .

Regular PowerShell arrays are of type [object[]], which allows mixing elements of different types, but any given element still has a specific type; e.g., even though the type of $arrayFromFile after the command above is [object[]], the type of $arrayFromFile[0], i.e. the 1st element, for instance, is [string] (assuming that the file contained at least 1 line; verify the type with $arrayFromFile[0].GetType().Name).


Faster alternative: direct use of the .NET framework

Cmdlets and the pipeline offer high-level, potentially memory-throttling features that are expressive and convenient, but they can be slow.

When performance matters, direct use of .NET framework types is necessary, such as [System.IO.File] in this case.

$arrayFromFile = [IO.File]::ReadAllLines('C:\USER\Documents\Collections\collection.txt')

Note how the System. prefix can be omitted from the type name.

  • As in John's answer, this will return a [string[]] array.

  • Caveats:

    • Be careful with relative paths, as .NET typically has a different current directory than PowerShell; to work around this, always pass absolute paths, in the simplest case with, e.g., "$PWD/collection.txt" and most robustly with
      "$((Get-Location -PSProvider FileSystem).ProviderPath)/collection.txt"

    • .NET's default encoding is UTF-8, whereas Windows PowerShell defaults to "ANSI" encoding, i.e.the system locale's legacy code page; PowerShell Core, by contrast, also defaults to UTF-8. Use Get-Encoding's -Encoding parameter or the .ReadAllLines() overload that accepts an encoding instance to specify the input file's character encoding explicitly.


[1] Generally, PowerShell's implicit, runtime type conversions cannot provide the same type safety you'd get in C#, for instance. E.g., [string[]] $a = 'one', 'two'; $a[0] = 42 does not cause an error: PowerShell simply quietly converts [int] 42 to a string.