Convert String to Byte Array in PowerShell

Convert String to byte array in PowerShell

Using System.Text.Encoding Class

The System.Text.Encoding class converts the specified string to a byte array in PowerShell. This class can use a particular, default, or custom encoding format. Let’s see how we can use them.

Use System.Text.Encoding Class with a Particular Encoding

Use the System.Text.Encoding class with specific encoding to convert a string to byte array in PowerShell.

First, we initialized the $string variable with "Hello World!" string. Then, we created an ASCII encoding object and stored it in the $asciiEncodingObj variable. Note that the :: operator accessed the static ASCIIproperty of the System.Text.Encoding class.

Next, we invoked the GetBytes() method of the ASCII object ($asciiEncodingObj) created in the previous step. The GetBytes() method took$string as a parameter, converted it to a byte array and returned it, which we stored in the $byteArray variable. Finally, we used the Write-Host cmdlet to print them on the PowerShell console. This cmdlet is used to display customized outputs on the console.

Now, at this point, you should ask why we used the ASCII encoding format. It is because the $string had English and some special characters supported by the ASCII encoding format. Remember, the ASCII encoding is a 7-bit encoding, representing characters using a single byte. This encoding is used where we need to support only English and some special characters because it doesn’t support non-English characters.

Let’s take another example where the $string contains some non-English characters unsupported by the ASCII encoding; see the following example.

In the above example, we initialized the $string with a combination of English and non-English characters; for example, Hello and 世界! (world! in English) and used ASCII encoding format to convert the $string to byte array. Did you observe that we didn’t get any error or exception in PowerShell, but the returned output is not the correct one; why?

In PowerShell, you will notice that the non-ASCII characters were replaced with a ? or any other replacement character and the ASCII value in the byte array is the value of the replaced character whether it is a ? or any other; it is not the ASCII value of the non-English character. Let’s use the following code to see the actual ASCII value of non-English characters.

We found two non-ASCII characters: 世 and 界.

Standard ASCII characters have 0 to 127 ASCII values.

What should we do if we combine English and non-English characters? Do we have another solution for it? Yes, we have. We must use a different encoding format, such as UTF-8 or UTF-16. Let’s convert the "Hello 世界!" string to a byte array using the UTF8 encoding format. See the following example.

In the above output, the first six bytes in the $byteArray corresponds to ASCII characters for Hello (including whitespace), while the next six bytes denote the UTF8 encoding of 世界 characters. If you decode this byte array, you will notice that the byte sequence 228 184 150 corresponds to a 世 character, the byte sequence 231 149 140 corresponds to a 界 character and the last ASCII value, 33 corresponds to the ! character.

Let’s decode the $byteArray together in the following example:

Use System.Text.Encoding Class with a Default Encoding

Use the System.Text.Encoding class with Default encoding to convert a string to a byte array in PowerShell.

This time, we used the current operating system’s Default encoding format to convert $string to a byte array in PowerShell. This encoding type may vary based on the regional settings and operating system. Mostly, the default encoding is UTF8 or some variant of ANSIcode page such as Windows-1252 on Windows systems.

The Default encoding is not sufficient if working with non-ASCII characters; in that case, using UTF8 or UTF16 will be beneficial to encode characters properly.

Use System.Text.Encoding Class with Custom Encoding

Use the System.Text.Encoding class with custom encoding to convert a string to a byte array in PowerShell.

This code example is similar to the previous ones but allows us to use custom encoding using the GetEncoding() method. We specified the ISO-8859-1 encoding format as an argument to the GetEncoding() method for further use; you can mention any based on your project needs.

Use $byteArray.GetType() to get the data type of the byte array. Don’t forget to replace the $byteArray variable with your byte array variable name.

Until now, we used various encoding formats to convert a string, but why are they important? We convert string to a byte array for multiple reasons, including data transmission, encryption, hashing, and manipulating binary data. Here comes another important point; what to do if you must send a byte array and its equivalent cryptographic data in a text format. In that case, we can use the following code to convert the byte array into the hash (String->ByteArray->Hash->Base64)

After having the $byteArray, we used the New-Object cmdlet to create a new object of SHA1CryptoServiceProvider and stored its reference in the $sha variable. Then, we used the $sha variable to access the ComputeHash() function, which took $byteArray as a parameter and returned the hash of the $byteArray.

We saved this hash in the $result variable and chained it with the ToBase64String() method of the Convert class to convert the $result to a Base64-encoded string. Note that we updated the value of $result with the value returned by the ToBase64String() method. Finally, we used the Write-Host cmdlet to print customized outputs on the PowerShell console.

Was this post helpful?

Leave a Reply

Your email address will not be published. Required fields are marked *