Table of Contents
Using System.Text.Encoding Class
The System.Text.Encoding class converts the specified string to a byte array in PowerShell. This class can use a particular, default, or custom encoding format. Let’s see how we can use them.
Use System.Text.Encoding Class with a Particular Encoding
Use the System.Text.Encoding class with specific encoding to convert a string to byte array in PowerShell.
|
1 2 3 4 5 6 7 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
|
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
First, we initialized the $string variable with "Hello World!" string. Then, we created an ASCII encoding object and stored it in the $asciiEncodingObj variable. Note that the :: operator accessed the static ASCIIproperty of the System.Text.Encoding class.
Next, we invoked the GetBytes() method of the ASCII object ($asciiEncodingObj) created in the previous step. The GetBytes() method took$string as a parameter, converted it to a byte array and returned it, which we stored in the $byteArray variable. Finally, we used the Write-Host cmdlet to print them on the PowerShell console. This cmdlet is used to display customized outputs on the console.
Now, at this point, you should ask why we used the ASCII encoding format. It is because the $string had English and some special characters supported by the ASCII encoding format. Remember, the ASCII encoding is a 7-bit encoding, representing characters using a single byte. This encoding is used where we need to support only English and some special characters because it doesn’t support non-English characters.
Let’s take another example where the $string contains some non-English characters unsupported by the ASCII encoding; see the following example.
|
1 2 3 4 5 6 7 |
$string = "Hello 世界!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
|
1 2 3 4 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 63 63 33 |
In the above example, we initialized the $string with a combination of English and non-English characters; for example, Hello and 世界! (world! in English) and used ASCII encoding format to convert the $string to byte array. Did you observe that we didn’t get any error or exception in PowerShell, but the returned output is not the correct one; why?
In PowerShell, you will notice that the non-ASCII characters were replaced with a ? or any other replacement character and the ASCII value in the byte array is the value of the replaced character whether it is a ? or any other; it is not the ASCII value of the non-English character. Let’s use the following code to see the actual ASCII value of non-English characters.
|
1 2 3 4 5 6 7 8 9 10 |
$string = "Hello 世界!" for ($i = 0; $i -lt $string.Length; $i++) { $character = $string[$i] $asciiValue = [int][char]$character if ($asciiValue -gt 127) { Write-Host "Non-ASCII character found: $character (ASCII Value $asciiValue)" } } |
|
1 2 3 4 |
Non-ASCII character found: 世 (ASCII Value 19990) Non-ASCII character found: 界 (ASCII Value 30028) |
We found two non-ASCII characters: 世 and 界.
Standard ASCII characters have
0to127ASCII values.
What should we do if we combine English and non-English characters? Do we have another solution for it? Yes, we have. We must use a different encoding format, such as UTF-8 or UTF-16. Let’s convert the "Hello 世界!" string to a byte array using the UTF8 encoding format. See the following example.
|
1 2 3 4 5 6 7 |
$string = "Hello 世界!" $utf8EncodingObj = [System.Text.Encoding]::UTF8 $byteArray = $utf8EncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
|
1 2 3 4 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 228 184 150 231 149 140 33 |
In the above output, the first six bytes in the $byteArray corresponds to ASCII characters for Hello (including whitespace), while the next six bytes denote the UTF8 encoding of 世界 characters. If you decode this byte array, you will notice that the byte sequence 228 184 150 corresponds to a 世 character, the byte sequence 231 149 140 corresponds to a 界 character and the last ASCII value, 33 corresponds to the ! character.
Let’s decode the $byteArray together in the following example:
|
1 2 3 4 5 6 7 8 9 |
$string = "Hello 世界!" $utf8EncodingObj = [System.Text.Encoding]::UTF8 $byteArray = $utf8EncodingObj.GetBytes($string) $decodedByteArray = $utf8EncodingObj.GetString($byteArray) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" Write-Host "Decoded Byte Array is: $decodedByteArray" |
|
1 2 3 4 5 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 228 184 150 231 149 140 33 Decoded Byte Array is: Hello 世界! |
Further reading:
Use System.Text.Encoding Class with a Default Encoding
Use the System.Text.Encoding class with Default encoding to convert a string to a byte array in PowerShell.
|
1 2 3 4 5 6 7 |
$string = "Hello World!" $defaultEncodingObj = [System.Text.Encoding]::Default $byteArray = $defaultEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
|
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
This time, we used the current operating system’s Default encoding format to convert $string to a byte array in PowerShell. This encoding type may vary based on the regional settings and operating system. Mostly, the default encoding is UTF8 or some variant of ANSIcode page such as Windows-1252 on Windows systems.
The
Defaultencoding is not sufficient if working with non-ASCII characters; in that case, usingUTF8orUTF16will be beneficial to encode characters properly.
Use System.Text.Encoding Class with Custom Encoding
Use the System.Text.Encoding class with custom encoding to convert a string to a byte array in PowerShell.
|
1 2 3 4 5 6 7 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::GetEncoding("ISO-8859-1") $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $byteArray" Write-Host "Byte Array is: $byteArray" |
|
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
This code example is similar to the previous ones but allows us to use custom encoding using the GetEncoding() method. We specified the ISO-8859-1 encoding format as an argument to the GetEncoding() method for further use; you can mention any based on your project needs.
Use
$byteArray.GetType()to get the data type of the byte array. Don’t forget to replace the$byteArrayvariable with your byte array variable name.
Until now, we used various encoding formats to convert a string, but why are they important? We convert string to a byte array for multiple reasons, including data transmission, encryption, hashing, and manipulating binary data. Here comes another important point; what to do if you must send a byte array and its equivalent cryptographic data in a text format. In that case, we can use the following code to convert the byte array into the hash (String->ByteArray->Hash->Base64)
|
1 2 3 4 5 6 7 8 9 10 11 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) $sha = New-Object System.Security.Cryptography.SHA1CryptoServiceProvider $result = $sha.ComputeHash($byteArray) $result = [System.Convert]::ToBase64String($result1) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" Write-Host "Byte Array in Text Form: $result" |
|
1 2 3 4 5 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 Byte Array in Text Form: Lve95gjOVATpfV8EL5X4nxwjKHE= |
After having the $byteArray, we used the New-Object cmdlet to create a new object of SHA1CryptoServiceProvider and stored its reference in the $sha variable. Then, we used the $sha variable to access the ComputeHash() function, which took $byteArray as a parameter and returned the hash of the $byteArray.
We saved this hash in the $result variable and chained it with the ToBase64String() method of the Convert class to convert the $result to a Base64-encoded string. Note that we updated the value of $result with the value returned by the ToBase64String() method. Finally, we used the Write-Host cmdlet to print customized outputs on the PowerShell console.