Table of Contents
Using System.Text.Encoding
Class
The System.Text.Encoding
class converts the specified string to a byte array in PowerShell. This class can use a particular, default, or custom encoding format. Let’s see how we can use them.
Use System.Text.Encoding
Class with a Particular Encoding
Use the System.Text.Encoding
class with specific encoding to convert a string to byte array in PowerShell.
1 2 3 4 5 6 7 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
First, we initialized the $string
variable with "Hello World!"
string. Then, we created an ASCII
encoding object and stored it in the $asciiEncodingObj
variable. Note that the ::
operator accessed the static ASCII
property of the System.Text.Encoding
class.
Next, we invoked the GetBytes()
method of the ASCII object ($asciiEncodingObj
) created in the previous step. The GetBytes()
method took$string
as a parameter, converted it to a byte array and returned it, which we stored in the $byteArray
variable. Finally, we used the Write-Host
cmdlet to print them on the PowerShell console. This cmdlet is used to display customized outputs on the console.
Now, at this point, you should ask why we used the ASCII
encoding format. It is because the $string
had English and some special characters supported by the ASCII
encoding format. Remember, the ASCII encoding is a 7-bit encoding, representing characters using a single byte. This encoding is used where we need to support only English and some special characters because it doesn’t support non-English characters.
Let’s take another example where the $string
contains some non-English characters unsupported by the ASCII
encoding; see the following example.
1 2 3 4 5 6 7 |
$string = "Hello 世界!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
1 2 3 4 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 63 63 33 |
In the above example, we initialized the $string
with a combination of English and non-English characters; for example, Hello
and 世界!
(world!
in English) and used ASCII
encoding format to convert the $string
to byte array. Did you observe that we didn’t get any error or exception in PowerShell, but the returned output is not the correct one; why?
In PowerShell, you will notice that the non-ASCII characters were replaced with a ?
or any other replacement character and the ASCII value in the byte array is the value of the replaced character whether it is a ?
or any other; it is not the ASCII value of the non-English character. Let’s use the following code to see the actual ASCII value of non-English characters.
1 2 3 4 5 6 7 8 9 10 |
$string = "Hello 世界!" for ($i = 0; $i -lt $string.Length; $i++) { $character = $string[$i] $asciiValue = [int][char]$character if ($asciiValue -gt 127) { Write-Host "Non-ASCII character found: $character (ASCII Value $asciiValue)" } } |
1 2 3 4 |
Non-ASCII character found: 世 (ASCII Value 19990) Non-ASCII character found: 界 (ASCII Value 30028) |
We found two non-ASCII characters: 世
and 界
.
Standard ASCII characters have
0
to127
ASCII values.
What should we do if we combine English and non-English characters? Do we have another solution for it? Yes, we have. We must use a different encoding format, such as UTF-8
or UTF-16
. Let’s convert the "Hello 世界!"
string to a byte array using the UTF8
encoding format. See the following example.
1 2 3 4 5 6 7 |
$string = "Hello 世界!" $utf8EncodingObj = [System.Text.Encoding]::UTF8 $byteArray = $utf8EncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
1 2 3 4 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 228 184 150 231 149 140 33 |
In the above output, the first six bytes in the $byteArray
corresponds to ASCII characters for Hello
(including whitespace), while the next six bytes denote the UTF8
encoding of 世界
characters. If you decode this byte array, you will notice that the byte sequence 228 184 150
corresponds to a 世
character, the byte sequence 231 149 140
corresponds to a 界
character and the last ASCII value, 33
corresponds to the !
character.
Let’s decode the $byteArray
together in the following example:
1 2 3 4 5 6 7 8 9 |
$string = "Hello 世界!" $utf8EncodingObj = [System.Text.Encoding]::UTF8 $byteArray = $utf8EncodingObj.GetBytes($string) $decodedByteArray = $utf8EncodingObj.GetString($byteArray) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" Write-Host "Decoded Byte Array is: $decodedByteArray" |
1 2 3 4 5 |
Original String is: Hello 世界! Byte Array is: 72 101 108 108 111 32 228 184 150 231 149 140 33 Decoded Byte Array is: Hello 世界! |
Further reading:
Use System.Text.Encoding
Class with a Default
Encoding
Use the System.Text.Encoding
class with Default
encoding to convert a string to a byte array in PowerShell.
1 2 3 4 5 6 7 |
$string = "Hello World!" $defaultEncodingObj = [System.Text.Encoding]::Default $byteArray = $defaultEncodingObj.GetBytes($string) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" |
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
This time, we used the current operating system’s Default
encoding format to convert $string
to a byte array in PowerShell. This encoding type may vary based on the regional settings and operating system. Mostly, the default encoding is UTF8
or some variant of ANSI
code page such as Windows-1252 on Windows systems.
The
Default
encoding is not sufficient if working with non-ASCII characters; in that case, usingUTF8
orUTF16
will be beneficial to encode characters properly.
Use System.Text.Encoding
Class with Custom Encoding
Use the System.Text.Encoding
class with custom encoding to convert a string to a byte array in PowerShell.
1 2 3 4 5 6 7 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::GetEncoding("ISO-8859-1") $byteArray = $asciiEncodingObj.GetBytes($string) Write-Host "Original String is: $byteArray" Write-Host "Byte Array is: $byteArray" |
1 2 3 4 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 |
This code example is similar to the previous ones but allows us to use custom encoding using the GetEncoding()
method. We specified the ISO-8859-1
encoding format as an argument to the GetEncoding()
method for further use; you can mention any based on your project needs.
Use
$byteArray.GetType()
to get the data type of the byte array. Don’t forget to replace the$byteArray
variable with your byte array variable name.
Until now, we used various encoding formats to convert a string, but why are they important? We convert string to a byte array for multiple reasons, including data transmission, encryption, hashing, and manipulating binary data. Here comes another important point; what to do if you must send a byte array and its equivalent cryptographic data in a text format. In that case, we can use the following code to convert the byte array into the hash (String->ByteArray->Hash->Base64)
1 2 3 4 5 6 7 8 9 10 11 |
$string = "Hello World!" $asciiEncodingObj = [System.Text.Encoding]::ASCII $byteArray = $asciiEncodingObj.GetBytes($string) $sha = New-Object System.Security.Cryptography.SHA1CryptoServiceProvider $result = $sha.ComputeHash($byteArray) $result = [System.Convert]::ToBase64String($result1) Write-Host "Original String is: $string" Write-Host "Byte Array is: $byteArray" Write-Host "Byte Array in Text Form: $result" |
1 2 3 4 5 |
Original String is: Hello World! Byte Array is: 72 101 108 108 111 32 87 111 114 108 100 33 Byte Array in Text Form: Lve95gjOVATpfV8EL5X4nxwjKHE= |
After having the $byteArray
, we used the New-Object
cmdlet to create a new object of SHA1CryptoServiceProvider
and stored its reference in the $sha
variable. Then, we used the $sha
variable to access the ComputeHash()
function, which took $byteArray
as a parameter and returned the hash of the $byteArray
.
We saved this hash in the $result
variable and chained it with the ToBase64String()
method of the Convert
class to convert the $result
to a Base64-encoded string. Note that we updated the value of $result
with the value returned by the ToBase64String()
method. Finally, we used the Write-Host
cmdlet to print customized outputs on the PowerShell console.