Character set

We can convert a Unicode character to a sequence of bytes and vice versa using an encoding scheme.
The java.nio.charset package provides classes to encode/decode a CharBuffer to a ByteBuffer and vice versa.
An object of the Charset class represents the encoding scheme. The CharsetEncoder class performs the encoding. The CharsetDecoder class performs the decoding.
We can get an object of the Charset class using its forName() method by passing the name of the character set as its argument.
For simple encoding and decoding tasks, we can use the encode() and decode() methods of the Charset class.
The following code shows how to encode a sequence of characters in the string Hello stored in a character buffer and decode it using the UTF-8 encoding-scheme.
Charset cs = Charset.forName("UTF-8");
CharBuffer cb = CharBuffer.wrap("Hello");
ByteBuffer encodedData = cs.encode(cb);
CharBuffer decodedData = cs.decode(encodedData);
CharsetEncoder and CharsetDecoder classes accept a chunk of input to be encoded or decoded.
The encode() and decode() methods of the Charset class return the encoded and decoded buffers to us.
The following code shows how to get encoder and decoder objects from a Charset object.
Charset cs = Charset.forName("UTF-8");
CharsetEncoder encoder = cs.newEncoder();
CharsetDecoder decoder = cs.newDecoder();
The following code demonstrates how to list all character sets supported by a JVM.
import java.util.Map;
import java.nio.charset.Charset;
import java.util.Set;
public class Main {
public static void main(String[] args) {
Map<String, Charset> map = Charset.availableCharsets();
Set<String> keys = map.keySet();
System.out.println("Available Character Set Count: " + keys.size());
for (String charsetName : keys) {
System.out.println(charsetName);
} }
}

No comments:

Post a Comment