Converting Between Strings (Unicode) and Other Character Set Encodings

Many network protocols and files store their characters with a byte-oriented character set such as ISO-8859-1 (ISO-Latin-1). However, Java's native character encoding is Unicode.

This example demonstrates how to convert ISO-8859-1 encoded bytes in a ByteBuffer to a string in a CharBuffer and visa versa.

// Create the encoder and decoder for ISO-8859-1 Charset charset = Charset.forName("ISO-8859-1"); CharsetDecoder decoder = charset.newDecoder(); CharsetEncoder encoder = charset.newEncoder(); try { // Convert a string to ISO-LATIN-1 bytes in a ByteBuffer // The new ByteBuffer is ready to be read. ByteBuffer bbuf = encoder.encode(CharBuffer.wrap("a string")); // Convert ISO-LATIN-1 bytes in a ByteBuffer to a character ByteBuffer and then to a string. // The new ByteBuffer is ready to be read. CharBuffer cbuf = decoder.decode(bbuf); String s = cbuf.toString(); } catch (CharacterCodingException e) { }
In the example above, the encoding and decoding methods created new ByteBuffers into which to encode or decoding the data. Moreover, the newly allocated ByteBuffers are non-direct (Determining If a ByteBuffer Is Direct). The encoder and decoder provide methods that use a supplied ByteBuffer rather than create one. Here's an example that uses these methods:
// Create a direct ByteBuffer. // This buffer will be used to send and recieve data from channels. ByteBuffer bbuf = ByteBuffer.allocateDirect(1024); // Create a non-direct character ByteBuffer CharBuffer cbuf = CharBuffer.allocate(1024); // Convert characters in cbuf to bbuf encoder.encode(cbuf, bbuf, false); // flip bbuf before reading from it bbuf.flip(); // Convert bytes in bbuf to cbuf decoder.decode(bbuf, cbuf, false); // flip cbuf before reading from it cbuf.flip();

Comments

31 Jan 2010 - 9:07am by Anonymous (not verified)

"Java's native character encoding is Unicode."

I'm new to java, yet i think this statement is wrong. I think Java default's enconding is the encoding used by the system. For instance, i'm on a mac and what i get when i print

System.getProperty("file.encoding")

is

MacRoman

2 Feb 2010 - 4:24am by Simon (not verified)

Depends on what you mean by "native encoding".
Strings are encoded in UTF-16BE in memory. Always.

20 Feb 2010 - 4:33pm by RC (not verified)

The default charset is determined by the OS.
The statement "Java's native character encoding is UTF-16." is correct.
Typically it's a one-to-one mapping between the charset and the encoding.
See java.nio.charset.Charset javadoc comprehensive description.

28 Jul 2010 - 4:19am by Anonymous (not verified)

thanks

5 Sep 2010 - 6:40pm by Anonymous (not verified)

This is an extremely helpful post. Thanks. The only question I have is what happens to the characters that where converted. I am using this method to remove special characters that my database didn't like, characters that are more commonly used in Europe.

7 Sep 2010 - 12:05am by coach handbags (not verified)

coach is famous for manufacturing the vast range of designer coach handbags and using only best quality raw material.coach wallets ,coach shoes as well as coach sunglasses . http://www.nicecoachhandbags.com

8 Sep 2010 - 1:08am by UGG Classic Tall (not verified)

Spring is a season full of UGG Classic Short vitality and youth, it seems that UGG Ultra Short everyone wants to fully prove themselves this season so comfortable. Just as the designers hope ugg, customers can glory shining in the spring.
First, you can use a slightly relaxed jeans indigo wash, which can make it as UGG Classic Mini real and easily. And the black knitwear UGG Classic Cardy can make it much cooler. Also, you can take a thick strawberry carry Ugg boots. Moreover, you can wear jeans ugg boots.
In my opinion, girls like clothes, jewelry, or things UGG Classic Tall like that. The best post-sale: Hurlling hockey stick So I chose a beautiful pin for her. Leighton Meester starting point you to keep photo albums latest price Although it can be very valuable, I thought I could make my cousin happy. At the same time, I was UGG Handbags very curious about what my parents sent him.

8 Sep 2010 - 3:35am by meiye (not verified)

The Omega Seamaster Watches are free articles watches that have classic lines and an attractive and articles cool appearance. Omega Seamaster Watches are good watches for both articles land and water wear. Omega Seamaster Watches grew in popularity free articles after James Bond's character wore them in, "Goldeneye," and free articles throughout other Bond films

Post a comment

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
Image CAPTCHA
Enter the characters shown in the image. Ignore spaces and be careful about upper and lower case.