Java小白踩坑录 - new String 乱码（二）-阿里云开发者社区

Java小白踩坑录 - new String 乱码（二）

2022-05-30 619

版权

本文内容由阿里云实名注册用户自发贡献，版权归原作者所有，阿里云开发者社区不拥有其著作权，亦不承担相应法律责任。具体规则请查看《阿里云开发者社区用户服务协议》和《阿里云开发者社区知识产权保护指引》。如果您发现本社区中有涉嫌抄袭的内容，填写侵权投诉表单进行举报，一经查实，本社区将立刻删除涉嫌侵权内容。

简介： Java小白踩坑录 - new String 乱码（二）

推测可能是编码问题，深入其源码内部，看看：

 /**
 * Constructs a new {@code String} by decoding the specified array of bytes
 * using the platform's default charset. The length of the new {@code
 * String} is a function of the charset, and hence may not be equal to the
 * length of the byte array.
 *
 * <p> The behavior of this constructor when the given bytes are not valid
 * in the default charset is unspecified. The {@link
 * java.nio.charset.CharsetDecoder} class should be used when more control
 * over the decoding process is required.
 *
 * @param bytes
 * The bytes to be decoded into characters
 *
 * @since JDK1.1
 */
 public String(byte bytes[]) {
  this(bytes, 0, bytes.length);
 }

翻译过来就是：在通过解码使用平台缺省字符集的指定 byte 数组来构造一个新的 String 时，该新 String 的长度是字符集的一个函数，因此，它可能不等于 byte 数组的长度。当给定的所有字节在缺省字符集中并非全部有效时，这个构造器的行为是不确定的。

罪魁祸首就是 String(byte[]) 构造。

问题解决

小白承认了自己的错误，小T也高兴得提了个 Bug。接下来小白就要修改掉这个 Bug 了。

public static void main(String[] args) throws UnsupportedEncodingException {
  byte bytes[] = new byte[256];
  for (int i = 0; i < 256; i++)
    bytes[i] = (byte)i;
  String str = new String(bytes,"ISO-8859-1");
    for (int i = 0, n = str.length(); i < n; i++)
    System.out.print((int)str.charAt(i) + " ");
}

指定字符集后，小T和小白又能愉快得玩耍了。

总结

每当你要将一个 byte 序列转换成一个String 时，你都在使用某一个字符集，不管你是否显式地指定了它。如果你想让你的程序的行为是可预知的，那么就请你在每次使用字符集时都明确地指定。

Java小白踩坑录 - new String 乱码（二）

总结

热门文章

最新文章

相关课程

相关电子书

相关实验场景