版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/weixin_40254498/article/details/82464207
String
- 在 Java 中字符串属于对象。
- Java 提供了 String 类来创建和操作字符串。
定义
使用了final ,说明该类不能被继承。同时还实现了:
- java.io.Serializable
- Comparable
- CharSequence
public final class String
implements java.io.Serializable, Comparable<String>, CharSequence { }
属性
/** The value is used for character storage.
* String就是用char[]实现的。保存的
*/
private final char value[];
/** Cache the hash code for the string
* hash 值
*/
private int hash; // Default to 0
/** use serialVersionUID from JDK 1.0.2 for interoperability
* Java的序列化机制是通过在运行时判断类的serialVersionUID来验证版本一致性的。
*/
private static final long serialVersionUID = -6849794470754667710L;
/**
* Class String is special cased within the Serialization Stream Protocol.
* 类字符串在序列化流协议中是特殊的。
* A String instance is written into an ObjectOutputStream according to
* 将字符串实例写入ObjectOutputStream中,根据 a标签
* <a href="{@docRoot}/../platform/serialization/spec/output.html">
* Object Serialization Specification, Section 6.2, "Stream Elements"</a>
*/
private static final ObjectStreamField[] serialPersistentFields =
new ObjectStreamField[0];
构造方法
String 的构造方法大概有十几种,其中最常用的如下:
/**
* 根据字符串创建字符串对象
* Initializes a newly created {@code String} object so that it represents
* the same sequence of characters as the argument; in other words, the
* newly created string is a copy of the argument string. Unless an
* explicit copy of {@code original} is needed, use of this constructor is
* unnecessary since Strings are immutable.
*
* @param original
* A {@code String}
*/
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}
/**
* 根据byte数组创建字符串对象
* byte[] to String 是根据系统的编码来的,但是也可以自己指定编码
* Constructs a new {@code String} by decoding the specified array of bytes
* using the platform's default charset. The length of the new {@code
* String} is a function of the charset, and hence may not be equal to the
* length of the byte array.
*
* <p> The behavior of this constructor when the given bytes are not valid
* in the default charset is unspecified. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes The bytes to be decoded into characters
* @since JDK1.1
*/
public String(byte bytes[]) {
this(bytes, 0, bytes.length);
}
/**
* 在Java中,String实例中保存有一个char[]字符数组,char[]字符数组是以unicode码来存储的,
* String 和 char 为内存形式,byte是网络传输或存储的序列化形式。
* 所以在很多传输和存储的过程中需要将byte[]数组和String进行相互转化。
* 所以,String提供了一系列重载的构造方法来将一个字符数组转化成String,
* 提到byte[]和String之间的相互转换就不得不关注编码问题。
* 例如:
* public String(byte bytes[], int offset, int length, Charset charset) {}
* String(byte bytes[], String charsetName)
* String(byte bytes[], int offset, int length, String charsetName)
* and so on
* String(byte[] bytes, Charset charset)是指通过charset来解码指定的byte数组,
* 将其解码成unicode的char[]数组,够造成新的String。
*
* 下面这个构造方法可以指定字节数组的编码
* Constructs a new {@code String} by decoding the specified array of
* bytes using the specified {@linkplain java.nio.charset.Charset charset}.
* The length of the new {@code String} is a function of the charset, and
* hence may not be equal to the length of the byte array.
*
* <p> This method always replaces malformed-input and unmappable-character
* sequences with this charset's default replacement string. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
*
* @param bytes
* The bytes to be decoded into characters
*
* @param charset
* The {@linkplain java.nio.charset.Charset charset} to be used to
* decode the {@code bytes}
*
* @since 1.6
*/
public String(byte bytes[], Charset charset) {
this(bytes, 0, bytes.length, charset);
}
/**
* 根据char数组
* Allocates a new {@code String} so that it represents the sequence of
* characters currently contained in the character array argument. The
* contents of the character array are copied; subsequent modification of
* the character array does not affect the newly created string.
*
* @param value
* The initial value of the string
*/
public String(char value[]) {
this.value = Arrays.copyOf(value, value.length);
/**
* 根据 StringBuffer 创建 String对象
* Allocates a new string that contains the sequence of characters
* currently contained in the string buffer argument. The contents of the
* string buffer are copied; subsequent modification of the string buffer
* does not affect the newly created string.
*
* @param buffer
* A {@code StringBuffer}
*/
public String(StringBuffer buffer) {
synchronized(buffer) {
this.value = Arrays.copyOf(buffer.getValue(), buffer.length());
}
}
/**
* 根据 StringBuilder 创建 String对象
* Allocates a new string that contains the sequence of characters
* currently contained in the string builder argument. The contents of the
* string builder are copied; subsequent modification of the string builder
* does not affect the newly created string.
*
* <p> This constructor is provided to ease migration to {@code
* StringBuilder}. Obtaining a string from a string builder via the {@code
* toString} method is likely to run faster and is generally preferred.
*
* @param builder
* A {@code StringBuilder}
*
* @since 1.5
*/
public String(StringBuilder builder) {
this.value = Arrays.copyOf(builder.getValue(), builder.length());
}
/*
* 这是一个受保护构造方法,因为不能继承,所以内部使用
* 第二个属性基本没有用,只能是true
* 从代码中可以看出来是直接引用,而不是新建一个,为了提高性能,节省内存等。
* 保护的原因也是为了保证字符串不可修改。
* Package private constructor which shares value array for speed.
* this constructor is always expected to be called with share==true.
* a separate constructor is needed because we already have a public
* String(char[]) constructor that makes a copy of the given char[].
*/
String(char[] value, boolean share) {
// assert share : "unshared not supported";
this.value = value;
}
常用的方法
getByte
/**
* 将字符串转成可用的 byte数组
* 在通信的比较多,例如 网络中传输、8583报文、socket通信
* 要想不乱码,就得搞清楚通信双方所使用的字节编码!!!
* Encodes this {@code String} into a sequence of bytes using the named
* charset, storing the result into a new byte array.
*
* <p> The behavior of this method when this string cannot be encoded in
* the given charset is unspecified. The {@link
* java.nio.charset.CharsetEncoder} class should be used when more control
* over the encoding process is required.
*
* @param charsetName
* The name of a supported {@linkplain java.nio.charset.Charset
* charset}
*
* @return The resultant byte array
*
* @throws UnsupportedEncodingException
* If the named charset is not supported
*
* @since JDK1.1
*/
public byte[] getBytes(String charsetName)
throws UnsupportedEncodingException {
if (charsetName == null) throw new NullPointerException();
return StringCoding.encode(charsetName, value, 0, value.length);
}
/**
* 同上
* Encodes this {@code String} into a sequence of bytes using the given
* {@linkplain java.nio.charset.Charset charset}, storing the result into a
* new byte array.
*
* <p> This method always replaces malformed-input and unmappable-character
* sequences with this charset's default replacement byte array. The
* {@link java.nio.charset.CharsetEncoder} class should be used when more
* control over the encoding process is required.
*
* @param charset
* The {@linkplain java.nio.charset.Charset} to be used to encode
* the {@code String}
*
* @return The resultant byte array
*
* @since 1.6
*/
public byte[] getBytes(Charset charset) {
if (charset == null) throw new NullPointerException();
return StringCoding.encode(charset, value, 0, value.length);
}
/**
* 将使用系统默认编码。
* 要注意的,部署的时候容易出错的地方就是这里,
* windows 环境和linux环境字节编码不一样.所以建议指定编码方法
* Encodes this {@code String} into a sequence of bytes using the
* platform's default charset, storing the result into a new byte array.
*
* <p> The behavior of this method when this string cannot be encoded in
* the default charset is unspecified. The {@link
* java.nio.charset.CharsetEncoder} class should be used when more control
* over the encoding process is required.
*
* @return The resultant byte array
*
* @since JDK1.1
*/
public byte[] getBytes() {
return StringCoding.encode(value, 0, value.length);
}
hashCode
/**
* hash算法
* hashCode可以保证相同的字符串的hash值肯定相同,
* 但是,hash值相同并不一定是value值就相同。
* 所以要保证两个字符串相等还得用上 equals
* s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
*/
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
equals
/**
*
* 在hashmap中
* 一定要重写 equals 和 hachcode
* 才能保证是同一个字符串
* 正因为String 重写了我们才能愉快的使用字符串作为key
*/
public boolean equals(Object anObject) {
/** 首先判断是不是自己!*/
if (this == anObject) {
return true;
}
/** 在判断是不是String类型 */
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
/** 判断长度 */
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
/** 判断字节 */
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}
substring
这个方法在JDK1.6(含1.6)以前和JDK1.7之后(含1.7)有了不一样的变化
JDK1.6 substring
/**
* 仍然创建新的字符串但是 旧字符串还在 只是新的引用了旧的一部分
* 但旧字符串很大的时候,因为新的引用一小部分而无法回收会导致内存泄漏
* 一般使用加上一个空的字符串来生成新的解决这个问题
* str = str.substring(x, y) + ""
*/
String(int offset, int count, char value[]) {
this.value = value;
this.offset = offset;
this.count = count;
}
public String substring(int beginIndex, int endIndex) {
/** 校验数组溢出 */
return new String(offset + beginIndex, endIndex - beginIndex, value);
}
- 内存泄露:在计算机科学中,内存泄漏指由于疏忽或错误造成程序未能释放已经不再使用的内存。 内存泄漏并非指内存在物理上的消失,而是应用程序分配某段内存后,由于设计错误,导致在释放该段内存之前就失去了对该段内存的控制,从而造成了内存的浪费。
JDK1.8 substring
jdk1.7之后直接新建了一个字符串 。虽然增加了内存,但是解决了内存泄漏问题。
public String substring(int beginIndex, int endIndex) {
if (beginIndex < 0) {
throw new StringIndexOutOfBoundsException(beginIndex);
}
if (endIndex > value.length) {
throw new StringIndexOutOfBoundsException(endIndex);
}
int subLen = endIndex - beginIndex;
if (subLen < 0) {
throw new StringIndexOutOfBoundsException(subLen);
}
return ((beginIndex == 0) && (endIndex == value.length)) ? this
: new String(value, beginIndex, subLen);
public String(char value[], int offset, int count) {
if (offset < 0) {
throw new StringIndexOutOfBoundsException(offset);
}
if (count <= 0) {
if (count < 0) {
throw new StringIndexOutOfBoundsException(count);
}
if (offset <= value.length) {
this.value = "".value;
return;
}
}
// Note: offset or count might be near -1>>>1.
if (offset > value.length - count) {
throw new StringIndexOutOfBoundsException(offset + count);
}
this.value = Arrays.copyOfRange(value, offset, offset+count);
}
valueOf
/** 调用对象自己的toString方法 */
public static String valueOf(Object obj) {
return (obj == null) ? "null" : obj.toString();
}
public static String valueOf(char data[]) {
return new String(data);
}
public static String valueOf(char data[], int offset, int count) {
return new String(data, offset, count);
}
String + 号重载
String str = "abc";
String str1= str + "def";
/** 反编译之后 */
String str = "abc";
String str1= (new StringBuilder(String.valueOf(str))).append("def").toString();
spilt
按照字符regex将字符串分成limit份。
public String[] split(String regex, int limit) {
/* fastpath if the regex is a
(1)one-char String and this character is not one of the
RegEx's meta characters ".$|()[{^?*+\\", or
(2)two-char String and the first char is the backslash and
the second is not the ascii digit or ascii letter.
*/
char ch = 0;
if (((regex.value.length == 1 &&
".$|()[{^?*+\\".indexOf(ch = regex.charAt(0)) == -1) ||
(regex.length() == 2 &&
regex.charAt(0) == '\\' &&
(((ch = regex.charAt(1))-'0')|('9'-ch)) < 0 &&
((ch-'a')|('z'-ch)) < 0 &&
((ch-'A')|('Z'-ch)) < 0)) &&
(ch < Character.MIN_HIGH_SURROGATE ||
ch > Character.MAX_LOW_SURROGATE))
{
int off = 0;
int next = 0;
boolean limited = limit > 0;
ArrayList<String> list = new ArrayList<>();
while ((next = indexOf(ch, off)) != -1) {
if (!limited || list.size() < limit - 1) {
list.add(substring(off, next));
off = next + 1;
} else { // last one
//assert (list.size() == limit - 1);
list.add(substring(off, value.length));
off = value.length;
break;
}
}
// If no match was found, return this
if (off == 0)
return new String[]{this};
// Add remaining segment
if (!limited || list.size() < limit)
list.add(substring(off, value.length));
// Construct result
int resultSize = list.size();
if (limit == 0) {
while (resultSize > 0 && list.get(resultSize - 1).length() == 0) {
resultSize--;
}
}
String[] result = new String[resultSize];
return list.subList(0, resultSize).toArray(result);
}
return Pattern.compile(regex).split(this, limit);
}
按照字符regex将字符串分割
/** 直接调用 split(String regex, int limit) limit 为 零 */
public String[] split(String regex) {
return split(regex, 0);
}
equalsIgnoreCase
public boolean equalsIgnoreCase(String anotherString) {
return (this == anotherString) ? true
: (anotherString != null)
&& (anotherString.value.length == value.length)
&& regionMatches(true, 0, anotherString, 0, value.length);
}
三目运算符加 && 代替 多个if
replaceFirst、replaceAll、replace
String replaceFirst(String regex, String replacement)
String replaceAll(String regex, String replacement)
String replace(CharSequence target, CharSequence replacement)
- replace的参数是char和CharSequence,即可以支持字符的替换,也支持字符串的替换
- replaceAll和replaceFirst的参数是regex,即基于规则表达式的替换,replace只要有符合就替换
- replaceFirst(),只替换第一次出现的字符串;
其他方法
String 类中还有很多方法。例如:
- public int length(){}
返回字符串长度 - public boolean isEmpty() { }
返回字符串是否为空 - public char charAt(int index) {}
返回字符串中第(index+1)个字符 - public char[] toCharArray() {}
转化成字符数组 - public String trim(){}
去掉两端空格 - public String toUpperCase(){}
转化为大写 - public String toLowerCase(){}
转化为小写 - public String concat(String str) {}
拼接字符串 - public boolean matches(String regex){}
判断字符串是否匹配给定的regex正则表达式 - public boolean contains(CharSequence s)
判断字符串是否包含字符序列s