2020-04-08

基础 / JavaSE

33 分钟读完 (大约 4907 个字) 0次访问

String 类

部分参考自：https://www.cnblogs.com/ysocean/p/8571426.html#_label0

定义

String类是一个不可变类。其一旦被赋值，就不能别修改了。

我们先来看一下源码：

public final class String implements java.io.Serializable, Comparable<String>, CharSequence {
    private final char value[];
    private int hash;
}

我们都知道String类是一个不可变类。首先String类是被final修饰的类，不能被任何类继承，而且由于内部属性value数组也被final修饰，一旦被创建之后，包含在这个对象中的字符序列是不可改变的，包括该类后续的所有方法都是不能修改该对象的，直至该对象被销毁，这是我们需要特别注意的（该类的一些方法看似改变了字符串，其实内部都是创建一个新的字符串）。

通过上面的代码可以看到，String内部其实是一个字符数组char[] ，这里使用的JDK版本是8.x的。在后面的版本改为了byte[]数组，网上说9.x之后改的。我没有下载9.x，看的是11.x，内部确实改为了byte[].

网上扯来了一张图：

看完String类的基本特点之后，我们看一下方法：

构造方法

String类的构造方法有很多，我们来看一下：

无参构造方法：

1
2
3

public String() {
    this.value = "".value;
}

字符串构造方法：

public String(String original) {
    this.value = original.value;
    this.hash = original.hash;
}

以及利用字符数组，字节数组 和 StringBuffer 以及 StringBuilder来创建。

示例：

String str0 = new String();
String str1 = new String("abc");
String str2 = new String(new char[]{'a', 'b', 'c'});
String str3 = new String(new byte[]{1, 2, 3});
String str4 = new String(new StringBuilder());
String str5 = new String(new StringBuffer());

除了上面的示例，还有一些特殊处理的。比如取数组的某个区间创建字符串。

常用的普通方法

获取长度和判空的方法

public int length() {
    return value.length;
}
public boolean isEmpty() {
    return value.length == 0;
}

获取当前字符串中指定位置的字符

说明：通过其源码可以看出，其实就是获取value数组第index位置上的字符。由于数组下标从0开始，所以这里的index也是从0开始的。

public char charAt(int index) {
    if ((index < 0) || (index >= value.length)) {
        throw new StringIndexOutOfBoundsException(index);
    }
    return value[index];
}

获取字符串的字节数组

1
2
3

public byte[] getBytes(String charsetName) throws UnsupportedEncodingException {}
public byte[] getBytes(Charset charset) {}
public byte[] getBytes() {}

比较字符串是否相同

通过方法名即可知道equalsIgnoreCase是忽略大小写的，则equals是区分大小写的比较，即A 和 a 是false

1 2	public boolean equals(Object anObject) {} public boolean equalsIgnoreCase(String anotherString) {}

比较字符串大小，返回值是int

1 2	public int compareTo(String anotherString) {} public int compareToIgnoreCase(String str) {}

以xx开头，以xx结尾

实用场景：

startsWith：比如判断某一类路径，不走拦截器拦截，以/static开头的静态文件不拦截。
endsWith：比如获取所有的pdf文件，那就需要判断是否是以.pdf结尾的。

1
2
3

public boolean startsWith(String prefix, int toffset) {}
public boolean startsWith(String prefix) {}
public boolean endsWith(String suffix) {}

获取指定元素的下标

获取指定字符，指定字符串第一次出现的位置，最后一次出现的位置。

public int indexOf(int ch) {}
public int indexOf(int ch, int fromIndex) {}
public int lastIndexOf(int ch) {}
public int lastIndexOf(int ch, int fromIndex) {}
public int indexOf(String str) {}
public int indexOf(String str, int fromIndex) {}
public int lastIndexOf(String str) {}
public int lastIndexOf(String str, int fromIndex) {}

这里需要注意，我们的方法中只提供 int和String类型的参数，但是我们可以传入char类型的，这里就涉及到了类型转换问题：

String str = "abcda";
System.out.println(str.indexOf('a'));			//0
System.out.println(str.lastIndexOf('a'));		//4
System.out.println(str.indexOf(98));			//1
System.out.println(str.lastIndexOf(99));		//2
System.out.println(str.indexOf("bc"));			//1
System.out.println(str.lastIndexOf("bc"));		//1

类型转换：左边的类型可以自动类型转换为右边的类型，反过来则需要强制类型转换。

byte -> short -> int -> long -> float -> double
char -> int -> long -> float -> double

子串subString

public String substring(int beginIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    int subLen = value.length - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return (beginIndex == 0) ? this : new String(value, beginIndex, subLen);
}

public String substring(int beginIndex, int endIndex) {
    if (beginIndex < 0) {
        throw new StringIndexOutOfBoundsException(beginIndex);
    }
    if (endIndex > value.length) {
        throw new StringIndexOutOfBoundsException(endIndex);
    }
    int subLen = endIndex - beginIndex;
    if (subLen < 0) {
        throw new StringIndexOutOfBoundsException(subLen);
    }
    return ((beginIndex == 0) && (endIndex == value.length)) ? this
        : new String(value, beginIndex, subLen);
}

可以看出，subString方法是获取指定区间的字符数组新创建String或是返回当前字符串本身。

字符串拼接

public String concat(String str) {
    int otherLen = str.length();
    if (otherLen == 0) {
        return this;
    }
    int len = value.length;
    char buf[] = Arrays.copyOf(value, len + otherLen);
    str.getChars(buf, len);
    return new String(buf, true);
}

通过方法可以知道，方法体直接调用的是length方法，所以传入的参数不能为空，否则会抛出空指针异常。字符串拼接，会返回一个新的字符串对象，不会修改原来数据。

字符串替换

具体实现先不看，直接看返回值，可以看出，同样是方法当前对象或是重新构建一个新的对象。

解释：

replace: 单个字符替换，将所有相同的字符全部替换
replaceFirst: 见名知意，替换第一个相同的字符串
replaceAll:同样，将所有相同的字符串全部替换，只是replaceFirst和replaceAll支持正则表达式。
replace：则是CharSequence类型的参数。下面是CharSequence的实现类。

public String replace(char oldChar, char newChar) {}
public String replaceFirst(String regex, String replacement) {}
public String replaceAll(String regex, String replacement) {}
public String replace(CharSequence target, CharSequence replacement) {}

我们看其中一个方法；

说明：从源码我们可以知道，返回值是一个新的对象或是本身，同时会把所有相同的对象全部替换。

public String replace(char oldChar, char newChar) {
    if (oldChar != newChar) {
        int len = value.length;
        int i = -1;
        char[] val = value; /* avoid getfield opcode */

        while (++i < len) {
            if (val[i] == oldChar) {
                break;
            }
        }
        if (i < len) {
            char buf[] = new char[len];
            for (int j = 0; j < i; j++) {
                buf[j] = val[j];
            }
            while (i < len) {
                char c = val[i];
                buf[i] = (c == oldChar) ? newChar : c;//一直遍历到最后，所以会把相同的串全部替换
                i++;
            }
            return new String(buf, true);
        }
    }
    return this;
}

判断字符串是否满足格式

1	public boolean matches(String regex) {}

可以看出，接受一个正则表达式串，来判断当前字符串是否满足格式。

判断字符串是否包含指定子串

1	public boolean contains(CharSequence s) {}

分割字符串

1 2	public String[] split(String regex, int limit) {} public String[] split(String regex) {}

转换大小写

public String toLowerCase(Locale locale) {}
public String toLowerCase() {}
public String toUpperCase(Locale locale) {}
public String toUpperCase() {}

去除空白符

同样没有修改原有的字符，调用的是subString方法，用来获取子串，也是新创建的字符串对象。

public String trim() {
    int len = value.length;
    int st = 0;
    char[] val = value;

    while ((st < len) && (val[st] <= ' ')) {
        st++;
    }
    while ((st < len) && (val[len - 1] <= ' ')) {
        len--;
    }
    return ((st > 0) || (len < value.length)) ? substring(st, len) : this;
}

加分隔符并返回字符串

1 2	public static String join(CharSequence delimiter, CharSequence... elements) {} public static String join(CharSequence delimiter, Iterable<? extends CharSequence> elements) {}

实例：

List<String> stringList = new ArrayList<>();
stringList.add("AA");
stringList.add("BB");
stringList.add("CC");

String join = String.join("|", stringList);
System.out.println(join);// AA|BB|CC

System.out.println(String.join("@","A","B","C"));// A@B@C

获取字符串的字节数组

public char[] toCharArray() {
    // Cannot use Arrays.copyOf because of class initialization order issues
    char result[] = new char[value.length];
    System.arraycopy(value, 0, result, 0, value.length);
    return result;
}

其他类型转为字符串

public static String valueOf(Object obj) {}
public static String valueOf(char data[]) {}
public static String valueOf(char data[], int offset, int count) {}
public static String copyValueOf(char data[], int offset, int count) {}
public static String copyValueOf(char data[]) {}
public static String valueOf(boolean b) {}
public static String valueOf(char c) {}
public static String valueOf(int i) {}
public static String valueOf(long l) {}
public static String valueOf(float f) {}
public static String valueOf(double d) {}

功能就是将传入的参数转为字符串，这里不详细说。

注意：我们基本把String中的方法都看遍了，可以发现，String内部都没有去修改原有字符串的内容，由于字符串是不可变类，我们也不法修改String的内容，所以转替换，转大小写，删空白符等操作，都是通过新建String对象来实现的。

String为什么是不可变类

上面我们介绍了如何保证String类是不可变类：①利用final关键字修饰String类，这样String就不能被继承，就无法被继承修改。②利用final修改属性value，就意味着value属性一旦初始化，就不可以修改。

但是我们都知道value属性是一个数组，虽然我们不能修改value属性的引用，但是我们可以修改数组中具体的某个下标元素，如下：

1 2	final int val[] = {1,23,3}; val[1] = 20;//[1, 20, 3]

也就是说，要实现真正的不可变类String，并不会仅仅因为这两个final关键字修饰。通过上面我们分析的普通方法可以知道，所以设计到修改字符串的方法，都是构建了新的字符串实例之后返回的，也就是JDK设计者在设计的时候，已经保证了所提供的方法是不会修改value数组的数据的。所以，综合上述三个条件，才能实现真正的不可变类。如下：

final修饰类：保证String类无法被继承，从而无法通过重写方法来修改原有String类的方法功能。
final修饰value：保证value属性一旦被初始化，其引用就是固定的，不能修改。
方法不修改value数组的具体元素：保证在使用String提供的方法时，都不会修改String内部的数据。

String类为什么要设计成不可变类

部分引用：https://www.cnblogs.com/ysocean/p/8571426.html

原因：

安全
hashcode缓存的需要
实现字符串常量池（效率高）

安全

如果字符串可变的话，可能会引发安全问题，比如数据库的用户名、密码都是以字符串的形式传入来获得数据库的连接，或者在socket变成中，主机和端口都是以字符串的形式传入的。因为字符串是不可变的，所以它的值是不可以变的，否则黑客可以钻到空子，改变字符串指向的对象的值，造成安全漏洞。

hashcode缓存的需要

首先我们需要先看一下String类的hashcode的计算方法：

private int hash; // Default to 0
public int hashCode() {
    int h = hash;
    if (h == 0 && value.length > 0) {
        char val[] = value;

        for (int i = 0; i < value.length; i++) {
            h = 31 * h + val[i];
        }
        hash = h;
    }
    return h;
}

可以看到，hashcode只在第一次进行计算，后续调用这个对象时，直接返回缓存在对象中的hash值。hashcode的计算与value有关，若String可变，那么hashcode也需要随之进行计算，针对于Map，Set等容器，他们的键值需要保证唯一性和一致性，因此，String的不可变性使其比其他对象更加适合当容器的键值对。

实例：如果我们不用String类型的不可变类来存，而使用StringBuilder这种可变类来操作，往HashSet中存值，如下：

public static void main(String[] args) {
    StringBuilder sb1 = new StringBuilder("AA");
    StringBuilder sb2 = new StringBuilder("AA");
    HashSet<StringBuilder> set = new HashSet<>();
    set.add(sb1);
    set.add(sb2);
    System.out.println(set);//[AA, AA]
}

可以看出，想StringBuilder这种没有重写hashcode方法，调用的Object的。把sb1和sb2进行了区分，这对于我们来说就相当于存储了相同的值。

实现字符串常量池

这个原因是非常主要的，只有保证了字符串类的不可变性，才有实现字符串常量池的必要性。

常量池：Java运行时会维护一个String Pool（String池），也叫“字符串缓存池”。String池用来存放运行时中产生的各种字符串，并且池中的字符串的内容不重复。

①、字面量创建字符串或者纯字符串（常量）拼接字符串会现在字符串池中找，看是否有相等的对象，没有的话就在字符串池创建该对象；有的话则直接用池中的引用，避免重复创建对象。

②、new关键字创建时，直接在堆中创建一个新的对象，变量所引用的都是这个新对象的地址，但是如果new关键字创建的字符串内容在常量池中存在了，那么会由堆再指向常量池中对应的字符；但是反过来，如果通过new关键字创建的字符串对象在常量池中没有，那么通过new关键字创建的字符串对象是不会额外在常量池中维护的。

③、使用包含变量表达式来创建String对象，则不仅会检查维护字符串池，还会在堆区创建这个对象，最后是指向堆内存的对象。

我们先来看一个题目：

String str1 = "hello";
String str2 = "hello";
String str3 = new String("hello");
System.out.println(str1==str2); 
System.out.println(str1==str3);
System.out.println(str2==str3);
System.out.println(str1.equals(str2));
System.out.println(str1.equals(str3));
System.out.println(str2.equals(str3));

首先，str1是字面量创建对象的，所以会先看字符串常量池中查看有没有”hello”对象，发现没有，会先在字符串常量池中创建，再将其引用返回，而此时str2创建时，字符串常量池中已经存在了，所以就直接返回”hello”对象的引用。当str3通过new对象创建时，会先在堆中创建对象，然后查询常量池，发现里面存在了”hello”对象，于是将其指向字符串常量池中的对象。

我们再来看一个题目：

public static void main(String[] args]{
    String str1 = "hello";
    String str2 = "helloworld";
    String str3 = str1+"world";
    String str4 = "hello"+"world";
    System.out.println(str2==str3); 
    System.out.println(str2==str4);
    System.out.println(str3==str4);
}

上面的结果是什么呢？

我们先看一下反编译之后的结果：

public static void main(String[] args) {
    String str1 = "hello";
    String str2 = "helloworld";
    String str3 = str1 + "world";
    String str4 = "helloworld";
    System.out.println((str2 == str3));
    System.out.println((str2 == str4));
    System.out.println((str3 == str4));
}

首先 str1 是利用字符串字面量来创建对象。str2也是。可以看出，str4编译器能够明确这两个是不变的，就进行优化了，而str3中带有str1，编译器是无法确定里面的值的。

对于str1，内存操作步骤是：

先查询字符串常量池用存不存在”hello”对象。

存在，直接返回字符串常量池中”hello”对象的引用
不存在，在字符串常量池中创建”hello”对象，并返回其引用。

str2 依旧如此。那么此时字符串常量池中已经有"hello"对象和"helloworld"对象了。当执行到str3时，由于此时不是字面量创建了，str3会先在堆内存中创建一个String对象，再去查询str1+world拼接后的”helloworld”对象是否存在于字符串常量池中，如果存在，则直接将str3中的对象指向常量池中”helloworld”对象的。如果不存在，则只会在堆中创建。

而当执行str4时，其实与str2的效果一样，同样是直接查询常量池。如图：

至此，我们再回来看这个题目：

public static void main(String[] args]{
    String str1 = "hello";
    String str2 = "helloworld";
    String str3 = str1+"world";
    String str4 = "hello"+"world";
    System.out.println(str2==str3);//false
    System.out.println(str2==str4);//true
    System.out.println(str3==str4);//false
}

可能看到这，会有这样的疑惑，堆中创建了对象怎么再指向常量池呢？

参考自：https://segmentfault.com/a/1190000009888357

字符串对象内部是用字符数组存储的，那么看下面的例子:

String m = "hello,world";
String n = "hello,world";
String u = new String(m);
String v = new String("hello,world");
System.out.println(m == n);
System.out.println(m == u);
System.out.println(m == v);
System.out.println(u == v);

会分配一个11长度的char数组，并在常量池分配一个由这个char数组组成的字符串，然后由m去引用这个字符串。String m = "hello,world"
用n去引用常量池中的字符串对象，所以和m引用的是同一个对象。
在堆中生成一个新的字符串，但是其内部的字符数组是引用m内部的字符数组。
同样会生成一个新的字符串，但内部的字符数组引用常量池中的m的字符数组，意思是和u是同样的字符数组。

图示：情况就大概是这样的(使用虚线只是表示两者其实没什么特别的关系)

综上，所以题目结果为：

System.out.println(m == n);//true
System.out.println(m == u);//false
System.out.println(m == v);//false
System.out.println(u == v);//false

结论：

m和n是同一个对象。
m,u,v都是不同的对象
m,n,u,v都使用了同样的字符数组，并且用equals判断的话也会返回true。

上面说到了m,n,u,v都是使用了同样的字符数组，但是我们使用正常的方式又获取不到value属性，那么我们使用反射来看一下具体的地址和值。

public static void main(String[] args) throws Exception {
    String m = "hello,world";
    String n = "hello,world";
    String u = new String(m);
    String v = new String("hello,world");

    Field value = String.class.getDeclaredField("value");
    value.setAccessible(true);
    printMsg(m, value);
    printMsg(n, value);
    printMsg(u, value);
    printMsg(v, value);
}

private static void printMsg(String m,Field value) throws IllegalAccessException {
    Object o = value.get(m);
    System.out.println("字符数组地址："+o);
    System.out.println("字符串地址："+System.identityHashCode(m));
    System.out.println("字符数组内容："+Arrays.toString((char[])o));
    System.out.println("-------------------------------------");
}

结果如图：

可以看到获取到m,n,u,v四个对象中的char[] value数组的地址都是相同的。m和n的地址也是一样的

可变的String

String类的字符数组真的修改不了吗？String对象真的是一个绝对的不可变对象吗？不！下面我们来看一下，如果真的需要修改，如何修改String对象。

定义一个字符串String str = "hello"; 如果将其值修改为Hello;这里说的是真正的修改，str的地址不能变。

修改实例：

public static void main(String[] args) throws Exception {
    String str = "hello";
    //获取所有的value属性
    Field value = String.class.getDeclaredField("value");
    //设置为可访问
    value.setAccessible(true);
    //获取属性值
    Object o = value.get(str);
    //转为字符数组
    char[] val = (char[]) o;
    //修改字符数组
    val[0] = 'H';
    System.out.println(str);//Hello
}

如上所示，可以进行字符串内容的修改，但是需要使用到反射技术。但是一般不会进行这样的操作，所以可以说String类是不可变的类。

本文标题：String 类
本文作者：ouYang
本文链接：https://ooyhao.github.io/2020/04/08/%E5%9F%BA%E7%A1%80/String/
发布时间：2020-04-08
版权声明：本博客所有文章除特别声明外，均采用 CC BY-NC-SA 4.0 许可协议。转载请注明出处！

# JavaSE

String 类

String 类

定义

构造方法

常用的普通方法

获取长度和判空的方法

获取当前字符串中指定位置的字符

获取字符串的字节数组

比较字符串是否相同

比较字符串大小，返回值是int

以xx开头，以xx结尾

获取指定元素的下标

子串subString

字符串拼接

字符串替换

判断字符串是否满足格式

判断字符串是否包含指定子串

分割字符串

转换大小写

去除空白符

加分隔符并返回字符串

获取字符串的字节数组

其他类型转为字符串

String为什么是不可变类

String类为什么要设计成不可变类

安全

hashcode缓存的需要

实现字符串常量池

可变的String

评论

目录

链接

分类

标签云

最新文章

归档

标签

Your browser is out-of-date!