I decided to read the JDK source somehow. That said, I don't have time to read each line carefully, so I read it briefly and found this code. Last time I read the source of Long, so next is String.
The String class is a string class. Since we came with Byte, Short, Integer, and Long, it can be said that String is also a wrapper class for char []. Well then, the field.
String.java
private final char value[];
private int hash; // Default to 0
public static final Comparator<String> CASE_INSENSITIVE_ORDER = new CaseInsensitiveComparator();
There is char [] which is the body of the string. Hash is assigned only in the following constructor. Otherwise it remains 0.
String.java
public String(String original) {
this.value = original.value;
this.hash = original.hash;
}
Calculated when the string length is greater than 0 and hash is 0 when the hashCode method is called. When the hash value calculated by chance is 0, I think that it should not be calculated every time, but that is unlikely.
String.java
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;
for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}
There was a constant called CASE_INSENSITIVE_ORDER. I wondered if it was possible recently, but it was from JDK 1.2.
The length method returns the length. C-like strlen searches for null characters each time, but java just returns the size of the array. It means that value secures the perfect size. Well, it's an immutable object, so you don't have to afford it.
String.java
public int length() {
return value.length;
}
IsEmpty has been added since JDK 1.6.
String.java
public boolean isEmpty() {
return value.length == 0;
}
The content is the same as length () == 0, but it is a very java-like method that can be judged by boolean type without comparing the numerical value.
charAt is a method to get one character at a time, and toCharArray is a method to get an array collectively.
String.java
public char charAt(int index) {
if ((index < 0) || (index >= value.length)) {
throw new StringIndexOutOfBoundsException(index);
}
return value[index];
}
public char[] toCharArray() {
// Cannot use Arrays.copyOf because of class initialization order issues
char result[] = new char[value.length];
System.arraycopy(value, 0, result, 0, value.length);
return result;
}
charAt only extracts the elements by subscripting the array, but toCharArray secures another array of the same size and copies it. Unless there is a special reason, it is better to access with charAt by turning to length () with a for statement.
By the way, I saw in another person's source that you can split it into a single character string with str.split ("") ;, but stop it like this. It seems that String instances can be created only for the types of characters contained in the string.
There is an intern method. To be honest, I'd like you to stop doing this. The implementation is a native method. It feels like I'm doing the reference value internally.
String.java
public native String intern();
How to use intern ...
Main.java
public static void main(String[] args) {
String s01 = "abc";
String s02 = "abcdef".substring(0, 3);
System.out.println(s01 == s02);
System.out.println(s01.equals(s02));
String s03 = s02.intern();
System.out.println(s01 == s03);
System.out.println(s01.equals(s03));
}
When you run ...
false
true
true
true
When I substring, the reference value is different, but when I intern, the reference value is the same.
Here is the equal method,
String.java
public boolean equals(Object anObject) {
if (this == anObject) {
return true;
}
if (anObject instanceof String) {
String anotherString = (String)anObject;
int n = value.length;
if (n == anotherString.value.length) {
char v1[] = value;
char v2[] = anotherString.value;
int i = 0;
while (n-- != 0) {
if (v1[i] != v2[i])
return false;
i++;
}
return true;
}
}
return false;
}
Pay attention to this == anObject at the beginning. Rather than always writing the comparison operator == on the premise that everything is intern, if you write it in equals for the time being, it will be faster if it is intern and the same instance.
With equals, the loop is terminated when the length of the string is different or when the string is compared from the beginning and the loop is broken, but when the last character is different or the string is the same, processing is performed. So it is the fastest to compare by reference value, but somehow it seems to be effective to return false if it does not match by comparing with hashCode after length comparison. If hashCode is not calculated, it will loop up to 2 times.
indexOf has been around for a long time, but if you wonder why the argument is an int ch ...
String.java
public int indexOf(int ch, int fromIndex) {
final int max = value.length;
if (fromIndex < 0) {
fromIndex = 0;
} else if (fromIndex >= max) {
// Note: fromIndex might be near -1>>>1.
return -1;
}
if (ch < Character.MIN_SUPPLEMENTARY_CODE_POINT) {
// handle most cases here (ch is a BMP code point or a
// negative value (invalid code point))
final char[] value = this.value;
for (int i = fromIndex; i < max; i++) {
if (value[i] == ch) {
return i;
}
}
return -1;
} else {
return indexOfSupplementary(ch, fromIndex);
}
}
private int indexOfSupplementary(int ch, int fromIndex) {
if (Character.isValidCodePoint(ch)) {
final char[] value = this.value;
final char hi = Character.highSurrogate(ch);
final char lo = Character.lowSurrogate(ch);
final int max = value.length - 1;
for (int i = fromIndex; i < max; i++) {
if (value[i] == hi && value[i + 1] == lo) {
return i;
}
}
}
return -1;
}
Oh, did it support surrogate pairs? The indexOfSupplementary searches for each of the two chars, but the first half is U + D800 to U + DBFF, and the second half is U + DC00 to U + DFFF.
Just search lastIndexOf from the back, and there is a method for surrogate pairs called lastIndexOfSupplementary. It seems that JDK 1.5 is compatible with surrogate pairs. There was no MIN_SUPPLEMENTARY_CODE_POINT in the JDK 1.4 source. Since JDK 1.4, the argument of indexOf is int ch. That kind of thing is amazing.
I don't know if there are people in the world who write "abc" .toString () ...
String.java
public String toString() {
return this;
}
Return yourself.
I'm not sure if the String class is a runtime library or part of a language specification. For example, the default constructor is ...
String.java
public String() {
this.value = "".value;
}
Recommended Posts