String manipulation is a common task in Java programming, and splitting a string into smaller parts is a fundamental operation. Fortunately, Java offers multiple methods to accomplish this task, each with its own advantages. In this article, we will learn How to Split the String in Java using different Methods
By understanding these methods, you can choose the most suitable approach for your specific requirements and improve your string manipulation skills.
How to Split the String in Java?
To split the string in Java, below are three methods:
split()
StringTokenizer Class
Pattern and Matcher Class
Method 1: split() method
The split() method in Java is a method available in the String class that is used to split a string into substrings based on a specified delimiter and returns an array of the resulting substrings.
The method signature of split() is as follows:
public String[] split(String regex)
The split() method takes a regular expression pattern as its argument and uses that pattern as the delimiter to split the input string. It then returns an array of strings, where each element of the array represents a substring obtained by splitting the input string.
Here's an example program that demonstrates how to split a string by words using the split() method:
public class StringSplitExample {
public static void main(String[] args) {
String text = "Hello, World! Java Programming";
// Split the string by words using whitespace as the delimiter
String[] words = text.split("\\s+");
// Print each word
for (String word : words) {
System.out.println(word);
}
}
}
Output:
Hello,
World!
Java
Programming
In the above program, the input string "Hello, World! Java Programming" is split into words by using the regular expression "\s+" as the delimiter. This regular expression matches one or more whitespace characters, so it effectively splits the string at each whitespace and returns an array of words.
Note that the "\s+" regular expression pattern represents one or more whitespace characters. The double backslash is used to escape the backslash character because backslash itself is an escape character in Java regular expressions.
Pros:
Easy to use and widely supported.
Can split strings based on various delimiters.
Cons:
Limited flexibility in handling complex splitting patterns.
Regular expression-based delimiters may require escaping special characters.
Method 2: StringTokenizer class
The StringTokenizer class in Java is a legacy class that is used to break a string into tokens (substrings) based on a specified delimiter. It provides methods to iterate over these tokens and extract them one by one. The default delimiters used by StringTokenizer are whitespace characters.
Here's an example program that uses StringTokenizer to split a string into words:
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = "Hello, World! Java Programming";
StringTokenizer tokenizer = new StringTokenizer(text);
System.out.println("Words in the string:");
while (tokenizer.hasMoreTokens()) {
String word = tokenizer.nextToken();
System.out.println(word);
}
}
}
In this program, the string "Hello, World! Java Programming" is split into words using StringTokenizer. Each word is printed on a separate line.
Output:
Words in the string:
Hello,
World!
Java
Programming
Note that by default, StringTokenizer considers whitespace as the delimiter. However, you can also specify a custom delimiter by passing it as the second argument to the StringTokenizer constructor. For example, to split the string based on commas, you can modify the program as follows:
import java.util.StringTokenizer;
public class StringTokenizerExample {
public static void main(String[] args) {
String text = "Hello, World, Java, Programming";
StringTokenizer tokenizer = new StringTokenizer(text, ",");
System.out.println("Words in the string:");
while (tokenizer.hasMoreTokens()) {
String word = tokenizer.nextToken();
System.out.println(word);
}
}
}
Output:
Words in the string:
Hello
World
Java
Programming
In this modified program, the string is split into words based on commas as the delimiter. Each word is printed on a separate line.
Pros:
Allows custom delimiters and additional options like skipping empty tokens.
Efficient for simple splitting tasks.
Cons:
Limited functionality compared to other methods.
Does not support regular expression-based delimiters.
Method 3: Regular Expression (Pattern and Matcher classes)
the Pattern and Matcher classes are part of the java.util.regex package and are used for working with regular expressions. Here's a brief explanation of each class:
Pattern class:
The Pattern class represents a compiled regular expression pattern.
It provides methods to compile regular expressions into patterns and match them against input strings.
The compile() method is used to compile a regular expression into a Pattern object.
Other methods, like matcher(), are used to obtain a Matcher object for matching against an input string.
Matcher class:
The Matcher class is used to match a Pattern against an input string.
It provides methods to perform various matching operations, such as finding matches, extracting matched groups, and replacing matched text.
The matches() method is commonly used to check if the entire input sequence matches the pattern.
Other methods, like find(), group(), and replaceAll(), allow more advanced matching and manipulation of the input string.
Here's an example program that splits a string by words using the Pattern and Matcher classes:
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class WordSplitExample {
public static void main(String[] args) {
String text = "Hello, World! This is a sample string.";
// Define the pattern to match words (non-whitespace sequences)
Pattern pattern = Pattern.compile("\\S+");
// Create a Matcher object and associate it with the input string
Matcher matcher = pattern.matcher(text);
// Find and print all matches (words)
while (matcher.find()) {
String word = matcher.group();
System.out.println(word);
}
}
}
In this example, the regular expression pattern \\S+ is used to match one or more non-whitespace characters. The program iterates over all matches found by the Matcher and prints each word.
Note: The regular expression pattern \\S+ assumes that words are defined as consecutive sequences of non-whitespace characters. You can adjust the pattern based on your specific word definition and splitting requirements.
Pros:
Provides powerful pattern-matching capabilities.
Can handle complex splitting scenarios.
Cons:
Regular expressions can be challenging to construct and understand.
Performance impact when dealing with large strings or complex patterns.
Conclusion
In this article, we have discussed three different methods to split strings in Java. Each method has its own advantages and disadvantages. You can select the method according to your requirement.
Happy Coding!
コメント