Java split string by comma example shows how to split string by comma in Java. The example also shows how to handle CSV records having a comma between double quotes or parentheses.
How to split String by comma in Java?
You can split string by comma in Java using the split method of the String class. split
method returns parts of the string which are generated from the split operation in an array.
Example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
package com.javacodeexamples.stringexamples; import java.util.regex.Pattern; public class StringSplitByCommaExample { public static void main(String[] args) { String strCSVRecord = "fistname, lastname, standard, totalmarks"; //split string by comma String[] strParts = strCSVRecord.split( "," ); //print parts for(String part : strParts) System.out.println(part); } } |
Output
1 2 3 4 |
fistname lastname standard totalmarks |
As you may have noticed, some of the array elements have a space in them. You can use the "\\s*,\\s*"
regular expression to remove the space around elements and split the string by comma, where
1 2 3 4 5 |
\\s - a space character * - zero or more times , - followed by comma \\s - followed by a space character * - zero or more times |
So basically we are looking for comma which has zero or more leading and zero or more trailing spaces. Once the match is found, the split
method will split the string by the match. Since the spaces are now part of our expression, they are not returned back in the result (the split
method does not return delimiters).
1 2 3 4 5 6 |
String strCSVRecord = "fistname, lastname, standard, totalmarks"; String[] strParts = strCSVRecord.split( "\\s*,\\s*" ); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 |
fistname lastname standard totalmarks |
How to return empty fields?
Let’s change our CSV values a bit like given below
1 2 3 4 5 |
String strCSVRecord = "fistname, lastname, standard, totalmarks,,"; String[] strParts = strCSVRecord.split( "\\s*,\\s*" ); System.out.println("Total fields: " + strParts.length); |
Output
1 |
Total fields: 4 |
We have total of 6 values in the String (where 5th and 6th values are empty values), but the split
method returned only 4 parts. That is because by default, the split
method does not return empty trailing values. In order to have these values returned, we need to specify the limit parameter in the split
method.
1 |
public String[] split(String regex, int limit) |
The limit parameter specifies how many times the regex pattern can be applied to String. The default limit is zero which applies pattern as many times as possible but discards the empty trailing values. If the limit is non-zero positive, the pattern can be applied limit – 1 time at most. If the limit specified is a negative value, pattern can be applied as many times as possible (and retains empty trailing values). Let’s apply the limit to the above code.
1 2 3 4 5 |
String strCSVRecord = "fistname, lastname, standard, totalmarks,,"; String[] strParts = strCSVRecord.split( "\\s*,\\s*", -1 ); System.out.println("Total fields: " + strParts.length); |
Output
1 |
Total fields: 6 |
How to ignore comma in parentheses?
Consider below given input values to the code we just wrote.
1 |
String strCSVRecord = "John,Mathew,12th,(85,66,53)"; |
And our expected output is,
1 2 3 4 |
John Mathew 12th (85,66,53) |
1 2 3 4 5 6 |
String strCSVRecord = "John,Mathew,12th,(85,66,53)"; String[] strParts = strCSVRecord.split( "\\s*,\\s*" ); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 5 6 |
John Mathew 12th (85 66 53) |
Our pattern also matched the comma between parentheses. We want to match only commas that are not between parentheses. We need to rewrite our pattern to ",(?![^()]*\\))"
where,
1 2 3 4 5 |
, - comma (?!) - negative lookahead expression [^()] - any character which is not "(" or ")" * - zero or more times \\) - followed by closing parentheses |
Basically, we are looking for a comma that is not followed by closing parentheses thus ignoring the comma inside the parentheses.
1 2 3 4 5 6 7 8 |
String strCSVRecord = "John,Mathew,12th,(85,66,53)"; String[] strParts = strCSVRecord.split( ",(?![^()]*\\))", -1 ); System.out.println("Total fields: " + strParts.length); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 5 |
Total fields: 4 John Mathew 12th (85,66,53) |
How to ignore comma in double quotes?
Sometimes CSV record values are enclosed in double quotes. The values may itself contain comma which we need to ignore while splitting the values by comma. We are going to use a "\"(,\")?"
pattern where,
1 2 |
\" - a double quote character (,\") - followed by comma and double quote zero or 1 time |
Here is the example program.
1 2 3 4 5 6 7 8 |
String strCSVRecord = "\"John\",\"Mathew\",\"12th\",\"Science, Maths\""; String[] strParts = strCSVRecord.split( "\"(,\")?"); System.out.println("Total fields: " + strParts.length); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 5 6 |
Total fields: 5 John Mathew 12th Science, Maths |
Note: The split
method for version Java 1.7 and below returns the first element as an empty string.
The above regular expression works for the String having all values enclosed in double quotes but it fails where some values are enclosed in double quotes and others are not.
Consider below given example string.
1 2 3 4 5 6 7 8 |
String strCSVRecord = "\"John\",carter,\"Mathew\",\"12th\",\"Science, Maths\""; String[] strParts = strCSVRecord.split( "\"(,\")?"); System.out.println("Total fields: " + strParts.length); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 5 6 7 |
Total fields: 6 John ,carter, Mathew 12th Science, Maths |
Here is the more accurate version of regular expression which handles this scenario ",(?=([^\"]*\"[^\"]*\")*[^\"]*$)"
. Basically we are looking for a comma that has either zero or even number of double quotes.
1 2 3 4 5 6 |
String strCSVRecord = "\"John\",carter,\"Mathew\",\"12th\",\"Science, Maths\""; String[] strParts = strCSVRecord.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)", -1); for(String part : strParts) System.out.println(part); |
Output
1 2 3 4 5 |
"John" carter "Mathew" "12th" "Science, Maths" |
This example is a part of the Java String tutorial with examples.
Please let me know your views in the comment section below.