For all of the amount of time I've been wrestling with this, I could have coded an (inelegant) solution from scratch, but it's bugging me to distraction and I won't sleep until I figure it out.
I'm doing a conversion of some data pulled from a mainframe into a text file, turning it into an XML document. The rules are:
- Each line consists of a number of elements, separated by commas.
- Each element can either be a double-quote surrounded string, or nothing.
- The last element has no following comma.
So a line of data could look like this:
"Value1","Value2",,,"Value3",,"Value4","Value5"
I thought this would be easily handled with a regular expression:
(".*") ,
It all started to unravel once it occurred to me that the last element could actually have a trailing comma, if the final element is, in fact, nothing. As usual, it was anything but easy. I've gone back and forth with all sort of grouping combinations, but the bottom line is I can't make it work and I think I'm too close to it to see the (elegant) solution. Can somebody help Thanks.

Regex match from hell
Abhishek_SE
You may want to take a look at regexlib.com
Kamii47
http://www.codeproject.com/cs/database/CsvReader.asp
That's one.
If you want to write your own, go ahead, but I don't suggest doing that.
cunyalen
NelG1
Thanks for the link. Looked at the website. None of the patterns is an exact match, but the following was the closest:
(("(\\\\|\\"|[^"])* ", )|([^"]* ,))|([^"]+)It leaves the trailing commas, but otherwise does what I need. Thanks.