Simply say that the bit between the delimiters is not allowed to contain
any delimiters. You just have to wath out for escaped delimiters.
Using EMACS regexp notation "[^"]*" will match any text between two successive
quotes. This is good enough for HTML since an escaped quote is "
rather than \".
For other languages, that use backslashes for escaped characters,
you would have to make sure that
none of the quotes are preceded by an odd number of backslashes.
In EMACS \(^\|[^\\]\)\(\\\\\)*" will match any unescaped quote.
ps Anyone know the location of an html parser? C++ preferred, but C
accepted. There's probably a Java one around, no?