regex - C++ regexp to parsing nested struct -
i have string formatted follows permanent way:
{ first nested string; { second nested string; } }
nesting may arbitrary depth. each sub-element formatted closing brace, adding 2 spaces previous level of nesting , closing brace. want receive regular expression allows obtain nested data. example, above example result should be:
first nested string; { second nested string; }
i wrote following code allows parse strings in 1 line, symbol '.' character except newline.
regex regex("\\s*\\{\\s*(.*?)\\s*\\}\\s*"); string testinput = "{\n" " first nested string;\n" " {\n" " second nested string;\n" " }\n" "}\n"; smatch match; if (regex_search(testinput, match, regex)) { auto result = match[1].str(); }
what regular expression make can receive nested data? in advance.
the regex implementation c++ standard library not support recursion, needed match nested structures.
like wintermute said in comments, nested structures such not regular language, , need other tools.
fyi, can use boost.regex or pcre match following pattern:
\{(?:[^{}]++|(?r))*\}
this pretty simple recursive pattern. explanations:
[^{}]++
matches{
or}
possessively.(?r)
recurses entire pattern- the
*
quantifier applied on inner possessive++
quantifier, prevents catastrophic backtracking.
the thing is, matching nested constructs. won't parsing. regexes aren't right tool such job, parser more appropriate.
if still want go regex way, you'll have expand pattern match constructs more precisely. if using pcre may want use callout mechanism extract information pattern while engine performing match. said, write parser.
Comments
Post a Comment