regex - Regular Expression Split on character not in Qualified String (python) -
i use assistance easy familiar. i'm trying parse more/less shop-brewed configuration files dictionary/json. have python code using string procedures or re.split() works fine i've tested against; however, know there corner cases could break , create generic regular expressions better handle logic, , same regex portable other languages (perl,awk,c,etc) use @ work consistent.
i looking either use re.match() or re.split() in python.
the patterns i'm looking should following:
1) split str on first ? if ? not in substring qualified single and/or double quotes.
strin: ''' foo = 'some',"stuff?",'that "could be?" nested?', ? still capture this? , "this?" ''' listout ['''foo = 'some',"stuff?",'that "could be?" nested?', ''' , ''' still capture this? , "this?"''']
2) split str on first # if # not in substring qualified single or double quotes, , # not after first unqualified ? (as per 1)
strin: ''' foo = 'some',"stuff?#, maybe 'nested#' " # #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! " ''' listout: ['''foo = 'some',"stuff?#, maybe 'nested#' " ''', ''' #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! "'''
you use re.split
>>> s = '''foo = 'some',"stuff?",'that "could be?" nested?', ? still capture this? , "this?"''' >>> [i in re.split(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?', s) if i] ['foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"']
or
re.findall
.
>>> re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?(.*)', s) [('foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"')] >>> [j in re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?(.*)', s) j in i] ['foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"']
you same above second question.
>>> s = '''foo = 'some',"stuff?#, maybe 'nested#' " # #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! "''' >>> [j in re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"#])*)#(.*)', s) j in i] ['foo = \'some\',"stuff?#, maybe \'nested#\' " ', ' #but comment capture ,\'that "could be?#" nested#\', ? still capture this?! , "this?! "']
Comments
Post a Comment