regex - Regular Expression Split on character not in Qualified String (python) -


i use assistance easy familiar. i'm trying parse more/less shop-brewed configuration files dictionary/json. have python code using string procedures or re.split() works fine i've tested against; however, know there corner cases could break , create generic regular expressions better handle logic, , same regex portable other languages (perl,awk,c,etc) use @ work consistent.

i looking either use re.match() or re.split() in python.

the patterns i'm looking should following:

1) split str on first ? if ? not in substring qualified single and/or double quotes.

strin: ''' foo = 'some',"stuff?",'that "could be?" nested?', ? still capture this? , "this?" '''  listout ['''foo = 'some',"stuff?",'that "could be?" nested?', ''' , ''' still capture this? , "this?"'''] 

2) split str on first # if # not in substring qualified single or double quotes, , # not after first unqualified ? (as per 1)

strin: ''' foo = 'some',"stuff?#, maybe 'nested#' " # #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! " '''  listout: ['''foo = 'some',"stuff?#, maybe 'nested#' " ''', ''' #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! "''' 

you use re.split

>>> s = '''foo = 'some',"stuff?",'that "could be?" nested?', ? still capture this? , "this?"''' >>> [i in re.split(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?', s) if i] ['foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"'] 

or

re.findall.

>>> re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?(.*)', s) [('foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"')] >>> [j in re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"?])*)\?(.*)', s) j in i] ['foo = \'some\',"stuff?",\'that "could be?" nested?\', ', ' still capture this? , "this?"'] 

demo

you same above second question.

>>> s = '''foo = 'some',"stuff?#, maybe 'nested#' " # #but comment capture ,'that "could be?#" nested#', ? still capture this?! , "this?! "''' >>> [j in re.findall(r'^((?:"[^"]*"|\'[^\']*\'|[^\'"#])*)#(.*)', s) j in i] ['foo = \'some\',"stuff?#, maybe \'nested#\' " ', ' #but comment capture ,\'that "could be?#" nested#\', ? still capture this?! , "this?! "'] 

Comments

Popular posts from this blog

javascript - AngularJS custom datepicker directive -

javascript - jQuery date picker - Disable dates after the selection from the first date picker -