regex - Python regular expression split string into numbers and text/symbols -


i split string sections of numbers , sections of text/symbols current code doesn't include negative numbers or decimals, , behaves weirdly, adding empty list element on end of output

import re mystring = 'ad%5(6ag 0.33--9.5' newlist = re.split('([0-9]+)', mystring) print (newlist) 

current output:

['ad%', '5', '(', '6', 'ag ', '0', '.', '33', '--', '9', '.', '5', ''] 

desired output:

['ad%', '5', '(', '6', 'ag ', '0.33', '-', '-9.5'] 

your issue related fact regex captures 1 or more digits , adds them resulting list , digits used delimiter, parts before , after considered. if there digits @ end, split results in empty string @ end added resulting list.

you may split regex matches float or integer numbers optional minus sign , remove empty values:

result = re.split(r'(-?\d*\.?\d+)', s) result = filter(none, result) 

to match negative/positive numbers exponents, use

r'([+-]?\d*\.?\d+(?:[ee][-+]?\d+)?)' 

the -?\d*\.?\d+ regex matches:

  • -? - optional minus
  • \d* - 0+ digits
  • \.? - optional literal dot
  • \d+ - 1 or more digits.

Comments