this question has answer here:
following text samples have:
text1 : "the salary $34-$36" text2 : "the salary $34.50-$36.20" text3 : "the salary $45000-$34000" text4 : "the salary $45-$34k"
so whenever find patterns $34-$36 or $34.50-$36.20 need add word hour text , whenever find patterns $45000-$34000 or $45-$34k need add word salary text.
can me how solve in r using regular expressions?
thank-you.
for 1 case, might work negative lookahead regular expression:
# add 'hour' 2-digit $-values (with optional decimal fraction) # if not followed 000 or k gsub("(\\$\\d{1,2}(?:\\.[\\d]+)?(?!000|k))", "\\1 hour", txt, perl=true)
the second case:
# add 'salary' 4-5-digit $-values (with optional decimal fraction) # if followed 000 or k gsub("(\\$\\d{1,2}(000|k))", "\\1 salary", txt, perl=true)
i've tested few snippets. maybe test cases more complex mine.
Comments
Post a Comment