have script:
use 5.014; use warnings; use utf8; binmode stdout, ':utf8'; $str = "xyz ΦΨΩ zyz φψω"; @greek = ($str =~ /\p{greek}/g); "greek: @greek"; @upper = ($str =~ /\p{upper}/g); "upper: @upper"; #my @upper_greek = ($str =~ /\p{upper+greek}/); #wrong. #say "upper+greek: @upper_greek";
is possible combine multiple unicode properties? e.g how select upper , greek
, , wanted:
greek: Φ Ψ Ω φ ψ ω upper: x y z Φ Ψ Ω upper+greek: Φ Ψ Ω #<-- how this?
we can't use
/(?:\p{greek}|\p{upper})/ # greek or upper
or
/[\p{greek}\p{upper}]/ # greek or upper
one way of achieving , in regex using lookarounds.
/\p{greek}(?<=\p{upper})/ # greek , upper
another way of getting , negate or. de morgan's laws tells us
not( greek , upper ) ⇔ not(greek) or not(upper)
so
greek , upper ⇔ not( not(greek) or not(upper) )
this gives us
/[^\p{greek}\p{upper}]/ # greek , upper
since 5.18, there's experimental feature can use:
no warnings qw( experimental::regex_sets ); /(?[ \p{greek} & \p{upper} ])/ # greek , upper
Comments
Post a Comment