i extracting pattern column of dataframe. has word 'oscar' , has word 'oscars'. how extract in panda dataframe . below extract line code. gives error.
df['oscar_awards_won'] = df['awards'].str.extract('won (\d+) (oscar[s]?)', expand=true).fillna(0)
i sorry not posting sample data.sample data column awards. trying extract no of oscars won.
awards won 3 oscars. 234 wins & 312 nominations. won 7 oscars. 215 wins & 169 nominations. won 11 oscars. 174 wins & 113 nominations. won 4 oscars. 122 wins & 213 nominations. won 3 oscars. 92 wins & 150 nominations. won 1 oscar. 91 wins & 95 nominations.
is needed?
import pandas pd df = pd.dataframe({'a': [1,2,3,4], 'b': ['is oscar','asd','oscars','not oscars q']}) df['c'] = ['won 3 oscars. 234 wins & 312 nominations.', 'won 7 oscars. 215 wins & 169 nominations.', 'won 11 oscar. 174 wins & 113 nominations.', 'won 4 oscars. 122 wins & 213 nominations.']
this line:
df['c'].str.extract('won (\d+) oscar[s]?', expand=true).fillna(0)
gives:
0 0 3 1 7 2 11 3 4
Comments
Post a Comment