I have a dataframe where I want to remove all parentheses and stuff inside it.
I checked out : How can I remove text within parentheses with a regex?
Where the answer to remove the data was
re.sub(r'\([^)]*\)', '', filename)
I tried this as well as
re.sub(r'\(.*?\)', '', filename)
However, I got an error: expected a string or buffer
When I tried using the column df['Column Name']
I got no item named 'Column Name'
I checked the dataframe using df.head()
and it showed up as a clean table with the column names as what I wanted them to be....however when I use the re
expression to remove the (stuff) it isn't recognizing the column name that I have.
I normally use
df['name'].str.replace(" ()","")
However, I want to remove the parentheses and what is inside....How can I do this using either regex or pandas?
Thanks!
Here is the solution I used...thanks for the help!
All['Manufacturer Standard Name'] = All['Manufacturer Standard Name'].str.replace(r"\(.*\)","")
df['name'].str.replace(r"\(.*\)","")
You can't run re functions directly on pandas objects. You have to loop them for each element inside the object. So Series.str.replace((r"\(.*\)", "")
is just syntactic sugar for Series.apply(lambda x: re.sub(r"\(.*\)", "", x))
.