Select by partial string from a pandas DataFrame

euforia picture euforia · Jul 5, 2012 · Viewed 732.9k times · Source

I have a DataFrame with 4 columns of which 2 contain string values. I was wondering if there was a way to select rows based on a partial string match against a particular column?

In other words, a function or lambda function that would do something like

re.search(pattern, cell_in_question) 

returning a boolean. I am familiar with the syntax of df[df['A'] == "hello world"] but can't seem to find a way to do the same with a partial string match say 'hello'.

Would someone be able to point me in the right direction?

Answer

Garrett picture Garrett · Jul 17, 2012

Based on github issue #620, it looks like you'll soon be able to do the following:

df[df['A'].str.contains("hello")]

Update: vectorized string methods (i.e., Series.str) are available in pandas 0.8.1 and up.