how to search a string value within a specific column in pandas dataframe, and if present, give an output of that row present in the dataframe?

Devarshi Sengupta picture Devarshi Sengupta · Jun 18, 2017 · Viewed 19.6k times · Source

I wish to search a database that I have in a .pkl file.

I have loaded the .pkl file and stored it in a variable named load_data.

Now, I need to accept a string input using raw input and search for the string in one specific column 'SMILES' of my dataset.

If the string matches, I need to display the whole row i.e all column values corresponding to that row.

Is that possible and if so, how should I go about it?

Answer

jezrael picture jezrael · Jun 18, 2017

Use boolean indexing that returns all matching rows:

df = pd.DataFrame({'a': [1,3,4],
                      'SMILES': ['a','dd b','f'],
                     'c': [1,2,0]})
print (df)
  SMILES  a  c
0      a  1  1
1   dd b  3  2
2      f  4  0

If you need to check a string only:

#raw_input for python 2, input for python 3
a = input('Enter String for SMILES columns: ') # f
#Enter String for SMILES columns: f
print (df[df['SMILES'] == a])
  SMILES  a  c
2      f  4  0

Or if you need to check a sub string, use str.contains:

a = input('Enter String for SMILES columns: ') # b 
print (df[df['SMILES'].str.contains(a)])
#Enter String for SMILES columns: b
  SMILES  a  c
1   dd b  3  2