I have a df in pandas
import pandas as pd
df = pd.DataFrame(['AA', 'BB', 'CC'], columns = ['value'])
I want to iterate over rows in df. For each row i want rows value and next row
s value
Something like(it does not work):
for i, row in df.iterrows():
print row['value']
i1, row1 = next(df.iterrows())
print row1['value']
As a result I want
'AA'
'BB'
'BB'
'CC'
'CC'
*Wrong index error here
At this point i have mess way to solve this
for i in range(0, df.shape[0])
print df.irow(i)['value']
print df.irow(i+1)['value']
Is there more efficient way to solve this issue?
Firstly, your "messy way" is ok, there's nothing wrong with using indices into the dataframe, and this will not be too slow. iterrows() itself isn't terribly fast.
A version of your first idea that would work would be:
row_iterator = df.iterrows()
_, last = row_iterator.next() # take first item from row_iterator
for i, row in row_iterator:
print(row['value'])
print(last['value'])
last = row
The second method could do something similar, to save one index into the dataframe:
last = df.irow(0)
for i in range(1, df.shape[0]):
print(last)
print(df.irow(i))
last = df.irow(i)
When speed is critical you can always try both and time the code.