Hello I am trying to convert a list of X and Y coordinates to lines. I want to mapped this data by groupby
the IDs and also by time. My code executes successfully as long as I grouby
one column, but two columns is where I run into errors. I referenced to this question.
Here's some sample data:
ID X Y Hour
1 -87.78976 41.97658 16
1 -87.66991 41.92355 16
1 -87.59887 41.708447 17
2 -87.73956 41.876827 16
2 -87.68161 41.79886 16
2 -87.5999 41.7083 16
3 -87.59918 41.708485 17
3 -87.59857 41.708393 17
3 -87.64391 41.675133 17
Here's my code:
df = pd.read_csv("snow_gps.csv", sep=';')
#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = GeoDataFrame(df, geometry=geometry)
# aggregate these points with the GrouBy
geo_df = geo_df.groupby(['track_seg_point_id', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df = GeoDataFrame(geo_df, geometry='geometry')
Here is the error: ValueError: LineStrings must have at least 2 coordinate tuples
This is the final result I am trying to get:
ID Hour geometry
1 16 LINESTRING (-87.78976 41.97658, -87.66991 41.9...
1 17 LINESTRING (-87.78964000000001 41.976634999999...
1 18 LINESTRING (-87.78958 41.97663499999999, -87.6...
2 16 LINESTRING (-87.78958 41.976612, -87.669785 41...
2 17 LINESTRING (-87.78958 41.976624, -87.66978 41....
3 16 LINESTRING (-87.78958 41.97666, -87.6695199999...
3 17 LINESTRING (-87.78954 41.976665, -87.66927 41....
Please any suggestions or ideas would be great on how to groupby multiple parameters.
Your code is good, the problem is your data.
You can see that if you group by ID and Hour, then there is only 1 point that is grouped with an ID of 1 and an hour of 17. A LineString has to consist of 1 or more Points (must have at least 2 coordinate tuples). I added another point to your sample data:
ID X Y Hour
1 -87.78976 41.97658 16
1 -87.66991 41.92355 16
1 -87.59887 41.708447 17
1 -87.48234 41.677342 17
2 -87.73956 41.876827 16
2 -87.68161 41.79886 16
2 -87.5999 41.7083 16
3 -87.59918 41.708485 17
3 -87.59857 41.708393 17
3 -87.64391 41.675133 17
and as you can see below the code below is almost identical to yours:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point, LineString, shape
df = pd.read_csv("snow_gps.csv", sep='\s*,\s*')
#zip the coordinates into a point object and convert to a GeoData Frame
geometry = [Point(xy) for xy in zip(df.X, df.Y)]
geo_df = gpd.GeoDataFrame(df, geometry=geometry)
geo_df2 = geo_df.groupby(['ID', 'Hour'])['geometry'].apply(lambda x: LineString(x.tolist()))
geo_df2 = gpd.GeoDataFrame(geo_df2, geometry='geometry')