import geopandas as gpd world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres')) What is this? print(type(world)) # <class 'geopandas.geodataframe.GeoDataFrame'> print(world.head()) print(world.geometry.name) # 'geometry'
from shapely.geometry import Point, LineString, Polygon nyc = Point(-74.006, 40.7128) Create a line route = LineString([(-74.006, 40.7128), (-73.935, 40.7306)]) Create a polygon (bounding box around NYC) bbox = Polygon([(-74.05, 40.68), (-73.95, 40.68), (-73.95, 40.75), (-74.05, 40.75)]) Check if point is inside polygon print(bbox.contains(nyc)) # True Step 4: The Magic of Spatial Joins This is where Geopandas shines. Let's find all countries that contain a specific point.
print(result['name']) # Should output "Brazil"
Next week, I'll cover spatial autocorrelation (aka: "Is that cluster real or random?"). Until then, map something interesting. What geospatial project are you working on? Let me know in the comments below. Python GeoSpatial Analysis Essentials
conda install geopandas folium shapely matplotlib # or pip (may require system GDAL) pip install geopandas folium shapely matplotlib Let's load a natural Earth dataset (Geopandas can download sample data).
Pro tip: Never calculate distance or area using lat/lon (EPSG:4326). Always project to a local or equal-area CRS first. Static maps are fine. Interactive maps impress stakeholders.
Given 10,000 crime incident points and a map of police precincts, which precinct has the most points? That's a spatial join. Step 5: Coordinate Reference Systems (CRS) – The Silent Killer If your layers don't align, you likely have a CRS mismatch. import geopandas as gpd world = gpd
# Our point of interest (somewhere in Brazil) point_of_interest = Point(-55.0, -10.0) We'll put the point into a tiny GeoDataFrame point_gdf = gpd.GeoDataFrame(geometry=[point_of_interest], crs=world.crs) "within" joins where the point is inside the polygon result = gpd.sjoin(point_gdf, world, how='left', predicate='within')
But if you open a raw shapefile or a GeoJSON file for the first time, you’ll quickly realize:
# Check CRS print(world.crs) # EPSG:4326 (Lat/Lon) world_meters = world.to_crs('EPSG:3857') # Web Mercator Or better for area: world.to_crs('EPSG:3395') Calculate area in square kilometers world['area_km2'] = world_meters.geometry.area / 10**6 print(world[['name', 'area_km2']].head()) Let me know in the comments below
A GeoDataFrame is just a Pandas DataFrame with a special column (usually geometry ) that stores shapely objects. You rarely create geometries by hand, but you must understand them.
Geospatial data is everywhere. From tracking delivery trucks to analyzing climate change, location is the secret ingredient that makes data science actionable.