Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code in Lesson 2, "Geopandas an introduction" and corresponding notebook doesn't work #11

Open
readyready15728 opened this issue Dec 11, 2024 · 0 comments

Comments

@readyready15728
Copy link

readyready15728 commented Dec 11, 2024

https://autogis-site.readthedocs.io/en/latest/lessons/lesson-2/geopandas-an-introduction.html

Here it says:

Again, we start by grouping the input data by terrain classes, and then compute the sum of each classes’ area. This can be condensed into one line of code:

area_information = data.groupby("CLASS").area.sum()
area_information

And here's the corresponding expected output:

CLASS
32111    1.833747e+03
32112    2.148168e+03
32200    1.057368e+05
32417    1.026678e+02
32421    6.792797e+05
32500    1.097467e+05
32611    1.314807e+07
32612    1.073431e+05
32800    1.407231e+06
32900    6.158391e+05
33000    6.594647e+05
33100    3.769076e+06
34100    1.236289e+07
34300    1.627079e+03
34700    2.785751e+03
35300    1.382940e+06
35411    3.928004e+05
35412    4.708321e+06
35421    6.786374e+04
36200    9.986966e+06
36313    4.346029e+04
Name: area, dtype: float64

At least with my current version of pandas (2.2.3) and geopandas (1.0.1), I get the following error:

AttributeError: 'DataFrameGroupBy' object has no attribute 'area'

However, I was first able to recreate the desired output bit by bit. I started like so:

data.groupby('CLASS').apply(lambda group: group.area)

But I got a warning:

DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.

Giving an extra argument made the warning go away without changing the output (see this Stack Overflow thread for further details):

data.groupby('CLASS').apply(lambda group: group.area, include_groups=False)

However, due to the way the area attribute in geopandas works, the output has an undesired MultiIndex:

CLASS
32111  3116      1833.746786
32112  3115      2148.168209
32200  103     103982.028273
       104       1754.793619
32417  3112       102.667779
                   ...
36313  4299      2651.800270
       4300       376.503380
       4301       413.942555
       4302      3487.927677
       4303      1278.963199
Length: 4304, dtype: float64

After further searching, I was finally able to match the desired output like so:

data.groupby('CLASS').apply(lambda group: group.area, include_groups=False).groupby(level=0).sum()

Here's the output:

CLASS
32111    1.833747e+03
32112    2.148168e+03
32200    1.057368e+05
32417    1.026678e+02
32421    6.792797e+05
             ...
35411    3.928004e+05
35412    4.708321e+06
35421    6.786374e+04
36200    9.986966e+06
36313    4.346029e+04
Length: 21, dtype: float64

I don't know whether what I did, with two .groupby() operations, is the ideal way to do things, but it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant