add garden size estimate evaluation notebook #115

crispy-wonton · 2025-01-22T17:23:44Z

Fixes #114

Description

add garden size estimate evaluation notebook

Instructions for Reviewer

to generate the notebook, ensure jupytext is installed and then run the following line in terminal:
jupytext --to notebook asf_heat_pump_suitability/analysis/garden_size_estimates/20250106_evaluate_garden_size_estimates.py

Please pay special attention to ...
Please could you provide some feedback on this analysis. Namely:

Do you think the selection criteria are appropriate and applied in the correct order?
Have I accounted for the appropriate corrections to make this a fair analysis (e.g. applying weights, checking weights add to at least 0.9 etc.)
Are there any suggestions you have to make the analysis more robust or accurate?
Have the weights been applied correctly?
Is there anything we should add to the analysis?

Checklist:

…timates.py`

lizgzil

Hey @crispy-wonton this looked good to me, and I thought the filtering steps made sense (and in the right order) - I commented on a few bits for clarification though.

lizgzil · 2025-01-29T09:56:11Z

...t_pump_suitability/analysis/garden_size_estimates/20250106_evaluate_garden_size_estimates.py

+total_avg_gardens_df = avg_gardens_df.filter(
+    pl.col("msoa_avg_outdoor_space_property_type") == "unknown"
+)


How come you didn't compare all the property types?

lizgzil · 2025-01-29T10:24:02Z

...t_pump_suitability/analysis/garden_size_estimates/20250106_evaluate_garden_size_estimates.py

+# Get counts of UPRN per garden
+uprn_count = gardens_df.group_by("NATIONALCADASTRALREFERENCE").agg(
+    pl.col("UPRN").count().alias("UPRN_count")
+)
+
+# Assign 1 garden size per cadastral
+cadastral_garden_size = gardens_df.group_by("NATIONALCADASTRALREFERENCE").agg(
+    pl.col("garden_area_m2").first().alias("cadastral_garden_size_m2")
+)
+
+# Join to garden size df
+gardens_df = gardens_df.join(
+    uprn_count, how="left", on="NATIONALCADASTRALREFERENCE"
+).join(cadastral_garden_size, how="left", on="NATIONALCADASTRALREFERENCE")
+
+# Divide shared gardens equally among UPRNs sharing gardens
+gardens_df = gardens_df.with_columns(
+    (pl.col("cadastral_garden_size_m2") / pl.col("UPRN_count")).alias(
+        "divided_garden_area_m2"
+    )
+)


I wasn't expecting these steps. I thought "garden_area_m2" was the estimation for each UPRN, so surprised "divided_garden_area_m2" needed to be calculated. Where have I misunderstood?

lizgzil · 2025-01-29T10:29:25Z

...t_pump_suitability/analysis/garden_size_estimates/20250106_evaluate_garden_size_estimates.py

+    pl.col("divided_garden_area_m2").is_not_null(),
+    pl.col("divided_garden_area_m2") > 0,
+    pl.col("divided_garden_area_m2") < pl.col("divided_garden_area_m2").quantile(0.99),
+    pl.col("weight").is_not_null(),


I can't remember why a weight would be null? Was it when the value was from a dummy property?

lizgzil · 2025-01-29T11:19:39Z

...t_pump_suitability/analysis/garden_size_estimates/20250106_evaluate_garden_size_estimates.py

+)
+
+# Filter results to MSOAs with averages calculated from 15 or more properties
+results = results.filter(pl.col("n_properties") >= 15)


I think here and in the weighted results you could increase this threshold - perhaps set to the 0.1 quantile? (which I think is 51). or was there a reason to use 15?

crispy-wonton added 2 commits January 22, 2025 17:18

add garden size estimate evaluation notebook

adf3ae8

update evaluation data file name in `20250106_evaluate_garden_size_es…

a9365ae

…timates.py`

lizgzil reviewed Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add garden size estimate evaluation notebook #115

add garden size estimate evaluation notebook #115

crispy-wonton commented Jan 22, 2025 •

edited

Loading

lizgzil left a comment

lizgzil Jan 29, 2025

lizgzil Jan 29, 2025

lizgzil Jan 29, 2025

lizgzil Jan 29, 2025

add garden size estimate evaluation notebook #115

Are you sure you want to change the base?

add garden size estimate evaluation notebook #115

Conversation

crispy-wonton commented Jan 22, 2025 • edited Loading

Description

Instructions for Reviewer

Checklist:

lizgzil left a comment

Choose a reason for hiding this comment

lizgzil Jan 29, 2025

Choose a reason for hiding this comment

lizgzil Jan 29, 2025

Choose a reason for hiding this comment

lizgzil Jan 29, 2025

Choose a reason for hiding this comment

lizgzil Jan 29, 2025

Choose a reason for hiding this comment

crispy-wonton commented Jan 22, 2025 •

edited

Loading