{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "f3740ed6-bc42-4d6a-bbec-f8bdf2882f7f",
   "metadata": {},
   "source": [
    "# Groundwater Level Dossier (GLD)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9874c92a-2232-4f02-b073-d49c1495fde6",
   "metadata": {},
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "import pandas as pd\n",
    "import brodata"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7cc10da1-f58f-4a23-814e-679697bd5f91",
   "metadata": {},
   "source": [
    "Download a GroundwaterLevelDossier with `GroundwaterLevelDossier.from_bro_id(bro_id)`. The method returns an object containing the data from the BRO XML file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9559950d-2033-4b98-be95-16c738ceefe3",
   "metadata": {},
   "outputs": [],
   "source": [
    "gld = brodata.gld.GroundwaterLevelDossier.from_bro_id(\"GLD000000012893\")\n",
    "gld"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e35e79e6-5bce-47c1-b8c5-57e88ee4f5b3",
   "metadata": {},
   "source": [
    "Observation data is available in the `observation` attribute as a pandas DataFrame."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "acaa4e2b-3dbd-4c34-a8ae-444e6f1d2450",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = gld.observation\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e3a4ca2b-7f75-4847-bd85-650ae2ece8d5",
   "metadata": {},
   "source": [
    "Create a time-series plot that uses different colors for each metadata combination. The helper function `plot_series` below implements this."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ab0f3cd5-c48d-4cb6-90d1-0f34fd899c30",
   "metadata": {},
   "outputs": [],
   "source": [
    "def plot_series(df):\n",
    "    f, ax = plt.subplots(figsize=(10, 8))\n",
    "    columns = [\"qualifier\", \"status\", \"observation_type\"]\n",
    "    for qualifier, status, observation_type in df[columns].drop_duplicates().values:\n",
    "        mask = df[\"qualifier\"] == qualifier\n",
    "        if pd.isna(status):\n",
    "            mask = mask & pd.isna(df[\"status\"])\n",
    "        else:\n",
    "            mask = mask & (df[\"status\"] == status)\n",
    "        mask = mask & (df[\"observation_type\"] == observation_type)\n",
    "        if status is None:\n",
    "            label = f\"{observation_type} {qualifier}\"\n",
    "        else:\n",
    "            label = f\"{observation_type} {status} {qualifier}\"\n",
    "        if mask.sum() > 100:\n",
    "            linestyle = \"-\"\n",
    "            marker = None\n",
    "        else:\n",
    "            linestyle = \"none\"\n",
    "            marker = \"o\"\n",
    "        df.loc[mask, \"value\"].plot(label=label, linestyle=linestyle, marker=marker)\n",
    "    ax.legend()\n",
    "    return f, ax\n",
    "\n",
    "\n",
    "plot_series(df);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9bff74c-b0d8-4ab0-be31-b8fc87f6aa82",
   "metadata": {},
   "source": [
    "We can look at all other contents of this GroundwaterLevelDossier by using the `to_dict()` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce501fcd-6f59-4c15-8de0-d6524c571fdf",
   "metadata": {},
   "outputs": [],
   "source": [
    "gld_data = gld.to_dict()\n",
    "gld_data.pop(\"observation\")\n",
    "gld_data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e4103692-9120-4d46-bea1-714b78153bc4",
   "metadata": {},
   "source": [
    "## Multiple objects"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9402db7-0481-488f-a72b-579ddf06dbf1",
   "metadata": {},
   "source": [
    "### All measurements of one tube\n",
    "Download all GroundwaterLevelDossiers for a particular tube (piezometer) using `brodata.gmw.get_tube_observations`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4d14f497-a669-468d-b736-6b340d5238ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = brodata.gmw.get_tube_observations(\"GMW000000017757\", 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38589d6b-b7b8-474a-aee8-b9e83566008a",
   "metadata": {},
   "source": [
    "Plot the resulting pandas DataFrame again using the `plot_series` function defined above."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "fcdd0f68-e092-48bc-a721-69df1f05cf02",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_series(df);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0d2593a4-8a8e-4165-8aca-288b76ca9465",
   "metadata": {},
   "source": [
    "### All measurements of one well\n",
    "Download all measurements of a Groundwater Monitoring Well using `brodata.gmw.get_observations(groundwaterMonitoringWell_id)`. The result is a GeoDataFrame where each row contains an `observation` DataFrame for a monitoring tube and a list of Groundwater Level Dossier BRO IDs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce2d6e39-b162-4aba-b9f6-46f3993510fd",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf = brodata.gmw.get_observations(\"GMW000000017757\")\n",
    "gdf = gdf.set_index([\"groundwaterMonitoringWell\", \"tubeNumber\"]).sort_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e5873b27-4a7e-434d-8455-5f02079a4030",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3186f8ee-5040-4dbb-ab3b-b58112e2f916",
   "metadata": {},
   "outputs": [],
   "source": [
    "f, ax = plt.subplots(figsize=(10, 8))\n",
    "for index in gdf.index.unique():\n",
    "    observations = [x for x in gdf.loc[[index], \"observation\"] if not x.empty]\n",
    "    df = pd.concat(observations).sort_index()\n",
    "    for qualifier in df[\"qualifier\"].unique():\n",
    "        if pd.isna(qualifier):\n",
    "            continue\n",
    "        mask = df[\"qualifier\"] == qualifier\n",
    "        label = f\"{index[0]}_{index[1]} {qualifier}\"\n",
    "        if mask.sum() > 100:\n",
    "            linestyle = \"-\"\n",
    "            marker = None\n",
    "        else:\n",
    "            linestyle = \"none\"\n",
    "            marker = \"o\"\n",
    "        df.loc[mask, \"value\"].plot(label=label, linestyle=linestyle, marker=marker)\n",
    "plt.legend();"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5b055b7b",
   "metadata": {},
   "source": [
    "### All measurements within an extent (option 1: `brodata.gm.get_data_in_extent`)\n",
    "Download groundwater level data within a bounding box using `brodata.gm.get_data_in_extent`.\n",
    "\n",
    "This method returns a GeoDataFrame containing metadata for monitoring tubes. Measurements are in the `observation` column (a DataFrame per tube), and the BRO IDs of the groundwater level dossiers are provided as a list in the `groundwaterLevelDossier` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "dcac099c",
   "metadata": {},
   "outputs": [],
   "source": [
    "extent = [118200, 118400, 439700, 440000]\n",
    "gdf_gm = brodata.gm.get_data_in_extent(extent)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ea455979",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf_gm.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bba92f6e-4a80-48e4-beac-84bdef926a9f",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf_gm['observation'].iloc[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22aad608-e0c3-4372-ac6d-7b99de733b68",
   "metadata": {},
   "source": [
    "Plot the GeoDataFrame on a map using its `plot` method."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "b64ac268",
   "metadata": {},
   "outputs": [],
   "source": [
    "f, ax = plt.subplots()\n",
    "ax.axis(\"scaled\")\n",
    "ax.axis(extent)\n",
    "gdf_gm.plot(ax=ax);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e2659a1d-d1e2-473d-b74d-4d0f0cb0ba4b",
   "metadata": {},
   "source": [
    "### All measurements within an extent (option 2: `brodata.gmw.get_data_in_extent`)\n",
    "The alternative method `brodata.gmw.get_data_in_extent` first downloads well characteristics, then each GroundwaterMonitoringWell to retrieve tube metadata, and uses the BRO relations API to obtain the relevant Groundwater Level Dossiers (GLD).\n",
    "\n",
    "Both methods place measurements in the `observation` column and list the corresponding GLD BRO IDs in `groundwaterLevelDossier`. The two methods differ in how tube metadata is retrieved.\n",
    "\n",
    "Because `brodata.gmw.get_data_in_extent` requires multiple BRO requests (characteristics, GroundwaterMonitoringWell, relations, GroundwaterLevelDossier), the BRO developed a PDOK web service that replaces the first three requests and is faster and less error-prone. For most use cases, `brodata.gm.get_data_in_extent` (option 1) is the preferred method because it is simpler and returns the key information (tube horizontal and vertical positions)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "4a098e2b-c951-4c7b-9b30-07998d4d3390",
   "metadata": {},
   "outputs": [],
   "source": [
    "extent = [118200, 118400, 439700, 440000]\n",
    "gdf_gmw = brodata.gmw.get_data_in_extent(\n",
    "    extent=extent, kind=\"gld\", combine=True, as_csv=True\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7d8b0f50",
   "metadata": {},
   "outputs": [],
   "source": [
    "gdf_gmw.T"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e6df02d5-7696-431e-bb9b-43c517aea4b5",
   "metadata": {},
   "outputs": [],
   "source": [
    "f, ax = plt.subplots()\n",
    "ax.axis(\"scaled\")\n",
    "ax.axis(extent)\n",
    "gdf_gmw.plot(ax=ax);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "de74e31a-187e-48e3-b01b-2d38062a6e82",
   "metadata": {},
   "source": [
    "## Observations summary\n",
    "Use `brodata.gld.get_observations_summary` to download a summary of the observations within a Groundwater Level Dossier."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c99b9f3a-92d8-4ad2-9e85-6c5bd98b6898",
   "metadata": {},
   "outputs": [],
   "source": [
    "brodata.gld.get_observations_summary(\"GLD000000012893\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0d18a1a6-c217-4da5-9e1a-5f11d4a5857a",
   "metadata": {},
   "source": [
    "## Objects as csv\n",
    "The XML representations of GroundwaterLevelDossier objects can become large and may be relatively slow to parse. To improve performance, the data is also available in CSV format. You can retrieve the data as CSV using the method `brodata.gld.get_objects_as_csv()`. This method returns a pandas.DataFrame equivalent to the `GroundwaterLevelDossier.observation` attribute.\n",
    "\n",
    "You can set the parameter `as_csv=True` when calling `brodata.gm.get_data_in_extent()` or `brodata.gmw.get_data_in_extent()` (described in the previous sections). When enabled, these methods download and process the data using the CSV endpoint instead of the XML files."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c64c2250-0a62-4168-81d6-47a8a5a04dc2",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = brodata.gld.get_objects_as_csv(\"GLD000000012893\")\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f550be85-0c2a-4e84-acea-b9358cb13927",
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_series(df);"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ca213ba8-a621-4ac9-8bf8-995ee1c8ce1b",
   "metadata": {},
   "source": [
    "## Series as csv\n",
    "An alternative is the method `brodata.gld.get_series_as_csv`. This method retrieves a table with measurements for different observation types (regulier_voorlopig, regulier_beoordeeld, controle en onbekend) as columns, and is intended for applications such as the graphical visualization of groundwater levels.\n",
    "\n",
    "`brodata.gld.get_series_as_csv()` returns a different data structure for the observations in the pandas.DataFrame compared to `brodata.gld.get_objects_as_csv()` or `GroundwaterLevelDossier.observation`. For this reason, its use is generally not recommended."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "162beb96-b588-43c1-92b3-54db1cff1691",
   "metadata": {},
   "outputs": [],
   "source": [
    "df = brodata.gld.get_series_as_csv(\"GLD000000012893\")\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "88aa71f1",
   "metadata": {},
   "outputs": [],
   "source": [
    "df[[\"Voorlopige Waarde [m]\", \"Beoordeelde Waarde [m]\", \"Controle Waarde [m]\", \"Onbekend Waarde [m]\"]].plot(figsize=(10, 8));"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "name": "python"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}