Skip to content

Commit

Permalink
Fixup wording in notebook 3
Browse files Browse the repository at this point in the history
Since `read_parquet` is discussed in notebook 2, we can drop the
language introducing it here. I also drop the note on accessing the
tables via `con.tables.<name>` since it feels out of place here IMO.
  • Loading branch information
jcrist committed Jul 8, 2024
1 parent 4f54af4 commit de2d947
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 75 deletions.
55 changes: 6 additions & 49 deletions 03 - Switching Backends.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@
"\n",
"### Parquet loading\n",
"\n",
"In the previous examples we used a pre-existing DuckDB database, and some\n",
"in-memory tables. Another common pattern is that you have a few parquet files\n",
"you want to work with. We can load those in to an in-memory DuckDB connection.\n",
"(Note that \"in-memory\" here just means ephemeral, DuckDB is still very happy to\n",
"operate on as much data as your hard drive can hold)."
"Here we use the `read_parquet` method discussed in the previous notebook to\n",
"load some data into DuckDB. Note that DuckDB treats this as a \"view\", meaning\n",
"that the data isn't copied or loaded into memory - it still only exists on disk\n",
"in the `parquet` files, meaning you can happily operate on parquet datasets\n",
"that are much larger than the RAM on your laptop."
]
},
{
Expand Down Expand Up @@ -91,47 +91,6 @@
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `read_parquet` method returns an Ibis table that points to the\n",
"to-be-ingested `parquet` file. \n",
"\n",
"`read_parquet` also registers the table with DuckDB (or another backend), so\n",
"you can also load the tables like we did for the `penguins` table in the\n",
"previous notebook."
]
},
{
"cell_type": "code",
"metadata": {},
"source": [
"basics = con.tables.imdb_title_basics # this cell is redundant, just here for demonstration"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {},
"source": [
"ratings = con.tables.imdb_title_ratings # this cell is redundant, just here for demonstration"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"metadata": {
"scrolled": true
},
"source": [
"basics"
],
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"metadata": {},
Expand All @@ -148,9 +107,7 @@
},
{
"cell_type": "code",
"metadata": {
"scrolled": true
},
"metadata": {},
"source": [
"%load solutions/nb03_ex01.py"
],
Expand Down
31 changes: 5 additions & 26 deletions quarto/03 - Switching Backends.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,11 @@ we'll run the same query on the full dataset.

### Parquet loading

In the previous examples we used a pre-existing DuckDB database, and some
in-memory tables. Another common pattern is that you have a few parquet files
you want to work with. We can load those in to an in-memory DuckDB connection.
(Note that "in-memory" here just means ephemeral, DuckDB is still very happy to
operate on as much data as your hard drive can hold).
Here we use the `read_parquet` method discussed in the previous notebook to
load some data into DuckDB. Note that DuckDB treats this as a "view", meaning
that the data isn't copied or loaded into memory - it still only exists on disk
in the `parquet` files, meaning you can happily operate on parquet datasets
that are much larger than the RAM on your laptop.

```{python}
import ibis
Expand All @@ -59,26 +59,6 @@ ratings = con.read_parquet(
)
```

The `read_parquet` method returns an Ibis table that points to the
to-be-ingested `parquet` file.

`read_parquet` also registers the table with DuckDB (or another backend), so
you can also load the tables like we did for the `penguins` table in the
previous notebook.

```{python}
basics = con.tables.imdb_title_basics # this cell is redundant, just here for demonstration
```

```{python}
ratings = con.tables.imdb_title_ratings # this cell is redundant, just here for demonstration
```

```{python}
#| scrolled: true
basics
```

## Exercises

### Exercise 1
Expand All @@ -89,7 +69,6 @@ Join `basics` with `ratings` on the `tconst` column.
#### Solution

```{python}
#| scrolled: true
%load solutions/nb03_ex01.py
```

Expand Down

0 comments on commit de2d947

Please sign in to comment.