Skip to content

Commit b4fb67a

Browse files
authored
Merge pull request #15 from rdhyee/issue-13-parquet-duckdb
Issue 13 parquet duckdb
2 parents d1f9e38 + e2e2f9f commit b4fb67a

File tree

3 files changed

+1016
-0
lines changed

3 files changed

+1016
-0
lines changed

_quarto.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,8 @@ website:
4444
href: tutorials/index.qmd
4545
- text: "iSamples Parquet Tutorial"
4646
href: tutorials/parquet.qmd
47+
- text: "Zenodo iSamples OpenContext Tutorial"
48+
href: tutorials/zenodo_isamples_analysis.qmd
4749
- text: "Cesium View"
4850
href: tutorials/parquet_cesium.qmd
4951
- text: "Cesium View split sources"
Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
---
2+
title: "Parquet"
3+
---
4+
5+
Let's query Eric's parquet file using duckdb+parquet
6+
7+
```{ojs}
8+
//| code-fold: true
9+
//
10+
11+
parquet_path = 'https://storage.googleapis.com/opencontext-parquet/oc_isamples_pqg.parquet';
12+
13+
// Create a DuckDB instance
14+
db = {
15+
const instance = await DuckDBClient.of();
16+
await instance.query(`create view nodes as select * from read_parquet('${parquet_path}')`)
17+
return instance;
18+
}
19+
20+
row_count = {
21+
const result = await db.queryRow(`select count(*) as n from nodes;`);
22+
return result.n;
23+
}
24+
25+
results = {
26+
const data = await db.query(`SELECT COUNT(*) as count, otype FROM nodes GROUP BY otype ORDER BY count DESC`);
27+
document.getElementById("loading_1").hidden = true;
28+
return Inputs.table(data);
29+
}
30+
31+
rows1k = {
32+
const data = await db.query(`SELECT row_id, pid, otype, label FROM nodes limit 1000`);
33+
document.getElementById("loading_2").hidden = true;
34+
return Inputs.table(data);
35+
}
36+
37+
md`There are ${row_count} rows in the source <code>${parquet_path}</code>.`
38+
```
39+
40+
41+
<div>
42+
<div id="loading_1">Loading type counts...</div>
43+
${results}
44+
</div>
45+
46+
The first 1000 rows:
47+
48+
<div>
49+
<div id="loading_2">Loading...</div>
50+
${rows1k}
51+
</div>

0 commit comments

Comments
 (0)