Skip to content

Conversation

@ruslandoga
Copy link
Collaborator

@ruslandoga ruslandoga commented Feb 8, 2025

Closes #218


Somewhat depends on the answer to ClickHouse/ClickHouse#75768.

Rust client went the easy way: ClickHouse/clickhouse-rs#171 -- guess we can do that too (#246).

@kurayama
Copy link

@ruslandoga any development here?
I've been using this branch for awhile and works well, just need a few hacks for querying json with ecto-ch, but that's another story 😄

@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Mar 21, 2025

👋 @kurayama

Thank you for reminding me about this one!

I couldn't come up with a good solution right away and kind of put it off for later until Plausible or someone else asks for JSON support :)

I guess that time has come! I'll try to finish it this weekend. Initially with input_format_binary_read_json_as_string and output_format_binary_write_json_as_string and maybe later with the "true" binary JSON. My plan is to make the RowBinary encoder/decoder be aware of the settings, and if input_format_binary_read_json_as_string: 1 is present, we encode with Jason.encode_to_iodata! and similarly for output_format_binary_write_json_as_string: 1 and Jason.decode!. Otherwise we raise with a helpful error message about the current :json type limitations. This would make it possible for us to support binary JSON (instead of raising) later without any breaking changes (if no "json as string" settings are provided).

need a few hacks for querying json with ecto-ch

Feel free to share the problems you have in this thread, I would be happy to change ecto-ch to make working with JSON easier :)

@kurayama
Copy link

kurayama commented Mar 21, 2025

@ruslandoga thanks for the reply

My issue might be just lack of familiarity with the library.
I want to dynamically receive a string from a query parser on user input and use it as a json subpath.

Right now I have something like:

  defp filter(query, path, value) do
    query
    |> where([e], fragment("?.?", e.my_json_field, literal(^path)) == ^value)
  end

I don't know if there's a better approach without using a literal, this works but seems a bit hackish 😅
My paths are e.g foo.bar.biz to filter on a nested property of the json field.

@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Mar 21, 2025

Have you tried json_extract_path? ecto_ch implements this Ecto operation with JSON_QUERY

I wonder if that would work:

defp filter(query, path, value) do
  query
  |> where([e], json_extract_path(e.my_json_field, ^path) == ^value) # or ^[path] or [^path]
end

I'll try adding some tests around this use-case into ecto_ch to see how well it works tomorrow.

@kurayama
Copy link

kurayama commented Mar 22, 2025

I'm getting:

iex(6)> ClickhouseRepo.all(Demo.Entry |> where([e], json_extract_path(e.my_json_field, ^["foo", "bar"]) == "5"))
** (UndefinedFunctionError) function Ch.type/0 is undefined or private. Did you mean:

      * type/1

    (ch 0.3.1) Ch.type()
    (elixir 1.18.3) lib/enum.ex:1840: Enum."-map_reduce/3-lists^mapfoldl/2-0-"/3
    (elixir 1.18.3) lib/enum.ex:2546: Enum."-reduce/3-lists^foldl/2-0-"/3
    (elixir 1.18.3) lib/enum.ex:2546: Enum."-reduce/3-lists^foldl/2-0-"/3
    (ecto 3.12.5) lib/ecto/repo/queryable.ex:214: Ecto.Repo.Queryable.execute/4
    (ecto 3.12.5) lib/ecto/repo/queryable.ex:19: Ecto.Repo.Queryable.all/3

In any case, JSON_QUERY doesn't seem compatible with the new json type.

27b50acad3a7 :) SELECT * FROM entries WHERE JSON_QUERY(my_json_field,'$.foo.bar') = 'biz' LIMIT 1

SELECT *
FROM entries
WHERE JSON_QUERY(my_json_field, '$.foo.bar') = 'biz'
LIMIT 1

Query id: 36de9fb0-8f0d-4530-8de5-27d82e90b4ec


Elapsed: 0.003 sec.

Received exception from server (version 25.3.1):
Code: 43. DB::Exception: Received from localhost:9000. DB::Exception: JSONPath functions require first argument to be JSON of string, illegal type: JSON. (ILLEGAL_TYPE_OF_ARGUMENT)

For the query to work it needs to be:

SELECT * FROM entries WHERE my_json_field.foo.bar = 'biz'

@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Mar 23, 2025

My plan is to make the RowBinary encoder/decoder be aware of the settings, and if input_format_binary_read_json_as_string: 1 is present, we encode with Jason.encode_to_iodata! and similarly for output_format_binary_write_json_as_string: 1 and Jason.decode!.

This approach didn't work out quite as I had expected since we might not have access to those setting options if they were passed into Ch.start_link instead of each individual Ch.query ... so one would need to pass them to each Ch.query / Repo operation:

Ch.query!(
  conn,
  "SELECT json FROM test",
  _no_params = [],
  settings: [
    output_format_binary_write_json_as_string: 1
  ]
)

Repo.all(Schema, settings: [output_format_binary_write_json_as_string: 1])

@syepes
Copy link

syepes commented Apr 10, 2025

It would be great to to have support for the native JSON type (https://clickhouse.com/docs/sql-reference/data-types/newjson), this would simply some of the usages

@arrowcircle
Copy link

Hey! Any updates on this PR?

@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Jul 15, 2025

👋

Sorry for the delay! I think this PR wouldn't work as is, and since I don't want to push the decision of binary vs text JSON to the users, I think the "right" way is to implement binary JSON encoding/decoding. I plan on attempting it today after taking care of some other easier issues :)

WIP: #261

@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Jul 16, 2025

👋

I created a branch that simplifies this PR by hard-coding the assumption of "json as text" while #261 is being worked on: plausible/ecto_ch#233 -- it uses #262 "underneath"

@ruslandoga ruslandoga closed this Jul 16, 2025
@ruslandoga
Copy link
Collaborator Author

ruslandoga commented Jul 17, 2025

👋

https://hex.pm/packages/ch/0.5.0 and https://hex.pm/packages/ecto_ch/0.8.0 now have basic JSON (as text) support. The input_format_binary_read_json_as_string and output_format_binary_write_json_as_string options are added automatically in ClickHouse 24.10+ versions. Dynamic type support is planned :)

Demos:

Mix.install [{:ch, "~> 0.5.0"}]

{:ok, pid} = Ch.start_link()

[[%{}]] = Ch.query!(pid, ~s|select '{"a":null}'::JSON|).rows
[[%{"a" => "b"}]] = Ch.query!(pid, ~s|select '{"a":"b"}'::JSON|).rows
# note 42 becomes a string since arrays need to have the same type
[[%{"a" => ["b", "42"]}]] = Ch.query!(pid, ~s|select '{"a":["b", 42]}'::JSON|).rows

Ch.query!(pid, "create table ch_json(data JSON) engine Memory")
Ch.query!(pid, "insert into ch_json(data) format RowBinary", [[%{}], [nil], [%{"a" => 42}]], types: ["JSON"]).num_rows
[[%{}], [%{}], [%{"a" => "42"}]] = Ch.query!(pid, "select * from ch_json").rows
Mix.install [{:ecto_ch, "~> 0.8.0"}]

defmodule Repo do
  use Ecto.Repo, otp_app: :demo, adapter: Ecto.Adapters.ClickHouse
end

Repo.start_link()

Repo.query!("select version()").rows
#=> [["25.6.3.116"]]

# note 42 becomes a string here as well for some reason (that's how ClickHouse decides to encode it when output_format_binary_write_json_as_string)
[[%{"a" => "42"}]] = Repo.query!(~s|select '{"a":42}'::JSON|).rows

defmodule Event do
  use Ecto.Schema
  
  @primary_key false
  schema "ecto_ch_json" do
    field :data, Ch, type: "JSON"
  end
end

Repo.query!("create table ecto_ch_json(data JSON) engine Memory")

Repo.insert_all(Event, [%{data: %{"a" => 42}}, %{data: nil}, %{data: %{}}, %{data: %{a: [1, 2, 3]}}])

import Ecto.Query

Repo.all(from e in Event, select: fragment("?.a::text", e.data))
#=> ["42", "", "", "[1,2,3]"]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for the new JSON data type

5 participants