-
Notifications
You must be signed in to change notification settings - Fork 106
[BREAKING] Package rewrite for a 1.0 release #374
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #374 +/- ##
===========================================
- Coverage 99.35% 88.78% -10.57%
===========================================
Files 7 7
Lines 466 1311 +845
===========================================
+ Hits 463 1164 +701
- Misses 3 147 +144 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
55b8d07 to
84c670b
Compare
| - `omit_empty::Bool` whether empty Julia collection values should be skipped when serializing | ||
| - `allownan::Bool` similar to the parsing keyword argument to allow/disallow writing of invalid JSON values `NaN`, `-Inf`, and `Inf` | ||
| - `ninf::String` the string to write if `allownan=true` and serializing `-Inf` | ||
| - `inf::String` the string to write if `allownan=true` and serializing `Inf` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly inf_string? Also do we need ninf or can it just print as a - before inf?
| - Explicit keyword arguments to control a number of serialization features, including: | ||
| - `omit_null::Bool` whether `nothing`/`missing` Julia values should be skipped when serializing | ||
| - `omit_empty::Bool` whether empty Julia collection values should be skipped when serializing | ||
| - `allownan::Bool` similar to the parsing keyword argument to allow/disallow writing of invalid JSON values `NaN`, `-Inf`, and `Inf` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, should this be allownonfinite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JSON3.jl had allow_inf, but JSON.jl already had allownan, so I opted to keep that for backwards compat.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do get the backwards compat, but as a potential consumer of the API I will say that allownan makes me think that the keywords is only for NaN, not Inf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could it be worth checking how many packages depend on the current JSON.jl to estimate how breaking any "breaking changes" might be in practice?
|
Tiny detail but these methods don’t close the file handle: Line 172 in c9d0b1b
|
|
Interesting situation w/ the Documentation job failure here: Documenter depends on JSON, but doesn't have compat w/ 1.0, so it can't install. Would we need to have Documenter.jl pre-emptively support 1.0? Other solutions? |
|
JSON ends up being a dependency in most environments. Is there any significant precompile / load times changes with this PR? |
|
As [noted](JuliaIO/JSON.jl#374 (comment)), the JSON 1.0 release is currently blocked on using Documenter due to the circular dependency of needing Documenter to build docs. The usage of JSON.jl in Documenter.jl is very vanilla, so this PR proposed "preemptive" support for JSON.jl 1.0 since the usage of JSON in Documenter is known to not rely on any breaking changes proposed.
* Preemptively support JSON 1.0 release As [noted](JuliaIO/JSON.jl#374 (comment)), the JSON 1.0 release is currently blocked on using Documenter due to the circular dependency of needing Documenter to build docs. The usage of JSON.jl in Documenter.jl is very vanilla, so this PR proposed "preemptive" support for JSON.jl 1.0 since the usage of JSON in Documenter is known to not rely on any breaking changes proposed.
Just pushed a fix; thanks for mentioning. |
|
Having to uncompress all the test files for all package updates feels like it could be slightly slow (at least on Windows). In TOML (which has a similar test suite) I get them during testing from an archive (https://github.com/JuliaLang/TOML.jl/blob/44aab3c023323587680ab8d8c7ef478bf78c4c0c/test/utils/utils.jl#L8) but I guess an archive could also be checked in. Alternatively, all the content of the files could be put into one file with a sentinel "file separator" and get constructed on the fly during testing. |
48aad24 to
3716320
Compare
Great callout; I refactored so that we have tar files of both |
What about just getting each file from the tarball directly (no on-filesystem untaring) using Tar.jl? |
Ah, can Tar.jl do that? I was going off the README docs and I didnt' see anything about that, but that would indeed by nice. |
Ok, I looked into |
|
You can create a |
|
I agree that API breakages are fine and expected. Since this is a pre-1.0 package, people depending on this package should expect an unstable API. My main issue with this PR is that it currently relies on a large number of undocumented features from Julia, which makes reviewing the code almost impossible for someone like me who isn't familiar with how Julia is implemented, or how it will change in the future. I also don't think mmap should be used in this package. Many commonly used systems do not have mmap available, or working well, and at least on my machine, julia> @be _ Mmap.mmap("foo.json") x->GC.gc() seconds=100.0
Benchmark: 710 samples with 1 evaluation
min 206.200 μs (22 allocs: 1.273 KiB)
median 268.750 μs (22 allocs: 1.273 KiB)
mean 298.476 μs (22 allocs: 1.273 KiB)
max 691.200 μs (22 allocs: 1.273 KiB)
julia> @be _ read("foo.json") x->GC.gc() seconds=100.0
Benchmark: 798 samples with 1 evaluation
min 158.300 μs (16 allocs: 800 bytes)
median 203.700 μs (16 allocs: 800 bytes)
mean 205.402 μs (16 allocs: 800 bytes)
max 317.600 μs (16 allocs: 800 bytes)
julia> filesize("foo.json")
5
julia> versioninfo()
Julia Version 1.11.6
Commit 9615af0f26 (2025-07-09 12:58 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Windows (x86_64-w64-mingw32)
CPU: 16 × 12th Gen Intel(R) Core(TM) i5-1240P
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 1 default, 0 interactive, 1 GC (on 16 virtual cores) |
I've added a new |
I've appreciated your thorough review of this PR so far. I've tried to address all of your concerns of "julia internals" usage by putting in proper guards where appropriate and raising issues in Julia proper to try and get better public APIs for things. I think I can also get behind your thoughts on Mmap. It's been a pattern of mine for years now to reach for it in file-byte-parsing scenarios, but I think overall |
|
Usage of and dependency on Mmap has now been removed. |
|
|
||
| # hand-rolled scoped enum | ||
| module JSONTypes | ||
| primitive type T 8 end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using primitive type instead of just struct here seems ugly and unnecessary.
|
I think the For example, function json(a, indent=nothing)
if isnothing(indent)
writefile(String, a)
else
String(push!(writefile(Vector{UInt8}, a; pretty=convert(Int, indent)), UInt8('\n')))
endThe julia> @test JSON.json("hello world", 2) == "\"hello world\"\n"
Error During Test at REPL[12]:1
Test threw exception
Expression: JSON.json("hello world", 2) == "\"hello world\"\n"
MethodError: json(::String, ::Int64) is ambiguous.
Candidates:
json(fname::String, obj; kw...)
@ JSON ~/github/JSON.jl/src/write.jl:428
json(a, indent::Integer)
@ JSON ~/github/JSON.jl/src/JSON.jl:107
Possible fix, define
json(::String, ::Integer)
Stacktrace:
[1] macro expansion
@ ~/.julia/juliaup/julia-1.11.6+0.x64.linux.gnu/share/julia/stdlib/v1.11/Test/src/Test.jl:677 [inlined]
[2] top-level scope
@ REPL[12]:1
ERROR: There was an error during testing
|
|
Several packages are currently using the Also, it would be good to include in the migration guide an example of migrating code like https://github.com/JuliaComputing/TableView.jl/blob/a3a3172fed89260b779d79f22c901ed74d96b90d/src/TableView.jl#L293-L319 where I'm guessing this would involve defining a custom |
|
Quick update: I've been diving deep into a few performance characteristics of the overall code proposed here vs. pre-1.0 JSON and JSON3. I've found a few places particularly where we can improve by avoiding more allocations and it's been a good dive overall in terms of my own understanding of the core compiler's allocation elision/escape analysis limitations. I'm hoping to get those performance improvements in this week, which should help tuple parsing (as reported here) perf be more in line. I appreciate those looking into ecosystem ripple effects and I'll catch up on those comments soon. |
|
Hi, I'm coming from trying to Now, my question: do you intend to make (the rewrite of) JSON.jl trimmable? That is, have everything type-stable etc. to allow trimming and thus small-binary generation work? I think parsing and serialising JSON is one of the fundamental things to do in today's mostly web-based and micro-servic'ing software landscape. So, having this trimmable would definitely bring julia forward as more-generally-usable language. |
|
Update: I believe the performance issue w/ tuples reported in #376 is now fixed in this PR, so I'm tentatively marking this to be merged soon + doing the 1.0 release. I'm going to investigate #377 (thank you @nhz2 for the research!) and may delay the merge or provide more compat/migration guides as appropriate. |
…ionality that can be directly vendored in
|
Ok, pushed a couple of tweaks:
|
Adapt to major update of JSON.jl: JuliaIO/JSON.jl#374
This is a proposal to overhaul the package internals, providing mostly the same high-level interfaces (
JSON.parse,JSON.json, etc) while also adding a number of enhancements (lazy parsing support, typed materialization similar to JSON3).My goal is to deprecate JSON3.jl in favor of the simpler, faster, and overall better functionality as implemented in this PR.
I've reviewed all existing open issues/PRs to both this package and JSON3.jl, marking all that will be fixed or resolved with the functionality as proposed (over 70 issues resolved in total!).
I've tried to provide extensive new documentation (in the new docs/ directory) on all proposed functionality, including a migrate.md file that provides a detailed migration guide for both JSON.jl pre-1.0 users and JSON3.jl users.
My hope and aim is that the net upgrade for the vast majority of JSON pre-1.0 users will simply be adding
JSON = "1"in their compat files. I've tried to support as much as possible in what I could tell was part of the JSON.jl public API. A few things were deliberately left out as my impression is they were either part of an ancient API that can be better served by more modern Julia mechanisms, or just seem not very useful. If there are things we discover feel too breaking, I'm definitely open to try and find ways to add backwards compatibility.Another part of my commitment in this proposal, if/when merged and released, is to take time across the ecosystem to help upgrade packages. I've done this for other packages (CSV.jl, DataFrames.jl, HTTP.jl, etc.) with big releases, and I feel it helps with a new release "momentum" to get a bunch of key packages upgraded on the new release.
This proposal does add another dependency on the newish StructUtils.jl package and I'll try to provide a little context/justification. I've actually been working on the core refactor of this code and StructUtils.jl for around 2 years. The ideas and redesign came from architectural pains/complexities in JSON3.jl and StructTypes.jl, and a desire to find an internal framework that was at least as powerful/flexible as what ST + J3 provided, but in a vastly simpler way (the JSON3.jl code kept growing in complexity and became hard to work with/maintain/improve). I believe the design of StructUtils.jl to be much less invasive and natural when working with Julia structs, while the power of the core functional parsing methods provide a clean integration for handling 1) default materialization from JSON, 2) typed materialization, and 3) simple serialization from structs and a variety of other types. What's more, is the StructUtils.jl machinery generalizes easily beyond just JSON, and I already have other unreleased packages that use it for database model interactions, struct diffing, and a number of other use-cases.
As for review, I recognize that huge PRs like this are very hard to break down and see the exact net changes. In this case, the entire package is pretty much rewritten, so I'd recommend checking out the PR locally, testing out code, and reviewing the entire package/files in that way (i.e. trying to look at any kind of "diff" on github isn't probably that useful).
I'd also like to propose a public "community review" time where I would host a google meet call that would be open to anyone interested and I would take time to go over package internals/architecture and help answer questions/concerns in person. If folks are interested, please ping me in the public Julia slack or comment here and I'll arrange a time that works for those interested.