How to convert an N-Triples file into Jelly in Python with fixed memory allocation? #97

KMax · 2025-09-26T08:28:10Z

KMax
Sep 26, 2025

Hi there,

I do use Python and rdflib to work with RDF data. One of my challenges is to process an RDF data file (usually in N-Triples or N-Quads) in a streaming fashion, so that the memory consumption is fixed. I was hoping to be able to do it with the Jelly format, but the blocker is that to convert from N-Triples to Jelly, it have to be loaded to the memory :(

I looked through the docs, but I could not find a solution to that. Could such a challenge be addressed with Jelly? If not now, then potentially in the future.

Thanks!
Maksim

Answered by Ostrzyciel

Sep 26, 2025

Hi Maksim! Thank you for your interest in the project.

It is possible for pyjelly to write a Jelly file in a fully streaming fashion, an example of that is here, section "Serializing a stream of statements". But, I don't think that RDFLib's N-Triples parser can actually return a stream of triples, so yeah, that would be a blocker. One option is that you could try to modify the RDFLib N-Triples parser to make it return an iterator of triples... Unfortunately this is a limitation of RDFLib so we can't do much about it.

Alternatively, you could try working with pyjelly's RDFLib-less integration – there is an example of that here, section "Serializing a stream of statements". We have an unoff…

View full answer

Ostrzyciel · 2025-09-26T08:48:03Z

Ostrzyciel
Sep 26, 2025
Maintainer

Hi Maksim! Thank you for your interest in the project.

It is possible for pyjelly to write a Jelly file in a fully streaming fashion, an example of that is here, section "Serializing a stream of statements". But, I don't think that RDFLib's N-Triples parser can actually return a stream of triples, so yeah, that would be a blocker. One option is that you could try to modify the RDFLib N-Triples parser to make it return an iterator of triples... Unfortunately this is a limitation of RDFLib so we can't do much about it.

Alternatively, you could try working with pyjelly's RDFLib-less integration – there is an example of that here, section "Serializing a stream of statements". We have an unofficial N-Triples/N-Quads parser that works with that, but it's only intended for tests. The code is here, but it's not something we support officially and we did not test it with W3C conformance tests.

Finally, you can also try to use jelly-cli from the command-line, to convert huge files in a streaming manner, much faster than possible in Python. This tool uses limited memory and we used it to convert things like Wikidata and OpenStreetMap without any problems.

If at all possible, I strongly suggest using jelly-cli, it's very easy to use and is also very fast.

I'm not sure if any of these options are useful to you – please let me know if you have further questions or feature suggestions! :)

1 reply

KMax Sep 26, 2025
Author

@Ostrzyciel thanks for the confirmation! I've commented on a relevant discussion in rdflib's repo, see RDFLib/rdflib#1560.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jelly-RDF

How to convert an N-Triples file into Jelly in Python with fixed memory allocation? #97

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Jelly-RDF

How to convert an N-Triples file into Jelly in Python with fixed memory allocation? #97

Uh oh!

KMax Sep 26, 2025

Replies: 1 comment · 1 reply

Uh oh!

Ostrzyciel Sep 26, 2025 Maintainer

Uh oh!

KMax Sep 26, 2025 Author

KMax
Sep 26, 2025

Replies: 1 comment 1 reply

Ostrzyciel
Sep 26, 2025
Maintainer

KMax Sep 26, 2025
Author