Avrotize is a "Rosetta Stone" for data structure definitions, allowing you to convert between numerous data and database schema formats and to generate code for different programming languages.
It is, for instance, a well-documented and predictable converter and code generator for data structures originally defined in JSON Schema (of arbitrary complexity).
The tool leans on the Apache Avro-derived Avrotize Schema as its schema model.
- Programming languages: Python, C#, Java, TypeScript, JavaScript, Rust, Go, C++
- SQL Databases: MySQL, MariaDB, PostgreSQL, SQL Server, Oracle, SQLite, BigQuery, Snowflake, Redshift, DB2
- Other databases: KQL/Kusto, MongoDB, Cassandra, Redis, Elasticsearch, DynamoDB, CosmosDB
- Data schema formats: Avro, JSON Schema, XML Schema (XSD), Protocol Buffers 2 and 3, ASN.1, Apache Parquet
You can install Avrotize from PyPI, having installed Python 3.10 or later:
pip install avrotizeAvrotize provides several commands for converting schema formats via Avrotize Schema.
Converting to Avrotize Schema:
avrotize p2a- Convert Protobuf (2 or 3) schema to Avrotize Schema.avrotize j2a- Convert JSON schema to Avrotize Schema.avrotize x2a- Convert XML schema to Avrotize Schema.avrotize asn2a- Convert ASN.1 to Avrotize Schema.avrotize k2a- Convert Kusto table definitions to Avrotize Schema.avrotize pq2a- Convert Parquet schema to Avrotize Schema.avrotize csv2a- Convert CSV file to Avrotize Schema.avrotize kstruct2a- Convert Kafka Connect Schema to Avrotize Schema.
Converting from Avrotize Schema:
avrotize a2p- Convert Avrotize Schema to Protobuf 3 schema.avrotize a2j- Convert Avrotize Schema to JSON schema.avrotize a2x- Convert Avrotize Schema to XML schema.avrotize a2k- Convert Avrotize Schema to Kusto table definition.avrotize s2k- Convert JSON Structure Schema to Kusto table definition.avrotize a2sql- Convert Avrotize Schema to SQL table definition.avrotize struct2sql- Convert JSON Structure Schema to SQL table definition.avrotize a2pq- Convert Avrotize Schema to Parquet or Iceberg schema.avrotize a2ib- Convert Avrotize Schema to Iceberg schema.avrotize s2ib- Convert JSON Structure to Iceberg schema.avrotize a2mongo- Convert Avrotize Schema to MongoDB schema.avrotize a2cassandra- Convert Avrotize Schema to Cassandra schema.avrotize struct2cassandra- Convert JSON Structure Schema to Cassandra schema.avrotize a2es- Convert Avrotize Schema to Elasticsearch schema.avrotize a2dynamodb- Convert Avrotize Schema to DynamoDB schema.avrotize a2cosmos- Convert Avrotize Schema to CosmosDB schema.avrotize a2couchdb- Convert Avrotize Schema to CouchDB schema.avrotize a2firebase- Convert Avrotize Schema to Firebase schema.avrotize a2hbase- Convert Avrotize Schema to HBase schema.avrotize a2neo4j- Convert Avrotize Schema to Neo4j schema.avrotize a2dp- Convert Avrotize Schema to Datapackage schema.avrotize a2md- Convert Avrotize Schema to Markdown documentation.avrotize struct2md- Convert JSON Structure schema to Markdown documentation.
Direct conversions (JSON Structure):
avrotize s2p- Convert JSON Structure to Protocol Buffers (.proto files).
Generate code from Avrotize Schema:
avrotize a2cs- Generate C# code from Avrotize Schema.avrotize a2java- Generate Java code from Avrotize Schema.avrotize a2py- Generate Python code from Avrotize Schema.avrotize a2ts- Generate TypeScript code from Avrotize Schema.avrotize a2js- Generate JavaScript code from Avrotize Schema.avrotize a2cpp- Generate C++ code from Avrotize Schema.avrotize a2go- Generate Go code from Avrotize Schema.avrotize a2rust- Generate Rust code from Avrotize Schema.
Generate code from JSON Structure:
avrotize s2cpp- Generate C++ code from JSON Structure schema.avrotize s2cs- Generate C# code from JSON Structure schema.avrotize s2py- Generate Python code from JSON Structure schema.avrotize s2rust- Generate Rust code from JSON Structure schema.avrotize s2ts- Generate TypeScript code from JSON Structure schema.avrotize s2go- Generate Go code from JSON Structure schema.
Direct JSON Structure conversions:
avrotize s2csv- Convert JSON Structure schema to CSV schema.avrotize s2x- Convert JSON Structure to XML Schema (XSD).
Other commands:
avrotize pcf- Create the Parsing Canonical Form (PCF) of an Avrotize Schema.
JSON Structure conversions:
avrotize s2dp- Convert JSON Structure schema to Datapackage schema.
Direct conversions (not via Avrotize Schema):
avrotize struct2gql- Convert JSON Structure schema to GraphQL schema.
You can use Avrotize to convert between Avro/Avrotize Schema and other schema formats like JSON Schema, XML Schema (XSD), Protocol Buffers (Protobuf), ASN.1, and database schema formats like Kusto Data Table Definition (KQL) and SQL Table Definition. That means you can also convert from JSON Schema to Protobuf going via Avrotize Schema.
You can also generate C#, Java, TypeScript, JavaScript, and Python code from Avrotize Schema documents. The difference to the native Avro tools is that Avrotize can emit data classes without Avro library dependencies and, optionally, with annotations for JSON serialization libraries like Jackson or System.Text.Json.
The tool does not convert data (instances of schemas), only the data structure definitions.
Mind that the primary objective of the tool is the conversion of schemas that describe data structures used in applications, databases, and message systems. While the project's internal tests do cover a lot of ground, it is nevertheless not a primary goal of the tool to convert every complex document schema like those used for devops pipeline or system configuration files.
Data structure definitions are an essential part of data exchange, serialization, and storage. They define the shape and type of data, and they are foundational for tooling and libraries for working with the data. Nearly all data schema languages are coupled to a specific data exchange or storage format, locking the definitions to that format.
Avrotize is designed as a tool to "unlock" data definitions from JSON Schema or XML Schema and make them usable in other contexts. The intent is also to lay a foundation for transcoding data from one format to another, by translating the schema definitions as accurately as possible into the schema model of the target format's schema. The transcoding of the data itself requires separate tools that are beyond the scope of this project.
The use of the term "data structure definition" and not "data object definition" is quite intentional. The focus of the tool is on data structures that can be used for messaging and eventing payloads, for data serialization, and for database tables, with the goal that those structures can be mapped cleanly from and to common programming language types.
Therefore, Avrotize intentionally ignores common techniques to model object-oriented inheritance. For instance, when converting from JSON Schema, all content from allOf expressions is merged into a single record type rather than trying to model the inheritance tree in Avro.
Avrotize Schema is a schema model that is a full superset of the popular Apache Avro Schema model. Avrotize Schema is the "pivot point" for this tool. All schemas are converted from and to Avrotize Schema.
Since Avrotize Schema is a superset of Avro Schema and uses its extensibility features, every Avrotize Schema is also a valid Avro Schema and vice versa.
Why did we pick Avro Schema as the foundational schema model?
Avro Schema ...
- provides a simple, clean, and concise way to define data structures. It is quite easy to understand and use.
- is self-contained by design without having or requiring external references. Avro Schema can express complex data structure hierarchies spanning multiple namespace boundaries all in a single file, which neither JSON Schema nor XML Schema nor Protobuf can do.
- can be resolved by code generators and other tools "top-down" since it enforces dependencies to be ordered such that no forward-referencing occurs.
- emerged out of the Apache Hadoop ecosystem and is widely used for serialization and storage of data and for data exchange between systems.
- supports native and logical types that cover the needs of many business and technical use cases.
- can describe the popular JSON data encoding very well and in a way that always maps cleanly to a wide range of programming languages and systems. In contrast, it's quite easy to inadvertently define a JSON Schema that is very difficult to map to a programming language structure.
- is itself expressed as JSON. That makes it easy to parse and generate, which is not the case for Protobuf or ASN.1, which require bespoke parsers.
It needs to be noted here that while Avro Schema is great for defining data structures, and data classes generated from Avro Schema using this tool or other tools can be used to with the most popular JSON serialization libraries, the Apache Avro project's own JSON encoding has fairly grave interoperability issues with common usage of JSON. Avrotize defines an alternate JSON encoding
in avrojson.md.
Avro Schema does not support all the bells and whistles of XML Schema or JSON Schema, but that is a feature, not a bug, as it ensures the portability of the schemas across different systems and infrastructures. Specifically, Avro Schema does not support many of the data validation features found in JSON Schema or XML Schema. There are no pattern, format, minimum, maximum, or required keywords in Avro Schema, and Avro does not support conditional validation.
In a system where data originates as XML or JSON described by a validating XML Schema or JSON Schema, the assumption we make here is that data will be validated using its native schema language first, and then the Avro Schema will be used for transformation or transfer or storage.
When converting Avrotize Schema to Kusto Data Table Definition (KQL), SQL Table Definition, or Parquet Schema, the tool can add special columns for CloudEvents attributes. CNCF CloudEvents is a specification for describing event data in a common way.
The rationale for adding such columns to database tables is that messages and events commonly separate event metadata from the payload data, while that information is merged when events are projected into a database. The metadata often carries important context information about the event that is not contained in the payload itself. Therefore, the tool can add those columns to the database tables for easy alignment of the message context with the payload when building event stores.
avrotize p2a <path_to_proto_file> [--out <path_to_avro_schema_file>]Parameters:
<path_to_proto_file>: The path to the Protobuf schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.
Conversion notes:
- Proto 2 and Proto 3 syntax are supported.
- Proto package names are mapped to Avro namespaces. The tool does resolve imports and consolidates all imported types into a single Avrotize Schema file.
- The tool embeds all 'well-known' Protobuf 3.0 types in Avro format and injects them as needed when the respective types are imported. Only the
Timestamptype is mapped to the Avro logical type 'timestamp-millis'. The rest of the well-known Protobuf types are kept as Avro record types with the same field names and types. - Protobuf allows any scalar type as key in a
map, Avro does not. When converting from Proto to Avro, the type information for the map keys is ignored. - The field numbers in message types are not mapped to the positions of the fields in Avro records. The fields in Avro are ordered as they appear in the Proto schema. Consequently, the Avrotize Schema also ignores the
extensionsandreservedkeywords in the Proto schema. - The
optionalkeyword results in an Avro field being nullable (union with thenulltype), while therequiredkeyword results in a non-nullable field. Therepeatedkeyword results in an Avro field being an array of the field type. - The
oneofkeyword in Proto is mapped to an Avro union type. - All
optionsin the Proto schema are ignored.
avrotize a2p <path_to_avro_schema_file> [--out <path_to_proto_directory>] [--naming <naming_mode>] [--allow-optional]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Protobuf schema directory to write the conversion result to. If omitted, the output is directed to stdout.--naming: (optional) Type naming convention. Choices aresnake,camel,pascal.--allow-optional: (optional) Enable support for 'optional' fields.
Conversion notes:
- Avro namespaces are resolved into distinct proto package definitions. The tool will create a new
.protofile with the package definition and animportstatement for each namespace found in the Avrotize Schema. - Avro type unions
[]are converted tooneofexpressions in Proto. Avro allows for maps and arrays in the type union, whereas Proto only supports scalar types and message type references. The tool will therefore emit message types containing a single array or map field for any such case and add it to the containing type, and will also recursively resolve further unions in the array and map values. - The sequence of fields in a message follows the sequence of fields in the Avro record. When type unions need to be resolved into
oneofexpressions, the alternative fields need to be assigned field numbers, which will shift the field numbers for any subsequent fields.
avrotize j2a <path_to_json_schema_file> [--out <path_to_avro_schema_file>] [--namespace <avro_schema_namespace>] [--split-top-level-records]Parameters:
<path_to_json_schema_file>: The path to the JSON schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) The namespace to use in the Avrotize Schema if the JSON schema does not define a namespace.--split-top-level-records: (optional) Split top-level records into separate files.
Conversion notes:
avrotize a2j <path_to_avro_schema_file> [--out <path_to_json_schema_file>] [--naming <naming_mode>]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the JSON schema file to write the conversion result to. If omitted, the output is directed to stdout.--naming: (optional) Type naming convention. Choices aresnake,camel,pascal,default.
Conversion notes:
avrotize x2a <path_to_xsd_file> [--out <path_to_avro_schema_file>] [--namespace <avro_schema_namespace>]Parameters:
<path_to_xsd_file>: The path to the XML schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) The namespace to use in the Avrotize Schema if the XML schema does not define a namespace.
Conversion notes:
- All XML Schema constructs are mapped to Avro record types with fields, whereby both, elements and attributes, become fields in the record. XML is therefore flattened into fields and this aspect of the structure is not preserved.
- Avro does not support
xsd:anyas Avro does not support arbitrary typing and must always use a named type. The tool will mapxsd:anyto a fieldanytyped as a union that allows scalar values or two levels of array and/or map nesting. simpleTypedeclarations that define enums are mapped toenumtypes in Avro. All other facets are ignored and simple types are mapped to the corresponding Avro type.complexTypedeclarations that have simple content where a base type is augmented with attributes is mapped to a record type in Avro. Any other facets defined on the complex type are ignored.- If the schema defines a single root element, the tool will emit a single Avro record type. If the schema defines multiple root elements, the tool will emit a union of record types, each corresponding to a root element.
- All fields in the resulting Avrotize Schema are annotated with an
xmlkindextension attribute that indicates whether the field was anelementor anattributein the XML schema.
avrotize a2x <path_to_avro_schema_file> [--out <path_to_xsd_schema_file>] [--namespace <target_namespace>]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the XML schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) Target namespace for the XSD schema.
Conversion notes:
- Avro record types are mapped to XML Schema complex types with elements.
- Avro enum types are mapped to XML Schema simple types with restrictions.
- Avro logical types are mapped to XML Schema simple types with restrictions where required.
- Avro unions are mapped to standalone XSD simple type definitions with a union restriction if all union types are primitives.
- Avro unions with complex types are resolved into distinct types for each option that are
then joined with a choice.
avrotize s2x <path_to_structure_file> [--out <path_to_xsd_schema_file>] [--namespace <target_namespace>]Parameters:
<path_to_structure_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the XML schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) Target namespace for the XSD schema.
Conversion notes:
- JSON Structure object types are mapped to XML Schema complex types with elements.
- JSON Structure primitive types (string, int8-128, uint8-128, float/double, boolean, etc.) are mapped to appropriate XSD simple types.
- Extended primitive types are mapped as follows:
binary/bytes→xs:base64Binarydate→xs:datetime→xs:timedatetime/timestamp→xs:dateTimeduration→xs:durationuuid→xs:stringuri→xs:anyURIdecimal→xs:decimal
- Collection types:
arrayandset→ complex types with sequences of itemsmap→ complex type with entry elements containing key and valuetuple→ complex type with fixed sequence of typed items
- Union types (
choiceor type arrays like["string", "null"]):- Tagged unions (with discriminator) →
xs:choiceelements - Inline unions → abstract base types with concrete extensions
- Nullable types → elements with
minOccurs="0"
- Tagged unions (with discriminator) →
- Type references (
$ref) are resolved to named XSD types - Type extensions (
$extends) are mapped to XSD complex type extensions withxs:complexContent - Abstract types are marked with
abstract="true"in XSD - Validation constraints (minLength, maxLength, pattern, minimum, maximum) are converted to XSD restrictions/facets
- Required properties become elements with
minOccurs="1", optional properties haveminOccurs="0"
avrotize asn2a <path_to_asn1_schema_file>[,<path_to_asn1_schema_file>,...] [--out <path_to_avro_schema_file>]Parameters:
<path_to_asn1_schema_file>: The path to the ASN.1 schema file to be converted. The tool supports multiple files in a comma-separated list. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.
Conversion notes:
- All ASN.1 types are mapped to Avro record types, enums, and unions. Avro does not support the same level of nesting of types as ASN.1, the tool will map the types to the best fit.
- The tool will map the following ASN.1 types to Avro types:
SEQUENCEandSETare mapped to Avro record types.CHOICEis mapped to an Avro record types with all fields being optional. While theCHOICEtype technically corresponds to an Avro union, the ASN.1 type has different named fields for each option, which is not a feature of Avro unions.OBJECT IDENTIFIERis mapped to an Avro string type.ENUMERATEDis mapped to an Avro enum type.SEQUENCE OFandSET OFare mapped to Avro array type.BIT STRINGis mapped to Avro bytes type.OCTET STRINGis mapped to Avro bytes type.INTEGERis mapped to Avro long type.REALis mapped to Avro double type.BOOLEANis mapped to Avro boolean type.NULLis mapped to Avro null type.UTF8String,PrintableString,IA5String,BMPString,NumericString,TeletexString,VideotexString,GraphicString,VisibleString,GeneralString,UniversalString,CharacterString,T61Stringare all mapped to Avro string type.- All other ASN.1 types are mapped to Avro string type.
- The ability to parse ASN.1 schema files is limited and the tool may not be able to parse all ASN.1 files. The tool is based on the Python asn1tools package and is limited to that package's capabilities.
avrotize k2a --kusto-uri <kusto_cluster_uri> --kusto-database <kusto_database> [--out <path_to_avro_schema_file>] [--emit-cloudevents-xregistry]Parameters:
--kusto-uri: The URI of the Kusto cluster to connect to.--kusto-database: The name of the Kusto database to read the table definitions from.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.--emit-cloudevents-xregistry: (optional) See discussion below.
Conversion notes:
- The tool directly connects to the Kusto cluster and reads the table definitions from the specified database. The tool will convert all tables in the database to Avro record types, returned in a top-level type union.
- Connecting to the Kusto cluster leans on the same authentication mechanisms as the Azure CLI. The tool will use the same authentication context as the Azure CLI if it is installed and authenticated.
- The tool will map the Kusto column types to Avro types as follows:
boolis mapped to Avro boolean type.datetimeis mapped to Avro long type with logical typetimestamp-millis.decimalis mapped to a logical Avro type with thelogicalTypeset todecimaland theprecisionandscaleset to the values of thedecimaltype in Kusto.guidis mapped to Avro string type.intis mapped to Avro int type.longis mapped to Avro long type.realis mapped to Avro double type.stringis mapped to Avro string type.timespanis mapped to a logical Avro type with thelogicalTypeset toduration.
- For
dynamiccolumns, the tool will sample the data in the table to determine the structure of the dynamic column. The tool will map the dynamic column to an Avro record type with fields that correspond to the fields found in the dynamic column. If the dynamic column contains nested dynamic columns, the tool will recursively map those to Avro record types. If records with conflicting structures are found in the dynamic column, the tool will emit a union of record types for the dynamic column. - If the
--emit-cloudevents-xregistryoption is set, the tool will emit an xRegistry registry manifest file with a CloudEvent message definition for each table in the Kusto database and a separate Avro Schema for each table in the embedded schema registry. If one or more tables are found to contain CloudEvent data (as indicated by the presence of the CloudEvents attribute columns), the tool will inspect the content of thetype(or__typeor__type) columns to determine which CloudEvent types have been stored in the table and will emit a CloudEvent definition and schema for each unique type.
avrotize a2k <path_to_avro_schema_file> [--out <path_to_kusto_kql_file>] [--record-type <record_type>] [--emit-cloudevents-columns] [--emit-cloudevents-dispatch]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Kusto KQL file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the Avro record type to convert to a Kusto table.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the table:___id,___source,___subject,___type, and___time.--emit-cloudevents-dispatch: (optional) If set, the tool will add a table named_cloudevents_dispatchto the script or database, which serves as an ingestion and dispatch table for CloudEvents. The table has columns for the core CloudEvents attributes and adatacolumn that holds the CloudEvents data. For each table in the Avrotize Schema, the tool will create an update policy that maps events whosetypeattribute matches the Avro type name to the respective table.
Conversion notes:
- Only the Avro
recordtype can be mapped to a Kusto table. If the Avrotize Schema contains other types (likeenumorarray), the tool will ignore them. - Only the first
recordtype in the Avrotize Schema is converted to a Kusto table. If the Avrotize Schema contains otherrecordtypes, they will be ignored. The--record-typeoption can be used to specify whichrecordtype to convert. - The fields of the record are mapped to columns in the Kusto table. Fields that are records or arrays or maps are mapped to columns of type
dynamicin the Kusto table.
avrotize s2k <path_to_structure_schema_file> [--out <path_to_kusto_kql_file>] [--record-type <record_type>] [--emit-cloudevents-columns] [--emit-cloudevents-dispatch]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Kusto KQL file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the record type to convert to a Kusto table.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the table:___id,___source,___subject,___type, and___time.--emit-cloudevents-dispatch: (optional) If set, the tool will add a table named_cloudevents_dispatchto the script or database, which serves as an ingestion and dispatch table for CloudEvents. The table has columns for the core CloudEvents attributes and adatacolumn that holds the CloudEvents data. For each table in the JSON Structure Schema, the tool will create an update policy that maps events whosetypeattribute matches the type name to the respective table.
Conversion notes:
- Only JSON Structure
objecttypes can be mapped to a Kusto table. Other types (likeenum,array,choice) are not directly convertible to tables. - The tool converts the first
objecttype found in the schema, or uses the type specified with--record-type. - Object properties are mapped to columns in the Kusto table. Complex types (objects, arrays, maps, sets, tuples, choices) are mapped to columns of type
dynamic. - JSON Structure primitive types are mapped to appropriate Kusto scalar types:
string,uri,jsonpointer→stringboolean→boolinteger,int8,uint8,int16,uint16,int32→intuint32,int64,uint64→longint128,uint128,decimal→decimalnumber,float,double,float8,binary32,binary64→realdate,datetime,timestamp→datetimetime,duration→timespanuuid→guidbinary→dynamic
avrotize a2sql [input] --out <path_to_sql_script> --dialect <dialect>Parameters:
input: The path to the Avrotize schema file to be converted (or read from stdin if omitted).--out: The path to the SQL script file to write the conversion result to.--dialect: The SQL dialect (database type) to target. Supported dialects include:mysql,mariadb,postgres,sqlserver,oracle,sqlite,bigquery,snowflake,redshift,db2
--emit-cloudevents-columns: (Optional) Add CloudEvents columns to the SQL table.
For detailed conversion rules and type mappings for each SQL dialect, refer to the SQL Conversion Notes document.
avrotize struct2sql [input] --out <path_to_sql_script> --dialect <dialect> [--emit-cloudevents-columns]Parameters:
input: The path to the JSON Structure schema file to be converted (or read from stdin if omitted).--out: The path to the SQL script file to write the conversion result to.--dialect: The SQL dialect (database type) to target. Supported dialects include:mysql,mariadb,postgres,sqlserver,oracle,sqlite,bigquery,snowflake,redshift,db2
--emit-cloudevents-columns: (Optional) Add CloudEvents columns to the SQL table.
Conversion notes:
- The tool converts JSON Structure schemas to SQL DDL statements for various database dialects.
- JSON Structure primitive types (string, int8-128, uint8-128, float, double, decimal, binary, date, datetime, uuid, etc.) are mapped to appropriate SQL types for each dialect.
- Compound types (array, set, map, object, choice, tuple) are typically mapped to JSON/JSONB columns or equivalent in the target database.
- Required properties from the JSON Structure schema become non-nullable columns and are used for primary keys.
- The
namespaceandnameproperties from the JSON Structure schema are used to construct table names. - Type annotations like
maxLength,precision, andscaleare preserved in column comments.
For detailed conversion rules and type mappings for each SQL dialect when converting from JSON Structure, refer to the SQL Conversion Notes document.
avrotize a2mongo <path_to_avro_schema_file> [--out <path_to_mongodb_schema>] [--emit-cloudevents-columns]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the MongoDB schema file to write the conversion result to.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the MongoDB schema.
Conversion notes:
- The fields of the Avro record type are mapped to fields in the MongoDB schema. Fields that are records or arrays or maps are mapped to fields of type
object. - The emitted MongoDB schema file is a JSON file that can be used with MongoDB's
mongoimporttool to create a collection with the specified schema.
Here are the "Convert ..." sections for the newly added commands:
avrotize a2cassandra [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the Cassandra schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the Cassandra schema (optional, default: false).
Refer to the detailed conversion notes for Cassandra in the NoSQL Conversion Notes.
avrotize struct2cassandra [input] --out <output_file> [--emit-cloudevents-columns]Parameters:
input: Path to the JSON Structure schema file (or read from stdin if omitted).--out: Output path for the Cassandra CQL schema file (required).--emit-cloudevents-columns: Add CloudEvents columns to the Cassandra schema (optional, default: false).
Conversion notes:
- The tool converts JSON Structure schemas to Cassandra CQL DDL statements.
- JSON Structure primitive types are mapped to appropriate Cassandra types (int32 → int, string → text, uuid → uuid, etc.).
- Required properties are used to construct the PRIMARY KEY for the table.
- Complex types (array, map, object) are stored as text columns in Cassandra.
Refer to the detailed conversion notes for Cassandra in the NoSQL Conversion Notes.
avrotize a2dynamodb [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the DynamoDB schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the DynamoDB schema (optional, default: false).
Refer to the detailed conversion notes for DynamoDB in the NoSQL Conversion Notes.
avrotize a2es [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the Elasticsearch schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the Elasticsearch schema (optional, default: false).
Refer to the detailed conversion notes for Elasticsearch in the NoSQL Conversion Notes.
avrotize a2couchdb [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the CouchDB schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the CouchDB schema (optional, default: false).
Refer to the detailed conversion notes for CouchDB in the NoSQL Conversion Notes.
avrotize a2neo4j [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the Neo4j schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the Neo4j schema (optional, default: false).
Refer to the detailed conversion notes for Neo4j in the NoSQL Conversion Notes.
avrotize a2firebase [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the Firebase schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the Firebase schema (optional, default: false).
Refer to the detailed conversion notes for Firebase in the NoSQL Conversion Notes.
avrotize a2cosmos [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the CosmosDB schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the CosmosDB schema (optional, default: false).
Refer to the detailed conversion notes for CosmosDB in the NoSQL Conversion Notes.
avrotize a2hbase [input] --out <output_directory> [--emit-cloudevents-columns]input: Path to the Avrotize schema file (or read from stdin if omitted).--out: Output path for the HBase schema (required).--emit-cloudevents-columns: Add CloudEvents columns to the HBase schema (optional, default: false).
Refer to the detailed conversion notes for HBase in the NoSQL Conversion Notes.
avrotize a2pq <path_to_avro_schema_file> [--out <path_to_parquet_schema_file>] [--record-type <record-type-from-avro>] [--emit-cloudevents-columns]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Parquet schema file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the Avro record type to convert to a Parquet schema.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the Parquet schema:__id,__source,__subject,__type, and__time.
Conversion notes:
- The emitted Parquet file contains only the schema, no data rows.
- The tool only supports writing Parquet files for Avrotize Schema that describe a single
recordtype. If the Avrotize Schema contains a top-level union, the--record-typeoption must be used to specify which record type to emit. - The fields of the record are mapped to columns in the Parquet file. Array and record fields are mapped to Parquet nested types. Avro type unions are mapped to structures, not to Parquet unions since those are not supported by the PyArrow library used here.
avrotize a2ib <path_to_avro_schema_file> [--out <path_to_iceberg_schema_file>] [--record-type <record-type-from-avro>] [--emit-cloudevents-columns]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Iceberg schema file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the Avro record type to convert to an Iceberg schema.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the Iceberg schema:__id,__source,__subject,__type, and__time.
Conversion notes:
- The emitted Iceberg file contains only the schema, no data rows.
- The tool only supports writing Iceberg files for Avrotize Schema that describe a single
recordtype. If the Avrotize Schema contains a top-level union, the--record-typeoption must be used to specify which record type to emit. - The fields of the record are mapped to columns in the Iceberg file. Array and record fields are mapped to Iceberg nested types. Avro type unions are mapped to structures, not to Iceberg unions since those are not supported by the PyArrow library used here.
avrotize s2ib <path_to_structure_schema_file> [--out <path_to_iceberg_schema_file>] [--record-type <record-type-from-structure>] [--emit-cloudevents-columns]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Iceberg schema file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the record type in definitions to convert to an Iceberg schema.--emit-cloudevents-columns: (optional) If set, the tool will add CloudEvents attribute columns to the Iceberg schema:___id,___source,___subject,___type, and___time.
Conversion notes:
- The emitted Iceberg file contains only the schema, no data rows.
- The tool supports JSON Structure schemas with
type: "object"at the top level. If the schema contains a$refor the record type is in definitions, the--record-typeoption can be used to specify which type to emit. - JSON Structure types are mapped to Iceberg types as follows:
- Primitive types:
string→ StringType,boolean→ BooleanType, numeric types (int8-128, uint8-128, float, double) → appropriate IntegerType/LongType/FloatType/DoubleType - Extended types:
binary/bytes→ BinaryType,date→ DateType,time→ TimeType,datetime/timestamp→ TimestampType,duration→ LongType (microseconds),decimal→ DecimalType (with precision/scale),uuid/uri/jsonpointer→ StringType - Compound types:
object→ StructType,array/set→ ListType,map→ MapType,tuple→ StructType with indexed fields - Choice types: Mapped to StructType with alternative fields (Iceberg doesn't have native union support)
- Primitive types:
- Type annotations such as
precision,scale, and validation constraints are preserved where applicable. - The
$extendsfeature is supported - base type properties are included in the conversion. - Required and optional properties are handled via Iceberg's
requiredfield flag.
avrotize pq2a <path_to_parquet_file> [--out <path_to_avro_schema_file>] [--namespace <avro_schema_namespace>]Parameters:
<path_to_parquet_file>: The path to the Parquet file to be converted. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) The namespace to use in the Avrotize Schema if the Parquet file does not define a namespace.
Conversion notes:
- The tool reads the schema from the Parquet file and converts it to Avrotize Schema. The data in the Parquet file is not read or converted.
- The fields of the Parquet schema are mapped to fields in the Avrotize Schema. Nested fields are mapped to nested records in the Avrotize Schema.
avrotize csv2a <path_to_csv_file> [--out <path_to_avro_schema_file>] [--namespace <avro_schema_namespace>]Parameters:
<path_to_csv_file>: The path to the CSV file to be converted. If omitted, the file is read from stdin.--out: The path to the Avrotize Schema file to write the conversion result to. If omitted, the output is directed to stdout.--namespace: (optional) The namespace to use in the Avrotize Schema if the CSV file does not define a namespace.
Conversion notes:
- The tool reads the CSV file and converts it to Avrotize Schema. The first row of the CSV file is assumed to be the header row, containing the field names.
- The fields of the CSV file are mapped to fields in the Avrotize Schema. The tool infers the types of the fields from the data in the CSV file.
avrotize kstruct2a [input] --out <path_to_avro_schema_file>Parameters:
input: The path to the Kafka Struct file to be converted (or read from stdin if omitted).--out: The path to the Avrotize Schema file to write the conversion result to.--kstruct: Deprecated: The path to the Kafka Struct file (for backward compatibility).
Conversion notes:
- The tool converts the Kafka Struct definition to an Avrotize Schema, mapping Kafka data types to their Avro equivalents.
- Kafka Structs are typically used to define data structures for Kafka Connect and other Kafka-based applications. This command facilitates interoperability by enabling the conversion of these definitions into Avro, which can be further used with various serialization and schema registry tools.
avrotize a2cs <path_to_avro_schema_file> [--out <path_to_csharp_dir>] [--namespace <csharp_namespace>] [--avro-annotation] [--system_text_json_annotation] [--newtonsoft-json-annotation] [--pascal-properties]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the C# classes to. Required.--namespace: (optional) The namespace to use in the C# classes.--avro-annotation: (optional) Use Avro annotations.--system_text_json_annotation: (optional) Use System.Text.Json annotations.--newtonsoft-json-annotation: (optional) Use Newtonsoft.Json annotations.--pascal-properties: (optional) Use PascalCase properties.
Conversion notes:
- The tool generates C# classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a C# class.
- The fields of the record are mapped to properties in the C# class. Nested records are mapped to nested classes in the C# class.
- The tool supports adding annotations to the properties in the C# class. The
--avro-annotationoption adds Avro annotations, the--system_text_json_annotationoption adds System.Text.Json annotations, and the--newtonsoft-json-annotationoption adds Newtonsoft.Json annotations. - The
--pascal-propertiesoption changes the naming convention of the properties to PascalCase.
avrotize a2java <path_to_avro_schema_file> [--out <path_to_java_dir>] [--package <java_package>] [--avro-annotation] [--jackson-annotation] [--pascal-properties]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Java classes to. Required.--package: (optional) The package to use in the Java classes.--avro-annotation: (optional) Use Avro annotations.--jackson-annotation: (optional) Use Jackson annotations.--pascal-properties: (optional) Use PascalCase properties.
Conversion notes:
- The tool generates Java classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Java class.
- The fields of the record are mapped to properties in the Java class. Nested records are mapped to nested classes in the Java class.
- The tool supports adding annotations to the properties in the Java class. The
--avro-annotationoption adds Avro annotations, and the--jackson-annotationoption adds Jackson annotations. - The
--pascal-propertiesoption changes the naming convention of the properties to PascalCase.
avrotize a2py <path_to_avro_schema_file> [--out <path_to_python_dir>] [--package <python_package>] [--dataclasses-json-annotation] [--avro-annotation]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Python classes to. Required.--package: (optional) The package to use in the Python classes.--dataclasses-json-annotation: (optional) Use dataclasses-json annotations.--avro-annotation: (optional) Use Avro annotations.
Conversion notes:
- The tool generates Python classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Python class.
- The fields of the record are mapped to properties in the Python class. Nested records are mapped to nested classes in the Python class.
- The tool supports adding annotations to the properties in the Python class. The
--dataclasses-json-annotationoption adds dataclasses-json annotations, and the--avro-annotationoption adds Avro annotations.
avrotize a2ts <path_to_avro_schema_file> [--out <path_to_typescript_dir>] [--package <typescript_package>] [--avro-annotation] [--typedjson-annotation]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the TypeScript classes to. Required.--package: (optional) The package to use in the TypeScript classes.--avro-annotation: (optional) Use Avro annotations.--typedjson-annotation: (optional) Use TypedJSON annotations.
Conversion notes:
- The tool generates TypeScript classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a TypeScript class.
- The fields of the record are mapped to properties in the TypeScript class. Nested records are mapped to nested classes in the TypeScript class.
- The tool supports adding annotations to the properties in the TypeScript class. The
--avro-annotationoption adds Avro annotations, and the--typedjson-annotationoption adds TypedJSON annotations.
avrotize s2ts <path_to_structure_schema_file> [--out <path_to_typescript_dir>] [--package <typescript_package>] [--typedjson-annotation] [--avro-annotation]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the TypeScript classes to. Required.--package: (optional) The TypeScript package name for the generated project.--typedjson-annotation: (optional) Use TypedJSON annotations for JSON serialization support.--avro-annotation: (optional) Add Avro binary serialization support with embedded Structure schema.
Conversion notes:
- The tool generates TypeScript classes from JSON Structure schema. Each object type in the JSON Structure schema is converted to a TypeScript class.
- Supports all JSON Structure Core types including:
- Primitive types: string, number, boolean, null
- Extended types: binary, int8-128, uint8-128, float8/float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer
- Compound types: object, array, set, map, tuple, any, choice (unions)
- JSON Structure features are supported:
- $ref references: Type references are resolved and generated as separate classes
- $extends inheritance: Base class properties are included in derived classes
- $offers/$uses add-ins: Add-in properties are merged into classes that use them
- Abstract types: Marked with
abstractkeyword in TypeScript - Required/optional properties: Required properties are non-nullable, optional properties are nullable
- Choice types: Converted to TypeScript union types
- The generated project includes:
- TypeScript source files in
src/directory package.jsonwith dependenciestsconfig.jsonfor TypeScript compilation.gitignorefileindex.tsfor exporting all generated types
- TypeScript source files in
- The TypeScript code can be compiled using
npm run build(requiresnpm installfirst) - For more details on JSON Structure handling, see jsonstructure.md
avrotize a2js <path_to_avro_schema_file> [--out <path_to_javascript_dir>] [--package <javascript_package>] [--avro-annotation]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the JavaScript classes to. Required.--package: (optional) The package to use in the JavaScript classes.--avro-annotation: (optional) Use Avro annotations.
Conversion notes:
- The tool generates JavaScript classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a JavaScript class.
- The fields of the record are mapped to properties in the JavaScript class. Nested records are mapped to nested classes in the JavaScript class.
- The tool supports adding annotations to the properties in the JavaScript class. The
--avro-annotationoption adds Avro annotations.
avrotize a2cpp <path_to_avro_schema_file> [--out <path_to_cpp_dir>] [--namespace <cpp_namespace>] [--avro-annotation] [--json-annotation]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the C++ classes to. Required.--namespace: (optional) The namespace to use in the C++ classes.--avro-annotation: (optional) Use Avro annotations.--json-annotation: (optional) Use JSON annotations.
Conversion notes:
- The tool generates C++ classes from the Avrotize Schema. Each record type in the Av
rotize Schema is converted to a C++ class.
- The fields of the record are mapped to properties in the C++ class. Nested records are mapped to nested classes in the C++ class.
- The tool supports adding annotations to the properties in the C++ class. The
--avro-annotationoption adds Avro annotations, and the--json-annotationoption adds JSON annotations.
avrotize a2go <path_to_avro_schema_file> [--out <path_to_go_dir>] [--package <go_package>] [--avro-annotation] [--json-annotation] [--package-site <go_package_site>] [--package-username <go_package_username>]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Go classes to. Required.--package: (optional) The package to use in the Go classes.--package-site: (optional) The package site to use in the Go classes.--package-username: (optional) The package username to use in the Go classes.--avro-annotation: (optional) Use Avro annotations.--json-annotation: (optional) Use JSON annotations.
Conversion notes:
- The tool generates Go classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Go class.
- The fields of the record are mapped to properties in the Go class. Nested records are mapped to nested classes in the Go class.
- The tool supports adding annotations to the properties in the Go class. The
--avro-annotationoption adds Avro annotations, and the--json-annotationoption adds JSON annotations.
avrotize a2rust <path_to_avro_schema_file> [--out <path_to_rust_dir>] [--package <rust_package>] [--avro-annotation] [--serde-annotation]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Rust classes to. Required.--package: (optional) The package to use in the Rust classes.--avro-annotation: (optional) Use Avro annotations.--serde-annotation: (optional) Use Serde annotations.
Conversion notes:
- The tool generates Rust classes from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Rust class.
- The fields of the record are mapped to properties in the Rust class. Nested records are mapped to nested classes in the Rust class.
- The tool supports adding annotations to the properties in the Rust class. The
--avro-annotationoption adds Avro annotations, and the--serde-annotationoption adds Serde annotations.
avrotize s2cpp <path_to_structure_file> --out <path_to_cpp_dir> [--namespace <cpp_namespace>] [--json-annotation]Parameters:
<path_to_structure_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the C++ classes to. Required.--namespace: (optional) The namespace to use in the C++ classes.--json-annotation: (optional) Include JSON serialization support (default: true).
Conversion notes:
- The tool generates C++ classes from JSON Structure schemas. Each object type in the JSON Structure schema is converted to a C++ class.
- The fields of the object are mapped to properties in the C++ class. Nested objects are mapped to nested classes.
- The tool supports all JSON Structure Core types including primitives (string, number, boolean), extended types (int8-128, uint8-128, float, double, decimal, binary, date, datetime, time, duration, uuid, uri), and compound types (object, array, set, map, tuple, choice).
- JSON Structure-specific features are supported including $ref type references, namespaces, definitions, and container type aliases.
- The generated code includes CMake build files and vcpkg dependency management for easy integration.
avrotize s2rust <path_to_structure_schema_file> [--out <path_to_rust_dir>] [--package <rust_package>] [--json-annotation]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Rust classes to. Required.--package: (optional) The package name to use in the Rust classes.--json-annotation: (optional) Use Serde JSON annotations for serialization support.
Conversion notes:
- The tool generates Rust structs and enums from JSON Structure schemas. Each object type in the JSON Structure schema is converted to a Rust struct.
- The fields of objects are mapped to struct fields with appropriate Rust types. Nested objects are mapped to nested structs.
- All JSON Structure Core types are supported:
- Primitive types: string, number, boolean, null
- Extended types: binary, int8-128, uint8-128, float8/float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer
- Compound types: object, array, set, map, tuple, any, choice (discriminated unions)
- JSON Structure-specific features are supported:
- Namespaces and definitions
- Type references ($ref)
- Required and optional properties
- Abstract types
- Extensions ($extends)
- The
--json-annotationoption adds Serde derive macros for JSON serialization and deserialization. - Generated code includes embedded unit tests that verify struct creation and serialization (when annotations are enabled).
avrotize s2go <path_to_structure_file> --out <path_to_go_dir> [--package <go_package>] [--json-annotation] [--avro-annotation] [--package-site <package_site>] [--package-username <username>]Parameters:
<path_to_structure_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the directory to write the Go structs to. Required.--package: (optional) The package name to use in the Go code.--json-annotation: (optional) Add JSON struct tags for encoding/json.--avro-annotation: (optional) Add Avro struct tags.--package-site: (optional) The package site for the Go module (e.g., github.com).--package-username: (optional) The username/organization for the Go module.
Conversion notes:
- The tool generates Go structs from JSON Structure schemas. Each object type is converted to a Go struct.
- JSON Structure primitive types are mapped to Go types. Extended types like
date,time,datetimeare mapped to time.Time. - Integer types (int8, int16, int32, int64, uint8, etc.) are mapped to corresponding Go integer types.
- Choice types are generated as interface{} types for flexibility.
- The tool generates a complete Go module with go.mod file, struct definitions, helper functions, and unit tests.
- Generated code includes methods for JSON serialization/deserialization when annotations are enabled.
avrotize a2dp <path_to_avro_schema_file> [--out <path_to_datapackage_file>] [--record-type <record-type-from-avro>]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Datapackage schema file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the Avro record type to convert to a Datapackage schema.
Conversion notes:
- The tool generates a Datapackage schema from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Datapackage resource.
- The fields of the record are mapped to fields in the Datapackage resource. Nested records are mapped to nested resources in the Datapackage.
avrotize s2dp <path_to_structure_schema_file> [--out <path_to_datapackage_file>] [--record-type <record-type-from-structure>]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Datapackage schema file to write the conversion result to. If omitted, the output is directed to stdout.--record-type: (optional) The name of the JSON Structure record type to convert to a Datapackage schema.
Conversion notes:
- The tool generates a Datapackage schema from the JSON Structure schema. Each object type in the JSON Structure schema is converted to a Datapackage resource.
- The properties of the object are mapped to fields in the Datapackage resource schema.
- All JSON Structure Core types are supported, including:
- JSON primitive types (string, number, boolean, null)
- Extended primitive types (int8-128, uint8-128, float/double, decimal, binary, date, datetime, time, duration, uuid, uri, jsonpointer)
- Compound types (object, array, set, map, tuple, choice/union)
- JSON Structure-specific features are preserved:
- Namespaces are used to organize resources
- Type references ($ref) are resolved
- Type annotations (maxLength, minLength, pattern, minimum, maximum, enum) are converted to Data Package field constraints
- Union types (nullable fields) are properly handled
avrotize a2md <path_to_avro_schema_file> [--out <path_to_markdown_file>]Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Markdown file to write the conversion result to. If omitted, the output is directed to stdout.
Conversion notes:
- The tool generates Markdown documentation from the Avrotize Schema. Each record type in the Avrotize Schema is converted to a Markdown section.
- The fields of the record are documented in a table in the Markdown section. Nested records are documented in nested sections in the Markdown file.
avrotize struct2md <path_to_structure_schema_file> [--out <path_to_markdown_file>]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Markdown file to write the conversion result to. If omitted, the output is directed to stdout.
Conversion notes:
- The tool generates Markdown documentation from JSON Structure Core schemas following the patterns established by the Avrotize Schema to Markdown converter.
- Supports all JSON Structure Core types including:
- JSON Primitive Types: string, number, boolean, null
- Extended Primitive Types: binary, int8-128, uint8-128, float8/float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer
- Compound Types: object, array, set, map, tuple, any, choice (both tagged and inline unions)
- Supports JSON Structure Core features:
- Namespaces and definitions are documented in separate sections
- Type references ($ref) are converted to Markdown links
- Extensions ($extends) and abstract types are clearly marked
- Required/optional properties are indicated
- Extended features (when present in schemas):
- Validation constraints (minLength, maxLength, minimum, maximum, pattern, etc.) are documented alongside properties
- Type-specific annotations (precision, scale for decimals, minItems/maxItems for arrays, etc.)
- Each object type in the schema is converted to a Markdown section with its properties documented in a structured list format.
- Choice types (unions) are documented with their selector (if present) and available choices.
- The definitions section documents all reusable type definitions.
avrotize s2csv <path_to_structure_schema_file> [--out <path_to_csv_schema_file>]Parameters:
<path_to_structure_schema_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the CSV schema file to write the conversion result to. If omitted, the output is directed to stdout.
Conversion notes:
- The tool converts JSON Structure schemas to CSV Schema format.
- All JSON Structure Core types are supported including primitives (string, number, boolean, null, integer), extended types (int8-128, uint8-128, float/double, decimal, date, datetime, time, duration, uuid, uri, binary), and compound types (object, array, set, map, tuple, choice).
- Compound types (arrays, objects, maps) are represented as strings in CSV schema, as CSV format doesn't have native support for complex nested structures.
- Required/optional properties are preserved with the
nullableflag. - Validation constraints (maxLength, minLength, pattern, minimum, maximum, precision, scale) are preserved in the CSV schema.
- Enum and const keywords are supported and preserved in the output.
- JSON Structure-specific features like
$ref,$extends, definitions, and namespaces are resolved during conversion.
avrotize s2p <path_to_json_structure_file> --out <path_to_proto_directory> [--naming-mode <naming_mode>] [--allow-optional]Parameters:
<path_to_json_structure_file>: The path to the JSON Structure schema file to be converted. If omitted, the file is read from stdin.--out: The path to the Protocol Buffers schema directory to write the conversion result to. This parameter is required as proto files need to be written to a directory.--naming-mode: (optional) Type naming convention. Choices aresnake,camel,pascal. Default ispascal.--allow-optional: (optional) Enable support for 'optional' keyword for nullable fields (proto3).
Conversion notes:
- The tool converts JSON Structure schemas directly to Protocol Buffers
.protofiles without going through Avrotize Schema. - JSON Structure primitive types (string, number, boolean, null) and extended types (int8-128, uint8-128, float32/64, decimal, date, datetime, time, duration, uuid, uri) are mapped to appropriate Protocol Buffers types.
- Compound types (object, array, set, map, tuple, choice) are converted to Protocol Buffers messages, repeated fields, map fields, and oneof constructs.
- JSON Structure namespaces are resolved into distinct proto package definitions.
- Type references (
$ref) are resolved and converted to appropriate message types. - Choice types (unions) are converted to Protocol Buffers
oneofconstructs. - Abstract types and extensions (
$extends) are handled by generating appropriate message hierarchies.
avrotize pcf <path_to_avro_schema_file>Parameters:
<path_to_avro_schema_file>: The path to the Avrotize Schema file to be converted. If omitted, the file is read from stdin.
Conversion notes:
- The tool generates the Parsing Canonical Form (PCF) of the Avrotize Schema. The PCF is a normalized form of the schema that is used for schema comparison and compatibility checking.
- The PCF is a JSON object that is written to stdout.
avrotize struct2gql [input] --out <path_to_graphql_schema_file>Parameters:
[input]: The path to the JSON Structure schema file. If omitted, the file is read from stdin.--out <path_to_graphql_schema_file>: The path to the output GraphQL schema file.
Conversion notes:
- Converts JSON Structure Core schema to GraphQL schema language (SDL)
- Supports all JSON Structure Core primitive types (string, number, boolean, null)
- Supports extended primitives (binary, int8-128, uint8-128, float/double, decimal, date, datetime, time, duration, uuid, uri, jsonpointer)
- Supports compound types (object, array, set, map, tuple, any, choice)
- Resolves type references ($ref) and maintains proper dependency ordering
- Maps JSON Structure namespaces to GraphQL types with simple names
- Generates custom scalars for specialized types (Date, DateTime, UUID, URI, Decimal, Binary, JSON)
- Required properties are marked with
!in GraphQL - Arrays and sets are represented as GraphQL lists
[Type] - Maps are represented using the JSON scalar type
Example:
# Convert a JSON Structure schema to GraphQL
avrotize struct2gql myschema.struct.json --out myschema.graphql
# Read from stdin and write to stdout
cat myschema.struct.json | avrotize struct2gql > myschema.graphqlThis document provides an overview of the usage and functionality of Avrotize. For more detailed information, please refer to the Avrotize Schema documentation and the individual command help messages.