Skip to content

Latest commit

 

History

History
282 lines (181 loc) · 12.5 KB

File metadata and controls

282 lines (181 loc) · 12.5 KB

Contributing to httplint

Contributions - in the form of code, bugs, or ideas - are very welcome!

Setting up a Development Environment

It should be possible to use modern Unix-like environment, provided that a recent release of Python is installed.

Thanks to Makefile.venv, a Python virtual environment is set up and run each time you use make. As long as you use make, Python dependencies will be installed automatically.

Helpful make targets include:

  • make shell - start a shell in the Python virtual environment
  • make python - start an interactive Python interpreter in the virtual environment
  • make lint - run pylint with httplint-specific configuration
  • make typecheck - run mypy to check Python types
  • make tidy - format Python source
  • make test - run the tests

You can run the tests in an individual field/parsers/foo_bar.py file by running make test_field_foo_bar.

Coding Conventions

  • All user-visible strings need to be internationalised; see TRANSLATION.md.
  • Every new field and every new Note should have a test covering it.
  • All Python functions and methods need to have type annotations. See pyproject.toml for specific pylint and mypy settings.

Before you Submit

The best way to submit a change is through a pull request. A few things to keep in mind when you're doing so:

  • Run make tidy.
  • Check your code with make lint and address any issues found.
  • Check your code with make typecheck and address any issues found.

If you're not sure how to dig in, feel free to ask for help, or sketch out an idea in an issue first.

Common Tasks

Adding a New Field Handler

The httplint/field/parser directory contains field handlers for httplint. They parse and check field values, set any field-specific notes that are appropriate, and join the values together in a data structure that represents the field.

Note that not all checks are in these files; ones that require coordination between several fields' values, for example, belong in a separate type of check (as cache testing is done, in cache.py). This is because fields can come in any order, so you can't be sure that another field's value will be available when your parser runs.

It's pretty easy to add support for new fields. To start, fork the source and add a new file into the parsers directory, whose name corresponds to the field's name, but in all lowercase, and with special characters (most commonly, -) transposed to an underscore.

For example, if your field's name is Foo-Example, the appropriate filename is foo_example.py.

If your field name doesn't work with this convention, please raise an issue.

The easiest way to get started is to copy httplint/field/parsers/field.tpl to your new file.

from httplint.field import HttpListField
from httplint.note import categories
from httplint.syntax import rfc9110
from httplint.types import ResponseLinterProtocol

class vary(HttpListField[ResponseLinterProtocol]):
    canonical_name = "Vary"
    description = """..."""
    reference = f"{rfc9110.SPEC_URL}#field.vary"
    syntax = rfc9110.Vary
    category = categories.CONNEG

Protocol Specialization

All field classes and their tests should be specialized with the appropriate message protocol from httplint.types.

When you specialize a field handler, it automatically sets the valid_in_requests and valid_in_responses properties for you:

  • RequestLinterProtocol: valid_in_requests is True, valid_in_responses is False.
  • ResponseLinterProtocol: valid_in_requests is False, valid_in_responses is True.
  • AnyMessageLinterProtocol: Both are True.

These are exposed as instance properties to ensure they are only used during a linting pass. Do not access them on the class itself.

Example import:

from httplint.types import AddNoteMethodType, RequestLinterProtocol

The HttpListField Class

Most HTTP headers are comma-separated lists of values. To implement one, inherit from httplint.field.HttpListField.

When using HttpListField:

  • parse is called for each individual value in the list (separated by commas).
  • value (the property accessed in evaluate) is a list of the results of calling parse.
  • syntax (if provided) is checked against each individual value in the list, automatically adding notes if the syntax is invalid.
  • valid_in_requests and valid_in_responses are used to automatically add notes if the field is used in an invalid context.
  • deprecated is used to automatically add notes if the field is deprecated.
  • category (if provided) is used to group automatically added notes.

The SingletonField Class

If the header you are adding strictly allows only one value (e.g., Age, Date), it should inherit from httplint.field.singleton_field.SingletonField (which itself inherits from HttpField).

SingletonField has no additional instance variables, but it alters how parse and evaluate work.

When using SingletonField:

  • parse is called once for the entire field value.
  • value (the property accessed in evaluate) is a single value, representing the first call to parse.
  • syntax (if provided) is checked against the entire value.
  • If the field is repeated in a message, it will be flagged with a SINGLE_HEADER_REPEAT note and only the first value will be used.

For example, a SingletonField like Age starts with:

from httplint.field.singleton_field import SingletonField
from httplint.syntax import rfc9111
from httplint.types import ResponseLinterProtocol

class age(SingletonField[ResponseLinterProtocol]):
    canonical_name = "Age"
    description = """\
The `Age` response header conveys the sender's estimate of the amount of time since the response
(or its validation) was generated at the origin server."""
    reference = f"{rfc9111.SPEC_URL}#field.age"
    syntax = False  # rfc9111.Age
    deprecated = False

The StructuredField Class

If the header you are adding is a Structured Field, it should inherit from httplint.field.structured_field.StructuredField (which itself inherits from HttpField).

StructuredField has one additional instance variable:

  • sf_type - string, the type of Structured Field (item, list, or dictionary). required

When using StructuredField, you do not need to provide a syntax regex or implement a parse method; the base class handles parsing using the http_sf library.

For example, a StructuredField like Cache-Status starts with:

from httplint.field.structured_field import StructuredField
from httplint.types import ResponseLinterProtocol

class cache_status(StructuredField[ResponseLinterProtocol]):
    canonical_name = "Cache-Status"
    reference = "https://www.rfc-editor.org/rfc/rfc9211.html"
    description = """..."""
    sf_type = "list"

The parse method

The parse method is called with two arguments, field_value and add_note, for each field line in the message. In other words, in this message:

Foo: a
Foo: b
Bar: 1, 2, 3
Baz: "def, ghi"

parse will be called twice for Foo (once with the field_value a and once with b) and once for Bar and once for Baz (with the field_values 1, 2, 3 and "def, ghi" respectively).

parse should return the parsed value corresponding to the field_value. If there is an error and the value shouldn't be remembered, raise ValueError.

When using HttpListField, parse will be called for each item in the list, after separating on commas (excepting those inside quoted strings). In the example above, if Bar were a HttpListField, parse would be called three times for Bar (with the field_values 1, 2, and 3), but still only once for Baz.

Note that syntax is checked against each item before parse is called in a HttpListField.

The evaluate method

evaluate is called with one argument, add_note. It allows setting _Note_s about the complete field.

evaluate is called once all of the field lines are processed, to enable the entire set of the field's values to be considered. To access the parsed value(s), use the value instance variable.

When using HttpListField, value is a list of the results of calling parse. When using SingletonField, it is a single value, representing the first call to parse.

The post_check method

post_check is called with one argument, add_note.

It is called after the entire message has been processed, including the body and other post-parsing steps (like checking for cacheability). This is the appropriate place for checks that rely on the state of the message derived from other fields or the body. Use self.message to access message state.

Note that if self.message.no_content is set, the body will not be processed, so checks relying on it should be skipped.

The pre_check method

pre_check is called with one argument, add_note. It is called before parsing or evaluating the field. If it returns False, processing for that field is aborted. This is useful for checking if a field is valid in the current context (e.g., request vs. response) before doing expensive parsing.

The message instance variable

Some checks may need to access other parts of the message; for example, the HTTP status code. You can access the httplint.message instance that the field is part of using the message instance variable.

Creating Notes

Field checkers report their results using _Note_s.

The _summary field of a Note is plain text and should be reasonably short (e.g., about one line of text). In REDbot, it's what's displayed in the "Notes" section of the results.

The _text field of a Note is markdown. That means it should NOT be indented. In REDbot, it's what's displayed when you hover over the summary.

Common notes used by multiple fields should be in httplint/field/notes.py. When possible, bias towards emitting a single note for a condition with details in _text, rather than creating multiple notes.

When reporting a syntax error, prefer BAD_SYNTAX_DETAILED over BAD_SYNTAX if you can provide the specific value that failed and the reason why. BAD_SYNTAX_DETAILED takes value and problem arguments to populate the note details.

Notes can also have child notes (sub-notes) attached to them. To do so, call add_child on the parent note instance (returned by add_note). add_child takes the Note class and any variables as arguments; the subject and other variables are inherited from the parent note.

parent_note = add_note(MY_NOTE)
parent_note.add_child(MY_SUB_NOTE)

Each field definition should also include tests, as subclasses of httplint.field.tests.FieldTest, which should be specialized with the appropriate message protocol.

It expects the following class properties:

  • name - the field field-name
  • inputs - a list of field field-values, one item per line. E.g., [b"foo", b"foo, bar"]
  • expected_out - the data structure that parse should return, given the inputs
  • expected_notes - a list of httplint.note.Note classes that are expected to be set with add_note when parsing the inputs

Example:

from httplint.field.tests import FieldTest
from httplint.types import ResponseLinterProtocol

class YourFieldTest(FieldTest[ResponseLinterProtocol]):
    ...

You can create any number of tests this way; they will be run automatically when tests/test_fields.py is run.

Intellectual Property

By contributing code, bugs or enhancements to this project (whether that be through pull requests, the issues list, e-mail or other means), you are licensing your contribution under the project's terms.