Skip to content

Releases: ail-project/lacus

v1.20.0

17 Nov 16:49
v1.20.0
6614675

Choose a tag to compare

This is an interesting one, and more details below.

The tl;dr is that there is a new key frames in the response which contains the rendered contents of the iFrames loaded from the rendered page at the end of the capture. This key has the following format:

class FramesResponse(TypedDict, total=False): 
                                              
    name: str
    url: str
    content: str | None
    children: list[FramesResponse] | None

Where name and url can be empty strings, the content can be missing / empty (that may be a bug in Playwright? TBD), and there might be children.

When it exists, the url can also be about:blank, about:srcdoc, a base64 encoded stream, or a blob (more details in the MDN). In that case, this specific frame doesn't load 3rd party content, but its children may contain another iFrames which do.


Now, the glorious details.

Until Lacus v1.19.0, the rendered content (html key in the response) contained the rendered content of the main frame only, and none of the other frames inside that main page. We missed the rendered content of every iFrames loaded by that main frame.

There are two ways to load iFrame content:

  • the content is local (as described above), in which case the content pre-rendering is in the parent frame
  • the content is loaded from a 3rd party (the url is really a URL), the content pre-rendering is in the HAR file

Now, if all goes well, we got all the rendered contents for all the iFrames at the same time we get the one from the main page, and we can use it to make a better tree in Lookyloo.

Note that the only other way to properly get the rendered contents for the iframes (AFAIK) is to open the dev tools in your browser and inspect the document, as they will aggregate the content of the iframes in that view. Simply looking at the source of a page in the browser only gets you the content of the response, nothing rendered.

Full Changelog: v1.19.0...v1.20.0

v1.19.0 - Hack.lu 2025

17 Oct 18:46
v1.19.0
3a2a99e

Choose a tag to compare

This is pretty much just a maintenance release with a lot of dependency updates as we're not supporting Python 3.9 anymore.

It still comes with 2 bugfixes:

  • more reliable DNS resolution over socks 5 proxies
  • more flexible cookies import into a capture to please the browser add-on

Full Changelog: v1.18.0...v1.19.0

v1.18.0 - Trusted Timestamps & I2P

19 Sep 11:48
v1.18.0
7b7dec7

Choose a tag to compare

Important: This is the last release supporting Python 3.9.

New features

Trusted Timestamp

Support for Trusted Timestamps (RFC3161). Allows to validate the results of the captures with a trusted Time Stamping Authority (TSA). The default TSA is provided by the German National Research and Education Network (DFN) - thank you very much for offering this service. But you can change it to any other rfc3161-compatible TSA: TrailsOfBits has a list of known-working (and not working) servers, feel free to pick another one in there.

When enabled, the SHA512s for the HAR, the rendered HTML, the page screenshot, the URL in the address bar and any downloaded content are sent to the timestamping service. The result of the capture contains an archive with the verification files.

From Lookyloo (v1.33+), you can then download (Actions -> Get forensic acquisition) an archive that contains all these elements along with the verification files and validate it using openssl (or the tool of your choice) - there is also a pre-populated validation script in the archive.

Modal on Lookyloo to view and download a forensic acquisition

Invisible Internet Project (I2P)

Similarly to tor (.onion) domains, it is now possible to capture .i2p domains. For that, you'll need to install a client. The default lacus config points to the default port for i2p.
If you enable it, give it some time as it takes a while to initialize so if you immediately trigger a capture, it might fail.

i2p domain captured on Lookyloo

Other changes

  • Use Playwright v1.55.0
  • Use orjson instead of the default json library (more efficient for big json files, especially the HAR dumps)
  • Maintenance and bugfixes.

Full Changelog: v1.17.0...v1.18.0

v1.17.0

25 Aug 11:08
v1.17.0
420288b

Choose a tag to compare

Mostly a maintenance and bugfix release from Lacus point of view, but includes many changes in the underlying libraries:

  • New Playwright version (updated browsers)
  • New version of playwright-stealth (browser not as easily flagged as a bot)
  • Support for an init script that is run before any other script once the page is loaded

What's Changed

  • Fix ports and container name in compose file by @litobro in #52
  • Optimized size of dockerfile by @litobro in #53

Full Changelog: v1.16.0...v1.17.0

v1.16.0

12 Jun 14:15
v1.16.0
425edd5

Choose a tag to compare

This is mostly a maintenance release, with bugfixes in the underlying dependencies.

The news is a working dockerfile by @litobro, thank you !

What's Changed

New Contributors

Full Changelog: v1.15.0...v1.16.0

v1.15.0 - GeekWeek X

22 May 10:24
v1.15.0
2f5476f

Choose a tag to compare

New Feature

This releases adds support for Wireguard VPN config files using wireproxy.

In order to use it, you need to install wireproxy, and set the path to the wireproxy executable file in $LACUS_HOME/config/generic.json, key wireproxy_path.

Lacus accepts any valid wireguard configuration file in $LACUS_HOME/config/<Name>.conf. The file is automatically configured for wireproxy, and an instance is launched. Wireproxy exposes a health endpoint, which Lacus checks regularly. If a proxy stops responding for too long, it is automatically disabled.

Once Lacus is running, you can add/remove any new wireguard proxy configuration file in $LACUS_HOME/config, the file will be detected, configured, and launched (or stopped). Note that the file name must be <Name>.conf.

Important Note: Any manual change in the wireproxy config files ($LACUS_HOME/config/<Name>.conf) once they have been configured is not allowed and will be reverted. The only config file you can edit is $LACUS_HOME/config/proxies.json, where you can add a description and anything you want in the meta key.

The example below is showing a config for Tor (with the default socks5 interface) and a proxy existing in the Netherlands (with wireproxy). Changing the proxy_url for a wireproxy config is not recommended, but it will be reflected in the wireproxy config file.

{
    "Tor": {
        "description": "Trigger the capture via the tor network.",
        "meta": {
            "provider": "Tor Project"
        },
        "proxy_url": "socks5://127.0.0.1:9050"
    },
    "Netherlands": {
        "description": "Proxy for Netherlands",
        "dns_resolver": "10.2.0.1",
        "meta": {
            "provider": "My wireguard config provider"
        },
        "proxy_url": "socks5://127.0.0.1:25310"
    }
}

API

The proxies are exposed in the Lacus API (https://<Lacushost>/proxies):

{
  "Netherlands": {
    "description": "Proxy for Netherlands",
    "meta": {
      "provider": "wireguard"
    }
  },
  "Tor": {
    "description": "Trigger the capture via the tor network.",
    "meta": {
      "provider": "Tor Project"
    }
  }
}

As you can see, the proxy_url is not in the response because if a proxy name (Netherlands, Tor) is passed in the proxy key when triggering a capture, Lacus automatically replaces it with the relevant URL.

Full Changelog: v1.14.0...v1.15.0

v1.14.0

21 Apr 18:25
v1.14.0
25c5f43

Choose a tag to compare

New Feature

  • Pre-configure proxies for the lookyloo instance (config example for: config/proxies.json)
{
    "Tor": {
        "proxy_url": "socks5://127.0.0.1:9050",
        "description": "Trigger the capture via the tor network.",
        "meta": {
            "provider": "Tor Project"
        }
    },
    "US": {
        "proxy_url": "socks5://127.0.0.1:32321",
        "dns_resolver": "1.1.1.1", 
        "description": "Trigger the capture via a Socks5 proxy in the US.",
        "meta": {
            "provider": "My own stash of Proxies"
        }
    }
}
  • Expose the proxies in the API (description and meta keys only)

What's Changed

New Contributors

Full Changelog: v1.13.1...v1.14.0

v1.13.1 - Hackathon release

09 Apr 13:51
v1.13.1
c42ab83

Choose a tag to compare

  • Add support for storage state in the query, and in the response

What's Changed

  • feat: make screenshot capturing optional by @jgarl in #45

New Contributors

  • @jgarl made their first contribution in #45

Full Changelog: v1.13.0...v1.13.1

v1.13.0

18 Feb 15:15
v1.13.0
d96b175

Choose a tag to compare

Most importantly, lacus has a logo now

Lacus Logo

New features

  • Support for captures with a headed browser (setting allow_headed). Only possible if lacus is running in a graphical environment. (see below for details)
  • Expose the results expiration timeout (setting expire_results to a lower value reduces the memory use if you have many captures)

Changes

Full Changelog: v1.12.0...v1.13.0

Notes for headed captures

The classical use of Lacus is to have it running on a server with no graphical interface (no X/Wayland server). The capture with Playwright uses a headless browser, runs some interactions on the page (see PlaywrightCapture for details), and finishes after a certain amount of time and/or no traffic. This method is good enough most of the time, but all the interactions on the page are predefined and cannot be modified by the user triggering the capture.

In order to use the headed option, you need the following:

  • The configuration setting "allow_headed" = True in config/generic.json
  • Lacus installed on a machine with a graphical interface
  • Pass headless set to False in the capture settings
  • Optionally general_timeout_in_sec set to the amount of time you want to interact with the page (it is set to 90 by default)

The headed capture mode opens a full browser configured with the settings passed to the capture, but it won't run the predefined interactions. Instead, it lets the user interact with the page for a set amount of time (general_timeout_in_sec), stops the capture, and store the result as usual. It is mostly helpful to manually bypass captchas and other techniques used by websites to detect bots.

v1.12.0

06 Nov 10:47
v1.12.0

Choose a tag to compare

This release requires some system upgrades:

  • Valkey 8.0+
  • Python 3.9+

Lacus changes

Full Changelog: v1.11.0...v1.12.0

  • Optionally disable JavaScript during a capture
  • Optionally configure the number of retries in case a capture fails (configurable globally, and for each capture)

LacusCore changes

Full Changelog: ail-project/LacusCore@v1.11.0...v1.12.0

PlaywrightCapture changes

Full Changelog: Lookyloo/PlaywrightCapture@v1.26.0...v1.27.0