bunch of upgrades by kjyv · Pull Request #1 · sweepai/jetbrains_plugin

kjyv · 2026-04-27T09:37:24Z

IntelliJ > 261 support, handle a few exceptions
add model chooser
allow using mlx models for next edit suggestion
load models using llama.cpp instead of python wrapper, lower latency
allow chat to use local models instead of sweep API

Port the Python sweep-autocomplete prompt construction and completion parsing logic to Kotlin, eliminating the Python server dependency. The plugin now constructs NES prompts in-process and calls llama-server's /v1/completions endpoint directly. Key components: - NesUtils, NesRetrieval, NesCompletionParser, NesPromptBuilder: Pure Kotlin port of the Python NES logic with 24 unit tests verified against Python-generated fixtures for parity - LlamaServerClient: HTTP client with SSE streaming, early abort on oversized completions, and request cancellation via thread interrupt - NextEditAutocompleteEngine: Top-level orchestrator with two-pass autocomplete (cursor-based + retrieval-based) - NesModelConfig: Model selector supporting 0.5B, 1.5B, and 7B variants - LocalAutocompleteServerManager: Launches llama-server with ngram-mod speculative decoding (--spec-type ngram-mod), auto-downloads model via hf CLI or curl fallback, auto-restarts on model change Performance: ~125ms median latency (2.7x faster than Python path) with llama-server + ngram speculative decoding on Apple Silicon. Includes benchmark script (bin/benchmark_autocomplete.py) for comparing Python vs native engine performance across multiple scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Was only used as developer reference (257K files, 5.9GB git data). The same code is available on GitHub. Removing it reduces .git from 5.9GB to 12MB and git status from 3s to 11ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Major changes: - OpenAIChatService: Fetches models from /v1/models, streams chat completions from /v1/chat/completions (works with LM Studio, Ollama, etc.) - OpenAIAgentService: Multi-turn agent loop with 7 tools (read_file, search_files, glob, list_files, str_replace, create_file, bash) using OpenAI function calling protocol - Stream.kt: Routes to OpenAI path when URL is not a Sweep backend, builds system prompt with current file context, cursor position, and project info - ModelPickerMenu: Fetches models from /v1/models for non-Sweep backends, clears stale Sweep model cache - WelcomeScreen: New local-first welcome page with setup instructions - MarkdownBlock: Added // filepath: comment pattern detection for code blocks, enabling Apply button on model responses - Post-processing: Extracts <think> blocks, generates CodeReplacement annotations for Apply button - Chat works without authentication when native engine is enabled - Stop button works for both chat and agent streaming - Version bumped to 1.30.0 Also fixes: - WriteIntentReadAction crash on IntelliJ 2025.1+ - HttpURLConnection used instead of HttpClient (fixes localhost timeout) - Renamed "Sweep API URL" to "OpenAI Compatible API URL" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Stream.kt - Agent mode is now a single completion turn that hands tool calls to SweepAgent.ingestToolCalls + awaitToolCalls. The existing CONTINUE_AGENT path drives multi-turn naturally — no parallel loop, no concurrent Stream.start() race, no manual loop detection. - New buildOpenAiAgentMessages() converts session messages to OpenAI's tool-calling shape (assistant.tool_calls + tool messages by id), preserving raw JSON arguments via ToolCall.rawText so numbers/booleans aren't string-coerced when echoed back. - System prompt now appends project rules from SweepConfig (SWEEP.md / AGENTS.md / CLAUDE.md, hierarchical + scoped to context). - stop() sets cancelledByUser before the streamingJob null-check so the OpenAI path (which has no streamingJob) can be cancelled. OpenAIAgentService - Drop dead runAgentLoop / executeTool / DESTRUCTIVE_TOOLS — execution goes through the Sweep tool pipeline now. - Drop streamWithToolCallsPublic wrapper and clearActiveConnection, drop unused project parameter. LocalAutocompleteServerManager - Replace java.net.http health check with HttpURLConnection (HttpClient has localhost timeout issues on macOS). - Add terminalStartInProgress guard to avoid duplicate server starts when multiple project windows open simultaneously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

KODerFunk · 2026-05-07T22:19:36Z

@kjyv Can you explain how to build and use it in the latest versions of JB IDEs?
Is it possible to use one of their models locally or another?
How do you use it?

bunch of upgrades

Stefan Bethge and others added 7 commits April 14, 2026 16:34

allow loading in intellij >= 261

332dd6d

handle EDT exception on load

60f24c3

Allow loading mlx models and use mlx-lm

7466ddc

Remove intellij-community submodule

99b9472

Was only used as developer reference (257K files, 5.9GB git data). The same code is available on GitHub. Removing it reduces .git from 5.9GB to 12MB and git status from 3s to 11ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

adi-itgg pushed a commit to adi-itgg/jetbrains-sweep-ai that referenced this pull request May 8, 2026

Merge pull request sweepai#1 from kjyv/fixes

08ed26d

bunch of upgrades

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bunch of upgrades#1

bunch of upgrades#1
kjyv wants to merge 7 commits into
sweepai:mainfrom
kjyv:fixes

kjyv commented Apr 27, 2026

Uh oh!

KODerFunk commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kjyv commented Apr 27, 2026

Uh oh!

KODerFunk commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants