bunch of upgrades#1
Open
kjyv wants to merge 7 commits into
Open
Conversation
kjyv
commented
Apr 27, 2026
- IntelliJ > 261 support, handle a few exceptions
- add model chooser
- allow using mlx models for next edit suggestion
- load models using llama.cpp instead of python wrapper, lower latency
- allow chat to use local models instead of sweep API
Port the Python sweep-autocomplete prompt construction and completion parsing logic to Kotlin, eliminating the Python server dependency. The plugin now constructs NES prompts in-process and calls llama-server's /v1/completions endpoint directly. Key components: - NesUtils, NesRetrieval, NesCompletionParser, NesPromptBuilder: Pure Kotlin port of the Python NES logic with 24 unit tests verified against Python-generated fixtures for parity - LlamaServerClient: HTTP client with SSE streaming, early abort on oversized completions, and request cancellation via thread interrupt - NextEditAutocompleteEngine: Top-level orchestrator with two-pass autocomplete (cursor-based + retrieval-based) - NesModelConfig: Model selector supporting 0.5B, 1.5B, and 7B variants - LocalAutocompleteServerManager: Launches llama-server with ngram-mod speculative decoding (--spec-type ngram-mod), auto-downloads model via hf CLI or curl fallback, auto-restarts on model change Performance: ~125ms median latency (2.7x faster than Python path) with llama-server + ngram speculative decoding on Apple Silicon. Includes benchmark script (bin/benchmark_autocomplete.py) for comparing Python vs native engine performance across multiple scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Was only used as developer reference (257K files, 5.9GB git data). The same code is available on GitHub. Removing it reduces .git from 5.9GB to 12MB and git status from 3s to 11ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major changes: - OpenAIChatService: Fetches models from /v1/models, streams chat completions from /v1/chat/completions (works with LM Studio, Ollama, etc.) - OpenAIAgentService: Multi-turn agent loop with 7 tools (read_file, search_files, glob, list_files, str_replace, create_file, bash) using OpenAI function calling protocol - Stream.kt: Routes to OpenAI path when URL is not a Sweep backend, builds system prompt with current file context, cursor position, and project info - ModelPickerMenu: Fetches models from /v1/models for non-Sweep backends, clears stale Sweep model cache - WelcomeScreen: New local-first welcome page with setup instructions - MarkdownBlock: Added // filepath: comment pattern detection for code blocks, enabling Apply button on model responses - Post-processing: Extracts <think> blocks, generates CodeReplacement annotations for Apply button - Chat works without authentication when native engine is enabled - Stop button works for both chat and agent streaming - Version bumped to 1.30.0 Also fixes: - WriteIntentReadAction crash on IntelliJ 2025.1+ - HttpURLConnection used instead of HttpClient (fixes localhost timeout) - Renamed "Sweep API URL" to "OpenAI Compatible API URL" Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stream.kt - Agent mode is now a single completion turn that hands tool calls to SweepAgent.ingestToolCalls + awaitToolCalls. The existing CONTINUE_AGENT path drives multi-turn naturally — no parallel loop, no concurrent Stream.start() race, no manual loop detection. - New buildOpenAiAgentMessages() converts session messages to OpenAI's tool-calling shape (assistant.tool_calls + tool messages by id), preserving raw JSON arguments via ToolCall.rawText so numbers/booleans aren't string-coerced when echoed back. - System prompt now appends project rules from SweepConfig (SWEEP.md / AGENTS.md / CLAUDE.md, hierarchical + scoped to context). - stop() sets cancelledByUser before the streamingJob null-check so the OpenAI path (which has no streamingJob) can be cancelled. OpenAIAgentService - Drop dead runAgentLoop / executeTool / DESTRUCTIVE_TOOLS — execution goes through the Sweep tool pipeline now. - Drop streamWithToolCallsPublic wrapper and clearActiveConnection, drop unused project parameter. LocalAutocompleteServerManager - Replace java.net.http health check with HttpURLConnection (HttpClient has localhost timeout issues on macOS). - Add terminalStartInProgress guard to avoid duplicate server starts when multiple project windows open simultaneously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@kjyv Can you explain how to build and use it in the latest versions of JB IDEs? |
adi-itgg
pushed a commit
to adi-itgg/jetbrains-sweep-ai
that referenced
this pull request
May 8, 2026
bunch of upgrades
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.