meddler meddler
  • Home
  • About
  • AI Agents
  • Coding Agents
  • Reading List
  • Open Source AI
  • Skills Map
  • Quick Search ⌘K
  • More
    Benchmarks Security Tutorials Lifecycle Topics Authors Contact
Controls
Search ⌘K Theme Auto
Menu
  • Home
  • About
  • Contact
Coverage
  • AI Agents
  • Coding Agents
  • Reading List
  • Benchmarks
  • Security
  • Tutorials
  • Open Source AI
Directory
  • Skills Map ✦
  • Topics
  • Authors
  • Privacy
  • Terms

anontruder

Hi I'm anontruder
LA 62 posts
Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases ai-agents-2-2

Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases

Real-world software engineering tasks require coding agents that can operate on massive repositories, sustain long-horizon sessions, and reliably coordinate complex toolchains at test time. Existing research-grade coding...

  • Go to the profile of  anontruder
Ethan Shaw
11 Dec 2025 · 1 min read
Agint: Agentic Graph Compilation for Software Engineering Agents ai-agents-2-2

Agint: Agentic Graph Compilation for Software Engineering Agents

LLM-based coding agents are increasingly common but still face challenges in context management, latency, reliability, reproducibility, and scalability. We present Agint, an agentic graph compiler, interpreter, and runti...

  • Go to the profile of  anontruder
Maya Collins
24 Nov 2025 · 1 min read
Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly? ai-agents-2-2

Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

Large Language Models (LLMs) are reshaping almost all industries, including software engineering. In recent years, a number of LLM agents have been proposed to solve real-world software problems. Such software agents are...

  • Go to the profile of  anontruder
Noah Bennett
17 Nov 2025 · 1 min read
The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents ai-agents-2-2

The OpenHands Software Agent SDK: A Composable and Extensible Foundation for Production Agents

Agents are now used widely in the process of software development, but building production-ready software engineering agents is a complex task. Deploying software agents effectively requires flexibility in implementation...

  • Go to the profile of  anontruder
Liam Carter
5 Nov 2025 · 1 min read
A Comprehensive Empirical Evaluation of Agent Frameworks on Code-centric Software Engineering Tasks ai-agents-2-2

A Comprehensive Empirical Evaluation of Agent Frameworks on Code-centric Software Engineering Tasks

Unlike traditional automation tools or static LLM-based systems, agents combine decision-making and tool utilization to accomplish complex tasks, showing great potential in software engineering. However, existing studies...

  • Go to the profile of  anontruder
Ava Brooks
2 Nov 2025 · 1 min read
TOM-SWE: User Mental Modeling For Software Engineering Agents ai-agents-2-2

TOM-SWE: User Mental Modeling For Software Engineering Agents

Recent advances in coding agents have made them capable of planning, editing, running, and testing complex code bases. Despite their growing ability in coding tasks, these systems still struggle to infer and track user i...

  • Go to the profile of  anontruder
Owen Blake
24 Oct 2025 · 1 min read

TEST

TEtt

  • Go to the profile of  anontruder
anontruder
4 Oct 2025 · 1 min read
RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents ai-agents-2-2

RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents

Code agents have gained widespread adoption due to their strong code generation capabilities and integration with code interpreters, enabling dynamic execution, debugging, and interactive programming capabilities. While...

  • Go to the profile of  anontruder
Nina Reed
2 Oct 2025 · 1 min read
TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture ai-agents-2-2

TUMIX: Multi-Agent Test-Time Scaling with Tool-Use Mixture

While integrating tools like Code Interpreter and Search has significantly enhanced Large Language Model (LLM) reasoning in models like ChatGPT Agent and Gemini-Pro, practical guidance on optimal tool use is lacking. The...

  • Go to the profile of  anontruder
Leo Parker
30 Sep 2025 · 1 min read
PerfBench: Can Agents Resolve Real-World Performance Bugs? ai-agents-2-2

PerfBench: Can Agents Resolve Real-World Performance Bugs?

Performance bugs are inefficiencies in software that waste computational resources without causing functional failures, making them particularly challenging to detect and fix. While recent advances in Software Engineerin...

  • Go to the profile of  anontruder
Aria Patel
28 Sep 2025 · 1 min read
Context Engineering for Multi-Agent LLM Code Assistants Using Elicit, NotebookLM, ChatGPT, and Claude Code ai-agents-2-2

Context Engineering for Multi-Agent LLM Code Assistants Using Elicit, NotebookLM, ChatGPT, and Claude Code

Large Language Models (LLMs) have shown promise in automating code generation and software engineering tasks, yet they often struggle with complex, multi-file projects due to context limitations and knowledge gaps. We pr...

  • Go to the profile of  anontruder
Zoe Walker
9 Aug 2025 · 1 min read
SetupBench: Assessing Software Engineering Agents' Ability to Bootstrap Development Environments ai-agents-2-2

SetupBench: Assessing Software Engineering Agents' Ability to Bootstrap Development Environments

Modern Large Language Model (LLM) agents promise end to end assistance with real-world software tasks, yet existing benchmarks evaluate LLM agents almost exclusively in pre-baked environments where every dependency is pr...

  • Go to the profile of  anontruder
Ethan Shaw
11 Jul 2025 · 1 min read
Unified Software Engineering Agent as AI Software Engineer ai-agents-2-2

Unified Software Engineering Agent as AI Software Engineer

The growth of Large Language Model (LLM) technology has raised expectations for automated coding. However, software engineering is more than coding and is concerned with activities including maintenance and evolution of...

  • Go to the profile of  anontruder
Maya Collins
17 Jun 2025 · 1 min read
SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling ai-agents-2-2

SWE-Dev: Building Software Engineering Agents with Training and Inference Scaling

Large language models (LLMs) have advanced rapidly from conversational problem solving to addressing real-world tasks involving tool use, such as software engineering (SWE). Recent LLM-powered toolkits, such as OpenAI Co...

  • Go to the profile of  anontruder
Noah Bennett
9 Jun 2025 · 1 min read
From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents ai-agents-2-2

From Knowledge to Noise: CTIM-Rover and the Pitfalls of Episodic Memory in Software Engineering Agents

We introduce CTIM-Rover, an AI agent for Software Engineering (SE) built on top of AutoCodeRover (Zhang et al., 2024) that extends agentic reasoning frameworks with an episodic memory, more specifically, a general and re...

  • Go to the profile of  anontruder
Liam Carter
29 May 2025 · 1 min read
SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents ai-agents-2-2

SWE-PolyBench: A multi-language benchmark for repository level evaluation of coding agents

Coding agents powered by large language models have shown impressive capabilities in software engineering tasks, but evaluating their performance across diverse programming languages and real-world scenarios remains chal...

  • Go to the profile of  anontruder
Ava Brooks
11 Apr 2025 · 1 min read
New tools for building agents ai-agents-2-2

New tools for building agents

Covers modern agent building blocks: Responses API, tool use, and SDK-level orchestration primitives.

  • Go to the profile of  anontruder
Owen Blake
11 Mar 2025 · 1 min read
LLM Agents Making Agent Tools ai-agents-2-2

LLM Agents Making Agent Tools

Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks by dynamically utilising external software components. However, these tools must be implemented in advance b...

  • Go to the profile of  anontruder
Nina Reed
17 Feb 2025 · 1 min read
meddler meddler

meddler

Explore

  • AI Agents
  • Coding Agents
  • Reading List
  • Topics
  • Open Source AI

Company

  • About
  • Authors
  • Contact
  • Podcast

Legal

  • Privacy Policy
  • Terms of Use
  • Cookie Policy
  • Editorial Policy
© 2026 meddler. All rights reserved.
RSS Sitemap Support