maria [she/her]@lemmy.blahaj.zone

maria [she/her]@lemmy.blahaj.zone

wow! agents sure can do stuff.

Agent Basics

so like - claude code is getting a lot of attention lately. its some coding agent thing using their models.

lemmy just define what an agent is real quick:

agents dont just yap but they actually cook, as in: they can do stuff.
this works by having a secret back and fourth with the computer, its essentially doing function calls, but we call them “tool calls” because… why not.

Here an example of what this might look like

(every “quote” is the “user” role and every bullet point is the assistant response)

(here are some hidden instructions on how to make tool calls) hey, plz put the stock prices of fairphone over last 2 years in some txt plzplz <3 also gimme a kiss when ur done >////<

Sure! First off all all, imma look up the stock prices.
<tool>get_stock_prices(company=“fairphone”, start=“01-01-2024”, end=“01-01-2026”, increment=“month”)</tool>

<tool_response>
blablabla some stock prices here…
</tool_response>

Ah, okay, I now have the stock prices, not imma make that txt
<tool>create_file(path=“/fairphone_stock_prices_2025.txt”, content=“the actual file content here”)</tool>

<tool_response>file /fairphone_stock_prices_2025.txt has been created sucessfully. You can send the file to the user using the send_file(path: str, subtitle: str) command</tool_response>

Now I will send you the file
<tool>send_file(path=“/fairphone_stock_prices_2025.txt”, subtitle=“heres ur file”)</tool>

<tool_response>“/fairphone_stock_prices_2025.txt” has been prepared to send and it will be attached to the last message you send</tool_response>

Hey user, i made that txt u asked for! The file is attached below for download. Oh yea, also kisses u <3

Agents scaling Problem

this agentic loop works pretty well. but it doesn’t scale well to hundreds - or even thousands of tools.

agents must be fed with tool-definitions, which have:

name
description
arguments (usually with descriptions and types)
example tool uses (optional)

for each individual tool.

now, all this stuff stuffs the LMs context, resulting in

higher costs (because of more input tokens)
it just confuses the model. could you remember hundreds of functions all at once?.. i couldn’t.

aaaaa How do we solve this?

introducing!!!

docs!!! but for language models.

yes! that’s what anthropic is trying to propose with SKILL.md. here’s what that looks like:

u just give the agent a filesystem with some docs and some useful files.
a “skill” is just a directory with a SKILL.md file in it. that file must include a yaml header with name and description
every single SKILL.md file header in the filesystem is automatically loaded into the agents context along with the files location and an instruction “Read any skill file if it appears useful” or something similar

your filesystem may look like this

/
- skills/
  - stock_prices/
    - SKILL.md
    - stock_stools.py
    - report_format.md
  - stack_overflow/
    - SKILL.md
    - stack_overflow_tools.py
  - creative_writing/
    - SKILL.md
    - themes.md
    - preferred_format.md

the /skills/stock_prices/SKILL.md may look like this:

---
name: stock_prices
description: Useful for detailed stock analysis
---
## Getting stock Prices
Use ./stock_prices_tools.py to run code which gets you the stock prices of registered companies:

` ` `python
import "/skills/stock_prices/stock_tools.py" as stonks
stocks_for_a_year = stonks.get_prices(company="nvidia", from="01-01-2001", to="01-01-2020", increment="1 month")

stocks_today = stonks.get_price(company="nvidia")

print("Stock prices from 2001 til 2020:\n" + str(stocks_for_a_year) + "\n\ntodays price: " + str(stocks_today))
` ` `
no need to read the file itself.

## Creating reports
Read ./report_format.md for the users preferred structure of a stock prices report

[...]

so… we simply decentralize the knowledge and the tools, by just putting them into files, so that things can be read once they become useful.

but Isn’t this super unsafe?

this approach assumes that the given agent can write and run python code.

this works great for such agents… but well… its also rather unsafe, isn’t it?

so you would have to

either trust the agent with whatever it’s doin (like editing and removing files and such)
or supervise its every step

both options are bad. so! let’s build our own better solution

Let’s make this better

how about this? we add a .tools/ dir into every skill dir. and each file in that .tools/ dir will be a safe python file, which defines a single tool, like this:

/
- skills/
  - stock_prices/
    - SKILL.md
    - report_format.md
    - .tools/
      - get_prices.py
      - get_price.py
      - […]

then, once a SKILL.md file is read, all the tools in its .tools/ dir are loaded into the agents memory, so it kinda explores to find new tools.

So what does this look like in practice?

this SKILL.md setup gives us:

essentially infinite tools
dynamic context retrieval (only reads stuff when necessary)
usually lower token prices
the ability to define standard operating procedures without forcing it into the agents memory

so like - gimme an example

okay okay, heres an example of what this might look like, without anything hidden. again, each quote is a “user” message and each bullet point is an “assistant” message:

You are an agent and you have access to these tools:

read(path: str) # returns file content or dir content

create_file(path: str, content: str)
You can use tools using this format:
<tool>name_of_the_tool(arg1=your value here, ag2=another value here, …)</tool> Simply include that format at the end of your message to use it.

Skills

Read these skill files when they become useful:

/skills/stock_prices/SKILL.md (Useful for detailed stock analysis)

/skills/stack_overflow/SKILL.md (Read this to search for help on programming problems)

/skills/creative_writing/SKILL.md (For writing stories, poems and whatever creative task the user feels you should perform) Today is 16-01-2026 hey, plz put the stock prices of fairphone over last 2 years in some txt plzplz <3 also gimme a kiss when ur done >////<

Sure! I will read the stocks skill
<tool>read(path=“/skills/stock_prices/SKILL.md”)</tool>

content of /skills/stock_prices/SKILL.md
blablabla here the entire content of that file
two tools have been added:

get_stock_prices(company: str, from: str, to: str, increment: str) # retrieves stock prices over time. “from” and “to” must have this format “DD-MM-YYYY” and increment can be one of “day”, “week” “month”, “year” with optional number prefix like “3 week”

get_stock_price(company: str) # returns current stock price of given company

Right, I will look at the report format.
<tool>read(“/skills/stock_prices/report_format.md”)

content of /skills/stock_prices/report_format.md
some format stuff here, eh - whatever

okay, now lemmy see those stock prices!
<tool>get_stock_prices(company=“fairphone”, start=“01-01-2024”, end=“01-01-2026”, increment=“month”)</tool>

<tool_response>
blablabla some stock prices here…
</tool_response>

(the rest of this continues like the initial example i gave)

now, the important thing to note here is that we didn’t include the context of all those other skills. here, the agent doesn’t need to know about creative writing or whatever - just about stocks stuff.

we essentially lead the agent around the filesystem, by prompting it in the files to look at other files if they appear useful.

this - is - huge!

the tool amount was rather limited previously, and - best of all: it doesn’t even have to be about tools! it can be SOPs, style directions, or just some info about yourself which doesn’t need to be loaded all the time.

butbutbut!!! *even bester-of-all!?!!?!

You can share Skills online!

now this - THIS!!! changes things.

sure, being able to make a skill to solve some edge case is cool. BUT!

now you can go share your skill online, and nobody has to deal with that edge case anymore.

this way, you slowly accumulate more skills, improving your agent with more and more capabilities.

But what if I have too many Skills?

oh no! the same problem from earlier came back! aaaaaaaa >o<

worry not ~ for there is a super simple solution:

just move some folders and edit some files.

yes, just cluster some skills together! maybe put stock_prices along with web_search into a new skill online_stuff with a description “For searching the web and getting stock prices”. then rename those old SKILL.md files to something else… like maybe general.md or whatever and remove their header.

Verdict iguess

this stuff is cool.
i always liked the idea of structuring stuff as files and dirs instead of json or whatever sql uses. it makes editing the data SO much quicker and easier for everyone.

if u wanted to edit an agents behaviour earlier, you would have to

know prompting stuff
know programming basics
go and find the files somewhere in your agents directory
hope that they expose their tools
somewhere in there, adjust a system prompt

but now! u can just - add a directory with stuff in it you want the agent to have.

its - so simple and obvious - but soooooooooooooooo useful.

i feel the urge to… write some random skill and see the agent explore the dir and say “ah, this seems useful” and get all giddy bout it ----- but that’s probably just me…

SKILL.md: ah yes, finally, docs for language models!