• Django

    ,

    Python

    ,

    LLM

    🤖 I released files-to-claude-xml and new development workflows

    After months of using and sharing this tool via a private gist, I finally carved out some time to release files-to-claude-xml.

    Despite my social media timeline declaring LLMs dead earlier today, I have used Claude Projects and Artifacts.

    My workflow is to copy a few files into a Claude Project and then create a new chat thread where Claude will help me write tests or build out a few features.

    My files-to-claude-xml script grew out of some research I did where I stumbled on their Essential tips for long context prompts which documents how to get around some file upload limits which encourages uploading one big file using Claude’s XML-like format.

    With files-to-claude-xml, I build a list of files that I want to import into a Claude Project. Then, I run it to generate a _claude.xml file, which I drag into Claude. I create a new conversation thread per feature, then copy the finished artifacts out of Claude once my feature or thread is complete.

    After the feature is complete, I delete the _claude.xml file from my project and replace it with an updated copy after I re-run files-to-claude-xml.

    Features on the go

    One bonus of using Claude Projects is that once everything is uploaded, I can use the Claude iOS app as a sort-of notes app and development tool. I can start parallel conversation threads and have it work on new ideas and features. Once I get back to my desktop, I can pull these chat conversations up, and if I like the direction of the feature, I might use them. If not, I have wasted no time or effort on them. This also serves as a nice ToDo list.

    New workflows

    I am working on side projects further using this methodology. Sometimes, I would like to work on something casually while watching Netflix, but my brain shuts off from coding during the day. Instead of feeling bad that I haven’t added share links to a website or some feature I meant to add last week, I can pair Claude to work on it with me.

    I can also get more done with my lunch hours on projects like DjangoTV than I could have otherwise. Overall, I’m happy to have an on-demand assistant to pair with and work on new features and ideas.

    It’s also quicker to try out new ideas and projects that I would have needed to make time for.

    Alternatives

    Simon Willison wrote files-to-prompt, which I think is also worth trying. I contributed to the discussion, feedback, and document structure for the --cxml feature.

    I wrote files-to-claude-xml before Simon had cxml support and hoped to not release my version.

    However, after trying it out on several projects, my ignore/exclude list grew more significant than the files that I wanted to include in my project to send to Claude. I found it easier to generate a list of files to pass to mine instead of maintaining a long list to exclude.

    Saturday October 12, 2024
  • Python

    ,

    Ollama

    ,

    LLM

    ,

    Today I Learned

    🦙 Ollama Llama 3.1 Red Pajama

    For a few weeks, I told friends I was excited to see if the new Llama 3.1 release was as good as it was being hyped.

    Yesterday, Llama 3.1 was released, and I was impressed that the Ollama project published a release to Homebrew and had the models ready to use.

    ➜ brew install ollama
    
    ➜ ollama serve
    
    # (optionally) I run Ollama as a background service
    ➜ brew services start ollama
    
    # This takes a while (defaults to the llama3.1:8b model)
    ➜ ollama pull llama3.1:latest 
    
    # (optional) This takes a longer time
    ➜ ollama pull llama3.1:70b
    
    # (optional) This takes so long that I skipped it and ordered a CAT6 cable...
    # ollama pull llama3.1:405b
    

    To use chat with the model, you use the same ollama console command:

    ➜ ollama run llama3.1:latest
    >>> how much is 2+2?
    The answer to 2 + 2 is:
    4!```
    
    ## Accessing Ollama Llama 3.1 with Python
    
    The Ollama project has an [`ollama-python`](https://github.com/ollama/ollama-python) library, which I use to build applications. 
    
    My demo has a bit of flare because there are a few options, like `--stream,` that improve the quality of life while waiting for Ollama to return results. 
    
    ```python
    # hello-llama.py
    import typer
    
    from enum import Enum
    from ollama import Client
    from rich import print
    
    
    class Host(str, Enum):
        local = "http://127.0.0.1:11434"
        the_office = "http://the-office:11434"
    
    
    class ModelChoices(str, Enum):
        llama31 = "llama3.1:latest"
        llama31_70b = "llama3.1:70b"
    
    
    def main(
        host: Host = Host.local,
        local: bool = False,
        model: ModelChoices = ModelChoices.llama31,
        stream: bool = False,
    ):
        if local:
            host = Host.local
    
        client = Client(host=host.value)
    
        response = client.chat(
            model=model.value,
            messages=[
                {
                    "role": "user",
                    "content": \
                        "Please riff on the 'Llama Llama Red Pajama' book but using AI terms like the 'Ollama' server and the 'Llama 3.1' model."
                        "Instead of using 'Llama Llama', please use 'Ollama Llama 3.1'.",
                }
            ],
            stream=stream,
        )
    
        if stream:
            for chunk in response:
                print(chunk["message"]["content"], end="", flush=True)
            print()
    
    	else:
            print(f"[yellow]{response['message']['content']}[/yellow]")
    
    if __name__ == "__main__":
        typer.run(main)
    

    Some of my family’s favorite books are the late Anna Dewdney’s Llama Llama books. Please buy and support their work. I can’t read Llama 3.1 and Ollama without considering the “Llama Llama Red Pajama” book.

    To set up and run this:

    # Install a few "nice to have" libraries
    ➜ pip install ollama rich typer
    
    # Run our demo
    ➜ python hello-llama.py --stream
    
    Here's a riff on "Llama Llama Red Pajama" but with an AI twist:
    
    **Ollama Llama 3.1, Ollama Llama 3.1**
    Mama said to Ollama Llama 3.1,
    "Dinner's done, time for some learning fun!"
    But Ollama Llama 3.1 didn't wanna play
    With the data sets and algorithms all day.
    
    He wanted to go out and get some rest,
    And dream of neural nets that were truly blessed.
    But Mama said, "No way, young Ollama Llama 3.1,
    You need to train on some more NLP."
    
    Ollama Llama 3.1 got so mad and blue
    He shouted at the cloud, "I don't wanna do this too!"
    But then he remembered all the things he could see,
    On the Ollama server, where his models would be.
    
    So he plugged in his GPU and gave a happy sigh
    And trained on some texts, till the morning light shone high.
    He learned about embeddings and wordplay too,
    And how to chat with humans, that's what he wanted to do.
    
    **The end**
    

    Connecting to Ollama

    I have two Macs running Ollama and I use Tailscale to bounce between them from anywhere. When I’m at home upstairs it’s quicker to run a local instance. When I’m on my 2019 MacBook Pro it’s faster to connect to the office.

    The only stumbling block I ran into was needing to set a few ENV variables setup so that Ollama is listening on a port that I can proxy to. This was frustrating to figure out, but I hope it saves you some time.

    ➜ launchctl setenv OLLAMA_HOST 0.0.0.0:11434
    ➜ launchctl setenv OLLAMA_ORIGINS http://*
    
    # Restart the Ollama server to pick up on the ENV vars
    ➜ brew services restart ollama
    

    Simon Willison’s LLM tool

    I also like using Simon Willison’s LLM tool, which supports a ton of different AI services via third-party plugins. I like the llm-ollama library, which allows us to connect to our local Ollama instance.

    When working with Ollama, I start with the Ollama run command, but I have a few bash scripts that might talk to OpenAI or Claude 3.5, and it’s nice to keep my brain in the same tooling space. LLM is useful for mixing and matching remote and local models.

    To install and use LLM + llm-ollama + Llama 3.1.

    Please note that the Ollama server should already be running as previously outlined.

    # Install llm
    ➜ brew install llm
    
    # Install llm-ollama
    ➜ llm install llm-ollama
    
    # List all of models from Ollama
    ➜ llm ollama list-models
    
    # 
    ➜ llm -m llama3.1:latest "how much is 2+2?"
    The answer to 2 + 2 is:
    
    4
    

    Bonus: Mistral Large 2

    While I was working on this post, Mistral AI launched their Large Enough: Mistral Large 2 model today. The Ollama project released support for the model within minutes of its announcement.

    The Mistral Large 2 release is noteworthy because it outperforms Lllama 3.1’s 405B parameter model and is under 1/3 of the size. It is also the second GPT-4 class model release in the last two days.

    Check out Simon’s post for more details and another LLM plugin for another way to access it.

    Wednesday July 24, 2024