DevLog 250423 – Terminal Home & HF Bot Setup

> Log Date: 2025-04-23

Continued building my terminal-themed homepage and began developing a Hugging Face AI bot with plans to customize it for documentation and portfolio use.

Today I worked on refining my terminal homepage styling and layout. There are a few other cleanup tasks I want to tackle soon — mostly reworking older HTML into Astro format, optimizing mobile spacing, and trimming unused scripts.

Custom Hugging Face Bot Setup

I also began creating a personal AI bot on Hugging Face, using app.py as the entry point for logic and document loading. Here's a quick-start tutorial to help others set up their own:

1. Clone the Hugging Face Gradio Space

git clone https://huggingface.co/spaces/your-username/your-space

2. Edit `app.py` with Custom Prompt Logic

Basic template using the huggingface_hub Inference API:

import gradio as gr
from huggingface_hub import InferenceClient

client = InferenceClient("HuggingFaceH4/zephyr-7b-beta")

def respond(message, history, system_message, max_tokens, temperature, top_p):
    messages = [{"role": "system", "content": system_message}]
    for user, bot in history:
        if user: messages.append({"role": "user", "content": user})
        if bot: messages.append({"role": "assistant", "content": bot})
    messages.append({"role": "user", "content": message})
    
    response = ""
    for msg in client.chat_completion(messages, max_tokens=max_tokens, stream=True):
        response += msg.delta
    return response

3. Upload Local Docs

To give your bot context, upload `.txt`, `.md`, or `.json` files into the root of the Space. You'll need to write a custom parser or embed them directly into the prompt context during inference.

4. Push to Hugging Face

git add .
git commit -m "Initial bot setup"
git push

Next Step: Choosing the Best Model

I'm currently researching the most performant open models. I’ve been leaning toward the new release of LLaMA 4, which was just announced and appears to offer strong performance in open weights.

Once I've chosen the final model, I’ll optimize prompt logic and memory handling for technical doc retrieval, code walkthroughs, and developer help.

Here's a link: Back to Arynwood Blog