Schnell! (Text to image using Hugging face api)

Text-to-Image Generator

This project provides a Python script to generate images from text prompts using the Hugging Face Inference API. It leverages the modern black-forest-labs/FLUX.1-schnell model for high-quality, fast image generation without requiring a local GPU.

Get the code via Github:

https://github.com/mskogly/Schnell-Text-to-Image-Generator

Todo / could do

  • A WordPress plugin?
  • Swap to another fast model, perhaps a fine tuned version of Flux.1-schnell that is a bit better at following prompts. (see this thread on reddit for a nice comparion test of Flux variations done by the very helpful MushroomCharacter411.

Features

  • Serverless Inference: Uses Hugging Face’s Inference API, so no heavy local hardware is required.
  • High Quality: Defaults to the black-forest-labs/FLUX.1-schnell model.
  • Configurable Resolution: Default resolution is set to 1344×768 (Landscape), but can be customized via command-line arguments.
  • Secure: Uses environment variables for API token management.

Prerequisites

  • Python 3.x
  • A Hugging Face account and API Token (Read access).

Setup

  1. Environment Setup: The project is designed to run in a virtual environment. python3 -m venv venv source venv/bin/activate
  2. Install Dependencies: pip install --upgrade pip pip install huggingface_hub python-dotenv Pillow
  3. Configuration: Create a .env file in the project root and add your Hugging Face token: HF_TOKEN=hf_your_token_here

Usage

Basic Usage

Generate an image with the default settings (1344×768). The image will be automatically saved to the output/ folder with a timestamp and sanitized prompt as the filename:

./venv/bin/python generate_image.py "A futuristic city skyline at sunset"
# Output: output/2025-11-24_12-10-53_A_futuristic_city_skyline_at_sunset.png

Custom Output Filename

Specify where to save the generated image:

./venv/bin/python generate_image.py "A cute robot cat" --output robot_cat.png

Custom Resolution

Override the default resolution. Note that FLUX.1-schnell works best with dimensions that are multiples of 32.

Square (1024×1024):

./venv/bin/python generate_image.py "Abstract art" --width 1024 --height 1024

Portrait (768×1344):

./venv/bin/python generate_image.py "A tall cyberpunk tower" --width 768 --height 1344

Troubleshooting

  • 404/410 Errors: These usually mean the model endpoint is temporarily unavailable or moved. The script currently uses black-forest-labs/FLUX.1-schnell, which is supported on the router.
  • Authentication Errors: Ensure your HF_TOKEN is correctly set in the .env file and has valid permissions.

Web Interface

  1. Install the new Flask dependency (you may already have the virtual environment active): pip install flask
  2. Run the Flask server (it loads the same .env file, so your HF_TOKEN is already read): python web_app.py
  3. Open http://localhost:5000 in your browser, enter a prompt, adjust resolution/format if needed, then download the generated image via the link that appears.
prompt
“A man and a robot are painting together in a ruined building, in the style of Caravaggio”
width
1344
height
768
format
“jpg”
num_inference_steps
6
seed
3062171345
model
“black-forest-labs/FLUX.1-schnell”
timestamp
“2025-11-24T18:29:20.902530”
filename
“2025-11-24_18-29-17_A_man_and_a_robot_are_painting_together_in_a_ruine.jpg”

generate_image.py

import os
import argparse
import re
import json
import random
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
from huggingface_hub import InferenceClient

# Load environment variables
load_dotenv()

def sanitize_filename(text, max_length=50):
    """Convert text to a safe filename."""
    # Remove or replace unsafe characters
    safe = re.sub(r'[^\w\s-]', '', text)
    # Replace spaces with underscores
    safe = re.sub(r'[-\s]+', '_', safe)
    # Truncate to max length
    return safe[:max_length].strip('_')

def generate_image(prompt, output_file=None, width=1344, height=768, format="jpg", num_inference_steps=4, seed=None):
    token = os.getenv("HF_TOKEN")
    if not token:
        raise ValueError("HF_TOKEN not found in environment variables. Please check your .env file.")

    print(f"Generating image for prompt: '{prompt}'")
    print(f"Resolution: {width}x{height}")
    print(f"Inference steps: {num_inference_steps}")
    
    # Generate random seed if not provided
    if seed is None:
        seed = random.randint(0, 2**32 - 1)
    print(f"Seed: {seed}")
    
    # Create output directory
    output_dir = Path("output")
    output_dir.mkdir(exist_ok=True)
    
    # Generate filename if not provided
    if output_file is None:
        timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        prompt_slug = sanitize_filename(prompt)
        ext = format.lower()
        output_file = output_dir / f"{timestamp}_{prompt_slug}.{ext}"
    else:
        output_file = Path(output_file)
    
    # Initialize the client
    client = InferenceClient(token=token)
    
    try:
        # FLUX.1-schnell is a good candidate for a modern default
        model = "black-forest-labs/FLUX.1-schnell" 
        print(f"Using model: {model}")
        
        image = client.text_to_image(
            prompt, 
            model=model,
            width=width,
            height=height,
            num_inference_steps=num_inference_steps,
            seed=seed
        )
        
        # Save image
        if format.lower() == "jpg":
            # Convert RGBA to RGB for JPG (JPG doesn't support transparency)
            if image.mode == "RGBA":
                rgb_image = image.convert("RGB")
                rgb_image.save(output_file, "JPEG", quality=95)
            else:
                image.save(output_file, "JPEG", quality=95)
        else:
            image.save(output_file)
        print(f"Image saved to {output_file}")
        
        # Save metadata as JSON
        metadata = {
            "prompt": prompt,
            "width": width,
            "height": height,
            "format": format,
            "num_inference_steps": num_inference_steps,
            "seed": seed,
            "model": model,
            "timestamp": datetime.now().isoformat(),
            "filename": str(output_file.name)
        }
        
        metadata_file = output_file.with_suffix('.json')
        with open(metadata_file, 'w') as f:
            json.dump(metadata, f, indent=2)
        print(f"Metadata saved to {metadata_file}")
        
    except Exception as e:
        print(f"Error generating image: {repr(e)}")

if __name__ == "__main__":
    parser = argparse.ArgumentParser(description="Generate an image from text using Hugging Face Inference API.")
    parser.add_argument("prompt", type=str, help="The text prompt for image generation")
    parser.add_argument("--output", type=str, default=None, help="Output filename (default: auto-generated with timestamp and prompt)")
    parser.add_argument("--width", type=int, default=1344, help="Image width (default: 1344)")
    parser.add_argument("--height", type=int, default=768, help="Image height (default: 768)")
    parser.add_argument("--format", type=str, default="jpg", choices=["jpg", "png"], help="Output format (default: jpg)")
    parser.add_argument("--steps", type=int, default=4, help="Number of inference steps (default: 4, higher = better quality but slower)")
    parser.add_argument("--seed", type=int, default=None, help="Random seed for reproducibility (default: random)")
    
    args = parser.parse_args()
    
    generate_image(args.prompt, args.output, args.width, args.height, args.format, args.steps, args.seed)

web_app.py

from datetime import datetime
from pathlib import Path
import json

from flask import Flask, render_template_string, request, send_from_directory, url_for

from generate_image import generate_image, sanitize_filename

app = Flask(__name__)

HTML_TEMPLATE = """
<!doctype html>
<title>Text-to-Image</title>
<style>
  body { font-family: system-ui,-apple-system,BlinkMacSystemFont,"Segoe UI",sans-serif; margin: 2rem; background: #0e1117; color: #f5f6fb; }
  .container { max-width: 1400px; margin: 0 auto; }
  form { max-width: 640px; margin-bottom: 2rem; }
  label { display: block; margin-bottom: .25rem; font-weight: 500; }
  input, textarea, select { width: 100%; padding: .5rem; margin-bottom: 1rem; border-radius: 6px; border: 1px solid #333; background: #141922; color: inherit; font-family: inherit; }
  button { padding: .75rem 1.5rem; border: none; border-radius: 6px; background: #0066ff; color: #fff; cursor: pointer; font-weight: 500; }
  button:hover { background: #0052cc; }
  .result { margin-top: 1rem; padding: 1rem; border-radius: 8px; background: #161b27; }
  .error { color: #ff6b6b; }
  .metadata { background: #1a1f2e; padding: 1rem; border-radius: 6px; margin-top: 1rem; font-family: 'Courier New', monospace; font-size: 0.9em; }
  .metadata dt { font-weight: bold; color: #8b92a8; margin-top: 0.5rem; }
  .metadata dd { margin-left: 1rem; color: #d4d7e0; }
  
  /* Gallery styles */
  .gallery-header { margin-top: 3rem; margin-bottom: 1rem; border-top: 2px solid #333; padding-top: 2rem; }
  .gallery { display: grid; grid-template-columns: repeat(auto-fill, minmax(300px, 1fr)); gap: 1.5rem; margin-top: 1rem; }
  .gallery-item { background: #161b27; border-radius: 8px; overflow: hidden; transition: transform 0.2s; }
  .gallery-item:hover { transform: translateY(-4px); }
  .gallery-item img { width: 100%; height: 200px; object-fit: cover; display: block; }
  .gallery-info { padding: 1rem; }
  .gallery-prompt { font-size: 0.9em; margin-bottom: 0.5rem; line-height: 1.4; color: #d4d7e0; }
  .gallery-meta { font-size: 0.75em; color: #8b92a8; }
  .gallery-meta span { display: inline-block; margin-right: 0.75rem; }
  .gallery-link { color: #0066ff; text-decoration: none; font-size: 0.85em; }
  .gallery-link:hover { text-decoration: underline; }
</style>
<div class="container">
  <h1>Text-to-Image Generator</h1>
  <p>Enter a prompt (and optional settings) to generate an image using Hugging Face FLUX.1-schnell. Images and metadata are saved to <code>output/</code>.</p>
  <form method="post">
    <label for="prompt">Prompt</label>
    <textarea id="prompt" name="prompt" rows="3" required>{{ prompt or "" }}</textarea>

    <label for="width">Width</label>
    <input id="width" name="width" type="number" min="256" max="2048" step="32" value="{{ width or 1344 }}">

    <label for="height">Height</label>
    <input id="height" name="height" type="number" min="256" max="2048" step="32" value="{{ height or 768 }}">

    <label for="steps">Inference Steps (higher = better quality, slower)</label>
    <input id="steps" name="steps" type="number" min="1" max="50" value="{{ steps or 4 }}">

    <label for="seed">Seed (leave empty for random)</label>
    <input id="seed" name="seed" type="number" min="0" value="{{ seed or '' }}" placeholder="Random">

    <label for="format">Format</label>
    <select id="format" name="format">
      <option value="jpg" {% if format == "jpg" %}selected{% endif %}>JPEG</option>
      <option value="png" {% if format == "png" %}selected{% endif %}>PNG</option>
    </select>

    <button type="submit">Generate Image</button>
  </form>

  {% if message %}
    <div class="result">
      <p>{{ message }}</p>
      {% if image_url %}
        <p><a href="{{ image_url }}" target="_blank" rel="noreferrer">Open generated image</a></p>
        <p><img src="{{ image_url }}" alt="Generated image" style="max-width:100%; border-radius:8px;"/></p>
        
        {% if metadata %}
        <div class="metadata">
          <h3>Generation Metadata</h3>
          <dl>
            <dt>Prompt:</dt>
            <dd>{{ metadata.prompt }}</dd>
            <dt>Model:</dt>
            <dd>{{ metadata.model }}</dd>
            <dt>Resolution:</dt>
            <dd>{{ metadata.width }}x{{ metadata.height }}</dd>
            <dt>Inference Steps:</dt>
            <dd>{{ metadata.num_inference_steps }}</dd>
            <dt>Seed:</dt>
            <dd>{{ metadata.seed }}</dd>
            <dt>Format:</dt>
            <dd>{{ metadata.format }}</dd>
            <dt>Timestamp:</dt>
            <dd>{{ metadata.timestamp }}</dd>
          </dl>
        </div>
        {% endif %}
      {% endif %}
    </div>
  {% endif %}

  {% if error %}
    <div class="result error">{{ error }}</div>
  {% endif %}

  <div class="gallery-header">
    <h2>Recent Generations ({{ gallery_items|length }})</h2>
  </div>
  
  <div class="gallery">
    {% for item in gallery_items %}
    <div class="gallery-item">
      <a href="{{ item.image_url }}" target="_blank">
        <img src="{{ item.image_url }}" alt="{{ item.metadata.prompt }}" loading="lazy">
      </a>
      <div class="gallery-info">
        <div class="gallery-prompt">{{ item.metadata.prompt }}</div>
        <div class="gallery-meta">
          <span>{{ item.metadata.width }}×{{ item.metadata.height }}</span>
          <span>{{ item.metadata.num_inference_steps }} steps</span>
          <span>{{ item.metadata.format }}</span>
        </div>
        <div style="margin-top: 0.5rem;">
          <a href="{{ item.image_url }}" class="gallery-link" download>Download</a>
        </div>
      </div>
    </div>
    {% endfor %}
  </div>
</div>
"""


def build_output_filename(prompt: str, extension: str) -> Path:
    """Create a timestamped filename based on the prompt."""
    timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
    prompt_slug = sanitize_filename(prompt) or "image"
    return Path("output") / f"{timestamp}_{prompt_slug}.{extension}"


def get_gallery_items(limit=50):
    """Get recent generated images with their metadata."""
    output_dir = Path("output")
    if not output_dir.exists():
        return []
    
    items = []
    # Get all JSON metadata files
    json_files = sorted(output_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True)
    
    for json_file in json_files[:limit]:
        try:
            with open(json_file, 'r') as f:
                metadata = json.load(f)
            
            # Check if the corresponding image exists
            image_file = json_file.with_suffix(f".{metadata.get('format', 'jpg')}")
            if image_file.exists():
                items.append({
                    'metadata': metadata,
                    'image_url': url_for('serve_image', filename=image_file.name),
                    'json_url': url_for('serve_image', filename=json_file.name)
                })
        except Exception as e:
            print(f"Error loading {json_file}: {e}")
            continue
    
    return items


@app.route("/", methods=["GET", "POST"])
def index():
    message = None
    image_url = None
    error = None
    metadata = None
    prompt = request.form.get("prompt", "")
    width = request.form.get("width", "1344")
    height = request.form.get("height", "768")
    steps = request.form.get("steps", "4")
    seed_str = request.form.get("seed", "")
    seed = int(seed_str) if seed_str else None
    format_choice = request.form.get("format", "jpg")

    if request.method == "POST":
        if not prompt.strip():
            error = "Prompt must not be empty."
        else:
            try:
                output_path = build_output_filename(prompt, format_choice)
                output_path.parent.mkdir(exist_ok=True)
                generate_image(
                    prompt=prompt,
                    output_file=output_path,
                    width=int(width),
                    height=int(height),
                    format=format_choice,
                    num_inference_steps=int(steps),
                    seed=seed,
                )
                image_url = url_for("serve_image", filename=output_path.name)
                message = f"Saved to {output_path}"
                
                # Load metadata
                metadata_path = output_path.with_suffix('.json')
                if metadata_path.exists():
                    with open(metadata_path, 'r') as f:
                        metadata = json.load(f)
                        
            except Exception as exc:
                error = f"Could not generate image: {exc}"

    # Get gallery items
    gallery_items = get_gallery_items()

    return render_template_string(
        HTML_TEMPLATE,
        prompt=prompt,
        width=width,
        height=height,
        steps=steps,
        seed=seed,
        format=format_choice,
        message=message,
        image_url=image_url,
        metadata=metadata,
        error=error,
        gallery_items=gallery_items,
    )


@app.route("/output/<path:filename>")
def serve_image(filename):
    return send_from_directory("output", filename)


if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000, debug=True)

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *