Tag: Text to image

  • Schnell! (Text to image using Hugging face api)

    Text-to-Image Generator

    This project provides a Python script to generate images from text prompts using the Hugging Face Inference API. It leverages the modern black-forest-labs/FLUX.1-schnell model for high-quality, fast image generation without requiring a local GPU.

    Get the code via Github:

    https://github.com/mskogly/Schnell-Text-to-Image-Generator

    Todo / could do

    • A WordPress plugin?
    • Swap to another fast model, perhaps a fine tuned version of Flux.1-schnell that is a bit better at following prompts. (see this thread on reddit for a nice comparion test of Flux variations done by the very helpful MushroomCharacter411.

    Features

    • Serverless Inference: Uses Hugging Face’s Inference API, so no heavy local hardware is required.
    • High Quality: Defaults to the black-forest-labs/FLUX.1-schnell model.
    • Configurable Resolution: Default resolution is set to 1344×768 (Landscape), but can be customized via command-line arguments.
    • Secure: Uses environment variables for API token management.

    Prerequisites

    • Python 3.x
    • A Hugging Face account and API Token (Read access).

    Setup

    1. Environment Setup: The project is designed to run in a virtual environment. python3 -m venv venv source venv/bin/activate
    2. Install Dependencies: pip install --upgrade pip pip install huggingface_hub python-dotenv Pillow
    3. Configuration: Create a .env file in the project root and add your Hugging Face token: HF_TOKEN=hf_your_token_here

    Usage

    Basic Usage

    Generate an image with the default settings (1344×768). The image will be automatically saved to the output/ folder with a timestamp and sanitized prompt as the filename:

    ./venv/bin/python generate_image.py "A futuristic city skyline at sunset"
    # Output: output/2025-11-24_12-10-53_A_futuristic_city_skyline_at_sunset.png

    Custom Output Filename

    Specify where to save the generated image:

    ./venv/bin/python generate_image.py "A cute robot cat" --output robot_cat.png

    Custom Resolution

    Override the default resolution. Note that FLUX.1-schnell works best with dimensions that are multiples of 32.

    Square (1024×1024):

    ./venv/bin/python generate_image.py "Abstract art" --width 1024 --height 1024

    Portrait (768×1344):

    ./venv/bin/python generate_image.py "A tall cyberpunk tower" --width 768 --height 1344

    Troubleshooting

    • 404/410 Errors: These usually mean the model endpoint is temporarily unavailable or moved. The script currently uses black-forest-labs/FLUX.1-schnell, which is supported on the router.
    • Authentication Errors: Ensure your HF_TOKEN is correctly set in the .env file and has valid permissions.

    Web Interface

    1. Install the new Flask dependency (you may already have the virtual environment active): pip install flask
    2. Run the Flask server (it loads the same .env file, so your HF_TOKEN is already read): python web_app.py
    3. Open http://localhost:5000 in your browser, enter a prompt, adjust resolution/format if needed, then download the generated image via the link that appears.
    prompt
    “A man and a robot are painting together in a ruined building, in the style of Caravaggio”
    width
    1344
    height
    768
    format
    “jpg”
    num_inference_steps
    6
    seed
    3062171345
    model
    “black-forest-labs/FLUX.1-schnell”
    timestamp
    “2025-11-24T18:29:20.902530”
    filename
    “2025-11-24_18-29-17_A_man_and_a_robot_are_painting_together_in_a_ruine.jpg”

    generate_image.py

    import os
    import argparse
    import re
    import json
    import random
    from datetime import datetime
    from pathlib import Path
    from dotenv import load_dotenv
    from huggingface_hub import InferenceClient
    
    # Load environment variables
    load_dotenv()
    
    def sanitize_filename(text, max_length=50):
        """Convert text to a safe filename."""
        # Remove or replace unsafe characters
        safe = re.sub(r'[^\w\s-]', '', text)
        # Replace spaces with underscores
        safe = re.sub(r'[-\s]+', '_', safe)
        # Truncate to max length
        return safe[:max_length].strip('_')
    
    def generate_image(prompt, output_file=None, width=1344, height=768, format="jpg", num_inference_steps=4, seed=None):
        token = os.getenv("HF_TOKEN")
        if not token:
            raise ValueError("HF_TOKEN not found in environment variables. Please check your .env file.")
    
        print(f"Generating image for prompt: '{prompt}'")
        print(f"Resolution: {width}x{height}")
        print(f"Inference steps: {num_inference_steps}")
        
        # Generate random seed if not provided
        if seed is None:
            seed = random.randint(0, 2**32 - 1)
        print(f"Seed: {seed}")
        
        # Create output directory
        output_dir = Path("output")
        output_dir.mkdir(exist_ok=True)
        
        # Generate filename if not provided
        if output_file is None:
            timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
            prompt_slug = sanitize_filename(prompt)
            ext = format.lower()
            output_file = output_dir / f"{timestamp}_{prompt_slug}.{ext}"
        else:
            output_file = Path(output_file)
        
        # Initialize the client
        client = InferenceClient(token=token)
        
        try:
            # FLUX.1-schnell is a good candidate for a modern default
            model = "black-forest-labs/FLUX.1-schnell" 
            print(f"Using model: {model}")
            
            image = client.text_to_image(
                prompt, 
                model=model,
                width=width,
                height=height,
                num_inference_steps=num_inference_steps,
                seed=seed
            )
            
            # Save image
            if format.lower() == "jpg":
                # Convert RGBA to RGB for JPG (JPG doesn't support transparency)
                if image.mode == "RGBA":
                    rgb_image = image.convert("RGB")
                    rgb_image.save(output_file, "JPEG", quality=95)
                else:
                    image.save(output_file, "JPEG", quality=95)
            else:
                image.save(output_file)
            print(f"Image saved to {output_file}")
            
            # Save metadata as JSON
            metadata = {
                "prompt": prompt,
                "width": width,
                "height": height,
                "format": format,
                "num_inference_steps": num_inference_steps,
                "seed": seed,
                "model": model,
                "timestamp": datetime.now().isoformat(),
                "filename": str(output_file.name)
            }
            
            metadata_file = output_file.with_suffix('.json')
            with open(metadata_file, 'w') as f:
                json.dump(metadata, f, indent=2)
            print(f"Metadata saved to {metadata_file}")
            
        except Exception as e:
            print(f"Error generating image: {repr(e)}")
    
    if __name__ == "__main__":
        parser = argparse.ArgumentParser(description="Generate an image from text using Hugging Face Inference API.")
        parser.add_argument("prompt", type=str, help="The text prompt for image generation")
        parser.add_argument("--output", type=str, default=None, help="Output filename (default: auto-generated with timestamp and prompt)")
        parser.add_argument("--width", type=int, default=1344, help="Image width (default: 1344)")
        parser.add_argument("--height", type=int, default=768, help="Image height (default: 768)")
        parser.add_argument("--format", type=str, default="jpg", choices=["jpg", "png"], help="Output format (default: jpg)")
        parser.add_argument("--steps", type=int, default=4, help="Number of inference steps (default: 4, higher = better quality but slower)")
        parser.add_argument("--seed", type=int, default=None, help="Random seed for reproducibility (default: random)")
        
        args = parser.parse_args()
        
        generate_image(args.prompt, args.output, args.width, args.height, args.format, args.steps, args.seed)
    

    web_app.py

    from datetime import datetime
    from pathlib import Path
    import json
    
    from flask import Flask, render_template_string, request, send_from_directory, url_for
    
    from generate_image import generate_image, sanitize_filename
    
    app = Flask(__name__)
    
    HTML_TEMPLATE = """
    <!doctype html>
    <title>Text-to-Image</title>
    <style>
      body { font-family: system-ui,-apple-system,BlinkMacSystemFont,"Segoe UI",sans-serif; margin: 2rem; background: #0e1117; color: #f5f6fb; }
      .container { max-width: 1400px; margin: 0 auto; }
      form { max-width: 640px; margin-bottom: 2rem; }
      label { display: block; margin-bottom: .25rem; font-weight: 500; }
      input, textarea, select { width: 100%; padding: .5rem; margin-bottom: 1rem; border-radius: 6px; border: 1px solid #333; background: #141922; color: inherit; font-family: inherit; }
      button { padding: .75rem 1.5rem; border: none; border-radius: 6px; background: #0066ff; color: #fff; cursor: pointer; font-weight: 500; }
      button:hover { background: #0052cc; }
      .result { margin-top: 1rem; padding: 1rem; border-radius: 8px; background: #161b27; }
      .error { color: #ff6b6b; }
      .metadata { background: #1a1f2e; padding: 1rem; border-radius: 6px; margin-top: 1rem; font-family: 'Courier New', monospace; font-size: 0.9em; }
      .metadata dt { font-weight: bold; color: #8b92a8; margin-top: 0.5rem; }
      .metadata dd { margin-left: 1rem; color: #d4d7e0; }
      
      /* Gallery styles */
      .gallery-header { margin-top: 3rem; margin-bottom: 1rem; border-top: 2px solid #333; padding-top: 2rem; }
      .gallery { display: grid; grid-template-columns: repeat(auto-fill, minmax(300px, 1fr)); gap: 1.5rem; margin-top: 1rem; }
      .gallery-item { background: #161b27; border-radius: 8px; overflow: hidden; transition: transform 0.2s; }
      .gallery-item:hover { transform: translateY(-4px); }
      .gallery-item img { width: 100%; height: 200px; object-fit: cover; display: block; }
      .gallery-info { padding: 1rem; }
      .gallery-prompt { font-size: 0.9em; margin-bottom: 0.5rem; line-height: 1.4; color: #d4d7e0; }
      .gallery-meta { font-size: 0.75em; color: #8b92a8; }
      .gallery-meta span { display: inline-block; margin-right: 0.75rem; }
      .gallery-link { color: #0066ff; text-decoration: none; font-size: 0.85em; }
      .gallery-link:hover { text-decoration: underline; }
    </style>
    <div class="container">
      <h1>Text-to-Image Generator</h1>
      <p>Enter a prompt (and optional settings) to generate an image using Hugging Face FLUX.1-schnell. Images and metadata are saved to <code>output/</code>.</p>
      <form method="post">
        <label for="prompt">Prompt</label>
        <textarea id="prompt" name="prompt" rows="3" required>{{ prompt or "" }}</textarea>
    
        <label for="width">Width</label>
        <input id="width" name="width" type="number" min="256" max="2048" step="32" value="{{ width or 1344 }}">
    
        <label for="height">Height</label>
        <input id="height" name="height" type="number" min="256" max="2048" step="32" value="{{ height or 768 }}">
    
        <label for="steps">Inference Steps (higher = better quality, slower)</label>
        <input id="steps" name="steps" type="number" min="1" max="50" value="{{ steps or 4 }}">
    
        <label for="seed">Seed (leave empty for random)</label>
        <input id="seed" name="seed" type="number" min="0" value="{{ seed or '' }}" placeholder="Random">
    
        <label for="format">Format</label>
        <select id="format" name="format">
          <option value="jpg" {% if format == "jpg" %}selected{% endif %}>JPEG</option>
          <option value="png" {% if format == "png" %}selected{% endif %}>PNG</option>
        </select>
    
        <button type="submit">Generate Image</button>
      </form>
    
      {% if message %}
        <div class="result">
          <p>{{ message }}</p>
          {% if image_url %}
            <p><a href="{{ image_url }}" target="_blank" rel="noreferrer">Open generated image</a></p>
            <p><img src="{{ image_url }}" alt="Generated image" style="max-width:100%; border-radius:8px;"/></p>
            
            {% if metadata %}
            <div class="metadata">
              <h3>Generation Metadata</h3>
              <dl>
                <dt>Prompt:</dt>
                <dd>{{ metadata.prompt }}</dd>
                <dt>Model:</dt>
                <dd>{{ metadata.model }}</dd>
                <dt>Resolution:</dt>
                <dd>{{ metadata.width }}x{{ metadata.height }}</dd>
                <dt>Inference Steps:</dt>
                <dd>{{ metadata.num_inference_steps }}</dd>
                <dt>Seed:</dt>
                <dd>{{ metadata.seed }}</dd>
                <dt>Format:</dt>
                <dd>{{ metadata.format }}</dd>
                <dt>Timestamp:</dt>
                <dd>{{ metadata.timestamp }}</dd>
              </dl>
            </div>
            {% endif %}
          {% endif %}
        </div>
      {% endif %}
    
      {% if error %}
        <div class="result error">{{ error }}</div>
      {% endif %}
    
      <div class="gallery-header">
        <h2>Recent Generations ({{ gallery_items|length }})</h2>
      </div>
      
      <div class="gallery">
        {% for item in gallery_items %}
        <div class="gallery-item">
          <a href="{{ item.image_url }}" target="_blank">
            <img src="{{ item.image_url }}" alt="{{ item.metadata.prompt }}" loading="lazy">
          </a>
          <div class="gallery-info">
            <div class="gallery-prompt">{{ item.metadata.prompt }}</div>
            <div class="gallery-meta">
              <span>{{ item.metadata.width }}×{{ item.metadata.height }}</span>
              <span>{{ item.metadata.num_inference_steps }} steps</span>
              <span>{{ item.metadata.format }}</span>
            </div>
            <div style="margin-top: 0.5rem;">
              <a href="{{ item.image_url }}" class="gallery-link" download>Download</a>
            </div>
          </div>
        </div>
        {% endfor %}
      </div>
    </div>
    """
    
    
    def build_output_filename(prompt: str, extension: str) -> Path:
        """Create a timestamped filename based on the prompt."""
        timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
        prompt_slug = sanitize_filename(prompt) or "image"
        return Path("output") / f"{timestamp}_{prompt_slug}.{extension}"
    
    
    def get_gallery_items(limit=50):
        """Get recent generated images with their metadata."""
        output_dir = Path("output")
        if not output_dir.exists():
            return []
        
        items = []
        # Get all JSON metadata files
        json_files = sorted(output_dir.glob("*.json"), key=lambda p: p.stat().st_mtime, reverse=True)
        
        for json_file in json_files[:limit]:
            try:
                with open(json_file, 'r') as f:
                    metadata = json.load(f)
                
                # Check if the corresponding image exists
                image_file = json_file.with_suffix(f".{metadata.get('format', 'jpg')}")
                if image_file.exists():
                    items.append({
                        'metadata': metadata,
                        'image_url': url_for('serve_image', filename=image_file.name),
                        'json_url': url_for('serve_image', filename=json_file.name)
                    })
            except Exception as e:
                print(f"Error loading {json_file}: {e}")
                continue
        
        return items
    
    
    @app.route("/", methods=["GET", "POST"])
    def index():
        message = None
        image_url = None
        error = None
        metadata = None
        prompt = request.form.get("prompt", "")
        width = request.form.get("width", "1344")
        height = request.form.get("height", "768")
        steps = request.form.get("steps", "4")
        seed_str = request.form.get("seed", "")
        seed = int(seed_str) if seed_str else None
        format_choice = request.form.get("format", "jpg")
    
        if request.method == "POST":
            if not prompt.strip():
                error = "Prompt must not be empty."
            else:
                try:
                    output_path = build_output_filename(prompt, format_choice)
                    output_path.parent.mkdir(exist_ok=True)
                    generate_image(
                        prompt=prompt,
                        output_file=output_path,
                        width=int(width),
                        height=int(height),
                        format=format_choice,
                        num_inference_steps=int(steps),
                        seed=seed,
                    )
                    image_url = url_for("serve_image", filename=output_path.name)
                    message = f"Saved to {output_path}"
                    
                    # Load metadata
                    metadata_path = output_path.with_suffix('.json')
                    if metadata_path.exists():
                        with open(metadata_path, 'r') as f:
                            metadata = json.load(f)
                            
                except Exception as exc:
                    error = f"Could not generate image: {exc}"
    
        # Get gallery items
        gallery_items = get_gallery_items()
    
        return render_template_string(
            HTML_TEMPLATE,
            prompt=prompt,
            width=width,
            height=height,
            steps=steps,
            seed=seed,
            format=format_choice,
            message=message,
            image_url=image_url,
            metadata=metadata,
            error=error,
            gallery_items=gallery_items,
        )
    
    
    @app.route("/output/<path:filename>")
    def serve_image(filename):
        return send_from_directory("output", filename)
    
    
    if __name__ == "__main__":
        app.run(host="0.0.0.0", port=5000, debug=True)