This project is a recreation of the original Flash-based `ulyd-v3.swf` interactive sound art piece, made in 2005, rebuilt using modern web technologies (p5.js). Also expanded to include Spatial Audio, changing pitch based on placement of the samples.
## About the Project
Ulyd v3 is an interactive sound sequencer. It features a vertical scanning line that triggers sounds when it passes over objects placed in the main area.
### Features
– **Scanner**: A white line scans down the screen, triggering sounds.
– **Pool**: A collection of sound objects on the right side.
– **Drag & Drop**: Users can drag objects from the pool into the scanner area.
– **Spatial Audio**: Sounds are modulated based on their horizontal position:
– **Left**: Lower pitch, panned left.
– **Center**: Normal pitch, centered.
– **Right**: Higher pitch, panned right.
– **Reactive Backgrounds**: The background atmosphere shifts and fades randomly when sounds are triggered.
– **Sound Groups**:
– **Yellow Icons**: Bright ping sounds.
– **Silver Icons**: Glass/Sonar sounds.
## Technical Details
– **Library**: [p5.js](https://p5js.org/) for graphics and interaction.
– **Audio**: [p5.sound](https://p5js.org/reference/#/libraries/p5.sound) for audio playback and effects.
– **Compatibility**: Works in all modern browsers (Chrome, Safari, Firefox). Includes fixes for Safari’s strict audio autoplay policies.
## How to Embed
1. Upload the contents of this folder to your web server.
2. Use an `<iframe>` to embed `index.html` on your page.
3. Ensure the `allow=”autoplay”` attribute is set on the iframe.
This project provides a Python script to generate images from text prompts using the Hugging Face Inference API. It leverages the modern black-forest-labs/FLUX.1-schnell model for high-quality, fast image generation without requiring a local GPU.
Swap to another fast model, perhaps a fine tuned version of Flux.1-schnell that is a bit better at following prompts. (see this thread on reddit for a nice comparion test of Flux variations done by the very helpful MushroomCharacter411.
Features
Serverless Inference: Uses Hugging Face’s Inference API, so no heavy local hardware is required.
High Quality: Defaults to the black-forest-labs/FLUX.1-schnell model.
Configurable Resolution: Default resolution is set to 1344×768 (Landscape), but can be customized via command-line arguments.
Secure: Uses environment variables for API token management.
Prerequisites
Python 3.x
A Hugging Face account and API Token (Read access).
Setup
Environment Setup: The project is designed to run in a virtual environment. python3 -m venv venv source venv/bin/activate
Configuration: Create a .env file in the project root and add your Hugging Face token: HF_TOKEN=hf_your_token_here
Usage
Basic Usage
Generate an image with the default settings (1344×768). The image will be automatically saved to the output/ folder with a timestamp and sanitized prompt as the filename:
./venv/bin/python generate_image.py "A futuristic city skyline at sunset"
# Output: output/2025-11-24_12-10-53_A_futuristic_city_skyline_at_sunset.png
Custom Output Filename
Specify where to save the generated image:
./venv/bin/python generate_image.py "A cute robot cat" --output robot_cat.png
Custom Resolution
Override the default resolution. Note that FLUX.1-schnell works best with dimensions that are multiples of 32.
./venv/bin/python generate_image.py "A tall cyberpunk tower" --width 768 --height 1344
Troubleshooting
404/410 Errors: These usually mean the model endpoint is temporarily unavailable or moved. The script currently uses black-forest-labs/FLUX.1-schnell, which is supported on the router.
Authentication Errors: Ensure your HF_TOKEN is correctly set in the .env file and has valid permissions.
Web Interface
Install the new Flask dependency (you may already have the virtual environment active): pip install flask
Run the Flask server (it loads the same .env file, so your HF_TOKEN is already read): python web_app.py
Open http://localhost:5000 in your browser, enter a prompt, adjust resolution/format if needed, then download the generated image via the link that appears.
prompt “A man and a robot are painting together in a ruined building, in the style of Caravaggio” width 1344 height 768 format “jpg” num_inference_steps 6 seed 3062171345 model “black-forest-labs/FLUX.1-schnell” timestamp “2025-11-24T18:29:20.902530” filename “2025-11-24_18-29-17_A_man_and_a_robot_are_painting_together_in_a_ruine.jpg”
generate_image.py
import os
import argparse
import re
import json
import random
from datetime import datetime
from pathlib import Path
from dotenv import load_dotenv
from huggingface_hub import InferenceClient
# Load environment variables
load_dotenv()
def sanitize_filename(text, max_length=50):
"""Convert text to a safe filename."""
# Remove or replace unsafe characters
safe = re.sub(r'[^\w\s-]', '', text)
# Replace spaces with underscores
safe = re.sub(r'[-\s]+', '_', safe)
# Truncate to max length
return safe[:max_length].strip('_')
def generate_image(prompt, output_file=None, width=1344, height=768, format="jpg", num_inference_steps=4, seed=None):
token = os.getenv("HF_TOKEN")
if not token:
raise ValueError("HF_TOKEN not found in environment variables. Please check your .env file.")
print(f"Generating image for prompt: '{prompt}'")
print(f"Resolution: {width}x{height}")
print(f"Inference steps: {num_inference_steps}")
# Generate random seed if not provided
if seed is None:
seed = random.randint(0, 2**32 - 1)
print(f"Seed: {seed}")
# Create output directory
output_dir = Path("output")
output_dir.mkdir(exist_ok=True)
# Generate filename if not provided
if output_file is None:
timestamp = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
prompt_slug = sanitize_filename(prompt)
ext = format.lower()
output_file = output_dir / f"{timestamp}_{prompt_slug}.{ext}"
else:
output_file = Path(output_file)
# Initialize the client
client = InferenceClient(token=token)
try:
# FLUX.1-schnell is a good candidate for a modern default
model = "black-forest-labs/FLUX.1-schnell"
print(f"Using model: {model}")
image = client.text_to_image(
prompt,
model=model,
width=width,
height=height,
num_inference_steps=num_inference_steps,
seed=seed
)
# Save image
if format.lower() == "jpg":
# Convert RGBA to RGB for JPG (JPG doesn't support transparency)
if image.mode == "RGBA":
rgb_image = image.convert("RGB")
rgb_image.save(output_file, "JPEG", quality=95)
else:
image.save(output_file, "JPEG", quality=95)
else:
image.save(output_file)
print(f"Image saved to {output_file}")
# Save metadata as JSON
metadata = {
"prompt": prompt,
"width": width,
"height": height,
"format": format,
"num_inference_steps": num_inference_steps,
"seed": seed,
"model": model,
"timestamp": datetime.now().isoformat(),
"filename": str(output_file.name)
}
metadata_file = output_file.with_suffix('.json')
with open(metadata_file, 'w') as f:
json.dump(metadata, f, indent=2)
print(f"Metadata saved to {metadata_file}")
except Exception as e:
print(f"Error generating image: {repr(e)}")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Generate an image from text using Hugging Face Inference API.")
parser.add_argument("prompt", type=str, help="The text prompt for image generation")
parser.add_argument("--output", type=str, default=None, help="Output filename (default: auto-generated with timestamp and prompt)")
parser.add_argument("--width", type=int, default=1344, help="Image width (default: 1344)")
parser.add_argument("--height", type=int, default=768, help="Image height (default: 768)")
parser.add_argument("--format", type=str, default="jpg", choices=["jpg", "png"], help="Output format (default: jpg)")
parser.add_argument("--steps", type=int, default=4, help="Number of inference steps (default: 4, higher = better quality but slower)")
parser.add_argument("--seed", type=int, default=None, help="Random seed for reproducibility (default: random)")
args = parser.parse_args()
generate_image(args.prompt, args.output, args.width, args.height, args.format, args.steps, args.seed)