feat(skills): 添加 minimax-music-gen Skill
This commit is contained in:
@@ -0,0 +1,404 @@
|
|||||||
|
---
|
||||||
|
name: minimax-music-gen
|
||||||
|
description: >
|
||||||
|
Use when user wants to generate music, songs, or audio tracks. Triggers on any request
|
||||||
|
involving music creation, song writing, lyrics generation, audio production, or covers.
|
||||||
|
Also triggers when user provides lyrics and wants them turned into a song, or describes
|
||||||
|
a mood/scene and wants background music. Supports multilingual triggers — match equivalent
|
||||||
|
phrases in any language. Do NOT use for music playback of existing files, music theory
|
||||||
|
questions, or music recommendation without generation.
|
||||||
|
license: MIT
|
||||||
|
metadata:
|
||||||
|
version: "1.1"
|
||||||
|
category: creative
|
||||||
|
---
|
||||||
|
|
||||||
|
# MiniMax Music Generation Skill
|
||||||
|
|
||||||
|
Generate songs (vocal or instrumental) using the MiniMax Music API. Supports two creation
|
||||||
|
modes: **Basic** (one-sentence-in, song-out) and **Advanced Control** (edit lyrics, refine
|
||||||
|
prompt, plan before generating).
|
||||||
|
|
||||||
|
## Prerequisites
|
||||||
|
|
||||||
|
- **mmx CLI** (required): Music generation uses the `mmx` command-line tool.
|
||||||
|
|
||||||
|
**Check if installed:**
|
||||||
|
```bash
|
||||||
|
command -v mmx && mmx --version || echo "mmx not found"
|
||||||
|
```
|
||||||
|
|
||||||
|
**Install (requires Node.js):**
|
||||||
|
```bash
|
||||||
|
npm install -g mmx-cli
|
||||||
|
```
|
||||||
|
|
||||||
|
**Authenticate (first time only):**
|
||||||
|
```bash
|
||||||
|
mmx auth login --api-key <your-minimax-api-key>
|
||||||
|
```
|
||||||
|
The API key can be obtained from [MiniMax Platform](https://platform.minimaxi.com/).
|
||||||
|
Credentials are saved to `~/.mmx/credentials.json` and persist across sessions.
|
||||||
|
|
||||||
|
**Verify:**
|
||||||
|
```bash
|
||||||
|
mmx quota show
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Audio player** (recommended): `mpv`, `ffplay`, or `afplay` (macOS built-in) for local
|
||||||
|
playback. `mpv` is preferred for its interactive controls.
|
||||||
|
|
||||||
|
## CLI Tool
|
||||||
|
|
||||||
|
This skill uses the `mmx` CLI for all music generation:
|
||||||
|
|
||||||
|
- **Music Generation**: `mmx music generate` — model: `music-2.6-free`
|
||||||
|
- Supports `--lyrics-optimizer` to auto-generate lyrics from prompt
|
||||||
|
- Supports `--instrumental` for instrumental tracks
|
||||||
|
- Supports `--lyrics` for user-provided lyrics
|
||||||
|
- Structured params: `--genre`, `--mood`, `--vocals`, `--instruments`, `--bpm`, `--key`, `--tempo`, `--structure`, `--references`
|
||||||
|
|
||||||
|
- **Cover**: `mmx music cover` — model: `music-cover-free`
|
||||||
|
- Takes reference audio via `--audio-file <path>` or `--audio <url>`
|
||||||
|
- `--prompt` describes the target cover style
|
||||||
|
|
||||||
|
**Agent flags**: Always add `--quiet --non-interactive` when calling mmx from agents.
|
||||||
|
|
||||||
|
**Pipeline**:
|
||||||
|
- Vocal: `User description -> mmx music generate --lyrics-optimizer -> MP3`
|
||||||
|
- Instrumental: `User description -> mmx music generate --instrumental -> MP3`
|
||||||
|
- Cover: `Source audio + style -> mmx music cover -> MP3`
|
||||||
|
|
||||||
|
## Storage
|
||||||
|
|
||||||
|
All generated music is saved to `~/Music/minimax-gen/`. Create the directory if it doesn't
|
||||||
|
exist. Files are named with a timestamp and a short slug derived from the prompt:
|
||||||
|
`YYYYMMDD_HHMMSS_<slug>.mp3`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Language & Interaction
|
||||||
|
|
||||||
|
Detect the user's language from their first message and respond in that language for the
|
||||||
|
entire session. This applies to all interaction text, questions, confirmations, and feedback
|
||||||
|
prompts.
|
||||||
|
|
||||||
|
**User-facing text localization rule**:
|
||||||
|
- ALL text shown to the user — including preview labels, field names, confirmations, status
|
||||||
|
messages, playback info, feedback prompts, **and the prompt/description preview** — MUST
|
||||||
|
be fully translated into the user's language.
|
||||||
|
- The **API prompt** sent to the model should always be written in English for best
|
||||||
|
generation quality. However, when previewing the prompt to the user, show a localized
|
||||||
|
description in the user's language instead of the raw English prompt. The English prompt
|
||||||
|
is an internal implementation detail — the user does not need to see it.
|
||||||
|
- The templates below are written in English as reference. At runtime, translate every label
|
||||||
|
and message into the user's detected language.
|
||||||
|
|
||||||
|
**Lyrics language rule**:
|
||||||
|
- Default lyrics language = the user's language. A Chinese-speaking user gets Chinese lyrics;
|
||||||
|
an English-speaking user gets English lyrics.
|
||||||
|
- Only generate lyrics in a different language if the user **explicitly** requests it.
|
||||||
|
- When a different lyrics language is needed, embed it naturally into the vocal or genre
|
||||||
|
description in the prompt. For example, instead of appending "with Korean lyrics", use
|
||||||
|
"featuring a Korean female vocalist" or specify a genre that implies the language (e.g.,
|
||||||
|
"K-pop", "J-rock", "Mandopop", "Latin pop").
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Workflow
|
||||||
|
|
||||||
|
### Step 0: Detect Intent
|
||||||
|
|
||||||
|
Parse the user's message to determine:
|
||||||
|
|
||||||
|
1. **Song category**: vocal (with lyrics), instrumental (no vocals), or cover
|
||||||
|
2. **Creation mode preference**: did they provide detailed requirements (Advanced) or a
|
||||||
|
casual one-liner (Basic)?
|
||||||
|
|
||||||
|
If ambiguous, ask using this decision tree:
|
||||||
|
|
||||||
|
```
|
||||||
|
Q1: What type of music?
|
||||||
|
- Vocal (with lyrics)
|
||||||
|
- Instrumental (no vocals)
|
||||||
|
- Cover
|
||||||
|
|
||||||
|
Q2: Creation mode?
|
||||||
|
- Basic — one-line description, auto-generate
|
||||||
|
- Advanced — edit lyrics, refine prompt, plan
|
||||||
|
```
|
||||||
|
|
||||||
|
If the user gives a clear one-liner like "make me a sad piano piece", skip the questions —
|
||||||
|
infer instrumental + basic mode and proceed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Step 1: Basic Mode
|
||||||
|
|
||||||
|
**Goal**: User provides a short description, the skill auto-generates everything, then calls
|
||||||
|
the API.
|
||||||
|
|
||||||
|
1. **Expand the description into a prompt**: Take the user's one-liner and expand it into a
|
||||||
|
rich music prompt. Refer to the **Prompt Writing Guide** appendix at the end of this
|
||||||
|
document for style vocabulary, genre/instrument references, and prompt structure.
|
||||||
|
**The API prompt should always be written in English** for best generation quality,
|
||||||
|
regardless of the user's language.
|
||||||
|
|
||||||
|
Follow this pattern:
|
||||||
|
```
|
||||||
|
A [mood] [BPM optional] [genre] song, featuring [vocal description],
|
||||||
|
about [narrative/theme], [atmosphere], [key instruments and production].
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Show the user a preview** before generating. Translate all labels AND the prompt
|
||||||
|
description into the user's language. The English prompt is only used internally when
|
||||||
|
calling the API — the user should never see it. Example template (English reference —
|
||||||
|
localize everything at runtime):
|
||||||
|
|
||||||
|
```
|
||||||
|
About to generate:
|
||||||
|
Type: Vocal / Instrumental
|
||||||
|
Description: indie folk, melancholy, acoustic guitar, gentle female voice
|
||||||
|
Lyrics: Auto-generated (--lyrics-optimizer)
|
||||||
|
|
||||||
|
Confirm? (press enter to confirm, or tell me what to change)
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Call mmx**: Generate the music directly.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Step 2: Advanced Control Mode
|
||||||
|
|
||||||
|
**Goal**: User has full control over every parameter before generation.
|
||||||
|
|
||||||
|
1. **Lyrics phase**:
|
||||||
|
- If user provided lyrics: display them formatted with section markers, ask for edits.
|
||||||
|
The final lyrics will be passed via `--lyrics` to mmx.
|
||||||
|
- If user has a theme but no lyrics: will use `--lyrics-optimizer` to auto-generate.
|
||||||
|
- Support iterative editing: "change the second chorus" -> only rewrite that section.
|
||||||
|
- User can also write lyrics themselves and pass via `--lyrics`.
|
||||||
|
|
||||||
|
2. **Prompt phase**:
|
||||||
|
- Generate a recommended prompt based on the lyrics' mood and content.
|
||||||
|
- Present it as editable tags the user can add/remove/modify.
|
||||||
|
- Refer to the **Prompt Writing Guide** appendix for the full vocabulary.
|
||||||
|
|
||||||
|
3. **Advanced planning** (optional, offer but don't force):
|
||||||
|
- Song structure: verse-chorus-verse-chorus-bridge-chorus or custom
|
||||||
|
- BPM suggestion (encode in prompt as tempo descriptor)
|
||||||
|
- Reference style: "something like X style" -> map to prompt tags
|
||||||
|
- Vocal character description
|
||||||
|
|
||||||
|
4. **Final confirmation**: Show complete parameter summary, then generate.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Step 3: Call mmx
|
||||||
|
|
||||||
|
Generate music using the mmx CLI:
|
||||||
|
|
||||||
|
**Vocal with auto-generated lyrics:**
|
||||||
|
```bash
|
||||||
|
mmx music generate \
|
||||||
|
--prompt "<prompt>" \
|
||||||
|
--lyrics-optimizer \
|
||||||
|
--genre "<genre>" --mood "<mood>" --vocals "<vocal style>" \
|
||||||
|
--instruments "<instruments>" --bpm <bpm> \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**Vocal with user-provided lyrics:**
|
||||||
|
```bash
|
||||||
|
mmx music generate \
|
||||||
|
--prompt "<prompt>" \
|
||||||
|
--lyrics "<lyrics with section markers>" \
|
||||||
|
--genre "<genre>" --mood "<mood>" --vocals "<vocal style>" \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**Instrumental (no vocal):**
|
||||||
|
```bash
|
||||||
|
mmx music generate \
|
||||||
|
--prompt "<prompt>" \
|
||||||
|
--instrumental \
|
||||||
|
--genre "<genre>" --mood "<mood>" --instruments "<instruments>" \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
Use structured flags (`--genre`, `--mood`, `--vocals`, `--instruments`, `--bpm`, `--key`,
|
||||||
|
`--tempo`, `--structure`, `--references`, `--avoid`, `--use-case`) to give the API
|
||||||
|
fine-grained control instead of cramming everything into `--prompt`.
|
||||||
|
|
||||||
|
Display a progress indicator while waiting. Typical generation takes 30-120 seconds.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Step 4: Playback
|
||||||
|
|
||||||
|
After generation, detect an available audio player and play the file.
|
||||||
|
|
||||||
|
**Detect player:**
|
||||||
|
```bash
|
||||||
|
command -v mpv || command -v ffplay || command -v afplay
|
||||||
|
```
|
||||||
|
|
||||||
|
**Play based on detected player (in priority order):**
|
||||||
|
|
||||||
|
| Player | Command | Controls |
|
||||||
|
|--------|---------|----------|
|
||||||
|
| `mpv` (preferred) | `mpv --no-video ~/Music/minimax-gen/<filename>.mp3` | space = pause/resume, q = quit, left/right = seek |
|
||||||
|
| `ffplay` | `ffplay -nodisp -autoexit ~/Music/minimax-gen/<filename>.mp3` | q = quit |
|
||||||
|
| `afplay` (macOS) | `afplay ~/Music/minimax-gen/<filename>.mp3` | Ctrl+C = stop |
|
||||||
|
| None found | Do not attempt playback | Show file path only |
|
||||||
|
|
||||||
|
After starting playback, tell the user (localize all text):
|
||||||
|
|
||||||
|
```
|
||||||
|
Now playing: <filename>.mp3
|
||||||
|
Saved to: ~/Music/minimax-gen/<filename>.mp3
|
||||||
|
```
|
||||||
|
|
||||||
|
Do NOT show playback controls (e.g. keyboard shortcuts) — they don't work in this
|
||||||
|
environment since the player runs in the background.
|
||||||
|
|
||||||
|
If no player is found (localize all text):
|
||||||
|
|
||||||
|
```
|
||||||
|
No audio player detected.
|
||||||
|
File saved to: ~/Music/minimax-gen/<filename>.mp3
|
||||||
|
Tip: Install mpv for the best playback experience (brew install mpv).
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Step 5: Feedback & Iteration
|
||||||
|
|
||||||
|
After playback, ask for feedback:
|
||||||
|
|
||||||
|
```
|
||||||
|
How was this song?
|
||||||
|
1. Love it, keep it!
|
||||||
|
2. Not quite, adjust and regenerate
|
||||||
|
3. Fine-tune lyrics/style then regenerate
|
||||||
|
4. Don't want it, start over
|
||||||
|
```
|
||||||
|
|
||||||
|
Based on feedback:
|
||||||
|
- **Satisfied**: Done. Mention the file path again.
|
||||||
|
- **Adjust & regenerate**: Ask what to change (prompt? lyrics? style?), apply edits,
|
||||||
|
re-run generation. Keep the old file with a `_v1` suffix for comparison.
|
||||||
|
- **Fine-tune**: Enter Advanced Control Mode with the current parameters pre-filled.
|
||||||
|
- **Delete & restart**: Remove the file, go back to Step 0.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Cover Mode
|
||||||
|
|
||||||
|
Generate a cover version of a song based on reference audio. Model: `music-cover-free`.
|
||||||
|
|
||||||
|
**Reference audio requirements**: mp3, wav, flac — duration 6s to 6min, max 50MB.
|
||||||
|
If no lyrics are provided, the original lyrics are extracted via ASR automatically.
|
||||||
|
|
||||||
|
### Workflow
|
||||||
|
|
||||||
|
When the user selects Cover mode:
|
||||||
|
1. Ask for the source audio — a local file path or URL
|
||||||
|
2. Ask for the target cover style (e.g., "acoustic cover, stripped-down, intimate vocal")
|
||||||
|
3. Optionally ask for custom lyrics or lyrics file
|
||||||
|
|
||||||
|
### Commands
|
||||||
|
|
||||||
|
**Cover from local file:**
|
||||||
|
```bash
|
||||||
|
mmx music cover \
|
||||||
|
--prompt "<cover style description>" \
|
||||||
|
--audio-file <source.mp3> \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**Cover from URL:**
|
||||||
|
```bash
|
||||||
|
mmx music cover \
|
||||||
|
--prompt "<cover style description>" \
|
||||||
|
--audio <source_url> \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**With custom lyrics (text):**
|
||||||
|
```bash
|
||||||
|
mmx music cover \
|
||||||
|
--prompt "<style>" \
|
||||||
|
--audio-file <source.mp3> \
|
||||||
|
--lyrics "<custom lyrics>" \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
**With custom lyrics (file):**
|
||||||
|
```bash
|
||||||
|
mmx music cover \
|
||||||
|
--prompt "<style>" \
|
||||||
|
--audio-file <source.mp3> \
|
||||||
|
--lyrics-file <lyrics.txt> \
|
||||||
|
--out ~/Music/minimax-gen/<filename>.mp3 \
|
||||||
|
--quiet --non-interactive
|
||||||
|
```
|
||||||
|
|
||||||
|
### Optional flags
|
||||||
|
|
||||||
|
| Flag | Description |
|
||||||
|
|------|-------------|
|
||||||
|
| `--seed <number>` | Random seed 0-1000000 for reproducible results |
|
||||||
|
| `--channel <n>` | `1` (mono) or `2` (stereo, default) |
|
||||||
|
| `--format <fmt>` | `mp3` (default), `wav`, `pcm` |
|
||||||
|
| `--sample-rate <hz>` | Sample rate (default: 44100) |
|
||||||
|
| `--bitrate <bps>` | Bitrate (default: 256000) |
|
||||||
|
|
||||||
|
### After generation
|
||||||
|
Proceed with normal playback and feedback flow (Step 4 & 5).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Error Handling
|
||||||
|
|
||||||
|
| Error | Action |
|
||||||
|
|-------|--------|
|
||||||
|
| mmx not found | `npm install -g mmx-cli` |
|
||||||
|
| mmx auth error (exit code 3) | `mmx auth login` |
|
||||||
|
| Quota exceeded (exit code 4) | Report quota limit, suggest waiting or upgrading |
|
||||||
|
| API timeout (exit code 5) | Retry once, then report failure |
|
||||||
|
| Content filter (exit code 10) | Adjust prompt to avoid filtered content |
|
||||||
|
| Invalid lyrics format | Auto-fix section markers, warn user |
|
||||||
|
| No audio player found | Save file and tell user the path, suggest installing mpv |
|
||||||
|
| Network error | Show error detail, suggest checking connection |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Important Notes
|
||||||
|
|
||||||
|
- **Never reproduce copyrighted lyrics.** When doing covers, always write original lyrics
|
||||||
|
inspired by the song's theme. Explain this to the user.
|
||||||
|
- **Prompt language**: The API prompt works best with English tags. Chinese tags are also
|
||||||
|
acceptable. Mixing is OK.
|
||||||
|
- **Section markers in lyrics**: The API recognizes `[verse]`, `[chorus]`, `[bridge]`,
|
||||||
|
`[outro]`, `[intro]`. Always include them when providing `--lyrics`.
|
||||||
|
- **File management**: If `~/Music/minimax-gen/` has more than 50 files, suggest cleanup
|
||||||
|
when starting a new session.
|
||||||
|
- **Structured params**: Prefer using `--genre`, `--mood`, `--vocals`, `--instruments`,
|
||||||
|
`--bpm` etc. over embedding everything in `--prompt`. This gives the API better control.
|
||||||
|
- **Lyrics language via style**: When the user wants lyrics in a specific language, express
|
||||||
|
it through the vocal description or genre (e.g., "Japanese female vocalist", "Mandopop
|
||||||
|
ballad") rather than appending a language directive to the prompt.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Appendix: Prompt Writing Guide
|
||||||
|
|
||||||
|
See [references/prompt_guide.md](references/prompt_guide.md) for the complete prompt writing guide,
|
||||||
|
including genre/vocal/instrument references and BPM tables.
|
||||||
@@ -0,0 +1,133 @@
|
|||||||
|
# Prompt Writing Guide
|
||||||
|
|
||||||
|
This reference helps construct high-quality music generation prompts.
|
||||||
|
|
||||||
|
## Core Principle
|
||||||
|
|
||||||
|
**Write prompts as vivid English sentences, not comma-separated tags.**
|
||||||
|
|
||||||
|
The API responds best to descriptive, narrative-style prompts that paint a complete picture
|
||||||
|
of the song. Each prompt should read like a creative brief for a musician.
|
||||||
|
|
||||||
|
## Prompt Structure
|
||||||
|
|
||||||
|
A complete prompt follows this sentence pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
A [mood/emotion] [BPM optional] [genre + sub-genre] [song/piece/track].
|
||||||
|
[Vocal description OR "Instrumental with..." description].
|
||||||
|
[Narrative/theme — what the song is about].
|
||||||
|
[Atmosphere/scene details].
|
||||||
|
[Key instruments and production elements].
|
||||||
|
```
|
||||||
|
|
||||||
|
**Vocal Track Example:**
|
||||||
|
```
|
||||||
|
A melancholic yet defiant Pop-House song, featuring emotional vocals, about
|
||||||
|
lighting a torch in the cold dark night as a form of romantic rebellion,
|
||||||
|
energetic rhythm with synth elements.
|
||||||
|
```
|
||||||
|
|
||||||
|
**Instrumental Example:**
|
||||||
|
```
|
||||||
|
A warm and uplifting 100 BPM indie folk instrumental piece, evoking a sunny
|
||||||
|
afternoon stroll through a small town market, featuring bright acoustic guitar
|
||||||
|
fingerpicking, gentle ukulele strums, light hand claps, and a whistled melody
|
||||||
|
that feels like pure contentment.
|
||||||
|
```
|
||||||
|
|
||||||
|
## How to Build a Prompt Step by Step
|
||||||
|
|
||||||
|
**1. Open with mood + genre (required)**
|
||||||
|
|
||||||
|
| Pattern | Example |
|
||||||
|
|---------|---------|
|
||||||
|
| Single mood | "A melancholic R&B song" |
|
||||||
|
| Contrasting moods | "A melancholic yet defiant Pop-House song" |
|
||||||
|
| With BPM | "A smoky 74 BPM Neo-Soul fusion" |
|
||||||
|
| With era/region | "A laid-back 90 BPM Island Reggae" |
|
||||||
|
| Genre blend | "An Avant-Garde Jazz and Neo-Soul fusion" |
|
||||||
|
|
||||||
|
**2. Describe the vocals (for vocal tracks)**
|
||||||
|
|
||||||
|
Good vocal descriptions:
|
||||||
|
- "featuring smooth emotional vocals"
|
||||||
|
- "Vocals: Ultra-low, gravelly baritone with authentic phrasing"
|
||||||
|
- "Vocals: Sultry, sophisticated male baritone with smooth jazz inflections and breathy delivery"
|
||||||
|
- "Vocals: Ethereal, crystal-clear Enya-style vocals with lush reverb"
|
||||||
|
- "Vocals: Relaxed, soul-flavored vocals with ad-libs and melodic scats"
|
||||||
|
|
||||||
|
Bad (too vague): "female vocal"
|
||||||
|
|
||||||
|
**3. Add narrative/theme (recommended)**
|
||||||
|
|
||||||
|
- "about lighting a torch in the cold dark night as a form of romantic rebellion"
|
||||||
|
- "about letting go of perfectionism and embracing your true self like flowing water"
|
||||||
|
|
||||||
|
For instrumentals, describe the scene: "evoking a sunrise drive along a coastal highway"
|
||||||
|
|
||||||
|
**4. Set the mood/atmosphere (recommended)**
|
||||||
|
|
||||||
|
- "bittersweet but healing mood"
|
||||||
|
- "empowering and self-loving mood"
|
||||||
|
|
||||||
|
**5. Specify production elements (recommended)**
|
||||||
|
|
||||||
|
- "mellow beats with lo-fi elements"
|
||||||
|
- "featuring a warm fretless bassline, shimmering Rhodes piano, and brushed jazz drums"
|
||||||
|
|
||||||
|
## Genre Reference
|
||||||
|
|
||||||
|
| Category | Genres |
|
||||||
|
|----------|--------|
|
||||||
|
| Pop & Dance | Pop, Dance Pop, Electropop, Synth-pop, Dream Pop, K-pop, J-pop, C-pop, City Pop, House, Future Bass, EDM |
|
||||||
|
| Rock & Alt | Rock, Indie Rock, Pop Rock, Post-Rock, Shoegaze, Punk, Metal, Alternative |
|
||||||
|
| R&B/Soul/Funk | R&B, Neo-Soul, Contemporary R&B, Funk, Gospel, Soul |
|
||||||
|
| Hip-Hop | Hip-Hop, Trap, Boom Bap, Lo-fi Hip-Hop, Cloud Rap, Drill, Afrobeats |
|
||||||
|
| Electronic | Ambient, Techno, Drum and Bass, Chillwave, Vaporwave, Amapiano |
|
||||||
|
| Folk/Acoustic | Folk, Indie Folk, Country, Chinese Traditional, Celtic Folk |
|
||||||
|
| Jazz/Blues | Jazz, Smooth Jazz, Jazz Fusion, Bossa Nova, Blues, Avant-Garde Jazz |
|
||||||
|
| Classical | Classical, Orchestral, Cinematic, Film Score, Epic, Neoclassical, Piano Solo |
|
||||||
|
| World | Reggae, Latin, Waltz, Tango, Flamenco |
|
||||||
|
|
||||||
|
## Vocal Style Reference
|
||||||
|
|
||||||
|
| Style | Prompt phrase |
|
||||||
|
|-------|--------------|
|
||||||
|
| Smooth & emotional | "smooth emotional vocals" |
|
||||||
|
| Raw & unpolished | "raw, unpolished vocals shifting between whispers and screams" |
|
||||||
|
| Breathy & intimate | "breathy delivery with intimate phrasing" |
|
||||||
|
| Powerful & soulful | "powerful soulful vocals with gospel inflections" |
|
||||||
|
| Sultry & sophisticated | "sultry, sophisticated baritone with jazz inflections" |
|
||||||
|
| Ethereal & clear | "ethereal, crystal-clear vocals with lush reverb" |
|
||||||
|
| Aggressive & intense | "aggressive vocal delivery with rhythmic intensity" |
|
||||||
|
|
||||||
|
## Instrument & Production Reference
|
||||||
|
|
||||||
|
| Category | Examples |
|
||||||
|
|----------|---------|
|
||||||
|
| Strings & Guitar | acoustic guitar fingerpicking, electric guitar riffs, fretless bass, violin, cello, erhu, guzheng, pipa |
|
||||||
|
| Keys & Synth | piano, Rhodes piano, synth pad, synth lead, arpeggiator, music box, organ |
|
||||||
|
| Drums & Percussion | brushed jazz drums, electronic drums, 808 hi-hats, trap percussion, cajon, bongos |
|
||||||
|
| Wind & Brass | saxophone, trumpet, flute, harmonica, bamboo flute, xiao |
|
||||||
|
| Texture & Effects | vinyl crackle, tape hiss, ambient pads, glitch elements, rain sounds |
|
||||||
|
|
||||||
|
## BPM Reference
|
||||||
|
|
||||||
|
| Feel | BPM | Use in prompt |
|
||||||
|
|------|-----|---------------|
|
||||||
|
| Very slow, meditative | 40-60 | "a meditative 50 BPM..." |
|
||||||
|
| Slow ballad | 60-80 | "a slow 70 BPM ballad..." |
|
||||||
|
| Mid-tempo groove | 80-110 | "a groovy 95 BPM..." |
|
||||||
|
| Upbeat, energetic | 110-130 | "an upbeat 120 BPM..." |
|
||||||
|
| Fast, driving | 130-160 | "a driving 140 BPM..." |
|
||||||
|
|
||||||
|
## Tips for High-Quality Prompts
|
||||||
|
|
||||||
|
1. **Write sentences, not tag lists**: "A melancholic R&B song about..." beats "R&B, sad, slow, piano".
|
||||||
|
2. **Be vivid and specific**: "salvaging memory fragments in space-time" > "sad memories".
|
||||||
|
3. **Describe vocals as a character**: "Sultry baritone with jazz inflections" not "male vocal".
|
||||||
|
4. **Include a scene or vibe**: "A high-end rooftop lounge at night" gives the model a coherent world.
|
||||||
|
5. **Mix detail levels**: Specify 2-3 key instruments precisely, leave the rest to the model.
|
||||||
|
6. **English prompts work best**: Chinese scene descriptions can be mixed in for flavor.
|
||||||
|
7. **For instrumentals**: Replace vocal descriptions with instrument focus and scene narrative.
|
||||||
Reference in New Issue
Block a user