Skip to content

[DMP 2026]: Evaluate current state of music gen AI models and ideas to integrate them #6610

@sum2it

Description

@sum2it

Ticket Contents

Description

Music Blocks already has a preliminary AI music generation widget (js/widgets/aiwidget.js) that uses the Groq API with the llama3-8b-8192 LLM to generate ABC notation from a text prompt, which is then parsed and converted into Music Blocks block programs via _parseABC. However, this integration is narrow in scope — it relies on a single general-purpose LLM, has known parsing bugs, only supports monophonic melodies, and does not consider dedicated music generation AI models (e.g., Meta's MusicGen/AudioCraft, Google's MusicLM, Magenta, etc.).

This project aims to conduct a thorough evaluation of the current state of music generation AI models, assess their suitability for integration into Music Blocks, and implement one or more improved integrations that are educationally meaningful, robust, and open-source-friendly.

Goals & Mid-Point Milestone

Goals

  • Survey and document the current landscape of music generation AI models (MusicGen/AudioCraft, MusicLM, Magenta, Jukebox, ABC-specific LLMs, etc.) with respect to: output format, licensing, API availability, latency, and educational suitability
  • Audit and document the limitations of the existing AIWidget implementation (hardcoded API key, monophonic-only output, ABC parsing bugs, single model dependency, no error recovery)
  • Propose and prototype at least one improved or alternative AI integration pathway (e.g., a model that outputs MIDI or MusicXML directly, a locally-runnable model, or a multi-model selector)
  • Implement a more robust ABC/MIDI/MusicXML → Music Blocks block conversion pipeline that handles chords, repeats, and multi-voice output correctly
  • Goals Achieved By Mid-point Milestone: Complete the landscape survey with written documentation, finish the audit of the existing AIWidget, and have a working prototype of at least one improved integration that generates multi-voice or chord-capable output and loads it into Music Blocks blocks

Setup/Installation

Clone the Music Blocks repository: https://github.com/sugarlabs/musicblocks
Follow the setup guide in README.md to run Music Blocks locally (requires Node.js and a browser)
Review the existing AI widget: js/widgets/aiwidget.js and its block definition in js/blocks/WidgetBlocks.js
Obtain API keys for relevant services under evaluation (Groq, Hugging Face, etc.) for prototyping
Relevant libraries already in use: ABCJS (ABC notation parsing/playback), Tone.js (audio synthesis), MIDI.js

Expected Outcome

A well-documented evaluation report of music generation AI models suitable for Music Blocks, accompanied by a working, improved AI music generation widget that:

Supports multi-voice and chord generation (not just monophonic melodies)
Has a clean, user-facing interface with prompt hints and clear error handling
Converts AI-generated output (ABC, MIDI, or MusicXML) into Music Blocks block programs reliably
Does not expose API keys in client-side code
Optionally supports model selection or a fallback chain

Acceptance Criteria

A written evaluation document comparing at least 4 music generation AI models/APIs on dimensions: output format, license, latency, educational value, and ease of integration
The existing AIWidget bugs (broken repeat handling, missing chord support, leftover console.log debug statements, hardcoded API key) are resolved or replaced
The new or improved widget correctly loads generated music as Music Blocks block programs for at least 2-voice output
API key is not hardcoded in client-side JavaScript
All new code passes existing ESLint/Prettier checks and includes JSDoc comments

Implementation Details

Language/Stack: JavaScript (ES6+), HTML/CSS — consistent with the existing Music Blocks codebase
Existing entry points:
Widget: js/widgets/aiwidget.js — AIWidget constructor, _parseABC, makeCanvas, __save
Block: js/blocks/WidgetBlocks.js — AIMusicBlocks class (line ~1563)
ABC utilities: js/abc.js — processABCNotes, saveAbcOutput
Widget registration: js/activity.js
Current model in use: llama3-8b-8192 via https://api.groq.com/openai/v1/chat/completions — a general-purpose LLM prompted to output ABC notation
Candidate models to evaluate: Meta MusicGen (AudioCraft), Google Magenta (music-transformer, melody-rnn), ABC-specific fine-tuned models on Hugging Face, OpenAI's music-capable models
Key technical challenge: Most dedicated music gen models output audio (WAV/MP3), not symbolic notation (ABC/MIDI/MusicXML). The project must evaluate which models output symbolic formats usable by Music Blocks, or design a transcription step
API key security: Should be handled server-side or via environment injection, not hardcoded as env.GROQ_API_KEY in client JS
ABC → Blocks parser improvements should address the known issues in _parseABC (multi-staff repeat logic, triplet handling, chord support)

Mockups/Wireframes

No response

Product Name

Music Blocks

Organisation Name

Sugar Labs

Domain

⁠Education

Tech Skills Needed

JavaScript

Mentor(s)

@sum2it @walterbender @omsuneri

Category

Frontend

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions