Research

AI Framework Translates 2D Images into G-code for AM

Researchers from Carnegie Mellon University have introduced Image2Gcode, an end-to-end deep learning framework that generates printer-ready G-code directly from 2D images, removing the need for computer-aided design (CAD) models or slicing software. Published on arXiv, the study presents a diffusion-transformer model that converts sketches or photographs into executable additive manufacturing instructions, creating a direct link between visual representation and fabrication.

Conventional additive manufacturing workflows require multi-step processes involving CAD modeling, mesh conversion, and slicing to produce G-code. Each step demands specialized expertise and extensive iteration, which slows design modification and limits accessibility. Image2Gcode replaces this process with a direct visual-to-instruction pathway based on a denoising diffusion probabilistic model (DDPM). The system generates structured extrusion trajectories directly from an image, bypassing intermediate STL and CAD files.

Input can include a hand-drawn sketch or an object photograph. Image2Gcode extracts visual features, interprets geometric boundaries, and synthesizes continuous extrusion paths. This approach streamlines prototyping, repair, and distributed manufacturing while lowering the barrier to entry for non-expert users. According to the researchers, the framework creates a “direct and interpretable mapping from visual input to native toolpaths,” effectively linking concept and execution within a single computational process.

Image2Gcode Overview. Image via arXiv.
Image2Gcode Overview. Image via arXiv.

Data-Driven Learning and Architecture

Image2Gcode integrates a pre-trained DinoV2-Small vision transformer, a self-supervised model for large-scale image representation learning, with a 1D U-Net denoising architecture conditioned through multi-scale cross-attention. The DinoV2 encoder, which includes 384-dimensional embeddings, 12 transformer layers, and a 14×14 patch size, extracts hierarchical geometric information that guides the diffusion model in generating coherent G-code.

Training utilized the Slice-100K dataset, which contains over 100,000 aligned STL–G-code pairs. Each sample comprises a rendered slice image and corresponding toolpath trajectory, enabling the model to learn layer-level relationships between geometry and movement. Treating each layer as an independent 2D task reduces computational complexity while maintaining geometric accuracy across variable layer heights and structural features.

Implementation was conducted in PyTorch, training for 800 epochs with the AdamW optimizer. A cosine noise schedule across 500 diffusion timesteps guided the iterative denoising process. Normalization across spatial and extrusion channels ensured stability and adaptability to different printer configurations. By progressively transforming Gaussian noise into ordered motion commands, the framework produced valid, printer-ready G-code sequences that could be scaled or adjusted without retraining.

Preprocessing pipeline. Image via arXiv.
Preprocessing pipeline. Image via arXiv.

Experimental Results Validation

Evaluation on the Slice-100K validation split showed that Image2Gcode generated geometrically consistent and manufacturable toolpaths. Prints fabricated from the model’s G-code demonstrated strong interlayer bonding, accurate boundaries, and smooth surfaces comparable to traditional slicer outputs. The generated toolpaths reproduced complex infill structures—such as rectilinear, honeycomb, and diagonal hatching—without relying on rule-based programming.

Real-world testing extended to photographs and hand-drawn sketches, representing data distributions distinct from the synthetic training set. Preprocessing extracted shape contours from these inputs, and Image2Gcode successfully generated coherent, printable paths. Fabricated results retained geometric fidelity and functional integrity, validating that pretrained DinoV2 features can bridge the gap between synthetic and real-world inputs.

Quantitative analysis showed a 2.4% reduction in mean travel distance compared to heuristic slicer baselines, indicating improved path efficiency. This reduction did not compromise print quality or mechanical strength, suggesting the model captures geometric regularities that support optimized motion planning.

https://3dprintingindustry.com/wp-content/uploads/2025/12/Generalization-to-real-world-inputs.-Image-via-arXiv.png
Generalization to real-world inputs. Image via arXiv.

Learned Variability and Limitations

Toolpath synthesis represents a many-to-one mapping problem in which several valid extrusion strategies can satisfy identical boundary constraints. Image2Gcode learned to produce multiple feasible infill solutions, demonstrating a non-deterministic understanding of manufacturing constraints. Comparative visualizations revealed that for identical geometries, the model could replace concentric shells with diagonal hatching or hybrid infill architectures, preserving coverage and manufacturability.

Current limitations stem from its 2D slice formulation, which lacks awareness of interlayer dependencies or internal cavities that require coordinated 3D path planning. The authors propose expanding toward hierarchical 3D generation, where a coarse global model defines key cross-sections and Image2Gcode refines them layer-by-layer. Additional conditioning for infill density, mechanical performance, and material usage is also proposed to enhance control over fabrication outcomes.

Future integration with AI-driven manufacturing frameworks such as LLM-3D Print, a multi-agent system for adaptive process control and defect detection, could extend Image2Gcode’s capabilities. Linking the diffusion model to language-based interfaces would enable users to specify goals—such as minimizing print time or improving surface finish—that the system translates into optimized G-code generation.

Novel Infill Pattern Generation. Image via arXiv.
Novel Infill Pattern Generation. Image via arXiv.

Combining diffusion-based synthesis, pretrained visual perception, and parameter normalization, Image2Gcode establishes a foundation for intent-aware additive manufacturing. Its data-driven architecture connects design, perception, and execution, reducing reliance on manual modeling and enabling fully digital workflows where sketches and photographs transition seamlessly into printed components.

Help shape the 2025 3D Printing Industry Awards. Sign up for the 3DPI Expert Committee today.

Are you building the next big thing in 3D printing? Join the 3D Printing Industry Start-up of the Year competition and expand your reach.

Subscribe to the 3D Printing Industry newsletter to stay updated with the latest news and insights.

Featured image shows Image2Gcode Overview. Image via arXiv.

© Copyright 2017 | All Rights Reserved | 3D Printing Industry