Technologies & Methods Used

As AI tools become more deeply integrated into education, our Applied Data Science project at Politecnico di Torino aimed to answer a forward-looking question: Can large language models (LLMs) support students in learning database design—specifically, by generating Entity-Relationship (ER) models from natural language instructions?
The goal of our project, “Designer AI,” was to evaluate how reliably LLMs can convert textual descriptions of logical data models into structured ER diagrams—automating a task commonly required in database education.
We tested multiple models, including both proprietary (GPT, Claude, Gemini) and open-source (Code Llama, SynthiaIA, Toppy) LLMs. Our pipeline took a novel two-stage approach: instead of asking the model to generate both logic and syntax, we separated concerns. The LLM would first generate the logical structure of the ER model, and then a separate syntax module would convert that into a Designer.io-compatible JSON format.
Key challenges we faced included:
Despite minimal fine-tuning, GPT-3.5 achieved nearly 85% accuracy in generating correct ER models, and models pre-trained on code (like Code Llama) generally performed better. However, there’s room for improvement—particularly in areas like better prompt engineering, deeper fine-tuning, and expanding the dataset.
This project demonstrated that LLMs are not just useful assistants—they can become effective teaching tools. From schema creation to automated feedback, AI has real potential to enhance how database design is taught and learned.

