GNOME Bugzilla – Bug 600977
Move the data from Glom's document example_rows into CSV files
Last modified: 2009-11-06 19:34:45 UTC
The following is a valid use case but there are some issues with its current implementation (split off from https://bugzilla.gnome.org/show_bug.cgi?id=600874): Creating a new project from an example document =============================================== 1. Glom reads all the describing metadata from the document, creating a db backend and filling the db with the outlined table structure (tables, columns, types, relationships etc). 2. Glom creates a new document, containing only metadata from the example. 3. Glom reads the example_rows section, one for each table. Those may contain data in CSV format (embedded CSV files, if you will). This data is shoved into the newly created database. Issues ====== * For those example documents, data is stored together with metadata. Usually, Glom documents always only contain metadata while the real data lives in the database. Breaking this rule seems arbitrary in this case. * The way the data is embedded in the example documents requires different encoding rules than the surrounding XML (as it basically is embedded CSV). * We have to maintain two separate code paths to do the very same thing, namely importing data into an (possibly empty) Glom document (see point 3). * We have to take care of properly embedding CSV data in XML files, even though this feature is rarely used by normal users. This basically means that this feature will lack testing (not exposed to most users), and as it includes non-trivial encoding issues it is also fragile. Proposal ======== Be strict about separation of data and metadata. For the example documents, a better approach could be to save the data in CSV files, one for each table. Maybe bundle all the .glom and the .csv files together in an archive, say .glomx When creating a new document from a template, instead of point 3 we now import data from the CSV files, using the same functionality that already exists for normal CSV imports. This way, we not only maintain less code. We also get better testing for the bundling of data and metadata in example documents (read: gets fixed/reported faster if broken) since at least the data import part will break at the same time as the CSV import breaks.
No, I want the example files to be self-contained. I don't want to exchange usually-working code for a new set of problems caused by having interdependent separate files. If we want more testing then we should have more unit tests.