GNOME Bugzilla – Bug 721016
File loading and saving
Last modified: 2014-07-09 11:40:13 UTC
Take the gedit implementation, transform the API to follow the GIO conventions for async operations, and make the code re-usable (e.g. do not use GSettings). You can see my progress here: https://git.gnome.org/browse/gtksourceview/log/?h=wip/loader-saver It's not yet ready for a first review.
I've written some mails to the mailing list: https://mail.gnome.org/archives/gedit-list/2013-December/msg00006.html Here is an update. The file saver in gsv is done, with an adapter in gedit to not break the gedit API for plugins. And the file loader is half done in gsv. The branch in gedit is: https://git.gnome.org/browse/gedit/log/?h=wip/loader-saver I think the GtkSourceFile API is nice, except for one thing: if you change the "encoding" property, save the file, and then an error occurs due to an encoding error, the "encoding" property is inconsistent with the underlying GFile (if the GFile already existed before changing the encoding). It would be better if it was not possible to have such inconsistencies. If it can not be avoided, it should at least be documented, to warn the user of the API. If we provide only the FileLoader and FileSaver classes, we don't have those inconsistencies, but the file properties (encoding, line ending type, etc) are not remembered by gsv, gedit would need to keep those attributes.
Ideally all the GeditDocument features should be migrated to gsv. But lots of plugins use the load and save signals to do various stuff. So the GtkSourceFile API is probably not sufficient. See bug #724247 for an alternate solution to the "load" and "loaded" signals: have a "is-loading" property. If we have a "file" property in GtkSourceBuffer that points to the default GtkSourceFile, it is not convenient to use because the GtkSourceFile properties can not be accessed directly, there is a level of indirection with the "file" property. If we connect to a GtkSourceFile signal, the signal handler must be disconnected when the "file" property changes. So a solution is to provide a signal in GtkSourceBuffer: "file-changed", with a pointer to the previous GtkSourceFile, so we can easily disconnect the old signal handlers. Or the application must keep around the previous GtkSourceFile.
If we choose to put the state in GtkSourceBuffer, then a first step is to have the Loader and Saver classes. The purpose of these classes is to have separate properties than the default ones. So it is more flexible: one can save the buffer to another file, without changing the default properties. Having the Loader and Saver classes also avoid having functions with lots of parameters. With a function with lots of parameters, it is also not possible to add a parameter in the future. So a class is more convenient, a property can be added in the future if needed. If possible, I also want to shorten a little the work of this project, so it is easier to review and merge. One way to shorten the project is to not have the state in GtkSourceBuffer, just have the Loader and Saver classes. With an expansion of the API in mind (without API breaks of course).
Another API idea, that solves the above problems. Have a GtkSourceFileSettings class, with the properties of a file: - location - encoding - compression-type - newline-type - ensure-trailing-newline All those properties are construct-only, so the objects are immutable. GtkSourceFileSettings can have a generic function to change one or more properties and return the modified object. For example: > new_file_settings = > gtk_source_file_settings_dup ("location", new_location, > "encoding", new_encoding, > NULL); For storing the state, GtkSourceBuffer can have a "file-settings" property. Since the file settings is an immutable object, we can connect only to the notify signal of the "file-settings" property, no need to connect to the notify signal of the FileSettings. For the actions, there can be a FileSaver and FileLoader classes. They each would have a "file-settings" property, along with other specific properties. A FileSaver or FileLoader object can be used several times, so if a property implies an error we can change its value and retry.
And the dup() function is not really needed, it was just an idea. Instead, we can take a FileSettings object, show its properties in the UI, the user can change the properties, and when clicking on the OK button, create a new FileSettings object with the new properties.
Well, it's maybe better with all the properties everywhere: in GtkSourceBuffer, FileLoader and FileSaver. The FileSettings is not really convenient in some cases, such as loading a new file, for which we don't know the settings. One idea with the FileSettings is that we can reuse the same object between the file loading and saving. Without the FileSettings, we can have the same effect if the state is stored in GtkSourceBuffer. Anyway, I should try a solution more in depth, see how it goes, the problems encountered when porting the code in gedit (and not only write the adapters between the two APIs, also port the code internally, in the plugins, etc, so we can deprecate the API in GeditDocument).
I continue my monologue. Any feedback would be appreciated at this point. The branch in gsv: https://git.gnome.org/browse/gtksourceview/log/?h=wip/loader-saver The gedit code is not yet ported. I prefer to have another opinion on the gsv API before porting gedit. There are properties in GtkSourceBuffer, and the FileLoader and FileSaver classes are public. The API documentation is almost complete. The best is probably to compile the code and read the API documentation. When creating a FileLoader or FileSaver, the properties of GtkSourceBuffer are copied by default (for the FileLoader it is only for the candidate encodings, as the other properties are auto-detected). When a file is loaded, the Buffer properties are updated. For the FileSaver, it is the case only if the boolean property "main-file" is true. I.e. it is possible to save the buffer in a secondary file without affecting the Buffer properties, which was one of the requirements. If a FileLoader or FileSaver fails the operation, the Buffer properties are not updated. The FileLoader and FileSaver are also more easily extensible. We can add properties to the classes without touching the save_async() and load_async() functions. And those functions don't need a flag parameter, since boolean properties are used for that. A little change: the FileLoader and FileSaver classes can be used several times. It is not a big change in the code, and since they are public classes, it's better in my opinion to support this case. Even if in gedit we don't use this possibility. An example where it could be useful: if the encoding is not correct in a FileSaver, we can adjust the encoding and try again, without destroying and creating a new FileSaver. With the properties in GtkSourceBuffer, there is a little problem though in gedit, since GeditDocument has more or less the same properties, but with different types. So it is needed to remove those properties in GeditDocument. At first, I didn't want to break the gedit API. But if the properties must be removed, why not removing completely the file loading and saving API from GeditDocument? The problem is that it involves more work for porting the plugins etc. And currently the gsv API is probably not sufficient, there is no indication that a GtkSourceBuffer is being loaded for example. And with Python code and string properties and signals names, it is more difficult to port the code (a compilation is not enough to detect all the code to port).
More ramblings. Paolo proposed on IRC other solutions, for example having a LoadAttrs and SaveAttrs helper objects that we pass to load() and save() functions, so we don't have functions with lots of parameters, and we have a good extensibility. But those helper objects would contain only data, no behavior. The FileLoader and FileSaver classes are the same, but with the behavior. One way or the other is almost the same, it doesn't make a big difference. I also thought about a feature: detect very long lines and propose to the user to split them before loading them to the buffer. The load operation could be paused, a signal sent, and the operation could be resumed with a flag to say to split the line. I guess it's easier to do that with the FileLoader than with the LoadAttrs. In object-oriented programming, it's better to keep related data and behavior in one place. So the FileLoader and FileSaver are better. On the other hand, having a class name derived from a verb (i.e. an action) is generally not a good idea, a method is often better. But in our case, the object is used also to configure the operation, not just launch the operation. Exactly like the new GSubprocessLauncher. ===== When seeing the GeditDocument functions, get_mime_type(), get_uri_for_display(), is_local(), get_deleted(), etc. All those functions are related to the file, not the buffer. It is the same for the properties: the encoding, the compression type and the line ending type are all related to the file, it's strange to have those properties on the buffer (the encoding of the buffer is always in UTF-8, for example). If all those properties and methods are added to GtkSourceBuffer, it creates noncommunicating behavior, that is, methods that operate on a proper subset of the data members of the class (here the file). To avoid god classes, it's better to split noncommunicating behavior. I don't say that GtkSourceBuffer is a god class, but if we always add all the features to the same classes, it can become a problem. So I propose to have a GtkSourceFile class, and keep the FileLoader and FileSaver. If a function like get_uri_for_display() is added to GtkSourceBuffer, it can not be used for another file. If it is added to GtkSourceFile, it's more flexible. GtkSourceBuffer can also have a "file" construct-only property, to make the association. With the construct-only, it's easier for connecting to the signals of GtkSourceFile. Since it will always be the same File object for a certain buffer, we don't need to disconnect the signals and reconnect them to a new object (it was one of the secondary problems of the initial GtkSourceFile described above).
(In reply to comment #8) > > When seeing the GeditDocument functions, get_mime_type(), > get_uri_for_display(), is_local(), get_deleted(), etc. All those functions are > related to the file, not the buffer. It is the same for the properties: the > encoding, the compression type and the line ending type are all related to the > file, it's strange to have those properties on the buffer (the encoding of the > buffer is always in UTF-8, for example). > > If all those properties and methods are added to GtkSourceBuffer, it creates > noncommunicating behavior, that is, methods that operate on a proper subset of > the data members of the class (here the file). To avoid god classes, it's > better to split noncommunicating behavior. I don't say that GtkSourceBuffer is > a god class, but if we always add all the features to the same classes, it can > become a problem. > > So I propose to have a GtkSourceFile class, and keep the FileLoader and > FileSaver. > > If a function like get_uri_for_display() is added to GtkSourceBuffer, it can > not be used for another file. If it is added to GtkSourceFile, it's more > flexible. > > GtkSourceBuffer can also have a "file" construct-only property, to make the > association. With the construct-only, it's easier for connecting to the signals > of GtkSourceFile. Since it will always be the same File object for a certain > buffer, we don't need to disconnect the signals and reconnect them to a new > object (it was one of the secondary problems of the initial GtkSourceFile > described above). Yes, this was one of the things I was pondering on IRC... from one side it makes a lot of sense to split, and if we split I do not think buffer should even have a file property. On the other hand there are properties that are not so black and white about where they belong: e.g. the line ending type...
(In reply to comment #9) > Yes, this was one of the things I was pondering on IRC... from one side it > makes a lot of sense to split, and if we split I do not think buffer should > even have a file property. If we make the association with a construct-only, it forces the user to reuse the same File object, which is better. Also, Jesse was interested to maybe move some features of code-assistance to GtkSourceView, and for that it needs the file-buffer association. But of course we can add the association later when we need it. > On the other hand there are properties that are not so black and white about > where they belong: e.g. the line ending type... Currently the line ending type is not used by the buffer. It could be useful, for example for the copy/paste to another application. If one day we want to do that, we can move the property to the buffer and deprecate the one from the file. Do you see other use cases? For the implicit-trailing-newline, it's clearly a buffer property because for a file the trailing newline is always explicit. And the property can be useful for drawing spaces. If the property is moved to the file, it should have a different name, like remove-add-trailing-newline, which is not great. In gedit the property is named ensure-trailing-newline, but it's a bad name. Even for the FileSaver, "ensure" normally doesn't add a new newline if the buffer has already one. But the FileSaver _always_ add a new newline if ensure-trailing-newline is true. So for the FileSaver a better name is add-trailing-newline. And for the FileLoader remove-trailing-newline. But to remember the state between the loader and saver, it is required to store the value in a common property, hence implicit-trailing-newline.
<pbor> I am a bit surprised that the constructor of GtkSourceFile takes a buffer <pbor> I was more thinking along the lines of <pbor> (pseudocode) <pbor> f = source_file_new (gfile) <pbor> loader = loader_new (f, buffer) After some thoughts, it's indeed better to not associate the buffer and the file, so applications can potentially have a many-to-many relationship between them. With a construct-only buffer property in GtkSourceFile, a GtkSourceFile object can not be reused for several buffers, so it is only a one-to-many relationship. The more flexible the better, a library should have as few policy as possible (but it's less the case for GTK+ these last years, it's more oriented to GNOME, but I digress). A use case for having a many-to-one relationship between a set of buffers and a GtkSourceFile: implementing the split view with synchronized buffers (since GtkTextView doesn't handle correctly the split view). The application can have a BufferSyncSet object holding all the buffers and doing the synchronization, with a ref to the GtkSourceFile. If the GtkSourceFile is associated to only one buffer, it doesn't fit well with the BufferSyncSet if there is no main buffer. In gedit we can already have several buffers with the same GFile, when the file is loaded a second time in a different window (in this case the new document is read-only by default). But different GtkSourceFile's are desirable. If the file is saved in one tab and the user switches to the other tab, there will be a info bar saying that the file is externally modified. If the same GtkSourceFile object is used for both tabs, the externally modified notification is not possible. The two buffers are not synchronized, and it is also desirable, because (1) it's simpler (2) the user can keep an old version in one tab. So I think a single GtkSourceFile object for several buffers is desirable only when the buffers are synchronized, which should ideally be avoided if GtkTextView handles correctly the model/view split. Anyway a many-to-many relationship should be possible, there are maybe other use cases. ===== Another thing discussed on IRC, the GFile parameter of the FileSaver is required. The GtkSourceFile:location property is updated only when the file saving succeeds. It is also needed to be able to use the same GtkSourceFile object for the whole lifetime of a buffer, which is simpler when connecting to the GtkSourceFile signals (just the notify signal for the moment). On the other hand the FileLoader doesn't need the GFile parameter. The current code updates the GtkSourceFile:location directly, since once a file loading is started, the buffer contents is directly removed so the previous location is no longer valid. And there is already a gtk_source_file_set_location() public function.
So the buffer parameter is now moved to the FileLoader and FileSaver. Summary from the last IRC conversation: 1) Remove the location property from the FileLoader. If the input-stream is NULL, read the contents from the GtkSourceFile's location. 2) Rename the location property of the FileSaver to target-location. If target-location is NULL, take the GtkSourceFile's location. The rationale is to simplify the constructors to have only one "file" parameter. Having both a GtkSourceFile and a GFile can be a bit disturbing. But after making those changes, I don't really like the new API. A FileLoader or FileSaver should be used for only one load or save operation, including possible errors/reconfiguration/relaunch cycles. It is easy to create a new FileLoader or FileSaver object, and the constraints with the construct-only properties (buffer, input-stream, location) is not really a problem. If a FileLoader or FileSaver is reused with different locations and buffers, it becomes a mess in the application code. Of course we can explain in the documentation to not do that, but it's clearer if the API itself makes these constraints. For the FileLoader, if we remove the "location" construct-only property, the object can be used several times for different locations, by calling gtk_source_file_set_location() and relaunching the loader. On the other hand, the input-stream (still present) is a construct-only property and thus the loader can not be used for different input streams. So the API is not consistent. I think it is also useful to have the "location" property in the loader and saver so we are sure the location doesn't change behind our back. In GeditTab there are some calls to gtk_source_file_loader_get_location() and gtk_source_file_saver_get_location(). Getting the GtkSourceFile and calling gtk_source_file_get_location() is less robust. So I think it's better to keep the "location" properties in both the FileLoader and FileSaver. We can however make the API a bit more convenient by taking the GtkSourceFile's location at construction time if the "location" is NULL. And I can improve the documentation to explain the difference between the two files, why the GFile is also needed, etc.
Pushed after some changes to the FileLoader and FileSaver constructors, and replace gtk_source_encoding_get_from_index() by gtk_source_encoding_foreach().