Hi list, To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
Also, running geany --version (or -V) should now also show 'GIO' if your system has it and Geany knew about it at build time.
Begin forwarded message:
Date: Tue, 02 Nov 2010 17:42:54 +0000 From: ntrel@users.sourceforge.net To: geany-commits@uvena.de Subject: SF.net SVN: geany:[5361] trunk
Revision: 5361 http://geany.svn.sourceforge.net/geany/?rev=5361&view=rev Author: ntrel Date: 2010-11-02 17:42:54 +0000 (Tue, 02 Nov 2010)
Log Message: ----------- Use g_file_replace_contents() if available to save documents - this should help workaround bugs in GVFS. Needs testing.
Nick
Le 02/11/2010 19:11, Nick Treleaven a écrit :
Hi list, To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
Interesting change, let's see what happens now -- this said I don't use remote files often so I will probably not notice any change...
Just to mention (I just saw that the question raised in the original thread, sorry not to have seen it before): you actually need to release GFiles with g_object_unref(). I'd thought it would be quite obvious -- you create an object, you destroy it -- but seems not :) So, here's a (very) small patch for this.
Best regards, Colomban
On Wed, 03 Nov 2010 00:26:00 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 02/11/2010 19:11, Nick Treleaven a écrit :
Hi list, To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
Interesting change, let's see what happens now -- this said I don't use remote files often so I will probably not notice any change...
Actually I should have been clearer - this change affects local files too. I'll list the fixes now:
permissions - was a bug only if using use_safe_file_saving pref, which is usually off anyway.
disk exhaustion - was a bug for local files and probably remote files.
file corruption - were bug(s) in GVFS for remote files, at least one is fixed in newest GVFS; our workaround will hopefully prevent corruption even with a broken GVFS.
Just to mention (I just saw that the question raised in the original thread, sorry not to have seen it before): you actually need to release GFiles with g_object_unref(). I'd thought it would be quite obvious -- you create an object, you destroy it -- but seems not :)
It wasn't clear an object was being created when I checked the docs originally. I just looked at the object hierarchy for GFile and it descends from GInterface, not GObject. Is g_object_unref () callable on a GInterface? I guess so but it would be nice to find some concrete information about it ;-)
Also, the docs for g_file_new_for_path don't mention releasing resources, but the docs for g_file_get_parent *do* say to call g_object_unref. Both functions return a GFile. Is this a bug in the docs for g_file_new_* ?
So, here's a (very) small patch for this.
I haven't applied it just yet, I'll wait for your reply to the above issue ;-)
Nick
Le 04/11/2010 17:26, Nick Treleaven a écrit :
On Wed, 03 Nov 2010 00:26:00 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 02/11/2010 19:11, Nick Treleaven a écrit :
Hi list, To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
Interesting change, let's see what happens now -- this said I don't use remote files often so I will probably not notice any change...
Actually I should have been clearer - this change affects local files too. I'll list the fixes now:
I should have been clearer too -- what I meant is that since I use local files, I don't face issues that GVFS (-fuse?) may have, so as far as the file saving still works, I woudln't notice :)
Just to mention (I just saw that the question raised in the original thread, sorry not to have seen it before): you actually need to release GFiles with g_object_unref(). I'd thought it would be quite obvious -- you create an object, you destroy it -- but seems not :)
It wasn't clear an object was being created when I checked the docs originally. I just looked at the object hierarchy for GFile and it descends from GInterface, not GObject. Is g_object_unref () callable on a GInterface? I guess so but it would be nice to find some concrete information about it ;-)
Well, hum... GObject is a "classic" OO "language" -- it's actually a library, but still. An interface (GInterface [1]) cannot be instantiated by itself. You can only get an "instance" of a GInterface by getting an instance of an object (GObject [2]) that implements this interface.
So, GFile is the main interface of GIO to handle files: it is what you use in the code. But there are many objects that implement it -- GLocalFile is the most basic implementation. These implementations are not exposed to the user, and most of them are in GVFS AFAIK, as (a part of?) the backends.
g_file_new_for_path() actually calls g_vfs_get_file_for_path() with the default VFS (g_vfs_get_default()), which try to create the most appropriate object for the given path, depending on which GVFS backends are available. This object is likely to be a GLocalFile, but may also be any other object implementing GFile -- I don't know their name, but we could imagine GDavFile, GSambaFile, GSshFile or whatever: each would implement all the GFile interface in the way that fit to a particular goal (file on DAV, Samba, SSH, whataver).
Also, the docs for g_file_new_for_path don't mention releasing resources, but the docs for g_file_get_parent *do* say to call g_object_unref. Both functions return a GFile. Is this a bug in the docs for g_file_new_* ?
Yes, it is. Actually I personally don't think it is a real bug because I think there is a difference between these two function: * g_file_new_for_path() is named "new" so it suggests it creates something new (just like g_object_new(), g_socket_new(), etc.). * g_file_get_parent(), however, takes a GFile in argument, and returns another one (the parent, right). It would be possible to think that the original file keeps the parent around, and so may not want the caller to release it.
And -- even if this is not really human-readable, I agree -- the return value of g_file_new_for_path() is marked as "transfer full" which means (thanks to the info popup) "Free data after the code is done". OK, the free function isn't explicitly given, and it's not really consistent to give it for g_file_get_parent() and not g_file_new_for_path().
So yes, it's kinda bug.
So, here's a (very) small patch for this.
I haven't applied it just yet, I'll wait for your reply to the above issue ;-)
I hope my explanations are clear enough for you to decide :)
Regards, Colomban
[1] http://library.gnome.org/devel/gobject/unstable/gtype-non-instantiable-class... [2] http://library.gnome.org/devel/gobject/unstable/gtype-instantiable-classed.h...
On Thu, 4 Nov 2010 20:37:13 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
dd if=/dev/zero of=~/zero bs=4K ; File -> Save:
Displays a message "Error writing to file: No space left on device". The file becomes 0 bytes. Reports that the disk file is newer, and offers to reload it. :) The permissions are OK, even for the empty file.
geany 0.20 (built on Nov 4 2010 with GTK 2.20.1, GLib 2.24.2, GIO)
No special options required to activate the new saving or something?..
No, you have the GIO string printed so the implementation will be g_file_replace_contents.
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
Nick
On Thu, 04 Nov 2010 18:14:17 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Interesting change, let's see what happens now -- this said I don't use remote files often so I will probably not notice any change...
Actually I should have been clearer - this change affects local files too. I'll list the fixes now:
I should have been clearer too -- what I meant is that since I use local files, I don't face issues that GVFS (-fuse?) may have, so as far as the file saving still works, I woudln't notice :)
Let's hope so ;-) Thanks for testing.
Just to mention (I just saw that the question raised in the original thread, sorry not to have seen it before): you actually need to release GFiles with g_object_unref(). I'd thought it would be quite obvious -- you create an object, you destroy it -- but seems not :)
It wasn't clear an object was being created when I checked the docs originally. I just looked at the object hierarchy for GFile and it descends from GInterface, not GObject. Is g_object_unref () callable on a GInterface? I guess so but it would be nice to find some concrete information about it ;-)
Well, hum... GObject is a "classic" OO "language" -- it's actually a library, but still. An interface (GInterface [1]) cannot be instantiated by itself. You can only get an "instance" of a GInterface by getting an instance of an object (GObject [2]) that implements this interface.
Thanks for the links. I know OO interfaces in higher level languages, but I'm a bit ignorant about GObject. I'll try to read those pages sometime.
Also, the docs for g_file_new_for_path don't mention releasing resources, but the docs for g_file_get_parent *do* say to call g_object_unref. Both functions return a GFile. Is this a bug in the docs for g_file_new_* ?
Yes, it is. Actually I personally don't think it is a real bug because I think there is a difference between these two function:
- g_file_new_for_path() is named "new" so it suggests it creates
something new (just like g_object_new(), g_socket_new(), etc.).
- g_file_get_parent(), however, takes a GFile in argument, and returns
another one (the parent, right). It would be possible to think that the original file keeps the parent around, and so may not want the caller to release it.
And -- even if this is not really human-readable, I agree -- the return value of g_file_new_for_path() is marked as "transfer full" which means (thanks to the info popup) "Free data after the code is done". OK, the free function isn't explicitly given, and it's not really consistent to give it for g_file_get_parent() and not g_file_new_for_path().
So yes, it's kinda bug.
I wasn't looking at the latest docs, just the version Google found. The "transfer full" helps somewhat, but ideally IMO it would say to unref. Perhaps that's something GInterface users know by convention.
So, here's a (very) small patch for this.
I haven't applied it just yet, I'll wait for your reply to the above issue ;-)
I hope my explanations are clear enough for you to decide :)
Yep, applied.
Nick
On Tue, 2 Nov 2010 18:11:01 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
To workaround bug(s) in GVFS, I've now changed the file saving implementation for systems with GIO in SVN trunk. See also:
http://lists.uvena.de/geany-devel/2010-November/003412.html
Please help test this. File permissions should be preserved after saving, and disk space exhaustion should now be handled without losing the previously saved file.
dd if=/dev/zero of=~/zero bs=4K ; File -> Save:
Displays a message "Error writing to file: No space left on device". The file becomes 0 bytes. Reports that the disk file is newer, and offers to reload it. :) The permissions are OK, even for the empty file.
geany 0.20 (built on Nov 4 2010 with GTK 2.20.1, GLib 2.24.2, GIO)
No special options required to activate the new saving or something?..
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
The problems I see are: 1) needs at least 2 times the size of the file in the target directory (but it's the same anytime we bake backups) 2) backup file may have altered permissions/attrs 3) ...so if write fails, we have changed the permissions/attrs on the original file In the worst case, the original file has another name and lost its permissions/attrs.
The advantages I see are: 1) the permissions/attrs are kept on success, as well as hard links 2) no data can be lost (if the FS is reliable)
The key idea is not to move the original file but to copy it. So we can safely overwrite the original (and the keep permissions/attrs), once the backup is done.
Thoughts?
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
...but if nobody files them, they are likely to never be fixed.
Regards, Colomban
On 5 November 2010 09:37, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
The problems I see are: 1) needs at least 2 times the size of the file in the target directory (but it's the same anytime we bake backups) 2) backup file may have altered permissions/attrs 3) ...so if write fails, we have changed the permissions/attrs on the original file In the worst case, the original file has another name and lost its permissions/attrs.
The advantages I see are: 1) the permissions/attrs are kept on success, as well as hard links 2) no data can be lost (if the FS is reliable)
The key idea is not to move the original file but to copy it. So we can safely overwrite the original (and the keep permissions/attrs), once the backup is done.
Thoughts?
Hi,
It seems to me that g_file_set_contents behavior can be described as "write new file with the name of an existing file" so new file permissions and hard links point to the old file.
Columbans suggestion can be described as "update the contents of an existing file" so old permissions and hard links still point to that file.
"Update the contents" seems much closer to the expected behavior of an editor than does "write a new file", so its good from that point of view.
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file. As I only use remote filesystems on fast networks I can't say how important this is likely to be, but for big files/lots of files it may be a problem, also I'm thinking web sites edited via ftp, ssh etc. This seems to be the use case for most of the performance complaints.
If making a backup file is selected when the file is opened then the write of the backup happens then, uses the data read for the open and doesn't need to happen at save time, it could even be asynchronous so long as it was checked for correct completion before saving. This would reduce the user visible performance impact.
Personally I would implement both and let the user choose (especially as g_file_set_contents exists).
Cheers Lex
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
...but if nobody files them, they are likely to never be fixed.
Regards, Colomban _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Le 05/11/2010 00:27, Lex Trotman a écrit :
On 5 November 2010 09:37, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
The problems I see are:
- needs at least 2 times the size of the file in the target directory
(but it's the same anytime we bake backups) 2) backup file may have altered permissions/attrs 3) ...so if write fails, we have changed the permissions/attrs on the original file In the worst case, the original file has another name and lost its permissions/attrs.
Hum, I found another problem (but seems to apply more or less to every safe-file-saving methods): write access to the directory containing the file to save is needed (to create the backup or restore it if write failed).
The advantages I see are:
- the permissions/attrs are kept on success, as well as hard links
- no data can be lost (if the FS is reliable)
The key idea is not to move the original file but to copy it. So we can safely overwrite the original (and the keep permissions/attrs), once the backup is done.
Thoughts?
Hi,
It seems to me that g_file_set_contents behavior can be described as "write new file with the name of an existing file" so new file permissions and hard links point to the old file.
Columbans suggestion can be described as "update the contents of an existing file" so old permissions and hard links still point to that file.
Note that even if it has problems, g_file_replace_contents() seems to keep the attributes quite correctly.
"Update the contents" seems much closer to the expected behavior of an editor than does "write a new file", so its good from that point of view.
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file. As I only use remote filesystems on fast networks I can't say how important this is likely to be, but for big files/lots of files it may be a problem, also I'm thinking web sites edited via ftp, ssh etc. This seems to be the use case for most of the performance complaints.
Well, right. I didn't thought that copy is unlikely to be done by the remote machine (is a filesystem/GVFS-backends clever enough to support copies?)... So yes, it would probably be a significant overhead on non-local files. Needs some testing probably.
If making a backup file is selected when the file is opened then the write of the backup happens then, uses the data read for the open and doesn't need to happen at save time, it could even be asynchronous so long as it was checked for correct completion before saving. This would reduce the user visible performance impact.
Theoretically yes, but I would not do that because another application may have changed the file since we loaded it and we may not have noticed. I think this looses a bit of the usefulness of a backup.
And it would also create a backup for each and every file that gets opened in Geany, not only the ones that gets saved, so.
Personally I would implement both and let the user choose (especially as g_file_set_contents exists).
Why not. Fast & reliable v.s. even more reliable. But it needs to write two different code paths, and if we want to support direct GIO API (which is probably a good idea IMHO, at least for the GNOME desktop with remote FS) as well as standard C API, we need about 4 code paths.
I join here a small program I wrote to test the idea, you can test it and see if you find problems. Note that this isn't well tested, but still, seems to work fine (both in the idea and in the tests I've done -- basically only no space left and normal).
To switch between plain C/GIO, change the value of the USE_GIO constant on the top of the file.
Regards, Colomban
On 11/05/2010 08:42 AM, Colomban Wendling wrote:
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file. As I only use remote filesystems on fast networks I can't say how important this is likely to be, but for big files/lots of files it may be a problem, also I'm thinking web sites edited via ftp, ssh etc. This seems to be the use case for most of the performance complaints.
Well, right. I didn't thought that copy is unlikely to be done by the remote machine (is a filesystem/GVFS-backends clever enough to support copies?)... So yes, it would probably be a significant overhead on non-local files. Needs some testing probably.
At the risk of making an obvious suggestion, why not use "move/rename" instead of "copy"? (I searched the list archives and couldn't find anything.) That eliminates efficiency concerns. And the only problem I can think of is file locking; If a move/rename fails (perhaps the file is locked by another process?) then fall-back to copying the file contents. If the file is locked, though, I wouldn't guarantee that writing the contents of the file would work, anyway.
If making a backup file is selected when the file is opened then the write of the backup happens then, uses the data read for the open and doesn't need to happen at save time, it could even be asynchronous so long as it was checked for correct completion before saving. This would reduce the user visible performance impact.
Theoretically yes, but I would not do that because another application may have changed the file since we loaded it and we may not have noticed. I think this looses a bit of the usefulness of a backup.
Using move/rename avoids this concern.
Personally I would implement both and let the user choose (especially as g_file_set_contents exists).
Why not. Fast& reliable v.s. even more reliable. But it needs to write two different code paths, and if we want to support direct GIO API (which is probably a good idea IMHO, at least for the GNOME desktop with remote FS) as well as standard C API, we need about 4 code paths.
That's one point I haven't researched: does GIO not support move/rename directly? That would be a silly thing to overlook, but I'll admit the possibility.
On 11/05/2010 12:07 PM, Dimitar Zhekov wrote:
On Fri, 05 Nov 2010 10:57:23 -0700 Jason Osterjason.oster@campnavajo.com wrote:
At the risk of making an obvious suggestion, why not use "move/rename" instead of "copy"? (I searched the list archives and couldn't find anything.)
Because when you move/rename filename to filename~, and create a new filename, the permissions and links will be lost.
Understandable. So you would end up with lost permissions/etc. on the first backup, and lose even those on later backups. Which is where the safe glib method comes in, I suppose. Thanks for the clarification.
Le 05/11/2010 20:08, Dimitar Zhekov a écrit :
On Thu, 04 Nov 2010 23:37:28 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a ecrit :
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
No, you are not; I should have been more specific. It's the only way which requires 2x disk space (that is unavoidable) but not 2x I/O. For Geany the latter doesn't matter so much, but glib is a general purpose library, and that's why it includes a safe method that modifies the metadata and links, and an unsafe method (documented as "the safest way possible", no less!), which preserves them.
Our choices are:
- The unsafe old or GIO method. Already supported.
- The safe glib method which modifies the metadata/links. Supported
for GIO = false, easy to include for GIO = true. 3. Our own method, which is safe, but generates 2x the I/O (minimum). 4. Our own method, based on the safe glib method, which preserves at least the ownership and permissions (but even that fails sometimes).
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
That'll be fine if backups are on, but as Lex pointed out, does too much I/O if they are off. So for backups = false we can:
a. Create filename-foo, write data to it, abort and unlink on failure. b. Write data to filename. On failure, rename filename-foo to filename, and notify the user that the metadata and links are lost. c. If even the rename fails, which is highly unlikely for files in the same directory, inform the user that the original file is _probably_ lost, and if so, filename-foo contains the data, but lacks the metadata and links.
Hey, that's quite clever :) We would then avoid to have to read the original file. However if the original file size is less than half the new data, the IO are worst [1]... but I think that this is not relevant for a text editor.
[1] assuming read and write has the same cost. No idea if it is likely to be true or not.
(probably, because remote filesystems may do the rename but fail to report it; that's specially mentioned for NFS in rename(2):BUGS)
Ouch. If we can't even trust the FS, what to trust? So I guess, "let's hope it'll work".
(filename-foo and not filename~, because we don't our insurance file containing the _new_ data to be replaced by some accidential backup, be it Geany's (if the user subsequently turned backups on) or cp's, mv's etc.)
The problems I see are:
- needs at least 2 times the size of the file in the target directory (but it's the same anytime we bake backups)
Unavoidable.
- backup file may have altered permissions/attrs
...links, ACL-s etc. Unavoidable.
- ...so if write fails, we have changed the permissions/attrs on the original file
Unavoidable.
In the worst case, the original file has another name and lost its permissions/attrs.
Unavoidable. Alas, no miracles for us.
So this solution seems quite interesting :)
Being a bit lazy, I'd prefer to include the safe glib method in the GIO code path (with the use_safe_file_saving option, as before) and be done with it. So for limited disk space and/or unstable connections, just turn the option on, we even have a GUI for that now. It's a compromise, but what isn't?..
I'm tempted to be of your opinion -- but as already said in another mail, I'm not really concerned as far as local files got saved correctly.
Regards, Colomban
On Fri, 05 Nov 2010 10:57:23 -0700 Jason Oster jason.oster@campnavajo.com wrote:
On 11/05/2010 08:42 AM, Colomban Wendling wrote:
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file.
At the risk of making an obvious suggestion, why not use "move/rename" instead of "copy"? (I searched the list archives and couldn't find anything.)
Because when you move/rename filename to filename~, and create a new filename, the permissions and links will be lost.
On Thu, 04 Nov 2010 23:37:28 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a ecrit :
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
No, you are not; I should have been more specific. It's the only way which requires 2x disk space (that is unavoidable) but not 2x I/O. For Geany the latter doesn't matter so much, but glib is a general purpose library, and that's why it includes a safe method that modifies the metadata and links, and an unsafe method (documented as "the safest way possible", no less!), which preserves them.
Our choices are:
1. The unsafe old or GIO method. Already supported. 2. The safe glib method which modifies the metadata/links. Supported for GIO = false, easy to include for GIO = true. 3. Our own method, which is safe, but generates 2x the I/O (minimum). 4. Our own method, based on the safe glib method, which preserves at least the ownership and permissions (but even that fails sometimes).
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
That'll be fine if backups are on, but as Lex pointed out, does too much I/O if they are off. So for backups = false we can:
a. Create filename-foo, write data to it, abort and unlink on failure. b. Write data to filename. On failure, rename filename-foo to filename, and notify the user that the metadata and links are lost. c. If even the rename fails, which is highly unlikely for files in the same directory, inform the user that the original file is _probably_ lost, and if so, filename-foo contains the data, but lacks the metadata and links.
(probably, because remote filesystems may do the rename but fail to report it; that's specially mentioned for NFS in rename(2):BUGS)
(filename-foo and not filename~, because we don't our insurance file containing the _new_ data to be replaced by some accidential backup, be it Geany's (if the user subsequently turned backups on) or cp's, mv's etc.)
The problems I see are:
- needs at least 2 times the size of the file in the target directory (but it's the same anytime we bake backups)
Unavoidable.
- backup file may have altered permissions/attrs
...links, ACL-s etc. Unavoidable.
- ...so if write fails, we have changed the permissions/attrs on the original file
Unavoidable.
In the worst case, the original file has another name and lost its permissions/attrs.
Unavoidable. Alas, no miracles for us.
Being a bit lazy, I'd prefer to include the safe glib method in the GIO code path (with the use_safe_file_saving option, as before) and be done with it. So for limited disk space and/or unstable connections, just turn the option on, we even have a GUI for that now. It's a compromise, but what isn't?..
On Fri, 05 Nov 2010 19:50:59 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 05/11/2010 20:08, Dimitar Zhekov a ecrit :
a. Create filename-foo, write data to it, abort and unlink on failure. [...]
Hey, that's quite clever :) We would then avoid to have to read the original file.
It's a well-known algorithm, but there is a better one, assuming that the underlying I/O system supports open for read-write and truncate. I checked GIO, and it does: g_file_open_readwrite, g_seekable_truncate. :D
1. Open the file for reading and writing. 2. If the new data is longer, append the extra part only (thus claiming the required disk space). If that fails, truncate to the original size and abort. 3. Write the data (without the extra part, if any). 4. If the new data is shorter, truncate.
That's almost 100% safe (there is no apparent reason for truncate to a smaller size to fail), preserves everything, no extra disk space, not even extra I/O.
Of course, not all GIO files will be truncatable, that can be checked with g_seekable_can_truncate. But why g_file_replace() for truncatables isn't implemented like above is beyond me.
Le 05/11/2010 21:33, Dimitar Zhekov a écrit :
On Fri, 05 Nov 2010 19:50:59 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 05/11/2010 20:08, Dimitar Zhekov a ecrit :
a. Create filename-foo, write data to it, abort and unlink on failure. [...]
Hey, that's quite clever :) We would then avoid to have to read the original file.
It's a well-known algorithm, but there is a better one, assuming that the underlying I/O system supports open for read-write and truncate. I checked GIO, and it does: g_file_open_readwrite, g_seekable_truncate. :D
- Open the file for reading and writing.
- If the new data is longer, append the extra part only (thus claiming
the required disk space). If that fails, truncate to the original size and abort. 3. Write the data (without the extra part, if any). 4. If the new data is shorter, truncate.
That's almost 100% safe (there is no apparent reason for truncate to a smaller size to fail), preserves everything, no extra disk space, not even extra I/O.
Even if it seems really interesting, I'm not sure it is really reliable. What if the program terminates (power loss, whatever) during the final write? What if a network connection is broken before we truncate? Since the whole operation is not atomic, it's not 100% safe, and I still believe that the backup is needed.
This said, it could be a good implementation for non-safe file saving, that would be safer than without. But that's probably much work for less gain IMHO.
Regards, Colomban
On 6 November 2010 04:57, Jason Oster jason.oster@campnavajo.com wrote:
On 11/05/2010 08:42 AM, Colomban Wendling wrote:
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file. As I only use remote filesystems on fast networks I can't say how important this is likely to be, but for big files/lots of files it may be a problem, also I'm thinking web sites edited via ftp, ssh etc. This seems to be the use case for most of the performance complaints.
Well, right. I didn't thought that copy is unlikely to be done by the remote machine (is a filesystem/GVFS-backends clever enough to support copies?)...
@Columban,
Also is the remote filesystem clever enough to support local copies?? EG NFS does not IIUC.
Cheers Lex
So yes, it would probably be a significant overhead on non-local files. Needs some testing probably.
At the risk of making an obvious suggestion, why not use "move/rename"
@ Jason,
If you move/rename to the backup you remove the original file, so it would have to be created again (thats what gtk_set_contents does). But the new file is created with different permissions from the original file.
And changing the permissions can be *very* annoying if it keeps taking execute off your scripts each time you edit them :-( Thats what we are trying to avoid.
Cheers Lex
instead of "copy"? (I searched the list archives and couldn't find anything.) That eliminates efficiency concerns. And the only problem I can think of is file locking; If a move/rename fails (perhaps the file is locked by another process?) then fall-back to copying the file contents. If the file is locked, though, I wouldn't guarantee that writing the contents of the file would work, anyway.
If making a backup file is selected when the file is opened then the write of the backup happens then, uses the data read for the open and doesn't need to happen at save time, it could even be asynchronous so long as it was checked for correct completion before saving. This would reduce the user visible performance impact.
Theoretically yes, but I would not do that because another application may have changed the file since we loaded it and we may not have noticed. I think this looses a bit of the usefulness of a backup.
Using move/rename avoids this concern.
Personally I would implement both and let the user choose (especially as g_file_set_contents exists).
Why not. Fast& reliable v.s. even more reliable. But it needs to write two different code paths, and if we want to support direct GIO API (which is probably a good idea IMHO, at least for the GNOME desktop with remote FS) as well as standard C API, we need about 4 code paths.
That's one point I haven't researched: does GIO not support move/rename directly? That would be a silly thing to overlook, but I'll admit the possibility.
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
While searching to report the bug (see below), I found that this isn't that simple. Actually, g_file_replace() itself seems quite safe, but g_file_replace_contents() isn't.
g_file_replace() use a probably correct backup strategy, and returns a stream to write to in a safe way. This stream will actually point to a temporary file (or likely, seems there are 2 different strategies), and this file will be moved automatically to the final place when the stream get closed/destructed. The problem is that there is no API yet to abort the replacement, and so prevent the final move to be done.
Actually, this is safe if e.g. the caller crashes during write, or anything that prevents the stream to be closed, and so the file to be moved...
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
The bug is already reported, see https://bugzilla.gnome.org/show_bug.cgi?id=602412 Sadly, this bug was reported on 2009-11 and last activity was 2010-04...
Le 04/11/2010 18:53, Nick Treleaven a écrit :
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
More or less. The backup file will be kept, but the "original" might still be lost. The data isn't completely lost so, but the user might not notice the backup is here (and it will not be restored).
Regards, Colomban
On 6 November 2010 10:43, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
While searching to report the bug (see below), I found that this isn't that simple. Actually, g_file_replace() itself seems quite safe, but g_file_replace_contents() isn't.
Well, it doesn't say much in the doc, but why would it not use g_file_replace and so be just as safe??
g_file_replace() use a probably correct backup strategy, and returns a stream to write to in a safe way. This stream will actually point to a temporary file (or likely, seems there are 2 different strategies), and this file will be moved automatically to the final place when the stream get closed/destructed.
And then we get the same permissions problem that glibs g_file_set_contents has which was where we came in....
The problem is that there is no API yet to abort the replacement, and so prevent the final move to be done.
Actually, this is safe if e.g. the caller crashes during write, or anything that prevents the stream to be closed, and so the file to be moved...
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
The bug is already reported, see https://bugzilla.gnome.org/show_bug.cgi?id=602412 Sadly, this bug was reported on 2009-11 and last activity was 2010-04...
Par for the course, if no one is worried enough to fix it then why should the Gnome guys, its open source after all ....
Le 04/11/2010 18:53, Nick Treleaven a écrit :
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
More or less. The backup file will be kept, but the "original" might still be lost. The data isn't completely lost so, but the user might not notice the backup is here (and it will not be restored).
I thought the sequence with backup would be:
1. write new contents to temp file, uses space 2. if successful rename old file to backup, uses no space 3. rename temp file to filename, uses no space
so if the write fails for exhaustion the old file is still there but no backup, what we want.
If one of the renames fails the state could be confusing, with a temp file and no old file, or a backup and no new file but that needs a filesystem crash to lose metadata right in the rename, *very* rare especially with journaled filesystems.
Cheers Lex
Regards, Colomban _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Le 06/11/2010 01:58, Lex Trotman a écrit :
On 6 November 2010 10:43, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
While searching to report the bug (see below), I found that this isn't that simple. Actually, g_file_replace() itself seems quite safe, but g_file_replace_contents() isn't.
Well, it doesn't say much in the doc, but why would it not use g_file_replace and so be just as safe??
g_file_replace() just /starts/ a replace operation [1]. It returns a stream to which you probably want to write data, but it doesn't do it. Getting this stream is safe, and writing to it is also safe.
g_file_replace_contents(), however, also writes to the stream, then close it.
The problem is that if writing to the stream failed, there is no (non-hackish) way to abort the final step of the replacement -- moving the temp file to the original one -- which is done on stream closing.
I know this is a bit confusing at first glance... let's do a small picture:
1) create a stream that points to a temp file 2) write data to this stream 3) move the temp file over the original one
1 is done by g_file_replace(); 2 is done with something like g_output_stream_write(); 3 is "scheduled" by g_file_replace() and is done upon stream closing (g_output_stream_close() or g_object_unref() that calls the former).
...and if 1 succeeded, we can't prevent 3 to happen, even if 2 failed.
A small example is better that thousand words:
ostream = g_file_replace(file, ..., &err); if (ostream) { if (! g_output_stream_write_all (ostream, data, length, ..., &err)) { /* here's the problem, we don't have * g_output_stream_abort_replace() */ } else { /* the final part of the replacement (3) will happen here, when we * close the stream. * note that the g_object_unref() call above would do this * automatically and we do it manually only to get possible * errors. */ success = g_output_stream_close (ostream, &err); } g_object_unref (ostream); }
g_file_replace() use a probably correct backup strategy, and returns a stream to write to in a safe way. This stream will actually point to a temporary file (or likely, seems there are 2 different strategies), and this file will be moved automatically to the final place when the stream get closed/destructed.
And then we get the same permissions problem that glibs g_file_set_contents has which was where we came in....
Theoretically, yes. But GIO seems to keep attributes quite decently (at least the permissions are copied, I haven't checked more).
The problem is that there is no API yet to abort the replacement, and so prevent the final move to be done.
Actually, this is safe if e.g. the caller crashes during write, or anything that prevents the stream to be closed, and so the file to be moved...
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
The bug is already reported, see https://bugzilla.gnome.org/show_bug.cgi?id=602412 Sadly, this bug was reported on 2009-11 and last activity was 2010-04...
Par for the course, if no one is worried enough to fix it then why should the Gnome guys, its open source after all ....
I'd think that maintainers should care a bit more than others, but basically you're right.
Le 04/11/2010 18:53, Nick Treleaven a écrit :
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
More or less. The backup file will be kept, but the "original" might still be lost. The data isn't completely lost so, but the user might not notice the backup is here (and it will not be restored).
I thought the sequence with backup would be:
- write new contents to temp file, uses space
- if successful rename old file to backup, uses no space
- rename temp file to filename, uses no space
so if the write fails for exhaustion the old file is still there but no backup, what we want.
If one of the renames fails the state could be confusing, with a temp file and no old file, or a backup and no new file but that needs a filesystem crash to lose metadata right in the rename, *very* rare especially with journaled filesystems.
That's (probably, not really checked) right, but that's not really the concern since the one that move/rename files doesn't know of any write error (see above).
Hope it's a little bit more clear -- though I'm not sure I'm really clear at 3 and half AM :D
Regards, Colomban
On 6 November 2010 13:30, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 06/11/2010 01:58, Lex Trotman a écrit :
On 6 November 2010 10:43, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
While searching to report the bug (see below), I found that this isn't that simple. Actually, g_file_replace() itself seems quite safe, but g_file_replace_contents() isn't.
Well, it doesn't say much in the doc, but why would it not use g_file_replace and so be just as safe??
Ok, I wasn't clear, rather than saying replace_contents is just as safe lets say just as unsafe!!!
In both cases there is no way of preventing the temp file being renamed over the original. See where there is lots of ?? below.
g_file_replace() just /starts/ a replace operation [1]. It returns a stream to which you probably want to write data, but it doesn't do it. Getting this stream is safe, and writing to it is also safe.
g_file_replace_contents(), however, also writes to the stream, then close it.
The problem is that if writing to the stream failed, there is no (non-hackish) way to abort the final step of the replacement -- moving the temp file to the original one -- which is done on stream closing.
I know this is a bit confusing at first glance... let's do a small picture:
I understand it I've looked at the source too
- create a stream that points to a temp file
- write data to this stream
- move the temp file over the original one
1 is done by g_file_replace(); 2 is done with something like g_output_stream_write(); 3 is "scheduled" by g_file_replace() and is done upon stream closing (g_output_stream_close() or g_object_unref() that calls the former).
...and if 1 succeeded, we can't prevent 3 to happen, even if 2 failed.
A small example is better that thousand words:
ostream = g_file_replace(file, ..., &err); if (ostream) { if (! g_output_stream_write_all (ostream, data, length, ..., &err)) { /* here's the problem, we don't have * g_output_stream_abort_replace() */
Lets say you are doing this yourself, not in g_file_replace_contents.
???????? Now what are you going to do, the rename will still happen when the stream is closed or unrefed.
So we can't use g_file_replace, have to do it all ourselves :-(
Second the comment above about glib developers :-)
} else { /* the final part of the replacement (3) will happen here, when we * close the stream. * note that the g_object_unref() call above would do this * automatically and we do it manually only to get possible * errors. */ success = g_output_stream_close (ostream, &err); } g_object_unref (ostream); }
g_file_replace() use a probably correct backup strategy, and returns a stream to write to in a safe way. This stream will actually point to a temporary file (or likely, seems there are 2 different strategies), and this file will be moved automatically to the final place when the stream get closed/destructed.
And then we get the same permissions problem that glibs g_file_set_contents has which was where we came in....
Theoretically, yes. But GIO seems to keep attributes quite decently (at least the permissions are copied, I haven't checked more).
In fact if if can't set the permissions it switches to your suggestion of copy the file to the backup and then truncate the original.
Of course that has the associated performance issues but is more correct if the "fast" method didn't work. But note that the backup is now made when the file is opened, not when it is closed.
The problem is that there is no API yet to abort the replacement, and so prevent the final move to be done.
Actually, this is safe if e.g. the caller crashes during write, or anything that prevents the stream to be closed, and so the file to be moved...
Perhaps you might like to file a bug against GIO. Perhaps first use gdb to break on that function just to be absolutely sure.
Perhaps. But bugs filed to gnome and kde may take years to resolve...
The bug is already reported, see https://bugzilla.gnome.org/show_bug.cgi?id=602412 Sadly, this bug was reported on 2009-11 and last activity was 2010-04...
Par for the course, if no one is worried enough to fix it then why should the Gnome guys, its open source after all ....
I'd think that maintainers should care a bit more than others, but basically you're right.
Sure but no one has asked for ages until you did,
They have limited time and use the amount of noise and status of the reporter to decide priorities. If the thread is very quiet then no one wants it badly.
Le 04/11/2010 18:53, Nick Treleaven a écrit :
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
More or less. The backup file will be kept, but the "original" might still be lost. The data isn't completely lost so, but the user might not notice the backup is here (and it will not be restored).
I thought the sequence with backup would be:
- write new contents to temp file, uses space
- if successful rename old file to backup, uses no space
- rename temp file to filename, uses no space
so if the write fails for exhaustion the old file is still there but no backup, what we want.
If one of the renames fails the state could be confusing, with a temp file and no old file, or a backup and no new file but that needs a filesystem crash to lose metadata right in the rename, *very* rare especially with journaled filesystems.
That's (probably, not really checked) right, but that's not really the concern since the one that move/rename files doesn't know of any write error (see above).
Yes we have to do this ourselves so we don't do the rename when the write fails, which includes having to do all the annoying permissions fixes, checks for non-regular files etc.
According to the comment in GIO source this stuff is copied from gedit. IIRC at the beginning of the loooong series of several threads gedit was noted as working better. Maybe just copy it, we are all GPL here.
In fact all in all, as far as I can see GIO isn't much help for this problem :-)
Hope it's a little bit more clear -- though I'm not sure I'm really clear at 3 and half AM :D
Only 1430 for me.
Lack of sleep is bad for you so I have delayed this reply for some time so hopefully you have gone to bed and I can't be blamed for keeping you up :-D
Cheers Lex
Regards, Colomban _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Fri, 05 Nov 2010 23:08:57 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 05/11/2010 21:33, Dimitar Zhekov a ecrit :
On Fri, 05 Nov 2010 19:50:59 +0100
- Open the file for reading and writing.
- If the new data is longer, append the extra part only (thus claiming
the required disk space). If that fails, truncate to the original size and abort. 3. Write the data (without the extra part, if any). 4. If the new data is shorter, truncate.
That's almost 100% safe (there is no apparent reason for truncate to a smaller size to fail), preserves everything, no extra disk space, not even extra I/O.
Even if it seems really interesting, I'm not sure it is really reliable. What if the program terminates (power loss, whatever) during the final write? What if a network connection is broken before we truncate?
In any given moment, the file is either OK, or it contains the old data followed by garbade. The latter should be reported to the user.
Since the whole operation is not atomic, it's not 100% safe, and I still believe that the backup is needed. This said, it could be a good implementation for non-safe file saving, that would be safer than without.
Of course, I propose this only for use_safe_file_saving = FALSE.
But that's probably much work for less gain IMHO.
I'll write a non-GIO variant first, as a proof of concept. The current non-GIO is buggy anyway. First:
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Second, and more important, the result of fclose() is not checked, for a buffered file stream. On lack of disk space, on my system fwrite() happily returns written == len, but fclose() fails. YMMV.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
On 6 November 2010 20:33, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
On Fri, 05 Nov 2010 23:08:57 +0100 Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 05/11/2010 21:33, Dimitar Zhekov a ecrit :
On Fri, 05 Nov 2010 19:50:59 +0100
- Open the file for reading and writing.
I'm not sure all non-local filesystems support read/write or they support it via local caching which isn't safe until flushed to the remote system
- If the new data is longer, append the extra part only (thus claiming
the required disk space).
A remote filesystem might not allocate it on the remote system only on the local cache until the file is closed.
If that fails, truncate to the original size and abort. 3. Write the data (without the extra part, if any).
This requires a seekable filesystem, again not sure all remote systems support it.
- If the new data is shorter, truncate.
That's almost 100% safe (there is no apparent reason for truncate to a smaller size to fail), preserves everything, no extra disk space, not even extra I/O.
Even if it seems really interesting, I'm not sure it is really reliable. What if the program terminates (power loss, whatever) during the final write? What if a network connection is broken before we truncate?
In any given moment, the file is either OK, or it contains the old data followed by garbade. The latter should be reported to the user.
Again its not ok if it only exists in the local cache and the write back to the remote fs is interrupted.
Since the whole operation is not atomic, it's not 100% safe, and I still believe that the backup is needed. This said, it could be a good implementation for non-safe file saving, that would be safer than without.
Of course, I propose this only for use_safe_file_saving = FALSE.
Its advantage is minimal transfers and no extra space in the no backup case, but doesn't happen with the backup case, the transfers are the same as just writing the file and so is the disk space.
So if you are feeling really lucky and want to use unsafe saving and no backup and the filesystem supports it then its an advantage.
The problem is that on local filesystems where its safe is where you don't need the performance and on remote filesystems where you need the performance you also need the safety. Can't win.
But that's probably much work for less gain IMHO.
I'll write a non-GIO variant first, as a proof of concept. The current non-GIO is buggy anyway. First:
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Even write's behavoir is filesystem dependent IIUC.
Second, and more important, the result of fclose() is not checked, for a buffered file stream. On lack of disk space, on my system fwrite() happily returns written == len, but fclose() fails. YMMV.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Yes.
Cheers Lex
-- E-gards: Jimmy _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Sat, 6 Nov 2010 21:31:31 +1100 Lex Trotman elextr@gmail.com wrote:
On 6 November 2010 20:33, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
- Open the file for reading and writing.
I'm not sure all non-local filesystems support read/write or they support it via local caching which isn't safe until flushed to the remote system [...] The problem is that on local filesystems where its safe is where you don't need the performance and on remote filesystems where you need the performance you also need the safety. Can't win.
Touchdown. OK, that's not worth.
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Even write's behavoir is filesystem dependent IIUC.
Yes, but the documented behaviour of write(2) is to set errno, while fwrite(2) does not. It's the same even for MSVCRT.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Yes.
Attached. I also rearranged the code a bit to restore the safe saving support for Geany compiled with GIO. Since g_file_replace_contents() is unsafe, there would be no harm have the old safe saving, at least until we resolve the matter.
Le 06/11/2010 13:45, Dimitar Zhekov a écrit :
On Sat, 6 Nov 2010 21:31:31 +1100 Lex Trotman elextr@gmail.com wrote:
On 6 November 2010 20:33, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Yes.
Attached. I also rearranged the code a bit to restore the safe saving support for Geany compiled with GIO. Since g_file_replace_contents() is unsafe, there would be no harm have the old safe saving, at least until we resolve the matter.
If I read the patch correctly, it seems not to be against the latest SVN rev, and lacks the g_object_unref(fp) call in the GIO part ;) Otherwise, seems good IMHO -- not tested yet though.
Regards, Colomban
Le 06/11/2010 06:28, Lex Trotman a écrit :
On 6 November 2010 13:30, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 06/11/2010 01:58, Lex Trotman a écrit :
On 6 November 2010 10:43, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
The documentation of g_file_replace() may state "will try to replace the file in the safest way possible", but I don't believe in miracles, much less in ones from the glib developers.
While searching to report the bug (see below), I found that this isn't that simple. Actually, g_file_replace() itself seems quite safe, but g_file_replace_contents() isn't.
Well, it doesn't say much in the doc, but why would it not use g_file_replace and so be just as safe??
Ok, I wasn't clear, rather than saying replace_contents is just as safe lets say just as unsafe!!!
Yes, finally it's unsafe (even though it tries to be safe).
Ironically, notice that the docs of g_file_replace_contents() doesn't speak of any safety.
- create a stream that points to a temp file
- write data to this stream
- move the temp file over the original one
1 is done by g_file_replace(); 2 is done with something like g_output_stream_write(); 3 is "scheduled" by g_file_replace() and is done upon stream closing (g_output_stream_close() or g_object_unref() that calls the former).
...and if 1 succeeded, we can't prevent 3 to happen, even if 2 failed.
A small example is better that thousand words:
ostream = g_file_replace(file, ..., &err); if (ostream) { if (! g_output_stream_write_all (ostream, data, length, ..., &err)) { /* here's the problem, we don't have * g_output_stream_abort_replace() */
Lets say you are doing this yourself, not in g_file_replace_contents.
???????? Now what are you going to do, the rename will still happen when the stream is closed or unrefed.
That's the problem, yep. Same for g_file_replace_contents() BTW, which is basically a wrapper that does the above by itself. So yes, can't use it for safe saving until it get fixed.
BTW there are possible hackish workarounds (GEdit has one, I implemented a test case with another) to simulate an abortion, but it uses some internal implementation details to do so, nothing good. Moreover, it leaves the temporary file on disk.
So we can't use g_file_replace, have to do it all ourselves :-(
Second the comment above about glib developers :-)
...I also feel really strange seeing that there was an effort for g_file_replace() to provide correct safety, but none to be able to use this safety. Moreover when g_file_replace_contents() exists.
Le 04/11/2010 18:53, Nick Treleaven a écrit :
Also, if the argument for backups is set perhaps disk exhaustion is handled: http://library.gnome.org/devel/gio/2.25/GFile.html#g-file-replace-contents
More or less. The backup file will be kept, but the "original" might still be lost. The data isn't completely lost so, but the user might not notice the backup is here (and it will not be restored).
I thought the sequence with backup would be:
- write new contents to temp file, uses space
- if successful rename old file to backup, uses no space
- rename temp file to filename, uses no space
so if the write fails for exhaustion the old file is still there but no backup, what we want.
If one of the renames fails the state could be confusing, with a temp file and no old file, or a backup and no new file but that needs a filesystem crash to lose metadata right in the rename, *very* rare especially with journaled filesystems.
That's (probably, not really checked) right, but that's not really the concern since the one that move/rename files doesn't know of any write error (see above).
Yes we have to do this ourselves so we don't do the rename when the write fails, which includes having to do all the annoying permissions fixes, checks for non-regular files etc.
According to the comment in GIO source this stuff is copied from gedit. IIRC at the beginning of the loooong series of several threads gedit was noted as working better. Maybe just copy it, we are all GPL here.
I've not looked at all the GEdit sources, but at least a part uses GIO's g_file_replace() and a hack to prevent renaming to happen on write error.
But yes, if we can re-use the code, it's just less effort and everybody's happy :)
In fact all in all, as far as I can see GIO isn't much help for this problem :-)
...until it gets fixed, yes.
Hope it's a little bit more clear -- though I'm not sure I'm really clear at 3 and half AM :D
Only 1430 for me.
Lack of sleep is bad for you so I have delayed this reply for some time so hopefully you have gone to bed and I can't be blamed for keeping you up :-D
Thank you very much, Lex! :D
Regards, Colomban
2010/11/6 Dimitar Zhekov dimitar.zhekov@gmail.com:
On Sat, 6 Nov 2010 21:31:31 +1100 Lex Trotman elextr@gmail.com wrote:
On 6 November 2010 20:33, Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
- Open the file for reading and writing.
I'm not sure all non-local filesystems support read/write or they support it via local caching which isn't safe until flushed to the remote system [...] The problem is that on local filesystems where its safe is where you don't need the performance and on remote filesystems where you need the performance you also need the safety. Can't win.
Touchdown. OK, that's not worth.
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Even write's behavoir is filesystem dependent IIUC.
Yes, but the documented behaviour of write(2) is to set errno, while fwrite(2) does not.
Actually SUS2 and POSIX both require it to set errno, its just Linux & MS who don't :-S
It's the same even for MSVCRT.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Yes.
Attached. I also rearranged the code a bit to restore the safe saving support for Geany compiled with GIO. Since g_file_replace_contents() is unsafe, there would be no harm have the old safe saving, at least until we resolve the matter.
Looks generally ok, but also havn't tried it.
-- E-gards: Jimmy
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
To summarize a long thread I'd like to propose the following process. Its based on GIO but handling fails better.
In pseudocode (sorry if it wraps in your mailer) since I'm not sure if we want to implement it in GIO calls, C library calls or system calls. See below.
To write a Geany buffer contents to Target_File
1 if Target_File does not exist, create it (try to do this in one step to avoid races) 2 if ( no backup required and unsafe save ) or Target_File just created then 3 open Target_File as target_stream for writing or use created stream 4 write contents to target_stream 5 if not unsafe save then 6 sync target_stream 7 close target_stream 8 else 9 open temp_file as target_stream for write in same directory as Target_File 10 try to set GID, UID and permissions on temp_file to the same as Target_File 11 if temp_file GID, UID and permissions are not the same as Target_File then 12 open Target_File for read and copy Target_file to target_stream 13 close Target_file and target_stream and sync target_stream 14 open Target_File for write as target_stream 15 write contents to target_stream 16 sync and close target_stream 17 if target_stream was writing to temp_file then 18 if backup required then 19 rename Target_File to backup_file 20 rename temp_file to Target_File 21 else 22 if backup required then 23 rename temp_file to backup_file 24 else 25 delete temp_file
The normal paths do only one data transfer and steps involving filesystem interaction are: first time, unsafe - 1, 3, 4, 7 first time safe - 1, 3, 4, 6, 7 unsafe, no backup - 3, 4, 7 with backup - 9, 10, 11, 15, 16, 19, 20 safe no backup - 9, 10, 11, 15, 16, 20
Fallback if it has trouble with permissions require three data transfers and are are: with backup - 9, 10, 12, 13, 14, 15, 16, 23 without backup - 9, 10, 12, 13, 14, 15, 16, 25
If anything goes wrong at any step except line 10 then give the system error message and an error message as below and close target_stream if its open and any other stream that is open.
Note that this error handling leaves temp file and backup files. This is deliberate to give the best chance of still having the old and new data exist in case Geany crashes before we fix the problem, after all running out of disk may crash the whole system or cause the system to close applications.
State of the disk if failure at the indicated line and suggested error message (in quotes):
1 no Target_File "Can't create Target_File, save unsuccessful" 3, 9 Target_File with old contents "Can't open Target_File for writing, save unsuccessful" 4, 6, 7 Target_File with unknown contents "Error writing Target_File, save unsuccessful" 12, 13 Target_File with old contents, temp_file with unknown contents "Error writing temporary file temp_file, Target_File not changed" 14 Target_File with old contents and temp_file the same "Can't open Target_File for writing, save unsuccessful, temporary file temp_file is backup" 15, 16 if target_stream is writing to temp_file Target_File with old contents, temp_file with unknown contents "Error writing temporary file temp_file, Target_File not changed, save unsuccessful" else temp_file with old contents Target_File with unknown contents "Error writing Target_File, save unsuccessful, temporary file temp_file is backup" 19 Target_File with old contents, temp_file with new contents "Can't rename Target_File to backup_file, temporary file temp_file contains saved data" 20 no Target_File, temp_file with new contents, backup_file with old contents "Can't rename temporary file temp_file containing saved data to Target_File" 23 Target_file with new contents, temp_file with old contents "Save successful, but can't rename temporary file temp_file to backup_file" 24 Target_file with new contents, temp_file with old contents "Save successful, but can't delete temporary file temp_file"
Feel free to throw bricks.
Implementing in GIO but just using create, stream open, sync, close not replace or set contents etc could benefit from improved performance of GIO VFS but I'm not sure that lines 10 and 11 can be done in GIO. Otherwise implement with POSIX system calls & use fdatasync that "should" sync critical metadata as well.
Cheers Lex
On Thu, 4 Nov 2010 22:42:04 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
I've now made this change. I saw your patch that also fixes POSIX writing but haven't read the discussion yet. In any case the two changes are unrelated so should not be committed together.
Nick
On Sun, 7 Nov 2010 13:18:48 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
On Thu, 4 Nov 2010 22:42:04 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
I've now made this change. I saw your patch that also fixes POSIX writing but haven't read the discussion yet.
IMHO, there isn't much worth reading, except probably the 25-step saving proposed by Lex (had no time to read it in detail yet).
In any case the two changes are unrelated so should not be committed together.
Here is the POSIX fix only against the latest svn, if you need it.
On Sat, 6 Nov 2010 11:33:41 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
I'll write a non-GIO variant first, as a proof of concept.
First, I feel quite reluctant to use non-GLib/GIO functions for writing instead of g_file_replace_contents because if there is a better implementation, why don't they add it?
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
The current non-GIO is buggy anyway. First:
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Second, and more important, the result of fclose() is not checked, for a buffered file stream. On lack of disk space, on my system fwrite() happily returns written == len, but fclose() fails. YMMV.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Thanks for finding these problems. I accept fclose should be checked.
I'm not sure why write is better than fwrite - POSIX says both set errno. http://www.opengroup.org/onlinepubs/009695399/functions/fwrite.html
Can you explain why non-buffered I/O is better?
Nick
P.S. Thanks for sending the updated patch.
On Tue, 9 Nov 2010 13:33:26 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
On Sat, 6 Nov 2010 11:33:41 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
I'll write a non-GIO variant first, as a proof of concept.
First, I feel quite reluctant to use non-GLib/GIO functions for writing instead of g_file_replace_contents because if there is a better implementation, why don't they add it?
That was discussed later and proven as not worth.
The current non-GIO is buggy anyway. First:
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Second, and more important, the result of fclose() is not checked, for a buffered file stream. On lack of disk space, on my system fwrite() happily returns written == len, but fclose() fails. YMMV.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Thanks for finding these problems. I accept fclose should be checked.
I'm not sure why write is better than fwrite - POSIX says both set errno.
Geany runs under Linux and Windows, possibly also under MAC and BSD. None of Linux fwrite(3), MSVCRT, Darwin fwrite(3) and BSD fwrite(3) mentions anything about setting errno.
http://www.opengroup.org/onlinepubs/009695399/functions/fwrite.html
This is the 2004 Edition. Linux conforms to 2001, BSD/Darwin to 1990, MSVCRT to "ANSI"(?)
Can you explain why non-buffered I/O is better?
Depenting on the implementation, fwrite(data, sizeof(gchar), len, fp) may really try to write <len> single-byte blocks. A large number of buffered writes takes some time under MSVCRT, I've seen it taking seconds (YMMV). Swapping <size> and <len> helps for now, because they haven't implemented 2004, which mandates that the data must be written using fputc()...
With binary I/O we have fsync(), which really flushes the data. Even fflush() and fclose() are not guaranteed to do that.
Well, g_file_set_contents() uses buffered I/O, so obviously it's acceptable too. But since we don't write the file line-by-line or something, _why_ do we use buffered I/O in the first place?..
On Tue, 9 Nov 2010 13:33:26 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
P.S. Thanks for sending the updated patch.
Oh, it's probably buggy. I think under Win~1, either setmode(O_BINARY) or open(O_BINARY) should be used, since it converts the EOLN-s even for non-buffered I/O.
On Tue, 9 Nov 2010 20:28:43 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
With binary I/O we have fsync(), which really flushes the data. Even fflush() and fclose() are not guaranteed to do that.
Strange, why is that?
Well, g_file_set_contents() uses buffered I/O, so obviously it's acceptable too.
Yes, it uses fwrite and then errno.
But since we don't write the file line-by-line or something, _why_ do we use buffered I/O in the first place?..
I don't know. But changing implementation often introduces bugs, so I'm reluctant to do so (besides fixing fclose failure).
Nick
On Tue, 9 Nov 2010 18:57:03 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
On Tue, 9 Nov 2010 20:28:43 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
With binary I/O we have fsync(), which really flushes the data. Even fflush() and fclose() are not guaranteed to do that.
Strange, why is that?
fsync() is in section 2 system calls; you can't have such a function in section 3. Closing a file (be it close or fclose) does not flush - maybe for for performance reasons, dunno.
Well, g_file_set_contents() uses buffered I/O, so obviously it's acceptable too.
Yes, it uses fwrite and then errno.
Probably fwrite(), fflush() and then errno? fflush() is guaranteed to set errno AFAIK.
But since we don't write the file line-by-line or something, _why_ do we use buffered I/O in the first place?..
I don't know. But changing implementation often introduces bugs, so I'm reluctant to do so (besides fixing fclose failure).
Well the current implementation is not exactly bug-free. :)
Hi again Nick,
On 10 November 2010 00:33, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Sat, 6 Nov 2010 11:33:41 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
I'll write a non-GIO variant first, as a proof of concept.
First, I feel quite reluctant to use non-GLib/GIO functions for writing instead of g_file_replace_contents because if there is a better implementation, why don't they add it?
Don't know why they havn't fixed it, but looking at the bug file it seems they can't agree on the interface for the fix and then it all went quiet.
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
My post with all the steps is simply a copy of what g_file_replace, g_stream_write and g_stream_close do, but allowing us to not rename a broken temp file over our old file.
Whilst this is complicated, it is ****really important**** because failing to write the output safely defeats the purpose of an editor.
We can't use GIO since there is no way of getting a writable stream except g_file_replace, at least that I can find.
Note: Gedit uses g_file_replace and an implementation detail of g_stream_close to hack around the problem. Big comment in the source. If you want the maintenance headache of checking every release of GIO has the same implementation then you can use this as well.
The current non-GIO is buggy anyway. First:
if (G_UNLIKELY(len != bytes_written)) err = errno;
but fwrite() is not guaranteed to set errno, only write() is.
Second, and more important, the result of fclose() is not checked, for a buffered file stream. On lack of disk space, on my system fwrite() happily returns written == len, but fclose() fails. YMMV.
If not anything else, we should use non-buffered I/O, with fsync(), and check the result of close() anyway.
Thanks for finding these problems. I accept fclose should be checked.
I'm not sure why write is better than fwrite - POSIX says both set errno.
And man on Linux says fwrite does not set errno!!!! Unix standards are great, everyone can have one :-S
You would have to enforce POSIX behavior and I'm not sure thats portable.
Cheers Lex
http://www.opengroup.org/onlinepubs/009695399/functions/fwrite.html
Can you explain why non-buffered I/O is better?
Nick
P.S. Thanks for sending the updated patch. _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Tue, 9 Nov 2010 21:27:07 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Well, g_file_set_contents() uses buffered I/O, so obviously it's acceptable too.
Yes, it uses fwrite and then errno.
Probably fwrite(), fflush() and then errno? fflush() is guaranteed to set errno AFAIK.
No:
errno = 0;
n_written = fwrite (contents, 1, length, file);
if (n_written < length) { save_errno = errno;
g_set_error (err, G_FILE_ERROR, g_file_error_from_errno (save_errno), _("Failed to write file '%s': fwrite() failed: % s"), display_name, g_strerror (save_errno));
fclose (file);
Although this is better than Geany because at least the error is reported even if errno is not set.
But since we don't write the file line-by-line or something, _why_ do we use buffered I/O in the first place?..
I don't know. But changing implementation often introduces bugs, so I'm reluctant to do so (besides fixing fclose failure).
Well the current implementation is not exactly bug-free. :)
OK, so we do need a fix so that failed writing is reported to the user, can't rely on errno.
Nick
On Wed, 10 Nov 2010 10:13:29 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Nick
On Wed, 10 Nov 2010 13:25:56 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
But since we don't write the file line-by-line or something, _why_ do we use buffered I/O in the first place?..
I don't know. But changing implementation often introduces bugs, so I'm reluctant to do so (besides fixing fclose failure).
Well the current implementation is not exactly bug-free. :)
OK, so we do need a fix so that failed writing is reported to the user, can't rely on errno.
I've now committed a fix so that any write failure should be reported to the user even if errno is 0. Please test.
fclose check is fixed now too.
I decided to stay with buffered I/O as that may be more portable.
Nick
On Wed, 10 Nov 2010 13:25:56 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
On Tue, 9 Nov 2010 21:27:07 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Well, g_file_set_contents() uses buffered I/O, so obviously it's acceptable too.
Yes, it uses fwrite and then errno.
Probably fwrite(), fflush() and then errno? fflush() is guaranteed to set errno AFAIK.
No:
errno = 0; n_written = fwrite (contents, 1, length, file); if (n_written < length)
{ save_errno = errno;
g_set_error (err, G_FILE_ERROR, g_file_error_from_errno (save_errno), _("Failed to write file '%s': fwrite() failed: %
s"), display_name, g_strerror (save_errno));
fclose (file);
Ah, it sets errno = 0 first, assuming that fwrite() may fail to do so.
I've now committed a fix so that any write failure should be reported to the user even if errno is 0. Please test.
It won't be 0 unless you assign it. A successful library call does not clear errno.
Why fake 0 errno as EIO? g_file_set_contents() doesn't, and *strerror() returns something for 0, usually "Error 0".
fclose check is fixed now too.
If fwrite() and fclose() both fail, I'd prefer to see the errno of fwrite(), even if 0 (g_file_set_contents() does that). Hmmm, it'll be nice to display the filename as UTF-8 too, plus the name of the failed function, and we have a GError ready...
Mind if I try to improve the error handling, using gfileutils.c as a source? There will be 3 new translation strings.
On Wed, 10 Nov 2010 20:02:53 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Ah, it sets errno = 0 first, assuming that fwrite() may fail to do so.
I've now committed a fix so that any write failure should be reported to the user even if errno is 0. Please test.
It won't be 0 unless you assign it. A successful library call does not clear errno.
Ok, but at least errors are now shown to the user.
But we should set errno=0 in case a confusing error message is shown.
Why fake 0 errno as EIO? g_file_set_contents() doesn't, and *strerror() returns something for 0, usually "Error 0".
GLib says to use g_strerror, which says 'Success', which is confusing.
fclose check is fixed now too.
If fwrite() and fclose() both fail, I'd prefer to see the errno of fwrite(), even if 0 (g_file_set_contents() does that).
Ok, but it might not set errno.
Hmmm, it'll be nice to display the filename as UTF-8 too, plus the name of the failed function, and we have a GError ready...
Mind if I try to improve the error handling, using gfileutils.c as a source? There will be 3 new translation strings.
Sure, have a go.
Nick
On Wed, 10 Nov 2010 18:14:04 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
On Wed, 10 Nov 2010 20:02:53 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Why fake 0 errno as EIO? g_file_set_contents() doesn't, and *strerror() returns something for 0, usually "Error 0".
GLib says to use g_strerror, which says 'Success', which is confusing.
It is, but "Input/output error" usually means a failed disk. (o_O) POSIX says "Some physical input or output error has occurred."
Mind if I try to improve the error handling, using gfileutils.c as a source? There will be 3 new translation strings.
Sure, have a go.
Here. errno 0 still displays "Success" at the end of the messages, but at least they start "Failed to"... Well, it's identical to glib, including the paranoid resetting of errno before each call.
Tested with writing to /boza, full disk and a fake close error.
On Wed, 10 Nov 2010 21:23:18 +0200 Dimitar Zhekov dimitar.zhekov@gmail.com wrote:
Why fake 0 errno as EIO? g_file_set_contents() doesn't, and *strerror() returns something for 0, usually "Error 0".
GLib says to use g_strerror, which says 'Success', which is confusing.
It is, but "Input/output error" usually means a failed disk. (o_O) POSIX says "Some physical input or output error has occurred."
Ok, so it's not great.
Mind if I try to improve the error handling, using gfileutils.c as a source? There will be 3 new translation strings.
Sure, have a go.
Here. errno 0 still displays "Success" at the end of the messages, but at least they start "Failed to"... Well, it's identical to glib, including the paranoid resetting of errno before each call.
Applied, thanks.
Tested with writing to /boza, full disk and a fake close error.
Thanks for testing.
Nick
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this, or it's a bug in my GLib version, I don't know. I changed the code of the callback to check mtime before setting doc->file_disk_status to CHANGE, and file saving is now works correctly for me. If you think this change is meaningful, consider the patch I attach. Apart of the change in the callback it contains some other staff (which you may wish to not commit).
Best regards, Eugene.
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
or it's a bug in my
GLib version, I don't know.
Which version??
I changed the code of the callback to
check mtime before setting doc->file_disk_status to CHANGE, and file saving is now works correctly for me. If you think this change is meaningful, consider the patch I attach. Apart of the change in the callback it contains some other staff (which you may wish to not commit).
Well thats one way of telling the monitor that we know about the changes because we caused them :-)
Cheers Lex
Best regards, Eugene. _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
or it's a bug in my
GLib version, I don't know.
Which version??
Recent 2.26.0
I changed the code of the callback to
check mtime before setting doc->file_disk_status to CHANGE, and file saving is now works correctly for me. If you think this change is meaningful, consider the patch I attach. Apart of the change in the callback it contains some other staff (which you may wish to not commit).
Well thats one way of telling the monitor that we know about the changes because we caused them :-)
Cheers Lex
Best regards, Eugene. _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On 14 November 2010 21:37, Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Since what is called change here is actually last_change_hint, you could actually get it several times because its only "probably" the last change. The unknown could be move/rename delete or just plain change (probably *not* last??).
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
But I still don't understand why using g_file_set_contents or plain fwrite doesn't trigger the monitor?? I can't see anywhere in document.c where it muzzles the monitor whilst saving.
Cheers but Confused. Lex
or it's a bug in my
GLib version, I don't know.
Which version??
Recent 2.26.0
I changed the code of the callback to
check mtime before setting doc->file_disk_status to CHANGE, and file saving is now works correctly for me. If you think this change is meaningful, consider the patch I attach. Apart of the change in the callback it contains some other staff (which you may wish to not commit).
Well thats one way of telling the monitor that we know about the changes because we caused them :-)
Cheers Lex
Best regards, Eugene. _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On 11 November 2010 00:27, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Wed, 10 Nov 2010 10:13:29 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
But Geany doesn't make a backup, make_backup is hard coded to false. Needs to be set to true.
Or maybe it needs a hidden pref make_backup_when_using_unsafe_GIO_file_saving :-)
Cheers Lex
Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Sun, 14 Nov 2010 22:02:31 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 21:37, Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Since what is called change here is actually last_change_hint, you could actually get it several times because its only "probably" the last change.
Yes, and it also means that the original implementation, which ignores exactly one hint callback receives, is wrong. The comments near the #define actually tell that GIO based file monitoring is not completely stable, but when I ran an older version of Geany (maybe rev. 5380) with the #define turned on, saving seemed to work fine. Though there is a high probability that I'm wrong because I used that version only for testing new patches (i.e., rarely). Anyway, seems that I need to find a version where saving began to "fail".
The unknown could be move/rename delete or just plain change (probably *not* last??).
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
Yes, there's a need for experiments.
But I still don't understand why using g_file_set_contents or plain fwrite doesn't trigger the monitor?? I can't see anywhere in document.c where it muzzles the monitor whilst saving.
Seems strange for me too.
Best regards, Eugene.
Some results after preliminary investigation:
On Sun, 14 Nov 2010 14:25:54 +0300% Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 22:02:31 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 21:37, Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Since what is called change here is actually last_change_hint, you could actually get it several times because its only "probably" the last change.
Yes, and it also means that the original implementation, which ignores exactly one hint callback receives, is wrong. The comments near the #define actually tell that GIO based file monitoring is not completely stable, but when I ran an older version of Geany (maybe rev. 5380) with the #define turned on, saving seemed to work fine. Though there is a high probability that I'm wrong because I used that version only for testing new patches (i.e., rarely). Anyway, seems that I need to find a version where saving began to "fail".
Okay, it still fails at r5380. The change is caused by g_file_replace_contents() which was introduced before than that.
The unknown could be move/rename delete or just plain change (probably *not* last??).
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
Yes, there's a need for experiments.
r5395 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 0 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 0 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
r5065 (before g_file_replace_contents()) Geany-INFO: monitor_file_changed_cb: event: 0 previous file status: 2 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
Events: 0 - CHANGED 1 - CHANGES_HINT (handled in Geany) 3 - CREATED
It's natural that CREATED is reported, but it's still unclear why the first CHANGES_HINT is sent. Going to investigate further…
But I still don't understand why using g_file_set_contents or plain fwrite doesn't trigger the monitor?? I can't see anywhere in document.c where it muzzles the monitor whilst saving.
Seems strange for me too.
Actually g_file_replace_contents(), g_file_set_contents(), and fwrite() - are called in document.c:write_data_to_disk() - which is only called in document.c:save_doc() - which is only called in document_save_file() - which itself sets file_disk_status to FILE_IGNORE.
So we ignore the first change hint despite of the function that is finally called.
Best regards, Eugene.
On Sun, 14 Nov 2010 19:29:08 +0300% Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 14:25:54 +0300% Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 22:02:31 +1100% Lex Trotman elextr@gmail.com wrote:
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
Yes, there's a need for experiments.
r5395 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 0 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 0 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
r5065 (before g_file_replace_contents()) Geany-INFO: monitor_file_changed_cb: event: 0 previous file status: 2 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
Events: 0 - CHANGED 1 - CHANGES_HINT (handled in Geany) 3 - CREATED
It's natural that CREATED is reported, but it's still unclear why the first CHANGES_HINT is sent. Going to investigate further…
Okay, [1] is the function of GLib 2.26.0 which is finally called in my case to open a stream for writing. In this particular case the original file is opened, but the fd is never used: instead a temporary file is opened and its fd is returned, and the original fd is just closed, hence the first meaningless change hint.
By the way, the function hasn't changed in GLib's "trunk" since 2.26.0.
[1] http://nopaste.geany.org/p/mefaf389
Best regards, Eugene.
On 11 November 2010 00:27, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Wed, 10 Nov 2010 10:13:29 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On 15 November 2010 03:29, Eugene Arshinov earshinov@gmail.com wrote:
Some results after preliminary investigation:
On Sun, 14 Nov 2010 14:25:54 +0300% Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 22:02:31 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 21:37, Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Since what is called change here is actually last_change_hint, you could actually get it several times because its only "probably" the last change.
Yes, and it also means that the original implementation, which ignores exactly one hint callback receives, is wrong. The comments near the #define actually tell that GIO based file monitoring is not completely stable, but when I ran an older version of Geany (maybe rev. 5380) with the #define turned on, saving seemed to work fine. Though there is a high probability that I'm wrong because I used that version only for testing new patches (i.e., rarely). Anyway, seems that I need to find a version where saving began to "fail".
Okay, it still fails at r5380. The change is caused by g_file_replace_contents() which was introduced before than that.
The unknown could be move/rename delete or just plain change (probably *not* last??).
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
Yes, there's a need for experiments.
r5395 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2
Probably the change of atime, though I would have expected that to be an attribute change event.
Geany-Message: monitor_file_changed_cb: FILE_CHANGED Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 0
Create when the rename is done, OK
Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 0 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
Thats another change of atime when the file descriptor is closed (on Unix its closed *after* the rename).
r5065 (before g_file_replace_contents()) Geany-INFO: monitor_file_changed_cb: event: 0 previous file status: 2 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
And with safe file saving (ie g_file_set_contents) it just gets:
Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 2
Because it only does the rename it doesn't do the other opens and closes.
Its not the Glib version, it happens for me on 2.24.1 as well.
The above looks to me like Glib is working properly, Geany just needs to learn to ignore changes of its own making better.
Your mtime patch looks sensible to me, since we really don't want to reload unless the content changed. We don't care about changes to metadata since we don't use it. Except I suggest:
changed = doc->priv->mtime < st.st_mtime;
be != instead of < in case someone restores an old version using a tool that also restores the mtime.
Events: 0 - CHANGED 1 - CHANGES_HINT (handled in Geany) 3 - CREATED
It's natural that CREATED is reported, but it's still unclear why the first CHANGES_HINT is sent. Going to investigate further…
But I still don't understand why using g_file_set_contents or plain fwrite doesn't trigger the monitor?? I can't see anywhere in document.c where it muzzles the monitor whilst saving.
Seems strange for me too.
Actually g_file_replace_contents(), g_file_set_contents(), and fwrite()
- are called in document.c:write_data_to_disk()
- which is only called in document.c:save_doc()
- which is only called in document_save_file()
- which itself sets file_disk_status to FILE_IGNORE.
So we ignore the first change hint despite of the function that is finally called.
Well spotted.
Cheers Lex
PS I've sent a patch for an option to enable g_file_replace_contents making backups, that could give other sequences of events since the old file is now renamed before the temp file is renamed.
Best regards, Eugene.
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Mon, 15 Nov 2010 09:41:53 +1100% Lex Trotman elextr@gmail.com wrote:
On 15 November 2010 03:29, Eugene Arshinov earshinov@gmail.com wrote:
Some results after preliminary investigation:
On Sun, 14 Nov 2010 14:25:54 +0300% Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 22:02:31 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 21:37, Eugene Arshinov earshinov@gmail.com wrote:
On Sun, 14 Nov 2010 21:24:39 +1100% Lex Trotman elextr@gmail.com wrote:
On 14 November 2010 20:31, Eugene Arshinov earshinov@gmail.com wrote: > Hi. > > I don't know whether it was this change which caused this, > but after I updated recently to r5395 and turned on the > #define in document.c which controls using GIO file > monitor, each time I save a document (I only use local > filesystems) I get a dialog telling me the file was > changed. In debug output coming from > document.c:monitor_file_changed_cb() I see that CHANGE > notification is sent twice after a file is saved. Maybe > it's g_file_replace_contents() which cause this,
Possibly, g_file_replace_contents writes to the temp file and then renames the temp file to the old file, but why two??.
The interesting thing is why doesn't any change to the file trigger the monitor no matter how its written?? Why does it only happen for GIO IO??
I was not fully correct. There were 2 CHANGE notifications and 1 "unknown" notification between them. Here is the output I get after opening, changing and saving a file:
Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: /tmp/temp.xml : XML (UTF-8) Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: IGNORE -> OK Geany-INFO: monitor_file_changed_cb: event: UNKNOWN; status: OK -> OK Geany-INFO: monitor_file_changed_cb: event: CHANGED; status: OK -> OK
Since what is called change here is actually last_change_hint, you could actually get it several times because its only "probably" the last change.
Yes, and it also means that the original implementation, which ignores exactly one hint callback receives, is wrong. The comments near the #define actually tell that GIO based file monitoring is not completely stable, but when I ran an older version of Geany (maybe rev. 5380) with the #define turned on, saving seemed to work fine. Though there is a high probability that I'm wrong because I used that version only for testing new patches (i.e., rarely). Anyway, seems that I need to find a version where saving began to "fail".
Okay, it still fails at r5380. The change is caused by g_file_replace_contents() which was introduced before than that.
The unknown could be move/rename delete or just plain change (probably *not* last??).
Note that with my patch the output is different from trunk, but things should be clear. I'll investigate the unknown notification a bit later today, maybe it is caused by the rename.
I'm not sure how g_file_monitor would report an atomic rename of a temp file over the file we are monitoring?? The directory entry is changed and the old inode and data deleted if this was the last hard link.
You would have to test it or look at the source.
Yes, there's a need for experiments.
r5395 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2
Probably the change of atime, though I would have expected that to be an attribute change event.
Geany-Message: monitor_file_changed_cb: FILE_CHANGED Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 0
Create when the rename is done, OK
Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 0 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
Thats another change of atime when the file descriptor is closed (on Unix its closed *after* the rename).
r5065 (before g_file_replace_contents()) Geany-INFO: monitor_file_changed_cb: event: 0 previous file status: 2 Geany-INFO: monitor_file_changed_cb: event: 1 previous file status: 2 Geany-Message: monitor_file_changed_cb: FILE_CHANGED
And with safe file saving (ie g_file_set_contents) it just gets:
Geany-INFO: monitor_file_changed_cb: event: 3 previous file status: 2
Because it only does the rename it doesn't do the other opens and closes.
Its not the Glib version, it happens for me on 2.24.1 as well.
The above looks to me like Glib is working properly, Geany just needs to learn to ignore changes of its own making better.
Your mtime patch looks sensible to me, since we really don't want to reload unless the content changed. We don't care about changes to metadata since we don't use it. Except I suggest:
changed = doc->priv->mtime < st.st_mtime;
be != instead of < in case someone restores an old version using a tool that also restores the mtime.
Agree, but then we also need to change a similar check in document_check_disk_status():
else if (doc->priv->mtime < st.st_mtime) { monitor_reload_file(doc); doc->priv->mtime = st.st_mtime; ret = TRUE; }
Best regards, Eugene.
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
On 11 November 2010 00:27, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Wed, 10 Nov 2010 10:13:29 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Whoever wants safe saving may set use_safe_file_saving=true, and that now works for GIO. Isn't unsafe_save_backup practically the same?..
2010/11/16 Dimitar Zhekov dimitar.zhekov@gmail.com:
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
On 11 November 2010 00:27, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Wed, 10 Nov 2010 10:13:29 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Whoever wants safe saving may set use_safe_file_saving=true, and that now works for GIO. Isn't unsafe_save_backup practically the same?..
Well no, GIO (correctly) is only used in the unsafe path.
Second, GIO file save has potential performance benefits for remote files, but is unsafe and networks are even more unsafe, making the backup mitigates some of the risks.
Third, making a backup also mitigates against some user errors of saving something you didn't mean to because it keeps the backup (filename~)
Cheers Lex
-- E-gards: Jimmy _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Ok. We do have some backup functionality in the Save Actions plugin, but anyway this pref may be more efficient.
I think either: 1. unsafe_save_backup should be renamed to gio_unsafe_save_backup to avoid confusion. 2. call the pref make_backup and make it work for all save implementations, and maybe have a GUI option for it.
Nick
On 19 November 2010 06:04, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Ok. We do have some backup functionality in the Save Actions plugin, but anyway this pref may be more efficient.
Yes, save actions is fine on a local disk and much more flexible since you can choose name and location for backups, but it copies the file which is a performance problem for remote filesystems which was one of the initiating complaints on the long "safe file save" thread IIRC.
I think either:
- unsafe_save_backup should be renamed to gio_unsafe_save_backup to
avoid confusion.
Ok, attached.
- call the pref make_backup and make it work for all save
implementations, and maybe have a GUI option for it.
To do this we would almost be writing the whole of GIO with the algorithm I outlined previously. So we probably should do that instead. For now lets just offer this easy option that provides improved performance on remote filesystems.
Cheers Lex
Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
And this time with the attachment :-S
Cheers Lex
On 19 November 2010 08:20, Lex Trotman elextr@gmail.com wrote:
On 19 November 2010 06:04, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
Also, g_file_replace_contents does have a make_backup argument we could provide an option for. This might handle the disk exhaustion problem.
To summarise the long thread for you Nick, g_file_replace_contents and g_file_replace can't be used because there is NO WAY to stop g_stream_close doing the rename of the temp file over the old file even if the write to the temp file fails. So you can get a broken output file.
I know. But at the least the user has the backup file then.
Attached is a patch to actually make the backup :-)
Ok. We do have some backup functionality in the Save Actions plugin, but anyway this pref may be more efficient.
Yes, save actions is fine on a local disk and much more flexible since you can choose name and location for backups, but it copies the file which is a performance problem for remote filesystems which was one of the initiating complaints on the long "safe file save" thread IIRC.
I think either:
- unsafe_save_backup should be renamed to gio_unsafe_save_backup to
avoid confusion.
Ok, attached.
- call the pref make_backup and make it work for all save
implementations, and maybe have a GUI option for it.
To do this we would almost be writing the whole of GIO with the algorithm I outlined previously. So we probably should do that instead. For now lets just offer this easy option that provides improved performance on remote filesystems.
Cheers Lex
Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Fri, 19 Nov 2010 08:20:58 +1100 Lex Trotman elextr@gmail.com wrote:
On 19 November 2010 06:04, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
I think either:
- unsafe_save_backup should be renamed to gio_unsafe_save_backup to
avoid confusion.
Ok, attached.
Applied, thanks. I put the new pref field last just in case anyone is accessing the other fields from a plugin.
- call the pref make_backup and make it work for all save
implementations, and maybe have a GUI option for it.
To do this we would almost be writing the whole of GIO with the algorithm I outlined previously. So we probably should do that instead. For now lets just offer this easy option that provides improved performance on remote filesystems.
Fine with just the patch. I was only suggesting copying the existing file first for the backup when using non-gio saving.
Nick
On 24 November 2010 03:27, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Fri, 19 Nov 2010 08:20:58 +1100 Lex Trotman elextr@gmail.com wrote:
On 19 November 2010 06:04, Nick Treleaven nick.treleaven@btinternet.com wrote:
On Mon, 15 Nov 2010 08:57:51 +1100 Lex Trotman elextr@gmail.com wrote:
I think either:
- unsafe_save_backup should be renamed to gio_unsafe_save_backup to
avoid confusion.
Ok, attached.
Applied, thanks. I put the new pref field last just in case anyone is accessing the other fields from a plugin.
- call the pref make_backup and make it work for all save
implementations, and maybe have a GUI option for it.
To do this we would almost be writing the whole of GIO with the algorithm I outlined previously. So we probably should do that instead. For now lets just offer this easy option that provides improved performance on remote filesystems.
Fine with just the patch. I was only suggesting copying the existing file first for the backup when using non-gio saving.
Sure, what I realise I wasn't clearly saying is that when you *really* need the backup is on remote filesystems, but then a copy costs you two transfers on top of the one to save the file, too slow. So you should try renaming first and only copy if you need to and by the time you have done that you have reimplemented a poor version of GIO.
Cheers Lex
Nick _______________________________________________ Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Sun, 14 Nov 2010 12:31:59 +0300, Eugene wrote:
Hey, sorry for my late reply.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed.
I'm not surprised this code broke. I don't maintain it anymore and didn't expect anyone is still using it. I even think we should remove it as it never really worked as it should, AFAIR. The only reason I didn't yet remove is that it is hard for me as I spent so much time on it :). But as I don't have time to spend on it to get it actually working, we either remove it or someone else wants to maintain/improve/fix it.
But as said, so far, I'm not surprised it's dead and I don't care about fixing it. Sorry.
Regards, Enrico
2010/11/29 Enrico Tröger enrico.troeger@uvena.de:
On Sun, 14 Nov 2010 12:31:59 +0300, Eugene wrote:
Hey, sorry for my late reply.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed.
I'm not surprised this code broke. I don't maintain it anymore and didn't expect anyone is still using it. I even think we should remove it as it never really worked as it should, AFAIR. The only reason I didn't yet remove is that it is hard for me as I spent so much time on it :). But as I don't have time to spend on it to get it actually working, we either remove it or someone else wants to maintain/improve/fix it.
But as said, so far, I'm not surprised it's dead and I don't care about fixing it. Sorry.
Well, Eugenes solution looks ok, if he provides a clean patch with the last things discussed it could be applied to unstable for testing.
Cheers Lex
Regards, Enrico
-- Get my GPG key from http://www.uvena.de/pub.asc
Geany-devel mailing list Geany-devel@uvena.de http://lists.uvena.de/cgi-bin/mailman/listinfo/geany-devel
On Mon, 29 Nov 2010 09:18:56 +1100 Lex Trotman elextr@gmail.com wrote:
I'm not surprised this code broke. I don't maintain it anymore and didn't expect anyone is still using it. I even think we should remove it as it never really worked as it should, AFAIR. The only reason I didn't yet remove is that it is hard for me as I spent so much time on it :). But as I don't have time to spend on it to get it actually working, we either remove it or someone else wants to maintain/improve/fix it.
But as said, so far, I'm not surprised it's dead and I don't care about fixing it. Sorry.
Well, Eugenes solution looks ok, if he provides a clean patch with the last things discussed it could be applied to unstable for testing.
Yes, or perhaps it could go to trunk - I think USE_GIO_FILEMON is not documented anywhere so it shouldn't affect ordinary users.
Nick
Hello Enrico, Lex, and Nick :)
Sorry for the delay.
On Mon, 29 Nov 2010 12:26:22 +0000% Nick Treleaven nick.treleaven@btinternet.com wrote:
On Mon, 29 Nov 2010 09:18:56 +1100 Lex Trotman elextr@gmail.com wrote:
I'm not surprised this code broke. I don't maintain it anymore and didn't expect anyone is still using it. I even think we should remove it as it never really worked as it should, AFAIR. The only reason I didn't yet remove is that it is hard for me as I spent so much time on it :). But as I don't have time to spend on it to get it actually working, we either remove it or someone else wants to maintain/improve/fix it.
I turned on the #define only for an experiment, and it worked nicely for me (maybe because I don't use remote filesystems). I like that updated files are colored, so I can easily see which are changed/removed (btw, "Reload all" command would be useful). The code broke only after g_file_replace_contents was introduced because the code didn't account that there can be several change notifications for single "write". The not-so-large patch I attached fixes this bug.
But as said, so far, I'm not surprised it's dead and I don't care about fixing it. Sorry.
Well, Eugenes solution looks ok, if he provides a clean patch with the last things discussed it could be applied to unstable for testing.
You mean changing from "doc->priv->mtime < st.st_mtime" to "doc->priv->mtime != st.st_mtime"? Yes, as I already wrote before, I think it's meaningful, so I'll send an updated patch if needed.
Also, if we check mtime on change notification, maybe we no longer need FILE_IGNORE constant. This will cost us an additional mtime check.
Yes, or perhaps it could go to trunk - I think USE_GIO_FILEMON is not documented anywhere so it shouldn't affect ordinary users.
I agree :)
Best regards, Eugene.
On Sun, 14 Nov 2010 12:31:59 +0300 Eugene Arshinov earshinov@gmail.com wrote:
Hi.
I don't know whether it was this change which caused this, but after I updated recently to r5395 and turned on the #define in document.c which controls using GIO file monitor, each time I save a document (I only use local filesystems) I get a dialog telling me the file was changed. In debug output coming from document.c:monitor_file_changed_cb() I see that CHANGE notification is sent twice after a file is saved. Maybe it's g_file_replace_contents() which cause this, or it's a bug in my GLib version, I don't know. I changed the code of the callback to check mtime before setting doc->file_disk_status to CHANGE, and file saving is now works correctly for me. If you think this change is meaningful, consider the patch I attach. Apart of the change in the callback it contains some other staff (which you may wish to not commit).
Best regards, Eugene.
Hello.
I've just created an entry in the patch tracker https://sourceforge.net/tracker/?func=detail&aid=3345697&group_id=153444&atid=787793
Regards, Eugene.