Le 05/11/2010 00:27, Lex Trotman a écrit :
On 5 November 2010 09:37, Colomban Wendling lists.ban@herbesfolles.org wrote:
Le 04/11/2010 21:42, Dimitar Zhekov a écrit :
On Thu, 4 Nov 2010 17:53:47 +0000 Nick Treleaven nick.treleaven@btinternet.com wrote:
So it seems that function doesn't handle disk exhaustion safely. (But this is no worse than before for Geany).
Before we had safe or unsafe file saving. Let's keep that choice, shall we? I can write the patch, it's only a few lines.
For more than 20 years now, the only safe save is to write the data into a temporary file in the same directory, and then rename it over the target file. Even if the rename fails, you still have the temporary file. That changes the onwership and permissions, which is exactly the behaviour of the safe g_file_set_contents().
Well... stop me if I'm saying bullshit but what about:
backup = path + '~' # or whatever if copy (path, backup_path): # COPY file to the backup if not write (path, data): # write in original file move (backup_path, path) # if write failed, restore backup elif not make_backup: unlink (backup_path) # if no backup is to be done, delete backup
The problems I see are:
- needs at least 2 times the size of the file in the target directory
(but it's the same anytime we bake backups) 2) backup file may have altered permissions/attrs 3) ...so if write fails, we have changed the permissions/attrs on the original file In the worst case, the original file has another name and lost its permissions/attrs.
Hum, I found another problem (but seems to apply more or less to every safe-file-saving methods): write access to the directory containing the file to save is needed (to create the backup or restore it if write failed).
The advantages I see are:
- the permissions/attrs are kept on success, as well as hard links
- no data can be lost (if the FS is reliable)
The key idea is not to move the original file but to copy it. So we can safely overwrite the original (and the keep permissions/attrs), once the backup is done.
Thoughts?
Hi,
It seems to me that g_file_set_contents behavior can be described as "write new file with the name of an existing file" so new file permissions and hard links point to the old file.
Columbans suggestion can be described as "update the contents of an existing file" so old permissions and hard links still point to that file.
Note that even if it has problems, g_file_replace_contents() seems to keep the attributes quite correctly.
"Update the contents" seems much closer to the expected behavior of an editor than does "write a new file", so its good from that point of view.
But from an efficiency point of view its much more work. Probably not a problem on a local filesystem, but on a remote filesystem it requires three transfers of the data instead of one, read the old file and write the backup then write the new file. As I only use remote filesystems on fast networks I can't say how important this is likely to be, but for big files/lots of files it may be a problem, also I'm thinking web sites edited via ftp, ssh etc. This seems to be the use case for most of the performance complaints.
Well, right. I didn't thought that copy is unlikely to be done by the remote machine (is a filesystem/GVFS-backends clever enough to support copies?)... So yes, it would probably be a significant overhead on non-local files. Needs some testing probably.
If making a backup file is selected when the file is opened then the write of the backup happens then, uses the data read for the open and doesn't need to happen at save time, it could even be asynchronous so long as it was checked for correct completion before saving. This would reduce the user visible performance impact.
Theoretically yes, but I would not do that because another application may have changed the file since we loaded it and we may not have noticed. I think this looses a bit of the usefulness of a backup.
And it would also create a backup for each and every file that gets opened in Geany, not only the ones that gets saved, so.
Personally I would implement both and let the user choose (especially as g_file_set_contents exists).
Why not. Fast & reliable v.s. even more reliable. But it needs to write two different code paths, and if we want to support direct GIO API (which is probably a good idea IMHO, at least for the GNOME desktop with remote FS) as well as standard C API, we need about 4 code paths.
I join here a small program I wrote to test the idea, you can test it and see if you find problems. Note that this isn't well tested, but still, seems to work fine (both in the idea and in the tests I've done -- basically only no space left and normal).
To switch between plain C/GIO, change the value of the USE_GIO constant on the top of the file.
Regards, Colomban