After an evaluation, GNOME has moved from Bugzilla to GitLab. Learn more about GitLab.
No new issues can be reported in GNOME Bugzilla anymore.
To report an issue in a GNOME project, go to GNOME GitLab.
Do not go to GNOME Gitlab for: Bluefish, Doxygen, GnuCash, GStreamer, java-gnome, LDTP, NetworkManager, Tomboy.
Bug 756540 - push to remote repo
push to remote repo
Status: RESOLVED WONTFIX
Product: ostree
Classification: Infrastructure
Component: general
unspecified
Other Linux
: Normal normal
: ---
Assigned To: OSTree maintainer(s)
OSTree maintainer(s)
Depends on:
Blocks:
 
 
Reported: 2015-10-13 23:32 UTC by Dan Nicholson
Modified: 2018-08-17 19:00 UTC
See Also:
GNOME target: ---
GNOME version: ---



Description Dan Nicholson 2015-10-13 23:32:48 UTC
Didn't see another bug for this. Currently the way you "push" a local repository to a remote is via rsync or some other out of band mechanism. Unfortunately, non-ostree mechanisms have no idea about the structure of the repository and can only blindly copy the repository recursively.

This means the best you can do is commit builds directly into the public repository or have a build repo that has 1:1 correspondence to the public repository. I.e., you can "push" with rsync without --delete, but you're still likely to update refs files you didn't want to. This is especially an issue if the public repo is to host binaries from multiple architectures since in that case it becomes much more difficult to keep the build repos of each repository appropriately synchronized.

It would be awesome if ostree supported a push API like git does. Ideally it would just use ssh as a transport to remain agnostic of the auth mechanism, but I have no idea if this would require a full protocol. Here's info on git's protocols, FWIW:

https://github.com/git/git/blob/master/Documentation/technical/pack-protocol.txt
Comment 1 Colin Walters 2015-10-14 02:08:32 UTC
This would be best discussed on the list I think - Bugzilla is better for more focused issues.
Comment 2 Colin Walters 2015-10-14 02:11:49 UTC
But briefly writing something which I can copy/paste reply on list:

The lack of push was intentional; People out there push binaries into production environments that came from developer laptops, which I think is crazy.  There *are* valid use cases for pushing binaries directly to stage without going through a clean CI/CD builder, but IMO not production.

For a delivery server doing binary-level promotion, you can use `pull-local` today to migrate commits between two local repositories, or have the target repository do a pull.

That ends up with a broken history on the production repo, so the next step is to pull the commit, make a new commit with that same tree content, then discard the pulled commit.  This needs documenting and should probably have some built in workflow.
Comment 3 Dan Nicholson 2016-03-21 22:32:54 UTC
As mentioned in https://mail.gnome.org/archives/ostree-list/2015-December/msg00007.html, I did a proof of concept in https://github.com/dbnicholson/ostree-push.

Still need to look at integrating it into ostree proper, but just wanted to keep this bug up to date.
Comment 4 Colin Walters 2016-03-23 19:03:07 UTC
One thing I've been thinking is with a move to github it's easier to create multiple repos under the same "ostreedev" organization.

I have an https://github.com/cgwalters/ostree-scripts/ that might be good to unify into a directory of rel-eng ostree components in particular.  A command like "ostree-releng" Dunno.
Comment 5 Sam Spilsbury 2016-06-22 23:00:10 UTC
I've started to look at this again for Endless.

I think the first chunk of this work will be to allow the use ssh as the transport protocol for pulls. After having discussed this with Dan, I think there's two ways to do this:
 1) Use gvfs to mount the remote with sftp and attempt to treat the remote like a local filesystem, shuffling objects across the wire.
 2) Use ssh as an intermediary and spawn a smart "server" on the remote end to enumerate the objects and send them over.

Dan and I discussed the options earlier. Dan mentioned that #1 looks simple, but tends to get tricky considering that gvfs opens up d-bus connections and this might not work so well on server environments. #2 will be a little more involved, but ultimately opens the door to other "smart server" features, such as enumeration of objects on the server side so that there's no ping-ponging to figure out which objects to send over.

What are the maintainers' thoughts on this? I'm going to start working up a prototype for what #2 will look like, but I can think of other ways if a smarter server is clearly out of scope for the project.
Comment 6 Colin Walters 2016-06-27 18:36:51 UTC
I just tried https://github.com/libfuse/sshfs and it works fine to write to an archive repo.

So for example if you do builds in a `bare-user` repo locally (including checking out buildroots), then:

mkdir -p mnt
sshfs user@export-server:/srv mnt
ostree --repo=mnt/repo pull-local build-repo exampleos/42/x86_64/standard

Incremental updates here work because we already did the checksum locally, so we just `fstatat(checksum)` in the remote repo and we don't write if it already exists.
Comment 7 Dan Nicholson 2016-06-27 18:58:36 UTC
Should we take that to mean that you don't want ssh support natively in ostree?
Comment 8 Colin Walters 2016-06-27 20:31:45 UTC
I'm happy to talk about it, but both `sshfs` and `pull-local` already exist, and seem to implement #1 well.  (We couldn't actually use gvfs without a lot of work since I've been porting the ostree core *away* from gio)

So that comes down to #2.  Advantages here would be:

 - The server can manage concurrency
 - The server can implement security (don't allow pushers to unlink objects), or possibly re-verify checksums
 - Likely some possible pipelining and wire efficiency gains
 - Could try to tweak static deltas to be non-static for this

But the disadvantage is a lot of new code.  Is there anything I'm missing?
Comment 9 Dan Nicholson 2016-06-27 22:14:29 UTC
I guess convenience and security are the big ones for wanting the feature. `ostree push origin` and not having to put the GPG private key on the public server are things I think are very nice. Using sshfs does fill those gaps, but obviously managing an sshfs mount dance is kinda lame.

Does `pull-local` do transactions and locking? Can I do concurrent `pull-local`s?
Comment 10 Colin Walters 2016-06-27 22:25:22 UTC
It'd probably be a ~15-20 line shell script to make `ostree-push` that used sshfs and had a trap handler to `fusermount -u`, right?

Why would the GPG private key go on the public server?  You can certainly sign on the private server too, it's just generating a `.commitmeta` file that *should* get copied with `pull-local`...though hm, I just noticed there's no `--mirror` option for `pull-local`.  Will look at that.

As far as concurrency...I need to write something in the manual, but basically it's fine to have unsynchronized concurrent commits to different branches.

What does need synchronization is:
 - metadata like ostree summary -u
 - prune (this is the big one, so e.g. to do a prune you'd need to disable ssh access temporarily for writers)
Comment 11 Dan Nicholson 2016-06-27 22:40:10 UTC
`pull-local` already mirrors by putting the "remote" refs directly into the destination repo's namespace.

With sshfs you wouldn't need the private key on the server. I was saying why I wanted the ssh setup in the first place. I was able to get away without the private key on the public server for a long time, but now flatpak is requiring that the summary file be signed. Which I think would also work with sshfs.

The other thing I envisioned for a push mode was for the server side to manage these types of concurrency issues. E.g., if the parent changes before your push completes, then the transaction fails like git. Again, this does seem like it can be managed in the pull sense with sshfs, though.
Comment 12 Dan Nicholson 2016-06-27 23:57:24 UTC
Here's said shell implementation - https://github.com/dbnicholson/ostree-push/blob/master/ostree-push.sh. I haven't tried it on our real repos, but it does do what I'd expect.
Comment 13 Dan Nicholson 2016-08-25 16:42:38 UTC
I tried the pull-local into sshfs on one of our real repos and it was incredibly slow. It seems like it takes a lot of time scanning commits on the "remote" repo, which involves doing thousands of stats over ssh. In one test, it took 30 minutes to push a commit that the remote already had.

So, I had to scrap that idea. I might try to keep working on the original python script that has a custom protocol over ssh. That would require duplicating much of the traversal and pull logic, unfortunately. What's there now is kind of a toy pull.

Another idea I had with rsync that's nasty but is probably faster than the sshfs route:

1. On the remote, clone the real repo to a temporary repo. By choosing a good path (e.g., in the repo's tmp/), this can be done quickly with hardlinks.
2. Rsync the objects, deltas and desired refs to the temporary repo.
3. Run a pull-local on the remote from the temporary repo to the real repo with the desired refs.

One drawback is that the temporary repo would need to be created as a full mirror of the real repo to avoid rsync sending lots of unrelated objects over the network.
Comment 14 Colin Walters 2016-08-25 17:57:59 UTC
(In reply to Dan Nicholson from comment #13)
> I tried the pull-local into sshfs on one of our real repos and it was
> incredibly slow. It seems like it takes a lot of time scanning commits on
> the "remote" repo, which involves doing thousands of stats over ssh. In one
> test, it took 30 minutes to push a commit that the remote already had.

That sounds really strange....yes, there's a logic bug in pull here.  If the commit is already on the target end without commitpartial, we still end up traversing.
Comment 15 Colin Walters 2016-11-23 14:55:58 UTC
I played around with sshfs again, and
https://github.com/ostreedev/ostree/pull/564
definitely fixes the "pull existing commit" path, it's now instant.

However, unless we do work in both ostree and sshfs to make it pipelined/multithreaded for the lstat() lookups, it's probably going
to ultimately be slower than rsync.

Basically ostree assumes lstat() is cheap, but it's not on a network
filesystem.
Comment 16 André Klapper 2018-08-17 19:00:24 UTC
OSTree has moved to Github a while ago.
Furthermore, GNOME Bugzilla will be shut down and replaced by gitlab.gnome.org.

If the problem reported in this Bugzilla ticket is still valid, please report it to https://github.com/ostreedev/ostree/issues instead. Thank you!

Closing this report as WONTFIX as part of Bugzilla Housekeeping.