kota's memex

Gemini user interactivity?

I've seen some chatter on the gemini mailing list recently about a perceived issue with gemini's limited user interaction. Specifically, that more features are needed in order to create "gemini apps", comment systems, payment systems, or for users of the various community capsules to create posts without needing a web browser.

Gemini as of today will not scale in interactive usability, because user interaction is severely limited. There is a statement that Gemini is only for consumption. However, we have to be realistic that for usability of consumption to stay high, user cannot be asked to live Gemini browser for HTTPS powered web browser to submit a post. That breaks user flow and makes Gemini not worth it. Why to come back to Gemini if I just used standard browser to do something and in the process I compromised all the advantages listed above? -- sergei.gnezdov at gmail.com

Sergei's post

Consider a website, gemini://example.org, where users can set up accounts. It uses TLS certificates for authentication and provides important settings through the Gemini interface. For example, one can delete their account by visiting a certain URL: perhaps gemini://example.org/account/delete. Although this makes sense, you may already begin to understand the problem at hand. Malicious Gemini pages (or parts thereof) can contain links to such locations. Depending upon the user's Gemini client configuration, it may not show them the URL they are going to (e.g. Amfora, I think), and they may accidentally delete their account at gemini://example.org (or perform any other action involuntarily) this way. -- nothien at uber.space

Nothien's post

I think both are coming to perfect logical conclusions about how difficult would be to create applications over gemini. That doing so would lead to security issues such as the cross-site request forgery attack described by Nothien. Sergei also made good points about how annoying it is to use a web browser for these tasks. The problem is that they're viewing gemini as a replacement for the things http does. Many others have responded more succinctly than I can in the mailing lists. I encourage you to go read them, you can even read them over gemini thanks to sloum's lovely archive.

Question 1.6: Do you really think you can replace the web?

Not for a minute! Nor does anybody involved with Gemini want to destroy Gopherspace. Gemini is not intended to replace either Gopher or the web, but to co-exist peacefully alongside them as one more option which people can freely choose to use if it suits them. In the same way that some people currently serve the same content via gopher and the web, people will be able to "bihost" or "trihost" content on whichever combination of protocols they think offer the best match to their technical, philosophical and aesthetic requirements and those of their intended audience. - Gemini FAQ

Project Gemini FAQ


I wrote briefly last month about my suggested "solution" to this issue:

In my mind gemini exists to wall off a space away from advertisers and billionaire news sites. It exists for the rest of us. For document sharing, story telling, and interesting new ideas. A possible future would involve replacing portions of the old web with small purposeful protocols and tools that have learned from the mistakes that allowed the web to be captured.

I want to talk about the "small purposeful protocols" part. Gemini does one thing and does it well, it serves static files with special consideration for a lightweight hypertext format. So the use cases mentioned by Sergei and Nothien don't really fit into gemini's stated use case, but that doesn't mean they aren't useful technological goals or forms of human communication we should preserve. (However, there are some elements of the web that I think we shouldn't preserve, but that's for another day).

SSH Apps?

For many applications the best case scenario is creating a custom built protocol on top of UDP or TCP and applications for different kinds of users on different platforms which use your protocol. Email, IRC, and git are obvious examples of this. I would love to see some more work on video streaming (we can do better than youtube clones), simple p2p file sharing (like magic wormhole), and perhaps even a clever protocol for "web forums" generally. All that said, I think it makes sense, at least in the short term, for a good general purpose application protocol. Something for sharing an application's interface over a network for others to use. Over time http transitioned from a lightweight hypertext protocol, much like gemini, to fill this gap. Newer versions of http are more optimized for applications rather than text, in fact the newish binary http2 protocol had these stated design goals:

Support common existing use cases of HTTP, such as desktop web browsers, mobile web browsers, web APIs, web servers at various scales, proxy servers, reverse proxy servers, firewalls, and content delivery networks.

There is actually another existing protocol that's largely overlooked for this purpose. It was designed as a general purpose application protocol with security in mind from the start. I'm talking about SSH of course!

Most of us only really use ssh to login to a remote machine, start a shell, and execute commands. Some may have also used it for forwarding arbitrary ports, or even using remote X11 applications, but it can also forward domain sockets, and create pretty much any kind of fast and secure communication channel. I should point out that ssh is a protocol and OpenSSH is the extremely popular client/server implementation you've probably used. I also want to quickly dispel a popular misconception: ssh is not an implementation of telnet with cryptography provided by SSL, but rather is it's own robust protocol.

The functionality of the ssh protocol is actually somewhat comparable to tls (gemini is a superset of tls), but with a number of useful application features not present in tls such as multiplexing many secondary sessions into a single ssh connection. Many of the more useful software architectural styles used in http can be implemented directly with ssh such as Representational state transfer (REST) or GraphQL. Graphical clients (or interesting specialized clients) can be written to use these ssh applications, but there's the nice advantage that ssh was actually somewhat designed for this and great cli clients and servers already exist. I don't propose that ssh is a great protocol for all types of web uses or applications, as I said earlier I'd like to see "small purposeful protocols" used more and more, but as a more generalized application protocol ssh is worth looking into more.


For a fun little example I threw together "boggle over ssh". Boggle is a favorite board game of mine. It's played using a grid of lettered dice, in which players try to find words using sequences of adjacent letters. Take the following board for example:

| E | U | E | B |
| A | S | L | M |
| E | W | I | P |
| U | A | N | B |

Starting with the first E we can move down to A, S, L, E. Spelling easle. We could start with W move left to E and get the word weasel. Traditionally, words must be 3 letters minimum and you may not use the same letter cube twice in a single word. Variations exist of course. Anyway this game was written for BSD some time ago and exists on Linux distros in the bsdgames package. You can play the game over ssh in a terminal with the following command:

ssh boggle@kota.nz

Careful observers, or those living far from my server (in California sadly, I don't have the means to self host in Aotearoa quite yet), will notice a slight input delay. That's because this "boggle ssh app" is "server side rendered". In normal terms that means the whole board is calculated on the server and then sent to the client. This isn't ideal for fast paced applications, it would be preferable if perhaps the board was rendered on the client side and word attempts were sent to the server to be verified as good or bad.

In the http world this is where javascript comes in, the server sends some code to a web browser which is then able to run the application locally. Sending and storing data on the client is a very complicated process, in part due to it's nature and security implication, but also because web developers are fighting against the original design of http which was never intended for this use case.

Now for lots of the "interactive gemini" use cases mentioned before the server side rendering approach is perfectly reasonable. Simple tui menus with input prompts can be created for configuring settings, file transfer is pretty straight forward, and for more complex usage an additional ssh api can be developed allowing for gui clients (which would be provided as an optional download on the gemini capsule). I imagine most of these applications would require your gemini client client cert to authenticate. If I wanted I could update my boggle application to allow entering a username and show a little high score page on gemini. I didn't feel that was needed for this demo, but it's easily doable.

There are some downsides to this approach. First of all, mobile support is simply not good with the method I've shown. I think the solution is to create a standard form + menu interface such that general purpose clients (graphical or otherwise) could be written and provide good mobile support. Lots more thought needs to be put into this. I'd be interested to hear what others think. I'm gonna try out that cool gemrefinder tool made recently by sandra, but I also have an email with kota as the username at nilsu.org.

How did I do it?

If you're interested, here's how I setup that boggle game server. My server is currently running Debian. I'm simply using a feature of sshd to force an ssh tty connection into running a particular command and rejecting any other methods of connection. The user doesn't require a password and the session is terminated when the program closes. Do note that should a vulnerability exist in the boggle game that would be an issue, but any web server accepting user input suffers from this and by using well tested tools like ssh we eliminate whole classes of vulnerabilities off the bat. Obviously, the boggle user should have very limited permissions and ideally for anything bigger you would sandbox this user completely so it can only access files and binaries needed for it's usage.

Install the bsdgames package with the boggle program:

sudo apt install bsdgames

Append the following to /etc/ssh/sshd_config:

Match User boggle
	X11Forwarding no
	AllowTcpForwarding no
	PermitUserRC no
	ForceCommand "/usr/games/boggle"
	PasswordAuthentication yes
	PermitEmptyPasswords yes

Create a boggle user:

sudo adduser boggle
sudo passwd -d boggle

Configure PAM to allow blank password logins:

sudo sed -i 's/nullok_secure/nullok/' /etc/pam.d/common-auth
sudo systemctl restart sshd


Your post was pretty interesting!

I have a comment on one small part of your post:

perhaps even a clever protocol for "web forums" generally.

There's already a protocol for that! It's called Usenet (built on the UUCP protocol much like Gemini is built on the TLS protocol): https://en.wikipedia.org/wiki/Usenet https://en.wikipedia.org/wiki/UUCP

While the old Usenet network is inactive and filled with spam, there's not much stopping a new Usenet network from rising. It's federated, lightweight (but not too minimalist), easy to set up; it really is a lot like an ActivityPub-Gemini fusion for internet forums. Write a few fancy clients rather than the 30-40 year old existing clients, host a few federated servers (independent from the existing Usenet network), generate some hype, and I could see it being a Fediverse for forums.

Just an interesting tidbit that I wanted to share.


That's really interesting! I've heard of Usenet before, but I think I got into computers just a little too late to have used it in it's prime. I'm gonna have to look into it a lot more, I've always personally liked forums more than the ActivityPub/twitter mircoblogging stuff.