I’ve been playing with an idea that would involve running a machine over a delay-tolerant mesh network. The thing is, each packet is precious and needs to be pretty much self contained in that situation, while modern systems assume SSH-like continuous interaction with the user.
Has anyone heard of anything pre-existing that would work here? I figured if anyone would know about situations where each character is expensive, it would be you folks.
Like MOSH? https://mosh.org/ Mosh has some predictive output and will resume sessions automatically.
Or more like tmux/screen? Has some fancy “nohup” like functions.
Mosh plus tmux is amazing.
That’s really helpful. Thank you! MOSH might work, I’ll have to play around with it.
Could you go into more detail about the tmux functions? If it’s a way to write everything to files instead of a STDOUT in a predictable way, that would be great, since each packet could be a (compressed) shell script that explicitly includes which data to send back, if any.
No, tmux does not redirect to a file. Though ‘>’ and ‘script’ do.
Tmux is like ‘screen’ and can be wrapped with ‘byobu’.
I mean, I guess you could just programmatically insert a > after every command. That’s actually a pretty good idea. It’s kind of obvious now that you mention it, haha!
It would be better if the tools expected to be used this way, but as a quick kludge for a project about something else it’s probably sufficient.
The first step is to make it work (at all, even badly).
Ask NASA
Do they post their software somewhere? What they use for space probes is exactly what I would need, but I kind of figured it would be a trade secret.
I was kidding. 😁 There’s nothing real time about C&C with space probes. And the ISS is close enough I doubt it’s an issue.
Yeah, I want not real time. The goal of having containers in the first place is to enable as much as possible without needing to put a human in the loop, since you have no idea how long each packet will spend in transit.
If I could emulate Curiosity’s onboard computer that would be a decent starting point.
In that case it might not hurt to reach out to some NASA email addresses. The people who write that stuff are, after all, nerds like us, and would probably be happy to share whatever they are allowed to share.
It’s funded by taxes so, security issues aside, there shouldn’t be a lot of trade secrets.
Government agencies, in my experience, tend to believe in security through obscurity; even the ones that don’t worry about spies as much as NASA. That said, maybe it’s worth a shot. I’ll have to figure out who’s the best person to bug.
I only know that the laptops on the ISS are Thinkpads that run on Debian Oldstable.
But that’s not really helpful for you.That’s funny. I’m on a really old laptop right now, and I’m running oldstable. Even going up from oldold broke it a bit.
It’s still plenty fast. Moore’s law is a bit of a paper tiger at this point.
The ‘ed’ editor was designed for high latency networks. I would pull on that thread. That is, in your shoes, I would read up on ‘ed’ and related tools.
Delightful!
“Of course, on the system I administrate, vi is symlinked to ed. Emacs has been replaced by a shell script which 1) Generates a syslog message at level LOG_EMERG; 2) reduces the user’s disk quota by 100K; and 3) RUNS ED!!!”
Gave me a giggle. That 100k loss has got to hurt for a user who still tries to run ‘vi’ on a classic system, I imagine.
Edit:
Another gem:
“Ed is generous enough to flag errors, yet prudent enough not to overwhelm the novice with verbosity.”
I would pull on that thread. That is, in your shoes
Directions unclear; shoelaces tangled
Ed is great (in this context). I think there’s been posts about it on here before. It’s just a text editor, though.
Yeah. I’ve had mentors regail me of other tools they used alongside ‘Ed’, but I wasn’t listening very attentively. Hopefully that’s something that can be dug out of the history of the Internet.
I would definitely choose the old reliable stuff over something new and fancy, if I had this use case.
Secure Scuttlebutt is (was?) a protocol for high-latency communication between occasionally-networked humans. Pro: https://scuttlebutt.nz/; con (not read in detail): https://derctuo.github.io/notes/secure-scuttlebutt.html. I think it was supposed to be able to spread messages over Bluetooth, assuming a sufficiently connected web of nodes between person A and person B. Public keys were identities, and were bound to devices; unfortunately people may have multiple devices, or change devices over time, so this was a hindrance.
IPFS was supposed to be the Interplanetary File System. I think that was just because whatever pieces of content you ask for, you also cache, as part of the design: you keep a copy on the near side of the small high-latency pipe. But that’s mostly about file transfer, not interactivity.
UUCP was definitely made in a time where a latency of days for delivery of email or netnews was common.
In the early days of CGI, the Web was just one way people imagined interacting with applications; another way was email. RFC 3834 has some recommendations for people who are going to automate email responses. There used to be services you could email a URL to, and receive the web page back as an email.
Using ed (in my experience) involves looking up the screen, or up the roll of paper on your teletype, to see what the lines of your file were, and imagine what they are now, given the changes you’ve wrought to them since they were printed, and then turn them into what they should be. With Mars rovers you have a simulation that you issue your command to, before sending it off to Mars. With correspondence chess you might keep a physical chessboard for each game you have going, and/or send a form back and forth that keeps track of several moves.
People used to do computation at universities and businesses by writing programs at their desks, submitting them to be typed on punchcards, and receiving printouts some time later. They would “desk check” their programs before sending them in, because each compute job took a couple days to come back.
I mention all these because, in an extreme censorship environment, any local state (session history on paper, an app on a smartphone, an odd device) might not be good to have around. So usability may require reducing the total amount of state that a command carries. The current working directory at the time a command is run changes the meaning and outcome of the command; you may not remember that directory in a day or two. The vocabulary and syntax of command-line switches are easy to look up in online manuals - but are there offline manuals? I don’t know if this avenue of inquiry helps you, but it’s interesting to think about for a moment.
Thanks for the effortpost! Scuttlebutt in particular is similar in spirit, although I agree with the blog post that the implementation sounds funny. One conceptual difference, I think, is Scuttlebutt sounding fully decentralised, which necessarily introduces an O(n2) kind of overhead. Hubs could operate more like the content distribution networks that already exist in really locked-down countries, which are proven to work, just with the new protocol as a lower risk way of getting to the end user. Their own page is loading blank for me, unfortunately.
Public keys were identities, and were bound to devices; unfortunately people may have multiple devices, or change devices over time, so this was a hindrance.
I’m not sure why even they added that, haha. How hard is moving a private key? I’m also imagining it would be pretty routine to just discard a key-identity and make a new one, for anonymity’s sake.
I mention all these because, in an extreme censorship environment, any local state (session history on paper, an app on a smartphone, an odd device) might not be good to have around. So usability may require reducing the total amount of state that a command carries. The current working directory at the time a command is run changes the meaning and outcome of the command; you may not remember that directory in a day or two. The vocabulary and syntax of command-line switches are easy to look up in online manuals - but are there offline manuals? I don’t know if this avenue of inquiry helps you, but it’s interesting to think about for a moment.
Some local state is probably necessary for usability. I mean, at the very least you need to have the software, which is probably illegal itself. The trick, as always with contraband, is either hiding it or not getting searched in the first place. In emergency situations having a way to securely delete everything quickly is the best that can be done, I think.
I don’t expect the average user wouldn’t be writing shell scripts themselves. There should be user-friendly frontends for common tasks like email messaging, but that doesn’t help developers. A certain level of statelessness at the hub end would be good, just to avoid unwanted interactions like that. Maybe execution always starts with the same environment variables in the same directory, and your payload bootstraps other shell scripts or actual programs needed to add context.
As long as you’re using TCP (what SSH uses) or a similar protocol, you should be able to deal with a situation like that. You’d mainly need to ensure that your client and server are tuned to meet your needs. With TCP, every packet is considered important and if the receiver does not acknowledge receipt, the sender will resend.
I’m not talking a lot of latency, I’m talking snail-mail levels. Hours probably won’t even be unusual, because hops will happen partly by sneakers net as people move around with their nodes. The concept is distributed burst radio for extreme censorship environments.
The point of the containers in the first place is to make as much as possible work offline, without the user having to be in the loop.
Oh that’s interesting. I might suggest looking at implementations of IP Over Avian Carrier (IPoAC). And I do mean that seriously. The idea started as an April Fools RFC but some people have actually implemented it. Basically, just using a different physical layer.
Yeah, that’s probably worth a look. Good suggestion. There’s also delay-tolerant protocols for space and similar, but I don’t know if any of them define an endpoint, as opposed to just a transport layer.
Indeed. I’d really suggest going for something based upon Internet Protocol, with any software that you need at endpoints to read and/or transmit. I might poke about at some ideas on the weekend (long holiday). What languages are you thinking to use?
Probably Rust, although I’m not married to it. I’m just at the planning stage right now, though.
One open question is if you can use a fairly standard transceiver like a Bluetooth chip, or if you need an SDR. Obviously they weren’t designed with this in mind, by maybe there’s a profile that’s close enough.
Packets should have a few kilobytes of payload so you can fit a postquantum cryptographic artifact. Thankfully, even with a BCH code, it seems doable to fit that much in a 1-second burst in a standard amateur radio voice channel, for testing. (In actual clandestine use I’d expect you’d want to go as wide as the hardware can support)
As envisioned there would be someone operating a hub, which might have actual network access through some means, and on which the containers run. They would send out runners to collect traffic from busy public spaces which might serve as hubs for burst activity, and dump outgoing packets, all without giving up any locations.
Accounts with their own small container would be opened by sending in a public key, and then further communication would be by standard symmetric algorithm - except in testing, because that’s an amateur radio no-no, so just signed cleartext. ID would be derived from signature fingerprint, as I have been thinking about it. I have a lightweight hash scheme in mind that would allow awarding of credit for retransmitting packets in a way that couldn’t be cheated.
You’d want to have some ability to detect and move around jamming, or just other people’s bursts. That’s more hardware research, basically.
I’ve got a few things that I need to get done in the next few days (hopefully mostly sorted today) but you’ve got me rather intrigued with this as a puzzle. I’ll see if I can get some time to sketch some thoughts out and maybe some high-level implementation of some bits in Python (it’s faster to POC things).
A few quick thoughts:
-
I think that an existing or novel protocol built on top of the Internet Protocol is likely the way to go. Following the OSI model, you can target Layer 4, with some simple stuff for higher layers. Client/Server (possibly the same binary) and associated automation should handle Layers 1-3 (translating between different carriers for Layers 1 and 2, and handling routing of data packets in Layer 3).
-
Message routing strategies and their impact on OpSec is worth consideration. By this I mean: broadcast-only vs targeted-only vs both vs hybrid. All three have trade-offs.
Broadcast-only: Makes it harder to know the intended destination of the message. Conversely, by being routed to either all known addresses or all approved addresses, it can be more vulnerable to interception by a compromised endpoint.
Targeted-only: May be harder to intercept as the path that a packet takes should result in it hitting fewer potential endpoints. Conversely, some form of addressing is necessary to know, at the least, the next hop in transit. This makes tracing the intended endpoint, as well as network hops much easier (ex. running a traceroute).
Both: Gains the advantages and disadvantages of both approaches, depending on the which mode the data is transmitted in. Ensuring that data is transmitted correctly becomes important and has implications on the requirement of maintaining known good versions of the client/server software to avoid unintentional or malicious improper routing.
Hybrid: Could take many forms but the one that comes to my mind is a multilevel hub and spoke architecture (I’ll draw this out). Basically, you end up having 2-3 “modes” for a client/server: hub, spoke, and endpoint. One or more client/servers operating in a hub “mode” act like traditional servers, kinda like a bulletin board, holding packets for local delivery or transmission to another hub. Client/servers in the spoke mode act as hops between hubs. Client/servers in the endpoint mode are the actual intended destination (this could be combined with the spoke mode). To protect endpoint identity, the destination could be part of the encrypted data packet allowing an endpoint to attempt to decrypt packets received from a hub locally, making it harder to know which endpoint a message is intended for. This does still require greater visibility of hub addresses for routing.
-
Encryption of packets is vital. Supporting some modularity might be of value so as to allow use of simpler cryptography for PoC but, the protocol should ensure that it is possible to break reverse compatibility (normally NOT what you want to do for networking protocols but avoiding an “it’s an old code but still checks out” situation would be more important).
-
Amateur radio should be avoided both in PoC and hypothetical “production” use cases. The ban on encryption is insurmountable there and illegal use of encryption could lead to hightened visibility because the FCC, historically, does not fuck around with illegal radio signals. This means all wireless should be below 1W in the US, in bands that are legal for unlicensed use.
-
Any physical layer that supports arbitrary data transfers should be possible. The implementation to support it would be part of the client/server. So, Bluetooth, 802.11, LORA, sneakernet, and many others could be hypothetically supported. Again, though, this relies on the protocol’s stack to be and to understand it, either directly, or translated by another component.
-
A web of trust may be a good approach for authentication and identity.
Darn, I have to go now. Apologies for the considerable latency there might be getting back to you on this, haha!
Alright, I’m back.
I was talking about amateur radio (in general) as a physical layer because I’m familiar with it, and know it can support short, wide-enough bursts with total radio silence in between. That’s an important requirement because if you’re loud continuously, in the “prod” case, jackboots with a yagi will show up and arrest you. Spies use fast, wide digital radio transmissions a bit like this in really locked-down countries, just not networked together in any way.
If more end-user hardware - or even a non-RF medium - would work, great, no issue. Like you said, there’s no way to support too many assuming they’re safe.
For routing, I would suggest no incoming transmission (or “transmission” if it’s really a hardwire connection) is ignored, but when to rebroadcast is left flexible for the user, who will be able to assess risk and likelihood of success getting closer to the destination in a way no reasonable software could.
Hybrid: Could take many forms but the one that comes to my mind is a multilevel hub and spoke architecture (I’ll draw this out). Basically, you end up having 2-3 “modes” for a client/server: hub, spoke, and endpoint. One or more client/servers operating in a hub “mode” act like traditional servers, kinda like a bulletin board, holding packets for local delivery or transmission to another hub. Client/servers in the spoke mode act as hops between hubs. Client/servers in the endpoint mode are the actual intended destination (this could be combined with the spoke mode). To protect endpoint identity, the destination could be part of the encrypted data packet allowing an endpoint to attempt to decrypt packets received from a hub locally, making it harder to know which endpoint a message is intended for. This does still require greater visibility of hub addresses for routing.
Yeah, so a hub just makes good sense - with such a modest network capacity relative to hardware capabilities, why not gather as much in one place as possible? Because one hub might get busted or just fall to some version of enshittification, it should be easy enough for a user to switch, but I think it’s the best choice of central organising principle.
Other than anonymity, is there a reason to separate out spokes from endpoints? One thing I already have worked out is a system where the hub can keep track of who has helped transmit things (in a cheat proof way), and could simply give credit for traffic moved, offsetting whatever cost there is to use it (ISPs aren’t usually free to start with, and this one is a safety risk to operate). The bandwidth overhead is literally just a key ID (address) and a hash per hop.
I figured switching keys frequently would be enough to ensure a degree of anonymity, since it’s completely pseudonymous. We don’t have a guarantee packets will arrive in order or in any reasonable timeframe, but if we did I’d suggest rolling through keys by count or timestamp.
A web of trust may be a good approach for authentication and identity.
I don’t really have anything to add there. Proving identity beyond just “I hold this key” is out of the scope of what I was considering. I’d probably go about it the same way I would over a more traditional network, if it came up.
Edit: Oh, and I’m not really sure how well this all dovetails into IP. If it can, that’s great, of course.
-