Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to single port and MDNS broadcast per device for all projects #91

Closed
gmaclennan opened this issue Mar 8, 2023 · 8 comments
Closed

Comments

@gmaclennan
Copy link
Member

Currently our network discovery assumes a separate connection for each project (each listening on a different port). In addition we need to discover other Mapeo devices not on the same project for project invites.

Because all mdns packets are visible to anyone on the network, we can't actually trust that a connection on a given connection is actually on the same project. In Hyperswarm connections in server mode (which all Mapeo devices will be) do not have a list of topics anyway.

I suggest the following changes to our discovery:

  1. A single listening port for local connections.
  2. All devices advertise the open port on LAN under a common service type name e.g. _mapeo._tcp.
  3. [maybe] Projects that a device is part of are advertised as a TXT record pj=

I'm not sure advertising projects is necessary? Other devices could still connect to it and discover through the protomux handshake whether it is actually part of project. If we are connecting to devices anyway (for being able to receive project invites, and perhaps updates) then there is no cost to this.

If we do advertise projects, we should use a shortened version of project discovery key, to keep within the limits of DNS-SD TXT records. If we use 8-bytes (64 bits) that allows up to 30 projects to be advertised, which seems like a more reasonable limit than 7 if using full 32-byte keys. There is no big negative consequence of a key collision here, since I device would try to connect and fail because they don't have the actual project key.

Also to consider, is whether we have two separate connections for local vs. internet connections? The advantage of having two is to easily identify local vs. internet traffic, however we can't guarantee that a connection on the port advertised on the LAN is actually local (we are just assuming that port is not known outside the LAN), so we probably want a separate check for incoming connections (e.g. check the IP).

@gmaclennan
Copy link
Member Author

Also to consider, is whether we have two separate connections for local vs. internet connections?

I think we should have two separate connections, and block non-local from the mdns socket, and local from the hyperswarm socket. Only connections on the mdns socket should have the RPC (invite) protocol muxed.

@ximenabb ximenabb removed the sprint 9 label Jun 26, 2023
@gmaclennan gmaclennan changed the title Proposed changes to local discovery (and possibly network discovery) Switch to single port and MDNS broadcast per device for all projects Jul 12, 2023
@sethvincent
Copy link
Contributor

sethvincent commented Aug 8, 2023

Right now discovery works like this:

  • there's one tcp server for mdns that can be used across projects
  • there's a broadcast for each topic (project id) as service subtypes, a neat alternative to txt records that don't have such a restrictive length limit
  • hyperswarm currently uses its own, default server, also one across projects

So it seems like there's separate issues here:

  • switch to only one broadcast that advertises a device (this is a great change)
  • remove topics from the Discovery class and expect our protomux handshake will take care of connecting devices to the correct projects/invites
  • split out hyperswarm connections somehow to treat them differently? and how do the differences between an always-on server peer and a remote mobile device complicate things?
  • is there a need to pass a custom server to hyperswarm? that means passing our own instance of hyperdht
  • remove the PeerInfo class?

We should confirm which of these we want to pursue, which we don't, and make a few separate issues that group related code problems listed above as this one is too broad.

@gmaclennan
Copy link
Member Author

Frustratingly I spent 30 mins writing a detailed response and got stung by a github bug that lost my text. Will try again with the time I have:

  • switch to only one broadcast that advertises a device (this is a great change)

Yes, agreed

  • remove topics from the Discovery class and expect our protomux handshake will take care of connecting devices to the correct projects/invites

Yes I think this makes sense.

  • split out hyperswarm connections somehow to treat them differently? and how do the differences between an always-on server peer and a remote mobile device complicate things?

We want to achieve 3 things (damn I had 4 in the first draft but can't remember them now):

  1. Control (via the Mapeo API) over any internet traffic, which means turning hyperswarm discovery on/off and not accepting any non-local connections on any open ports.
  2. We only want the RPC protomux channel to be added to local connections
  3. We want to avoid duplicate connections (via MDNS and Hyperswarm if the user is also doing Hyperswarm internet discovery).

I think the way to achieve this is by blocking connections to/from local devices in hyperswarm. We can block incoming connection via the firewall (requires this patch until merged). Not sure about outgoing connections?

  • is there a need to pass a custom server to hyperswarm? that means passing our own instance of hyperdht

Not sure if this is needed, but may be.

  • remove the PeerInfo class?

If this makes sense, I'm not currently familiar enough with the code.

We should confirm which of these we want to pursue, which we don't, and make a few separate issues that group related code problems listed above as this one is too broad.

Yes, definitely. Can you do this and create estimates for each task for Sprint 4?

Related issues and prs:

@sethvincent
Copy link
Contributor

mdns: advertise a device
hyperswarm: advertise a device? each project?

Advertising a device over hyperswarm seems potentially weird. I'm not able to visualize how that would work and am a little skeptical about it in general.

And if we're advertising each project over hyperswarm but each device over mdns that seems like it could get messy.

@sethvincent
Copy link
Contributor

sethvincent commented Aug 9, 2023

Made two issues based on what we've talked about: #160 #161

The other related issues listed above seem to cover most of the topics discussed here, and we should figure out what hyperswarm will advertise and then (potentially) open an issue for that.

@gmaclennan
Copy link
Member Author

And if we're advertising each project over hyperswarm but each device over mdns that seems like it could get messy.

With hyperswarm you are advertising your device and a list of topics you are interested in. When you receive an incoming connection you don't know what topics the connecting peer is interested in, if any. Outgoing connections will only be made to peers that are advertising topics that you are interested in - although that is no guarantee they actually have they public key associated with the discovery id used for the topic.

So with hyperswarm you treat connections as "probably interested in the topics I'm advertising" then mux replication streams, and the capability check for each replication stream will validate shared knowledge of hypercore keys.

With the change to a single MDNS service we're essentially advertising a single topic of interest "Mapeo" and all clients interested in that will connect to each other.

In both cases we need to mux replication streams to actually check if a peer has the shared keys. The "topics" in hyperswarm are just a way of filtering devices in the DHT to only connect to those who probably have the stuff you are interested in. For local connections we actually want to connect to all Mapeo devices (so that invites work), and we can assume that the number of connections is small enough to connect to all local devices, so we don't need to filter with topics/ project keys.

@sethvincent
Copy link
Contributor

sethvincent commented Aug 10, 2023

Right! So I think we'll need this:

One issue each for implementing separate classes like MdnsDiscovery and HyperswarmDiscovery (or other names).

Those tasks will be easiest to write as new code that can copy/paste any useful bits from the existing discovery implementation. Because of that, all the existing issues are somewhat extraneous and could actually be acceptance criteria for two larger issues.

I think the two issues could look something like this:

Feel free to edit anything in those.

Issues to close in favor of the above two issues

@gmaclennan @tomasciccola I'll let you all review this and make edits to open issues and close the ones listed here, to give you the chance to change the plan if needed.

Also

This issue may need to be reconsidered based on how we track peers in the new implementation: #12

@sethvincent sethvincent removed their assignment Aug 24, 2023
@sethvincent
Copy link
Contributor

closing this as other issues cover tasks more specifically.

@github-project-automation github-project-automation bot moved this from 🔖 Todo to ✅ Done in Mapeo - Sprint 2023 (Archived) Aug 24, 2023
@sethvincent sethvincent removed their assignment Sep 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants