Up to [local] / src / sys / net
Request diff between arbitrary revisions
Default branch: MAIN
Revision 1.35 / (download) - annotate - [select for diffs], Tue Feb 13 12:22:09 2024 UTC (3 months, 3 weeks ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_5_BASE,
OPENBSD_7_5,
HEAD
Changes since 1.34: +1 -2 lines
Diff to previous 1.34 (colored)
Merge struct route and struct route_in6. Use a common struct route for both inet and inet6. Unfortunately struct sockaddr is shorter than sockaddr_in6, so netinet/in.h has to be exposed from net/route.h. Struct route has to be bsd visible for userland as netstat kvm code inspects inp_route. Internet PCB and TCP SYN cache can use a plain struct route now. All specific sockaddr types for inet and inet6 are embeded there. OK claudio@
Revision 1.34 / (download) - annotate - [select for diffs], Sat Dec 23 10:52:54 2023 UTC (5 months, 2 weeks ago) by bluhm
Branch: MAIN
Changes since 1.33: +3 -1 lines
Diff to previous 1.33 (colored)
Backout always allocate per-CPU statistics counters for network interface descriptor. It panics during attach of em(4) device at boot.
Revision 1.33 / (download) - annotate - [select for diffs], Fri Dec 22 23:01:50 2023 UTC (5 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.32: +1 -3 lines
Diff to previous 1.32 (colored)
Always allocate per-CPU statistics counters for network interface descriptor. We have the mess in network interface statistics. Only pseudo drivers do per-CPU counters allocation, all other network devices use the old `if_data'. The network stack partially uses per-CPU counters and partially use `if_data', but the protection is inconsistent: some times counters accessed with exclusive netlock, some times with shared netlock, some times with kernel lock, but without netlock, some times with another locks. To make network interfaces statistics more consistent, always allocate per-CPU counters at interface attachment time and use it instead of `if_data'. At this step only move counters allocation to the if_attach() internals. The `if_data' removal will be performed with the following diffs to make review and tests easier. ok bluhm
Revision 1.32 / (download) - annotate - [select for diffs], Thu Nov 23 23:45:10 2023 UTC (6 months, 2 weeks ago) by dlg
Branch: MAIN
Changes since 1.31: +8 -7 lines
Diff to previous 1.31 (colored)
avoid passing weird mbuf chains to pf when pushing out a veb. pf expects the ip header to be in the first mbuf of the chain we pass to pf_test, but in some situations the ethernet header is the only data in the first mbuf. after we remove the ethernet header, the first mbuf had no data in it which confused pf. fix this by passing all packets to ip_check on output as well as input. ip input handlers do all the necessary m_pullups. found by Mark Patruck.
Revision 1.31 / (download) - annotate - [select for diffs], Tue May 16 14:32:54 2023 UTC (12 months, 3 weeks ago) by jan
Branch: MAIN
CVS Tags: OPENBSD_7_4_BASE,
OPENBSD_7_4
Changes since 1.30: +2 -2 lines
Diff to previous 1.30 (colored)
Use separate IFCAPs for LRO and TSO. This diff introduces separate capabilities for TCP offloading. We split this into LRO (large receive offloading) and TSO (TCP segmentation offloading). LRO can be turned on/off via tcprecvoffload option of ifconfig and is not inherited to sub interfaces. TSO is inherited by sub interfaces to signal this hardware offloading capability to the network stack. With tweaks from bluhm, claudio and dlg ok bluhm, claudio
Revision 1.30 / (download) - annotate - [select for diffs], Mon Feb 27 09:35:32 2023 UTC (15 months, 2 weeks ago) by jan
Branch: MAIN
CVS Tags: OPENBSD_7_3_BASE,
OPENBSD_7_3
Changes since 1.29: +3 -1 lines
Diff to previous 1.29 (colored)
Turn off TSO if interface is added to layer 2 devices. ok bluhm@, claudio@
Revision 1.29 / (download) - annotate - [select for diffs], Wed Jun 1 17:34:13 2022 UTC (2 years ago) by sashan
Branch: MAIN
CVS Tags: OPENBSD_7_2_BASE,
OPENBSD_7_2
Changes since 1.28: +3 -3 lines
Diff to previous 1.28 (colored)
callers to pf(4) must continue to run with packet as returned by firewall. OK dlg@
Revision 1.28 / (download) - annotate - [select for diffs], Sun May 15 21:37:29 2022 UTC (2 years ago) by bluhm
Branch: MAIN
Changes since 1.27: +9 -16 lines
Diff to previous 1.27 (colored)
Use strncmp() and IFNAMSIZ for if_xname in veb(4) consistently. OK dlg@
Revision 1.27 / (download) - annotate - [select for diffs], Sun May 15 03:54:07 2022 UTC (2 years ago) by deraadt
Branch: MAIN
Changes since 1.26: +2 -2 lines
Diff to previous 1.26 (colored)
gcc insists the decl for veb_ports_free also use inline
Revision 1.26 / (download) - annotate - [select for diffs], Sun May 15 03:18:41 2022 UTC (2 years ago) by dlg
Branch: MAIN
Changes since 1.25: +332 -78 lines
Diff to previous 1.25 (colored)
avoid calling if_enqueue from an smr critical section. claudio@ is right that as a rule of thumb it is a bad idea to call arbitrary code from an smr crit section because the scope of what is called is very hard to keep in your head. in this particular case sashan@ points out that if_enqueue can call vport handlers, which calls if_vinput, which will push a packet into the network stack, which will call pf and try to take an rwlock. you can't sleep in an smr crit section. SMRs in this situation are protecting references to ports in the list of span and actual ports attached to a veb. when we needed to send a packet to an unknown unicast, broadcast, or multicast packet the code would SMR_TAILQ_FOREACH over all the ports, duplicating the mbuf and calling if_enqueue against the port. span port handling is basically the same, but we unconditionally send to them. this replaces the SMR_TAILQ with maps (arrays) of ports. the veb port map data structure contains a struct refcnt and the number of ports. the forwarding paths use an SMR crit section to get a reference to the map, increase the refcnt, and then leaves the smr crit section before iterating over the array of ports in the map. after the iteration it releases the refcnt. this does add a couple of atomic ops in the forwarding path, but only in the uncommon case (most packets are (should be) to known unicast addresses), and it's only one set of ops for all ports instead of ops per port. the known unicast case follows this pattern too. reported by Barbaros Bilek on bugs@ fix tested by me and hrvoje popovski ok claudio@ sashan@ bluhm@ (who also did a lot of the initial analysis)
Revision 1.25 / (download) - annotate - [select for diffs], Tue Jan 4 06:32:39 2022 UTC (2 years, 5 months ago) by yasuoka
Branch: MAIN
CVS Tags: OPENBSD_7_1_BASE,
OPENBSD_7_1
Changes since 1.24: +2 -2 lines
Diff to previous 1.24 (colored)
Add `ipsec_flows_mtx' mutex(9) to protect `ipsp_ids_*' list and trees. ipsp_ids_lookup() returns `ids' with bumped reference counter. original diff from mvs ok mvs
Revision 1.24 / (download) - annotate - [select for diffs], Tue Dec 28 23:13:20 2021 UTC (2 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.23: +1 -2 lines
Diff to previous 1.23 (colored)
whitespace tweak, no functional change.
Revision 1.23 / (download) - annotate - [select for diffs], Tue Dec 28 23:10:58 2021 UTC (2 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.22: +5 -0 lines
Diff to previous 1.22 (colored)
it doesnt make sense to configure a vport as a span port.
Revision 1.22 / (download) - annotate - [select for diffs], Tue Dec 28 23:10:30 2021 UTC (2 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.21: +47 -24 lines
Diff to previous 1.21 (colored)
move away from using the M_PROTO1 flag to prevent loops with vports if a vlan interface is configured on a vport interface, vlan(4) will take the packet away from ether_input before the veb bridge input handler gets to clear M_PROTO1. this leaves the flag on the mbuf as it goes through the l3 stacks. if it goes back out a vport into a veb, the presence of M_PROTO1 means the packet ends up getting dropped, which is unexpected. this diff specialises vport handling by veb even more to avoid the problem the flag was handling. vports get their own bridge input handler that skips veb processing completely because a packet being received on a vport can only occur if a veb has decided to forward it there and has already processed it. when the stack sends a packet out a vport interface, then we do actual veb bridge input handling. bug reported on misc@ and the fix tested by Simon Baker
Revision 1.21 / (download) - annotate - [select for diffs], Mon Nov 8 04:15:46 2021 UTC (2 years, 7 months ago) by dlg
Branch: MAIN
Changes since 1.20: +7 -4 lines
Diff to previous 1.20 (colored)
veb rules are an smr list, so traversal should be in an smr crit section reported by stsp@ an earlier diff was tested by and ok stsp@ ok jmatthew@
Revision 1.20 / (download) - annotate - [select for diffs], Wed Jul 7 20:19:01 2021 UTC (2 years, 11 months ago) by sashan
Branch: MAIN
CVS Tags: OPENBSD_7_0_BASE,
OPENBSD_7_0
Changes since 1.19: +31 -4 lines
Diff to previous 1.19 (colored)
tell ether_input() to call pf_test() outside of smr_read sections, because smr_read sections don't play well with sleeping locks in pf(4). OK bluhm@
Revision 1.19 / (download) - annotate - [select for diffs], Wed Jun 2 00:44:18 2021 UTC (3 years ago) by dlg
Branch: MAIN
Changes since 1.18: +32 -9 lines
Diff to previous 1.18 (colored)
use ipv4_check and ipv6_check to well, check ip headers before running pf. unlike bridge(4), these checks are only run when the packet is entering the veb/tpmr topology. the assumption is that only valid IP packets end up inside the topology so we don't have to check them when they're leaving. ok bluhm@ sashan@
Revision 1.18 / (download) - annotate - [select for diffs], Thu May 27 03:43:23 2021 UTC (3 years ago) by dlg
Branch: MAIN
Changes since 1.17: +3 -1 lines
Diff to previous 1.17 (colored)
ajacouto says i missed copying some bits from bridge for divert-to.
Revision 1.17 / (download) - annotate - [select for diffs], Wed May 26 02:38:01 2021 UTC (3 years ago) by dlg
Branch: MAIN
Changes since 1.16: +11 -1 lines
Diff to previous 1.16 (colored)
support divert-to when pf applies it to a packet. when a divert-to rule applies to a packet, pf doesnt take the packet away and shove it in the socket directly. pf marks the packet, and then ip (or ipv6) input processing looks at the mark and picks the local socket to queue it on. because veb operates at layer 2, ip input processing only occurred if the packet was destined to go into a vport interface. bridge(4) handles this by checking if the packet has the pf divert to mark set on it and calls ip input if it's set. this copies the semantic to veb. this allows divert-to to steal (take?) packets going over a veb and process them on a local socket. reported by ajacatot@
Revision 1.16 / (download) - annotate - [select for diffs], Wed Mar 10 10:21:48 2021 UTC (3 years, 3 months ago) by jsg
Branch: MAIN
CVS Tags: OPENBSD_6_9_BASE,
OPENBSD_6_9
Changes since 1.15: +2 -2 lines
Diff to previous 1.15 (colored)
spelling ok gnezdo@ semarie@ mpi@
Revision 1.15 / (download) - annotate - [select for diffs], Fri Mar 5 06:44:09 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.14: +10 -6 lines
Diff to previous 1.14 (colored)
pass the uint64_t dst ethernet address from ether_input to bridges. tested on amd64 and sparc64.
Revision 1.14 / (download) - annotate - [select for diffs], Wed Mar 3 00:00:03 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.13: +2 -3 lines
Diff to previous 1.13 (colored)
clean up span ports as span ports, not bridge ports. the visible result of this is that span ports aren't made promisc like bridge ports. when cleaning up a span port, trying to take promisc off it screwed up the refs, and it makes the underlying interface not able to be promisc when it should be promisc. found by dave voutila
Revision 1.13 / (download) - annotate - [select for diffs], Tue Mar 2 23:40:06 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.12: +4 -3 lines
Diff to previous 1.12 (colored)
fix an assert in veb_p_ioctl() that failed when called by a span port. veb_p_ioctl() is used by both veb bridge and veb span ports, but it had an assert to check that it was being called by a veb bridge port. this extends the check so using it on a span port doesnt cause a panic. found by dave voutila
Revision 1.12 / (download) - annotate - [select for diffs], Fri Feb 26 01:57:20 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.11: +22 -3 lines
Diff to previous 1.11 (colored)
try do a better job of filtering 802.1 reserved group addresses. if the bridge is supposed to carry vlan packets, assuming it's an s-vlan component and should allow certain group addresses to cross between "customer" bridges. i should probably let some of these groups fall back through to the calling ether_input rather than drop them.
Revision 1.11 / (download) - annotate - [select for diffs], Fri Feb 26 01:42:47 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.10: +26 -26 lines
Diff to previous 1.10 (colored)
use uint64_ts for ethernet addresses in the src/dst bits of rules.
Revision 1.10 / (download) - annotate - [select for diffs], Fri Feb 26 01:28:51 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.9: +8 -22 lines
Diff to previous 1.9 (colored)
use a uint64_t for the ethernet address in the etherbridge table. testing has shown up to a 30% improvement in the veb forwarding rate with this change. an earlier diff was tested by hrvoje popovski tested on amd64 and sparc64
Revision 1.9 / (download) - annotate - [select for diffs], Fri Feb 26 00:16:41 2021 UTC (3 years, 3 months ago) by deraadt
Branch: MAIN
Changes since 1.8: +3 -3 lines
Diff to previous 1.8 (colored)
gcc is more strict about union decls ok dlg
Revision 1.8 / (download) - annotate - [select for diffs], Wed Feb 24 01:20:03 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.7: +49 -1 lines
Diff to previous 1.7 (colored)
add support for adding and deleting address table entries.
Revision 1.7 / (download) - annotate - [select for diffs], Tue Feb 23 23:42:17 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.6: +5 -1 lines
Diff to previous 1.6 (colored)
handle ifconfig veb0 flush with etherbridge_flush, like bpe and nvgre
Revision 1.6 / (download) - annotate - [select for diffs], Tue Feb 23 11:40:28 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.5: +287 -1 lines
Diff to previous 1.5 (colored)
make a start on transparent ipsec interception, based on bridge(4). i found the Transparent Network Security Policy Enforcement paper by angelos and jason was useful for understanding the background and why you'd want to do this. the implementation is a little bit different to the bridge one because i've tweaked the order that pf and ipsec processing happens, depending on which direction the packet is going over the bridge. bridge always runs ipsec processing before pf, no matter which direction the packet is going. packets going into veb, pf runs first and then ipsec input processing is allowed to happen. in the outgoing direction ipsec happens first and then pf. pf runs before ipsec in the inbound direction so pf can apply policy to ipsec encapsulated packets before they hit pf. this allows you to apply policy to both the encrypted and unencrypted packets in both directions. the code is disabled for now. this is mostly because i want veb(4) to have a good chance at operating outside the netlock, and i'm pretty sure the ipsec stack isn't ready for that yet. the other reason why it's disabled is getting a test setup is effort, but i want to sleep.
Revision 1.5 / (download) - annotate - [select for diffs], Tue Feb 23 07:29:07 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.4: +2 -2 lines
Diff to previous 1.4 (colored)
use link0 to allow vlans to cross the bridge.
Revision 1.4 / (download) - annotate - [select for diffs], Tue Feb 23 05:23:02 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.3: +24 -1 lines
Diff to previous 1.3 (colored)
implement support for the blocknonip port flag.
Revision 1.3 / (download) - annotate - [select for diffs], Tue Feb 23 05:01:00 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.2: +48 -1 lines
Diff to previous 1.2 (colored)
add support for setting and getting bridge port flags.
Revision 1.2 / (download) - annotate - [select for diffs], Tue Feb 23 04:40:27 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.1: +21 -1 lines
Diff to previous 1.1 (colored)
filter MAC Bridge component Reserved address im considering converting ethernet addresses into uint64_ts to make comparisons (and masking) easier. im trialling it here, and it doesn't seem like the worst.
Revision 1.1 / (download) - annotate - [select for diffs], Tue Feb 23 03:30:04 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
add veb(4), a Virtual Ethernet Bridge driver. my intention is to replace bridge(4), but the way it works is different enough from from bridge that a name change is justified to distinguish them. it also makes it easier to commit it to the tree and work on it in parallel to bridge, and allows a window of migration. the main difference between veb(4) and bridge(4) is how they use interfaces as ports. veb takes over interfaces completely and only uses them to receive and transmit ethernet packets. bridge also use each interface as a port to the ethernet segment it's connected to, but also tries to continue supporting the use of the interface as a way to talk to the network stack on the local system. supporting the use of interfaces for both external and local communication is where most of my confusion with bridge comes from, both when i'm trying to operate it and also understand the code. changing this semantic is where most of the simplification in veb comes from compared to bridge. because veb takes over interfaces, the ethernet network set up on a veb is isolated from the host network stack. by default veb does not interact with pf or the ip (and mpls) stacks. to enable pf for ip frames going over veb ports link1 on the veb interface must be set. to have the stack interact with a veb network, vport interfaces must be created and added as ports to a veb. the vport interface driver is provided as part of veb, and is handled specially by veb. veb usually prevents the use of ports by the stack for sending an receiving packets, but that's why vports exist, so veb has special handling for them. veb already supports a lot of the other features that bridge has, including bridge rules and protected domains, but i got tired of working out of the tree and stopped implementing them. the main outstanding features is better address table management, the blocknonip flag on ports, transparent ipsec interception, and spanning tree. i may not bother with spanning tree unless someone tells me that they actually use it. the core ethernet learning bridge functionality is provided by the etherbridge code that was factored out of nvgre and bpe. veb is already (a lot) faster than bridge, and is better prepared to operate in parallel on multiple CPUs concurrently. thanks to hrvoje popovski for testing some earlier versions of this. discussed with many ok patrick@ jmatthew@