Up to [local] / src / sys / net
Request diff between arbitrary revisions
Default branch: MAIN
Revision 1.36 / (download) - annotate - [select for diffs], Mon Apr 22 13:30:22 2024 UTC (5 weeks, 6 days ago) by bluhm
Branch: MAIN
CVS Tags: HEAD
Changes since 1.35: +5 -1 lines
Diff to previous 1.35 (colored)
Show pf fragment reassembly counters. Framgent count and statistics are stored in struct pf_status. From there pfctl(8) and systat(1) collect and show them. Note that pfctl -s info needs the -v switch to show fragments. As fragment reassembly has its own mutex, also grab this in pf ipctl(2) and sysctl(2) code. input claudio@; OK henning@
Revision 1.35 / (download) - annotate - [select for diffs], Mon Jan 1 22:16:51 2024 UTC (5 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_5_BASE,
OPENBSD_7_5
Changes since 1.34: +8 -2 lines
Diff to previous 1.34 (colored)
Protect link between pf and inp with mutex. Introduce global mutex to protect the pointers between pf state key and internet PCB. Then in_pcbdisconnect() and in_pcbdetach() do not need exclusive netlock anymore. Use a bunch of read once unlocked access to reduce performance impact. OK sashan@
Revision 1.34 / (download) - annotate - [select for diffs], Thu Jul 6 04:55:05 2023 UTC (10 months, 4 weeks ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_7_4_BASE,
OPENBSD_7_4
Changes since 1.33: +22 -13 lines
Diff to previous 1.33 (colored)
big update to pfsync to try and clean up locking in particular. moving pf forward has been a real struggle, and pfsync has been a constant source of pain. we have been papering over the problems for a while now, but it reached the point that it needed a fundamental restructure, which is what this diff is. the big headliner changes in this diff are: - pfsync specific locks this is the whole reason for this diff. rather than rely on NET_LOCK or KERNEL_LOCK or whatever, pfsync now has it's own locks to protect it's internal data structures. this is important because pfsync runs a bunch of timeouts and tasks to push pfsync packets out on the wire, or when it's handling requests generated by incoming pfsync packets, both of which happen outside pf itself running. having pfsync specific locks around pfsync data structures makes the mutations of these data structures a lot more explicit and auditable. - partitioning to enable future parallelisation of the network stack, this rewrite includes support for pfsync to partition states into different "slices". these slices run independently, ie, the states collected by one slice are serialised into a separate packet to the states collected and serialised by another slice. states are mapped to pfsync slices based on the pf state hash, which is the same hash that the rest of the network stack and multiq hardware uses. - no more pfsync called from netisr pfsync used to be called from netisr to try and bundle packets, but now that there's multiple pfsync slices this doesnt make sense. instead it uses tasks in softnet tqs. - improved bulk transfer handling there's shiny new state machines around both the bulk transmit and receive handling. pfsync used to do horrible things to carp demotion counters, but now it is very predictable and returns the counters back where they started. - better tdb handling the tdb handling was pretty hairy, but hrvoje has kicked this around a lot with ipsec and sasyncd and we've found and fixed a bunch of issues as a result of that testing. - mpsafe pf state purges this was committed previously, but because the locks pfsync relied on weren't clear this just caused a ton of bugs. as part of this diff it's now reliable, and moves a big chunk of work out from under KERNEL_LOCK, which in turn improves the responsiveness and throughput of a firewall even if you're not using pfsync. there's a bunch of other little changes along the way, but the above are the big ones. hrvoje has done performance testing with this diff and notes a big improvement when pfsync is not in use. performance when pfsync is enabled is about the same, but im hoping the slices means we can scale along with pf as it improves. lots (months) of testing by me and hrvoje on pfsync boxes tests and ok sashan@ deraadt@ says this is a good time to put it in
Revision 1.33 / (download) - annotate - [select for diffs], Wed May 10 22:42:51 2023 UTC (12 months, 3 weeks ago) by sashan
Branch: MAIN
Changes since 1.32: +4 -1 lines
Diff to previous 1.32 (colored)
nat-to may fail to insert state due to conflict on chosen source port number. This is typically indicated by 'wire key attach failed on...' message when pf(4) debugging is enabled. The problem is caused by glitch in pf_get_sport() which fails to discover conflict in advance. In order to fix it we must also calculate toeplitz hash in pf_get_sport() to initialize look up key properly. the bug has been kindly reported by joosepm _von_ gmail _dot_ com OK dlg@
Revision 1.32 / (download) - annotate - [select for diffs], Mon May 8 23:52:36 2023 UTC (12 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.31: +2 -3 lines
Diff to previous 1.31 (colored)
fix up some formatting in the pf_state_list comment.
Revision 1.31 / (download) - annotate - [select for diffs], Fri Apr 28 14:08:38 2023 UTC (13 months ago) by sashan
Branch: MAIN
Changes since 1.30: +25 -1 lines
Diff to previous 1.30 (colored)
This change speeds up DIOCGETRULE ioctl(2) which pfctl(8) uses to retrieve rules from kernel. The current implementation requires like O((n^2)/2) operation to read the complete rule set, because each DIOCGETRULE operation must iterate over previous n rules to find (n + 1)-th rule to read. To address the issue diff introduces a pf_trans structure to keep pointer to next rule to read, thus reading process does not need to iterate from beginning of rule set to reach the next rule. All transactions opened by process get closed either when process is done (reads all rules) or when /dev/pf device is closed. the diff also comes with lots of improvements from dlg@ and kn@ OK dlg@, kn@
Revision 1.30 / (download) - annotate - [select for diffs], Fri Jan 6 17:44:34 2023 UTC (16 months, 3 weeks ago) by sashan
Branch: MAIN
CVS Tags: OPENBSD_7_3_BASE,
OPENBSD_7_3
Changes since 1.29: +20 -1 lines
Diff to previous 1.29 (colored)
PF_ANCHOR_STACK_MAX is insufficient protection against stack overflow. On amd64 stack overflows for anchor rule with depth ~30. The tricky thing is the 'safe' depth varies depending on kind of packet processed by pf_match_rule(). For example for local outbound TCP packet stack overflows when recursion if pf_match_rule() reaches depth 24. Instead of lowering PF_ANCHOR_STACK_MAX to 20 and hoping it will be enough on all platforms and for all packets I'd like to stop calling pf_match_rule() recursively. This commit brings back pf_anchor_stackframe array we used to have back in 2017. It also revives patrick@'s idea to pre-allocate stack frame arrays from per-cpu. OK kn@
Revision 1.29 / (download) - annotate - [select for diffs], Wed Jan 4 10:31:55 2023 UTC (16 months, 4 weeks ago) by dlg
Branch: MAIN
Changes since 1.28: +5 -1 lines
Diff to previous 1.28 (colored)
move the pf_state_tree_id type from pfvar.h to pfvar_priv.h. the pf_state_tree_id type is private to the kernel. while here, move it from being an RB tree to an RBT tree. this saves about 12k in pf.o on amd64. ok sashan@
Revision 1.28 / (download) - annotate - [select for diffs], Wed Jan 4 02:00:49 2023 UTC (16 months, 4 weeks ago) by dlg
Branch: MAIN
Changes since 1.27: +5 -1 lines
Diff to previous 1.27 (colored)
move the pf_state_tree rb tree type from pfvar.h to pfvar_priv.h the pf_state_tree types are kernel private, and are not used by userland. make build agrees with me. while here, move the pf_state_tree from the RB macros to the RBT functions. this shaves about 13k off pf.o on amd64. ok sashan@
Revision 1.27 / (download) - annotate - [select for diffs], Thu Dec 22 05:59:27 2022 UTC (17 months, 1 week ago) by dlg
Branch: MAIN
Changes since 1.26: +3 -1 lines
Diff to previous 1.26 (colored)
use stoeplitz to generate a hash/flowid for state keys. the hash will be used to partition work in pf and pfsync in the future, and right now it is used as the first comparison in the rb tree state lookup. using stoeplitz means that pf will hash traffic the same way that hardware using a stoeplitz key will hash incoming traffic on rings. stoeplitz is also used by the tcp stack to generate a flow id, which is used to pick which transmit ring is used on nics with multiple queues too. using the same algorithm throughout the stack encourages affinity of packets to rings and softnet threads the whole way through. using the hash as the first comparison in the state rb tree comparison should encourage faster traversal of the state tree by having all the address/port bits summarised into the single hash value. however, tests by hrvoje popovski don't show performance changing. on the plus side, if this change is free from a performance point of view then it makes the future steps more straightforward. discussed at length at h2k22 tested by sashan@ and hrvoje popovski ok tb@ sashan@ claudio@ jmatthew@
Revision 1.26 / (download) - annotate - [select for diffs], Wed Dec 21 02:23:10 2022 UTC (17 months, 1 week ago) by dlg
Branch: MAIN
Changes since 1.25: +10 -9 lines
Diff to previous 1.25 (colored)
prefix pf_state_key and pf_state_item struct bits to make them more unique. this makes searching for the struct members easier, which in turn makes tweaking code around them a lot easier too. sk_refcnt in particular would have been a lot nicer to fiddle with than just refcnt because pf_state structs also have a refcnt, which is annoying. tweaks and ok sashan@ reads ok kn@
Revision 1.25 / (download) - annotate - [select for diffs], Mon Dec 19 04:35:34 2022 UTC (17 months, 2 weeks ago) by dlg
Branch: MAIN
Changes since 1.24: +26 -1 lines
Diff to previous 1.24 (colored)
move pf_state_item and pf_state_key structs from pfvar.h to pfvar_priv.h. both of these are kernel private data structures and do not need to be visible to userland. moving them to pfvar_priv.h makes this explicit, and makes it leass scary to tweak them in the future. ok deraadt@ kn@ sashan@
Revision 1.24 / (download) - annotate - [select for diffs], Fri Dec 16 02:05:45 2022 UTC (17 months, 2 weeks ago) by dlg
Branch: MAIN
Changes since 1.23: +3 -3 lines
Diff to previous 1.23 (colored)
always keep pf_state_keys attached to pf_states. pf_state structures don't contain ip addresses, protocols, ports, etc. that information is stored in a pf_state_key struct, which is used to wire a state into the state table. when things like pfsync or the pf state ioctls want to export information about a state, particularly the addresses on it, they needs the pf_state_key struct to read from. before this diff the code assumed that when a state was removed from the state tables it could throw the pf_state_key structs away as part of that removal. this code changes it so once pf_state_insert succeeds, a pf_state will keep its references to the pf_state_key structs until the pf_state struct itself is being destroyed. this allows anything that holds a reference to a pf_state to also look at the pf_state_key structs because they're now effectively an immutable part of the pf_state struct. this is by far the simplest and most straightforward fix for pfsync crashing on pf_state_key dereferences we've come up with so far. it has been made possible by the addition of reference counts to pf_state and pf_state_key structs, which allows us to properly account for this adjusted lifecycle for pf_state_keys on pf_state structs. sashan@ and i have been kicking this diff around for a couple of weeks now. ok sashan@ jmatthew@
Revision 1.23 / (download) - annotate - [select for diffs], Fri Nov 25 20:27:53 2022 UTC (18 months, 1 week ago) by bluhm
Branch: MAIN
Changes since 1.22: +4 -2 lines
Diff to previous 1.22 (colored)
revert pf.c r1.1152 again: move pf_purge out from under the kernel lock Using systqmp for pf_purge creates a deadlock between pf_purge() and ixgbe_stop() and possibly other drivers. On systqmp pf(4) needs netlock which the interface ioctl(2) is holding. ix(4) waits in sched_barrier() which is also scheduled on the systqmp task queue. Removing the netlock from pf_purge() as a quick fix caused other problems. backout suggested by deraadt@
Revision 1.22 / (download) - annotate - [select for diffs], Thu Nov 24 00:04:32 2022 UTC (18 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.21: +1 -2 lines
Diff to previous 1.21 (colored)
Remove netlock assertion within PF_LOCK(). The netlock should be taken first, but only if both locks are taken. ok dlg@ sashan@
Revision 1.21 / (download) - annotate - [select for diffs], Fri Nov 11 17:12:30 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.20: +2 -2 lines
Diff to previous 1.20 (colored)
me and my text editor are not getting along today
Revision 1.20 / (download) - annotate - [select for diffs], Fri Nov 11 16:12:08 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.19: +2 -4 lines
Diff to previous 1.19 (colored)
try pf.c r1.1143 again: move pf_purge out from under the kernel lock this also avoids holding NET_LOCK too long. the main change is done by running the purge tasks in systqmp instead of systq. the pf state list was recently reworked so iteration over the state can be done without blocking insertions. however, scanning a lot of states can still take a lot of time, so this also makes the state list scanner yield if it has spent too much time running. the other purge tasks for source nodes, rules, and fragments have been moved to their own timeout/task pair to simplify the time accounting. in my environment, before this change pf purges often took 10 to 50ms. the softclock thread runs next to it often took a similar amount of time, presumably because they ended up spinning waiting for each other. after this change the pf_purges are more like 6 to 12ms, and dont block softclock. most of the variability in the runs now seems to come from contention on the net lock. tested by me sthen@ chris@ ok sashan@ kn@ claudio@ the diff was backed out because it made things a bit more racey, but sashan@ has squashed those races this week. let's try it again.
Revision 1.19 / (download) - annotate - [select for diffs], Fri Nov 11 15:02:31 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.18: +4 -2 lines
Diff to previous 1.18 (colored)
add a mutex to struct pf_state and init it. nothing is protected by it yet but it will allow us to provide consistent updates to individual states without relying on a global lock. getting that right between the packet processing in pf itself, pfsync, the pf purge code, the ioctl paths, etc is not worth the required contortions. while pf_state does grow, it doesn't use more cachelines on machines where we will want to run in parallel with a lot of states. stolen from and ok sashan@
Revision 1.18 / (download) - annotate - [select for diffs], Fri Nov 11 12:50:45 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.17: +32 -32 lines
Diff to previous 1.17 (colored)
kn points out that brackets are not parentheses
Revision 1.17 / (download) - annotate - [select for diffs], Fri Nov 11 12:36:05 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.16: +2 -2 lines
Diff to previous 1.16 (colored)
fix a misuse of vi.
Revision 1.16 / (download) - annotate - [select for diffs], Fri Nov 11 12:29:32 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.15: +32 -32 lines
Diff to previous 1.15 (colored)
kn@ points out that lock annotations are usually wrapped in ()
Revision 1.15 / (download) - annotate - [select for diffs], Fri Nov 11 12:06:17 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.14: +41 -32 lines
Diff to previous 1.14 (colored)
steal a change by sashan@ to say which bits of pf_state need which locks. not all members are annotated yet, but that's because there's no clear protection for them yet. ok sashan@
Revision 1.14 / (download) - annotate - [select for diffs], Fri Nov 11 11:02:35 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.13: +8 -1 lines
Diff to previous 1.13 (colored)
rewrite the pf_state_peer_ntoh and pf_state_peer_hton macros as functions. i can read this code as functions, but it takes too much effort as macros.
Revision 1.13 / (download) - annotate - [select for diffs], Fri Nov 11 10:55:48 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.12: +48 -1 lines
Diff to previous 1.12 (colored)
move struct pf_state from pfvar.h to pfvar_priv.h. we (sashan) are going to add a mutex to the pf_state struct, but a mutex is a kernel data structure that changes shape depending on things like whether MULTIPROCESSOR is enabled, and should therefore not be visible to userland. when we added a mutex to pf_state, compiling pfctl failed because it doesn't know what a mutex is and it can't know which version of it the current kernel is running with. moving struct pf_state to pfvar_priv.h makes it clear it is a private kernel only data structure, and avoids this leak into userland. tested by me and make build ok sashan@
Revision 1.12 / (download) - annotate - [select for diffs], Mon Nov 7 16:35:12 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.11: +2 -1 lines
Diff to previous 1.11 (colored)
revert "move pf_purge out from under the kernel lock". hrvoje popovski showed me pfsync blowing up with this. im backing it out quickly in case something else at the hackathon makes it harder to do later. kn@ agrees
Revision 1.11 / (download) - annotate - [select for diffs], Mon Nov 7 12:56:38 2022 UTC (18 months, 3 weeks ago) by dlg
Branch: MAIN
Changes since 1.10: +2 -3 lines
Diff to previous 1.10 (colored)
move pf_purge out from under the kernel lock and avoid the hogging cpu this also avoids holding NET_LOCK too long. the main change is done by running the purge tasks in systqmp instead of systq. the pf state list was recently reworked so iteration over the state can be done without blocking insertions. however, scanning a lot of states can still take a lot of time, so this also makes the state list scanner yield if it has spent too much time running. the other purge tasks for source nodes, rules, and fragments have been moved to their own timeout/task pair to simplify the time accounting. in my environment, before this change pf purges often took 10 to 50ms. the softclock thread runs next to it often took a similar amount of time, presumably because they ended up spinning waiting for each other. after this change the pf_purges are more like 6 to 12ms, and dont block softclock. most of the variability in the runs now seems to come from contention on the net lock. tested by me sthen@ chris@ ok sashan@ kn@ claudio@
Revision 1.10 / (download) - annotate - [select for diffs], Fri Apr 29 08:58:49 2022 UTC (2 years, 1 month ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_2_BASE,
OPENBSD_7_2
Changes since 1.9: +4 -1 lines
Diff to previous 1.9 (colored)
IGMP and ICMP6 MLD packets always have the router alert option set. pf blocked IPv4 options and IPv6 option header by default. This forced users to set allow-opts in pf rules. Better let multicast work by default. Detect router alerts by parsing IP options and hop by hop headers. If the packet has only this option and is a multicast control packet, do not block it due to bad options. tested by otto@; OK sashan@
Revision 1.9 / (download) - annotate - [select for diffs], Fri Apr 8 18:17:24 2022 UTC (2 years, 1 month ago) by bluhm
Branch: MAIN
Changes since 1.8: +3 -3 lines
Diff to previous 1.8 (colored)
Rename the pf state assert locked macro consistently. OK sashan@
Revision 1.8 / (download) - annotate - [select for diffs], Sun Jan 2 22:36:04 2022 UTC (2 years, 4 months ago) by jsg
Branch: MAIN
CVS Tags: OPENBSD_7_1_BASE,
OPENBSD_7_1
Changes since 1.7: +2 -2 lines
Diff to previous 1.7 (colored)
spelling ok jmc@ reads ok tb@
Revision 1.7 / (download) - annotate - [select for diffs], Wed Jun 23 06:53:52 2021 UTC (2 years, 11 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_7_0_BASE,
OPENBSD_7_0
Changes since 1.6: +109 -1 lines
Diff to previous 1.6 (colored)
augment the global pf state list with its own locks. before this, things that iterated over the global list of pf states had to take the net, pf, or pf state locks. in particular, the ioctls that dump the state table took the net and pf state locks before iterating over the states and using copyout to export them to userland. when we tried replacing the use rwlocks with mutexes under the pf locks, this blew up because you can't sleep when holding a mutex and there's a sleeping lock used inside copyout. this diff introduces two locks around the global state list: a mutex that protects the head and tail of the list, and an rwlock that protects the links between elements in the list. inserts on the state list only occur during packet handling and can be done by taking the mutex and putting the state on the tail before releasing the mutex. iterating over states is only done from thread/process contexts, so we can take a read lock, then the mutex to get a snapshot of the head and tail pointers, and then keep the read lock to iterate between the head and tail points. because it's a read lock we can then take other sleeping locks (eg, the one inside copyout) without (further) gymnastics. the pf state purge code takes the rwlock exclusively and the mutex to remove elements from the list. this allows the ioctls and purge code to loop over the list concurrently and largely without blocking the creation of states when pf is processing packets. pfsync also iterates over the state list when doing bulk sends, which the state purge code needs to be careful around. ok sashan@
Revision 1.6 / (download) - annotate - [select for diffs], Tue Feb 9 14:06:19 2021 UTC (3 years, 3 months ago) by patrick
Branch: MAIN
CVS Tags: OPENBSD_6_9_BASE,
OPENBSD_6_9
Changes since 1.5: +1 -16 lines
Diff to previous 1.5 (colored)
Activate use of PF_LOCK() by removing the WITH_PF_LOCK ifdefs. Silence from the network group ok sashan@
Revision 1.5 / (download) - annotate - [select for diffs], Tue Sep 11 07:53:38 2018 UTC (5 years, 8 months ago) by sashan
Branch: MAIN
CVS Tags: OPENBSD_6_8_BASE,
OPENBSD_6_8,
OPENBSD_6_7_BASE,
OPENBSD_6_7,
OPENBSD_6_6_BASE,
OPENBSD_6_6,
OPENBSD_6_5_BASE,
OPENBSD_6_5,
OPENBSD_6_4_BASE,
OPENBSD_6_4
Changes since 1.4: +36 -1 lines
Diff to previous 1.4 (colored)
- moving state look up outside of PF_LOCK() this change adds a pf_state_lock rw-lock, which protects consistency of state table in PF. The code delivered in this change is guarded by 'WITH_PF_LOCK', which is still undefined. People, who are willing to experiment and want to run it must do two things: - compile kernel with -DWITH_PF_LOCK - bump NET_TASKQ from 1 to ... sky is the limit, (just select some sensible value for number of tasks your system is able to handle) OK bluhm@
Revision 1.4 / (download) - annotate - [select for diffs], Sun Aug 6 13:16:11 2017 UTC (6 years, 9 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_6_3_BASE,
OPENBSD_6_3,
OPENBSD_6_2_BASE,
OPENBSD_6_2
Changes since 1.3: +6 -1 lines
Diff to previous 1.3 (colored)
Reduce contention on the NET_LOCK() by moving the logic of the pfpurge thread to a task running on the `softnettq`. Tested and inputs from Hrvoje Popovski. ok visa@, sashan@
Revision 1.3 / (download) - annotate - [select for diffs], Mon Jun 5 22:18:28 2017 UTC (6 years, 11 months ago) by sashan
Branch: MAIN
Changes since 1.2: +35 -1 lines
Diff to previous 1.2 (colored)
- let's add PF_LOCK() to enable PF_LOCK(), you must add 'option WITH_PF_LOCK' to your kernel configuration. The code does not do much currently it's just the very small step towards MP. O.K. henning@, mikeb@, mpi@
Revision 1.2 / (download) - annotate - [select for diffs], Tue Nov 22 19:29:54 2016 UTC (7 years, 6 months ago) by procter
Branch: MAIN
CVS Tags: OPENBSD_6_1_BASE,
OPENBSD_6_1
Changes since 1.1: +11 -21 lines
Diff to previous 1.1 (colored)
Fold union pf_headers buffer into struct pf_pdesc (enabled by pfvar_priv.h). Prevent pf_socket_lookup() reading uninitialised header buffers on fragments. OK blum@ sashan@
Revision 1.1 / (download) - annotate - [select for diffs], Wed Oct 26 21:07:22 2016 UTC (7 years, 7 months ago) by bluhm
Branch: MAIN
Put union pf_headers and struct pf_pdesc into separate header file pfvar_priv.h. The pf_headers had to be defined in multiple .c files before. In pfvar.h it would have unknown storage size, this file is included in too many places. The idea is to have a private pf header that is only included in the pf part of the kernel. For now it contains pf_pdesc and pf_headers, it may be extended later. discussion, input and OK henning@ procter@ sashan@