OpenBSD CVS

CVS log for src/sys/net/if_var.h


[BACK] Up to [local] / src / sys / net

Request diff between arbitrary revisions


Default branch: MAIN


Revision 1.132 / (download) - annotate - [select for diffs], Sat Dec 23 10:52:54 2023 UTC (5 months, 2 weeks ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_5_BASE, OPENBSD_7_5, HEAD
Changes since 1.131: +4 -1 lines
Diff to previous 1.131 (colored)

Backout always allocate per-CPU statistics counters for network
interface descriptor.  It panics during attach of em(4) device at
boot.

Revision 1.131 / (download) - annotate - [select for diffs], Fri Dec 22 23:01:50 2023 UTC (5 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.130: +1 -4 lines
Diff to previous 1.130 (colored)

Always allocate per-CPU statistics counters for network interface
descriptor.

We have the mess in network interface statistics. Only pseudo drivers
do per-CPU counters allocation, all other network devices use the old
`if_data'. The network stack partially uses per-CPU counters and
partially use `if_data', but the protection is inconsistent: some times
counters accessed with exclusive netlock, some times with shared
netlock, some times with kernel lock, but without netlock, some times
with another locks.

To make network interfaces statistics more consistent, always allocate
per-CPU counters at interface attachment time and use it instead of
`if_data'. At this step only move counters allocation to the if_attach()
internals. The `if_data' removal will be performed with the following
diffs to make review and tests easier.

ok bluhm

Revision 1.130 / (download) - annotate - [select for diffs], Sat Nov 11 14:24:03 2023 UTC (6 months, 4 weeks ago) by bluhm
Branch: MAIN
Changes since 1.129: +4 -4 lines
Diff to previous 1.129 (colored)

Pass constant struct sockaddr to interface lookup functions.

OK mvs@

Revision 1.129 / (download) - annotate - [select for diffs], Fri Jul 7 08:05:02 2023 UTC (11 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_4_BASE, OPENBSD_7_4
Changes since 1.128: +3 -1 lines
Diff to previous 1.128 (colored)

Fix path MTU discovery for TCP LRO/TSO when forwarding.

When doing LRO (Large Receive Offload), the drivers, currently ix(4)
and lo(4) only, record an upper bound of the size of the original
packets in ph_mss.  When sending, either stack or hardware must
chop the packets with TSO (TCP Segmentation Offload) to that size.
That means we have to call tcp_if_output_tso() before ifp->if_output().
Put that logic into if_output_tso() to avoid code duplication.  As
TCP packets on the wire do not get larger that way, path MTU discovery
should still work.

tested by and OK jan@

Revision 1.128 / (download) - annotate - [select for diffs], Wed Jun 28 11:49:49 2023 UTC (11 months, 1 week ago) by kn
Branch: MAIN
Changes since 1.127: +2 -2 lines
Diff to previous 1.127 (colored)

use refcnt API for multicast addresses, add tracepoint:refcnt:ifmaddr probe

Replace hand-rolled reference counting with refcnt_init(9) and hook it up
with a new dt(4) probe.

OK bluhm mvs

Revision 1.127 / (download) - annotate - [select for diffs], Tue May 30 08:30:01 2023 UTC (12 months, 1 week ago) by jsg
Branch: MAIN
Changes since 1.126: +2 -2 lines
Diff to previous 1.126 (colored)

spelling
ok jmc@ guenther@ tb@

Revision 1.126 / (download) - annotate - [select for diffs], Sun May 7 16:23:23 2023 UTC (13 months ago) by bluhm
Branch: MAIN
Changes since 1.125: +5 -3 lines
Diff to previous 1.125 (colored)

I preparation for TSO in software, cleanup the fragment code.  Use
if_output_ml() to send mbuf lists to interfaces.  This can be used
for TSO, fragments, ARP and ND6.  Rename variable fml to ml.  In
pf_route6() split the if else block.  Put the safety check (hlen +
firstlen < tlen) into ip_fragment().  It makes the code correct in
case the packet is too short to be fragmented.  This should not
happen, but other functions also have this logic.
No functional change.  OK sashan@

Revision 1.125 / (download) - annotate - [select for diffs], Tue Apr 18 22:01:24 2023 UTC (13 months, 3 weeks ago) by mvs
Branch: MAIN
Changes since 1.124: +3 -2 lines
Diff to previous 1.124 (colored)

Remove kernel lock from ifa_ifwithaddr() and ifa_ifwithdstaddr().
Netlock protects `if_list', `ifa_list' and returned `ifa' dereference,
so put netlock assertion within.

Please note, rtable_setsource() doesn't destroy data pointed by
`ar_source'. This is the `ifa_addr' data belongs to `ifa' and exclusive
netlock is required to destroy it. So the kernel lock is not required
within rt_setsource(). Take netlock by rt_setsource() caller to make
`ifa' dereference safe.

Suggestions and ok by bluhm@

Revision 1.124 / (download) - annotate - [select for diffs], Tue Apr 18 22:00:19 2023 UTC (13 months, 3 weeks ago) by mvs
Branch: MAIN
Changes since 1.123: +2 -2 lines
Diff to previous 1.123 (colored)

Document `ifnetlist' locking.

We use both kernel and net lock for protect `ifnetlist'. This means we
do modification with both locks held, but for read-only access only one
lock required. Some places doing `ifnetlist' foreach loop are protected
by kernel lock and context switch can't be introduced there. This is the
exception, so "XXXSMP:" comment added.

Proposed and ok by bluhm@

Revision 1.123 / (download) - annotate - [select for diffs], Wed Apr 5 19:35:23 2023 UTC (14 months ago) by bluhm
Branch: MAIN
Changes since 1.122: +3 -1 lines
Diff to previous 1.122 (colored)

ARP has a queue of packets that should be sent after name resolution.
ND6 did only hold a single packet.  Unify the logic and add a mbuf
hold queue to struct llinfo_nd6.  This is MP safe and queue limits
are tracked with atomic operations.  New function if_mqoutput() has
common code for ARP and ND6.  ln_saddr6 holds the source address
of the requesting packet.  That is easier than fiddling with mbuf
queue in nd6_ns_output().
OK kn@

Revision 1.122 / (download) - annotate - [select for diffs], Wed Nov 23 14:50:59 2022 UTC (18 months, 2 weeks ago) by kn
Branch: MAIN
CVS Tags: OPENBSD_7_3_BASE, OPENBSD_7_3
Changes since 1.121: +1 -2 lines
Diff to previous 1.121 (colored)

Remove unused struct ifnet's *if_afdata[] and struct domain's dom_if{at,de}tach()

Both made obsolete through struct ifnet's previous *if_nd addition.

IPv6 Neighbour Discovery handles per-interface data directly, nothing
else uses this generic domain API anymore.

Outside of _KERNEL, but nothing in base uses them, either.

OK bluhm mvs claudio

Revision 1.121 / (download) - annotate - [select for diffs], Wed Nov 23 14:48:27 2022 UTC (18 months, 2 weeks ago) by kn
Branch: MAIN
Changes since 1.120: +2 -1 lines
Diff to previous 1.120 (colored)

Add *if_nd to struct ifnet, call nd6_if{at,de}tach() directly

*if_afdata[] and struct domain's dom_if{at,de}tach() are only used with
IPv6 Neighbour Discovery in6_dom{at,de}tach(), which allocate/init and
free single struct nd_ifinfo.

Set up a new ND-specific *if_nd member directly to avoid yet another
layer of indirection and thus make the generic domain API obsolete.

The per-interface data is only accessed in nd6.c and nd6_nbr.c through
the ND_IFINFO() macro;  it is allocated and freed exactly once during
interface at/detach, so document it as [I]mmutable.

OK bluhm mvs claudio

Revision 1.120 / (download) - annotate - [select for diffs], Mon Nov 14 22:06:26 2022 UTC (18 months, 3 weeks ago) by kn
Branch: MAIN
Changes since 1.119: +11 -11 lines
Diff to previous 1.119 (colored)

Document global interface group list locking

The per-interface group list is protected by the net lock and already
documented as such.

The global interface group list `ifg_head' is also protected by the net
lock and all access to it (all within if.c) take it accordingly.

Feedback OK mvs

Revision 1.119 / (download) - annotate - [select for diffs], Thu Nov 10 12:46:19 2022 UTC (19 months ago) by kn
Branch: MAIN
Changes since 1.118: +2 -2 lines
Diff to previous 1.118 (colored)

typofix; ok dlg

Revision 1.118 / (download) - annotate - [select for diffs], Tue Nov 8 18:43:22 2022 UTC (19 months ago) by kn
Branch: MAIN
Changes since 1.117: +14 -13 lines
Diff to previous 1.117 (colored)

Document ifc_list immutability

Move up to comment explaining different locks to account for all structs.

OK millert mvs

Revision 1.117 / (download) - annotate - [select for diffs], Thu Sep 8 10:22:06 2022 UTC (21 months ago) by kn
Branch: MAIN
CVS Tags: OPENBSD_7_2_BASE, OPENBSD_7_2
Changes since 1.116: +2 -2 lines
Diff to previous 1.116 (colored)

Rename global ifnet TAILQ

Naming the list like the struct itself makes for awful grepping.
Call the global variable "ifnetlist" from now on.

There used to be kvm(3) consumers in base picking up this symbol, but those
have long been converted to other interfaces.

A few potential ports users remain, same deal as sys/net/if_var.h r1.116
"Remove struct ifnet's unused if_switchport member":  they get bumped.

Previous users pointed out by deraadt
OK bluhm

Revision 1.116 / (download) - annotate - [select for diffs], Tue Aug 30 18:23:58 2022 UTC (21 months, 1 week ago) by kn
Branch: MAIN
Changes since 1.115: +1 -2 lines
Diff to previous 1.115 (colored)

Remove struct ifnet's unused if_switchport member

This is a switch(4) left-over.

Even though it is defined under _KERNEL, a few ports do define it and
include <net/if_var.h>, so this removal warrants a REVISION bump for all
potential ports consumers (once ports bulk machines run on a snapshot
containing this commit).

OK mvs

Revision 1.115 / (download) - annotate - [select for diffs], Mon Aug 29 07:51:45 2022 UTC (21 months, 1 week ago) by bluhm
Branch: MAIN
Changes since 1.114: +3 -2 lines
Diff to previous 1.114 (colored)

Use struct refcnt for interface address reference counting.
There was a crash due to use after free of the ifa although it is
ref counted.  As ifa_refcnt was a simple integer increment, there
may be a path where multiple CPUs access it concurrently.  So change
to struct refcnt which is MP safe and provides dt(4) leak debugging.
Link level address for IPsec enc(4) and various MPLS interfaces is
special.  There ifa is part of struct sc.  Use refcount anyway and
add a panic to detect use after free.
bug report stsp@; OK mvs@

Revision 1.114 / (download) - annotate - [select for diffs], Sat Feb 20 04:55:52 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_7_1_BASE, OPENBSD_7_1, OPENBSD_7_0_BASE, OPENBSD_7_0, OPENBSD_6_9_BASE, OPENBSD_6_9
Changes since 1.113: +3 -1 lines
Diff to previous 1.113 (colored)

add p2p_input, like ether_input but for l3 tunnel interfaces.

the l3 protocol input to push the packet is based on a value in
m->m_pkthdr.ph_family, which tunnel drivers should set before calling
if_vinput.

add p2p_bpf_mtap to call bpf_mtap_af also using m->m_pkthdr.ph_family.

Revision 1.113 / (download) - annotate - [select for diffs], Sat Feb 20 04:35:41 2021 UTC (3 years, 3 months ago) by dlg
Branch: MAIN
Changes since 1.112: +2 -1 lines
Diff to previous 1.112 (colored)

give interfaces an if_bpf_mtap handler.

the network stack is now responsible for calling bpf for packets
that the interface receives, and we so far got away with using
bpf_mtap_ether for everything. this doesn't work if layer 3 input
goes through the same functions, so letting drivers specify the
appropriate bpf mtap function means they will be able to cope.

Revision 1.112 / (download) - annotate - [select for diffs], Wed Jul 29 12:09:31 2020 UTC (3 years, 10 months ago) by mvs
Branch: MAIN
CVS Tags: OPENBSD_6_8_BASE, OPENBSD_6_8
Changes since 1.111: +2 -2 lines
Diff to previous 1.111 (colored)

Interface index is unsigned integer. Fix the places where it referenced
as signed. u_int used within pipex(4) for consistency with other code.

ok dlg@ mpi@

Revision 1.111 / (download) - annotate - [select for diffs], Fri Jul 24 18:17:15 2020 UTC (3 years, 10 months ago) by mvs
Branch: MAIN
Changes since 1.110: +4 -3 lines
Diff to previous 1.110 (colored)

Use interface index instead of pointer to `ifnet' in carp(4).

ok sashan@

Revision 1.110 / (download) - annotate - [select for diffs], Wed Jul 22 02:16:02 2020 UTC (3 years, 10 months ago) by dlg
Branch: MAIN
Changes since 1.109: +2 -7 lines
Diff to previous 1.109 (colored)

deprecate interface input handler lists, just use one input function.

the interface input handler lists were originally set up to help
us during the intial mpsafe network stack work. at the time not all
the virtual ethernet interfaces (vlan, svlan, bridge, trunk, etc)
were mpsafe, so we wanted a way to avoid them by default, and only
take the kernel lock hit when they were specifically enabled on the
interface. since then, they have been fixed up to be mpsafe.

i could leave the list in place, but it has some semantic problems.
because virtual interfaces filter packets based on the order they
were attached to the parent interface, you can get packets taken
away in surprising ways, especially when you reboot and netstart
does something different to what you did by hand. by hardcoding the
order that things like vlan and bridge get to look at packets, we
can document the behaviour and get consistency.

it also means we can get rid of a use of SRPs which were difficult
to replace with SMRs. the interface input handler list is an SRPL,
which we would like to deprecate. it turns out that you can sleep
during stack processing, which you're not supposed to do with SRPs
or SMRs, but SRPs are a lot more forgiving and it worked.

lastly, it turns out that this code is faster than the input list
handling, so lots of winning all around.

special thanks to hrvoje popovski and aaron bieber for testing.
this has been in snaps as part of a larger diff for over a week.

Revision 1.109 / (download) - annotate - [select for diffs], Fri Jul 10 13:26:42 2020 UTC (3 years, 11 months ago) by patrick
Branch: MAIN
Changes since 1.108: +1 -8 lines
Diff to previous 1.108 (colored)

Change users of IFQ_SET_MAXLEN() and IFQ_IS_EMPTY() to use the "new" API.

ok dlg@ tobhe@

Revision 1.108 / (download) - annotate - [select for diffs], Fri Jul 10 13:23:34 2020 UTC (3 years, 11 months ago) by patrick
Branch: MAIN
Changes since 1.107: +1 -6 lines
Diff to previous 1.107 (colored)

Change users of IFQ_PURGE() to use the "new" API.

ok dlg@ tobhe@

Revision 1.107 / (download) - annotate - [select for diffs], Fri Jul 10 13:22:22 2020 UTC (3 years, 11 months ago) by patrick
Branch: MAIN
Changes since 1.106: +1 -12 lines
Diff to previous 1.106 (colored)

Change users of IFQ_DEQUEUE(), IFQ_ENQUEUE() and IFQ_LEN() to use the
"new" API.

ok dlg@ tobhe@

Revision 1.106 / (download) - annotate - [select for diffs], Sat Jul 4 08:06:08 2020 UTC (3 years, 11 months ago) by anton
Branch: MAIN
Changes since 1.105: +5 -5 lines
Diff to previous 1.105 (colored)

It's been agreed upon that global locks should be expressed using
capital letters in locking annotations. Therefore harmonize the existing
annotations.

Also, if multiple locks are required they should be delimited using
commas.

ok mpi@

Revision 1.105 / (download) - annotate - [select for diffs], Tue May 12 08:49:54 2020 UTC (4 years, 1 month ago) by jan
Branch: MAIN
Changes since 1.104: +2 -1 lines
Diff to previous 1.104 (colored)

Set timeout(9) to refill the receive ring descriptors if the amount of
descriptors runs below the low watermark.

The em(4) firmware seems not to work properly with just a few descriptors in
the receive ring.  Thus, we use the low water mark as an indicator instead of
zero descriptors, which causes deadlocks.

ok kettenis@

Revision 1.104 / (download) - annotate - [select for diffs], Sun Apr 12 07:02:43 2020 UTC (4 years, 1 month ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_6_7_BASE, OPENBSD_6_7
Changes since 1.103: +2 -2 lines
Diff to previous 1.103 (colored)

say if_pcount needs NET_LOCK instead of the kernel lock.

if_pcount is only touched in ifpromisc(), and ifpromisc() needs
NET_LOCK anyway because it also modifies if_flags.

suggested by mpi@
ok visa@

Revision 1.103 / (download) - annotate - [select for diffs], Fri Nov 8 07:16:29 2019 UTC (4 years, 7 months ago) by dlg
Branch: MAIN
Changes since 1.102: +7 -4 lines
Diff to previous 1.102 (colored)

convert interface address change hooks to tasks and a task_list.

this follows what's been done for detach and link state hooks, and
makes handling of hooks generally more robust.

address hooks are a bit different to detach/link state hooks in
that there's only a few things that register hooks (carp, pf, vxlan),
but a lot of places to run the hooks (lots of ipv4 and ipv6 address
configuration).

an address hook cookie was in struct pfi_kif, which is part of the
pf abi. rather than break pfctl -sI, this maintains the void * used
for the cookie and uses it to store a task, which is then used as
intended with the new api.

Revision 1.102 / (download) - annotate - [select for diffs], Thu Nov 7 07:36:32 2019 UTC (4 years, 7 months ago) by dlg
Branch: MAIN
Changes since 1.101: +4 -2 lines
Diff to previous 1.101 (colored)

turn the linkstate hooks into a task list, like the detach hooks.

this is largely mechanical, except for carp. this moves the addition
of the carp link state hook after we're committed to using the new
interface as a carpdev. because the add can't fail, we avoid a
complicated unwind dance. also, this tweaks the carp linkstate hook
so it only updates the relevant carp interface, not all of the
carpdevs on the parent.

hrvoje popovski has tested an early version of this diff and it's
generally ok, but there's some splasserts that this diff fires that
i'll fix in an upcoming diff.

ok claudio@

Revision 1.101 / (download) - annotate - [select for diffs], Wed Nov 6 03:51:26 2019 UTC (4 years, 7 months ago) by dlg
Branch: MAIN
Changes since 1.100: +5 -2 lines
Diff to previous 1.100 (colored)

replace the hooks used with if_detachhooks with a task list.

the main semantic change is that things registering detach hooks
have to allocate and set a task structure that then gets added to
the list. this means if the task is allocated up front (eg, as part
of carps softc or bridges port structure), it avoids the possibility
that adding a hook can fail. a lot of drivers weren't checking for
failure, and unwinding state in the event of failure in other parts
was error prone.

while doing this i discovered that the list operations have to be
in a particular order, but drivers weren't doing that consistently
either. this diff wraps the list ops up so you have to seriously
go out of your way to screw them up.

ive also sprinkled some NET_ASSERT_LOCKED around the list operations
so we can make sure there's no potential for the list to be corrupted,
especially while it's being run.

hrvoje popovski has tested this a bit, and some issues he discovered
have been fixed.

ok sashan@

Revision 1.100 / (download) - annotate - [select for diffs], Wed Jun 26 09:36:06 2019 UTC (4 years, 11 months ago) by claudio
Branch: MAIN
CVS Tags: OPENBSD_6_6_BASE, OPENBSD_6_6
Changes since 1.99: +2 -1 lines
Diff to previous 1.99 (colored)

Create IF_WWAN_DEFAULT_PRIORITY which is lower than
IF_WIRELESS_DEFAULT_PRIORITY and use it in umb(4) as default prio.
OK kettenis@, sthen@

Revision 1.99 / (download) - annotate - [select for diffs], Sun Apr 28 22:15:57 2019 UTC (5 years, 1 month ago) by mpi
Branch: MAIN
Changes since 1.98: +2 -2 lines
Diff to previous 1.98 (colored)

Removes the KERNEL_LOCK() from bridge(4)'s output fast-path.

This redefines the ifp <-> bridge relationship.  No lock can be
currently used across the multiples contexts where the bridge has
tentacles to protect a pointer, use an interface index.

Tested by various, ok dlg@, visa@

Revision 1.98 / (download) - annotate - [select for diffs], Mon Apr 22 03:26:16 2019 UTC (5 years, 1 month ago) by dlg
Branch: MAIN
Changes since 1.97: +2 -1 lines
Diff to previous 1.97 (colored)

add if_vinput so pseudo (ethernet) interfaces can bypass ifiqs

if_vinput assumes that the interface that its called against uses
per cpu counters so it can count input packets, but basically does
all the things that if_input and ifiq_input do. the main difference
is it assumes the network stack is already running and runs the
interface input handlers directly. this is instead of queuing the
packets for a nettq to run.

ifiqs arent free, especially when they only run per packet like
they do on psuedo interfaces. this allows that overhead to be
bypassed.

Revision 1.97 / (download) - annotate - [select for diffs], Fri Apr 19 07:38:02 2019 UTC (5 years, 1 month ago) by dlg
Branch: MAIN
Changes since 1.96: +6 -1 lines
Diff to previous 1.96 (colored)

provide factored out txhprio and rxhprio checks

l2 and l3 drivers do the same thing all the time, so reduce the
chance of error by doing the checks once and making it available
for drivers to call instead of rolling on their own again.

Revision 1.96 / (download) - annotate - [select for diffs], Tue Apr 16 04:04:19 2019 UTC (5 years, 1 month ago) by dlg
Branch: MAIN
Changes since 1.95: +5 -1 lines
Diff to previous 1.95 (colored)

have another go at tx mitigation

the idea is to call the hardware transmit routine less since in a
lot of cases posting a producer ring update to the chip is (very)
expensive. it's better to do it for several packets instead of each
packet, hence calling this tx mitigation.

this diff defers the call to the transmit routine to a network
taskq, or until a backlog of packets has built up. dragonflybsd
uses 16 as the size of it's backlog, so i'm copying them for now.

i've tried this before, but previous versions caused deadlocks. i
discovered that the deadlocks in the previous version was from
ifq_barrier calling taskq_barrier against the nettq. interfaces
generally hold NET_LOCK while calling ifq_barrier, but the tq might
already be waiting for the lock we hold.

this version just doesnt have ifq_barrier call taskq_barrier. it
instead relies on the IFF_RUNNING flag and normal ifq serialiser
barrier to guarantee the start routine wont be called when an
interface is going down. the taskq_barrier is only used during
interface destruction to make sure the task struct wont get used
in the future, which is already done without the NET_LOCK being
held.

tx mitigation provides a nice performanace bump in some setups. up
to 25% in some cases.

tested by tb@ and hrvoje popovski (who's running this in production).
ok visa@

Revision 1.95 / (download) - annotate - [select for diffs], Sun Mar 31 13:58:18 2019 UTC (5 years, 2 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_6_5_BASE, OPENBSD_6_5
Changes since 1.94: +6 -3 lines
Diff to previous 1.94 (colored)

Document that it is safe to dereference `if_softc' when the caller has
a valid reference to the corresponding `ifp'.

ok visa@

Revision 1.94 / (download) - annotate - [select for diffs], Wed Jan 9 01:14:21 2019 UTC (5 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.93: +3 -1 lines
Diff to previous 1.93 (colored)

split if_enqueue up so drivers can replace ifq handling if needed

if_enqueue() still makes sure packets get handled by pf on the way
out, and seen by bridge if needed. however instead of falling through
to ifq mapping and output, it now calls a function pointer in the
ifnet struct. that pointer defaults to the ifq handling, but drivers
can override it to bypass ifq processing.

the most obvious users of the function pointer will be virtual
interfaces, eg, vlan(4). ifqs are good if you need to serialise
access to the thing that transmits packets (like hardware rings on
nics), or mitigate the number of times you do ring processing, but
neither of those things are desirable on vlan interfaces. ideally
vlan could transmit on any cpu without having packets serialised
by it's own ifq before being pushed down to an arbitrary number of
rings on the parent interface. bypassing ifqs means the driver can
push the vlan tag on concurrently and push down to the parent frmo
any cpu.

ok mpi@
no objection from claudio@

Revision 1.93 / (download) - annotate - [select for diffs], Thu Dec 20 10:26:36 2018 UTC (5 years, 5 months ago) by claudio
Branch: MAIN
Changes since 1.92: +2 -2 lines
Diff to previous 1.92 (colored)

Make this not hz dependent by using timeout_add_sec() also rename the
define to IFNET_SLOWTIMO since it is no longer a hz divisor.
OK visa@ bluhm@ kn@

Revision 1.92 / (download) - annotate - [select for diffs], Wed Dec 19 05:31:28 2018 UTC (5 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.91: +1 -2 lines
Diff to previous 1.91 (colored)

get rid of a prototype for if_enqueue_try()

it isn't implemented, and is never called.

Revision 1.91 / (download) - annotate - [select for diffs], Tue Dec 11 22:08:57 2018 UTC (5 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.90: +23 -1 lines
Diff to previous 1.90 (colored)

add optional per-cpu counters for interface stats.

these exist so interfaces that want to do mpsafe work outside the
ifq machinery have a place to allocate and update stats in. the
generic ioctl handling for getting stats to userland knows how to
roll the new per cpu stats into the rest before export.

ok visa@

Revision 1.90 / (download) - annotate - [select for diffs], Mon Sep 10 16:18:34 2018 UTC (5 years, 9 months ago) by sashan
Branch: MAIN
CVS Tags: OPENBSD_6_4_BASE, OPENBSD_6_4
Changes since 1.89: +1 -2 lines
Diff to previous 1.89 (colored)

- if_cloners list populated at boot time only then becomes immutable,
  so we can let go if_cloners_lock.

OK tb@, claudio@, bluhm@, kn@, henning@

Revision 1.89 / (download) - annotate - [select for diffs], Wed Jan 10 23:50:39 2018 UTC (6 years, 5 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_6_3_BASE, OPENBSD_6_3
Changes since 1.88: +2 -2 lines
Diff to previous 1.88 (colored)

get rid of struct carp_if by moving the srpl into struct ifnet if_carp.

currently carp uses a struct carp_if to hold an srp list head, which
is accessed by both if_carp in struct ifnet, and via the if input
handlers list.

this gets rid of some indirection by making if_carp itself the list
head, rather than a pointer to the list head via a struct carp_if.
it also makes accessing the list consistent by only using if_carp
to get to it.

ok mpi@

Revision 1.88 / (download) - annotate - [select for diffs], Mon Jan 8 23:05:21 2018 UTC (6 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.87: +9 -2 lines
Diff to previous 1.87 (colored)

Convert IF_CLONE_INITIALIZER() into C99 initializer.
OK mpi@

Revision 1.87 / (download) - annotate - [select for diffs], Thu Jan 4 10:48:02 2018 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.86: +5 -5 lines
Diff to previous 1.86 (colored)

Include timeout & tasks in 'struct ifnet' instead of always allocating
them as M_TEMP.

ok visa@

Revision 1.86 / (download) - annotate - [select for diffs], Tue Jan 2 12:52:17 2018 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.85: +42 -40 lines
Diff to previous 1.85 (colored)

Move the NET_LOCK() inside the switch and start documenting which field
is protected by which lock.

ok bluhm@, visa@

Revision 1.85 / (download) - annotate - [select for diffs], Fri Dec 15 01:37:30 2017 UTC (6 years, 5 months ago) by dlg
Branch: MAIN
Changes since 1.84: +7 -3 lines
Diff to previous 1.84 (colored)

add ifiqueues for mp safety and nics with multiple rx rings.

currently there is a single mbuf_queue per interface, which all
rings on a nic shove packets onto. while the list inside this queue
is protected by a mutex, the counters around it (ie, ipackets,
ibytes, idrops) are not. this means updates can be lost, and reading
the statistics is also inconsistent. having a single queue means
that busy rx rings can dominate and then starve the others.

ifiqueue structs are like ifqueue structs. they provide per ring
queues, and independent counters for each ring. when ifdata is read
for userland, these counters are aggregated. having a queue per
ring now allows for per ring backpressure to be applied. MCLGETI
will have it's day again.

right now we assume every interface wants an input queue and
unconditionally provide one. individual interfaces can opt into
more.

im not completely happy about the shape of this atm, but shuffling
it around more makes the diff bigger.

ok visa@

Revision 1.84 / (download) - annotate - [select for diffs], Fri Nov 17 03:51:32 2017 UTC (6 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.83: +3 -1 lines
Diff to previous 1.83 (colored)

add if_rxr_livelocked so rxr users can request backpressure themselves.

right now the rx ring moderation code makes a decision globally
that a machine is livelocked, and uses that to apply backpressure
on all the rx rings. we're moving toward having the network stack
run on multiple cpus, and fed from multiple rx rings. if_rxr_livelocked
lets a driver apply backpressure explicitely if something tells it
that whatever is consuming previous packets cannot keep up.

while here expose the current ring watermark with if_rxr_cwm.

tweaks and ok visa@

Revision 1.83 / (download) - annotate - [select for diffs], Tue Oct 31 22:05:12 2017 UTC (6 years, 7 months ago) by sashan
Branch: MAIN
Changes since 1.82: +1 -2 lines
Diff to previous 1.82 (colored)

- add one more softnet taskq
  NOTE: code still runs with single softnet task.  change definition of
  SOFTNET_TASKS in net/if.c, if you want to have more than one softnet task

OK mpi@, OK phessler@

Revision 1.82 / (download) - annotate - [select for diffs], Thu Oct 12 09:14:16 2017 UTC (6 years, 8 months ago) by mpi
Branch: MAIN
Changes since 1.81: +1 -4 lines
Diff to previous 1.81 (colored)

Move sysctl_mq() where it can safely mess with mbuf queue internals.

ok visa@, bluhm@, deraadt@

Revision 1.81 / (download) - annotate - [select for diffs], Mon May 8 08:46:39 2017 UTC (7 years, 1 month ago) by rzalamena
Branch: MAIN
CVS Tags: OPENBSD_6_2_BASE, OPENBSD_6_2
Changes since 1.80: +2 -1 lines
Diff to previous 1.80 (colored)

Added initial IPv6 multicast routing support for multiple rdomains:

* don't share mifs (multicast interface) between rdomains
* allow multiple routing sockets connected at the same time if they are
  in different rdomains.

ok bluhm@

Revision 1.80 / (download) - annotate - [select for diffs], Tue Jan 24 03:57:35 2017 UTC (7 years, 4 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_6_1_BASE, OPENBSD_6_1
Changes since 1.79: +7 -2 lines
Diff to previous 1.79 (colored)

add support for multiple transmit ifqueues per network interface.

an ifq to transmit a packet is picked by the current traffic
conditioner (ie, priq or hfsc) by providing an index into an array
of ifqs. by default interfaces get a single ifq but can ask for
more using if_attach_queues().

the vast majority of our drivers still think there's a 1:1 mapping
between interfaces and transmit queues, so their if_start routines
take an ifnet pointer instead of a pointer to the ifqueue struct.
instead of changing all the drivers in the tree, drivers can opt
into using an if_qstart routine and setting the IFXF_MPSAFE flag.
the stack provides a compatability wrapper from the new if_qstart
handler to the previous if_start handlers if IFXF_MPSAFE isnt set.

enabling hfsc on an interface configures it to transmit everything
through the first ifq. any other ifqs are left configured as priq,
but unused, when hfsc is enabled.

getting this in now so everyone can kick the tyres.

ok mpi@ visa@ (who provided some tweaks for cnmac).

Revision 1.79 / (download) - annotate - [select for diffs], Sat Jan 21 01:32:19 2017 UTC (7 years, 4 months ago) by patrick
Branch: MAIN
Changes since 1.78: +2 -2 lines
Diff to previous 1.78 (colored)

Make the if_flags member unsigned.  This was prompted by clang
complaining that assigning the MULTICAST flag, which sets the
uppermost bit, would invert the meaning of MULTICAST flag's
numeric value.

ok claudio@ deraadt@ tom@ visa@

Revision 1.78 / (download) - annotate - [select for diffs], Fri Jan 6 14:01:19 2017 UTC (7 years, 5 months ago) by rzalamena
Branch: MAIN
Changes since 1.77: +2 -1 lines
Diff to previous 1.77 (colored)

Remove the global viftable vector that holds the virtual interfaces
configuration and instead use ifnet to store the configuration and
counters. With this we can safely use multicast routing daemons on
multiple domains without vif id colisions.

ok mpi@

Revision 1.77 / (download) - annotate - [select for diffs], Mon Nov 14 10:32:46 2016 UTC (7 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.76: +1 -2 lines
Diff to previous 1.76 (colored)

Automatically create a default lo(4) interface per rdomain.

In order to stop abusing lo0 for all rdomains, a new loopback interface
will be created every time a rdomain is created.  The unit number will
be the same as the rdomain, i.e. lo1 will be attached to rdomain 1.

If this loopback interface is already in use it wont be possible to create
the corresponding rdomain.

In order to know which lo(4) interface is attached to a rdomain, its index
is stored in the rtable/rdomain map.

This is a long overdue since the introduction of rtable/rdomain.  It also
fixes a recent regression due to resetting the rdomain of an incoming
packet reported by semarie@, Andreas Bartelt and Nils Frohberg.

ok claudio@

Revision 1.76 / (download) - annotate - [select for diffs], Tue Nov 8 10:46:05 2016 UTC (7 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.75: +1 -2 lines
Diff to previous 1.75 (colored)

RIP ifa_ifwithnet()

ok vgross@

Revision 1.75 / (download) - annotate - [select for diffs], Sun Sep 4 15:46:39 2016 UTC (7 years, 9 months ago) by reyk
Branch: MAIN
Changes since 1.74: +2 -2 lines
Diff to previous 1.74 (colored)

When auto-creating an interface when opening a /dev/{tun,tap,switch}
device, inherit the rdomain from the calling process.  This adds an
rdomain argument to if_clone_create().

OK mpi@ henning@

Revision 1.74 / (download) - annotate - [select for diffs], Sat Sep 3 09:55:44 2016 UTC (7 years, 9 months ago) by mpi
Branch: MAIN
Changes since 1.73: +3 -1 lines
Diff to previous 1.73 (colored)

Use per-ifp tasks to process incoming packets.

Reduce the number of if_get/if_put from one per packet to one per ring
since we now know that all the packets are coming from the same interface.

Improve forwarding performances by 10Kpps in Hrvoje Popovski's test setup.

ok bluhm@, henning@, dlg@

Revision 1.73 / (download) - annotate - [select for diffs], Thu Sep 1 10:06:33 2016 UTC (7 years, 9 months ago) by goda
Branch: MAIN
Changes since 1.72: +2 -1 lines
Diff to previous 1.72 (colored)

Import switch(4), an in-kernel OpenFlow switch which can work alone.
switch(4) currently supports OpenFlow 1.3.5.
Currently, it's disabled by the kernel config.

With help from yasuoka@ reyk@ jsg@.

ok deraadt@ yasuoka@ reyk@ henning@

Revision 1.72 / (download) - annotate - [select for diffs], Fri Jun 10 20:33:29 2016 UTC (8 years ago) by vgross
Branch: MAIN
CVS Tags: OPENBSD_6_0_BASE, OPENBSD_6_0
Changes since 1.71: +2 -1 lines
Diff to previous 1.71 (colored)

Add the "llprio" field to struct ifnet, and the corresponding keyword
to ifconfig.

"llprio" allows one to set the priority of packets that do not go through
pf(4), as the case is for arp(4) or bpf(4).

ok sthen@ mikeb@

Revision 1.71 / (download) - annotate - [select for diffs], Fri Apr 15 05:05:21 2016 UTC (8 years, 1 month ago) by dlg
Branch: MAIN
Changes since 1.70: +1 -2 lines
Diff to previous 1.70 (colored)

remove ml_filter, mq_filter, niq_filter.

theyre currently unused, so no functional change.

Revision 1.70 / (download) - annotate - [select for diffs], Wed Apr 13 11:41:15 2016 UTC (8 years, 1 month ago) by mpi
Branch: MAIN
Changes since 1.69: +1 -2 lines
Diff to previous 1.69 (colored)

We're always ready!  So send IFQ_SET_READY() to the bitbucket.

Revision 1.69 / (download) - annotate - [select for diffs], Fri Dec 18 14:02:15 2015 UTC (8 years, 5 months ago) by visa
Branch: MAIN
CVS Tags: OPENBSD_5_9_BASE, OPENBSD_5_9
Changes since 1.68: +1 -2 lines
Diff to previous 1.68 (colored)

Remove leftover prototype.

ok mpi@

Revision 1.68 / (download) - annotate - [select for diffs], Wed Dec 9 15:05:51 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.67: +1 -11 lines
Diff to previous 1.67 (colored)

Keep all ether prototypes in one place.

Revision 1.67 / (download) - annotate - [select for diffs], Wed Dec 9 03:22:39 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.66: +2 -1 lines
Diff to previous 1.66 (colored)

rework the if_start mpsafe serialisation so it can serialise arbitrary work

work is represented by struct task.

the start routine is now wrapped by a task which is serialised by the
infrastructure. if_start_barrier has been renamed to ifq_barrier and
is now implemented as a task that gets serialised with the start
routine.

this also adds an ifq_restart() function. it serialises a call to
ifq_clr_oactive and calls the start routine again. it exists to
avoid a race that kettenis@ identified in between when a start
routine discovers theres no space left on a ring, and when it calls
ifq_set_oactive. if the txeof side of the driver empties the ring
and calls ifq_clr_oactive in between the above calls in start, the
queue will be marked oactive and the stack will never call the start
routine again.

by serialising the ifq_set_oactive call in the start routine and
ifq_clr_oactive calls we avoid that race.

tested on various nics
ok mpi@

Revision 1.66 / (download) - annotate - [select for diffs], Tue Dec 8 10:14:58 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.65: +1 -3 lines
Diff to previous 1.65 (colored)

if_stop is unused, so kill it.

ok mpi@

Revision 1.65 / (download) - annotate - [select for diffs], Tue Dec 8 10:06:12 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.64: +3 -72 lines
Diff to previous 1.64 (colored)

split the interface send queue (struct ifqueue) implementation out.

the intention is to make it more clear what belongs to a transmit
queue and what belongs to an interface.

suggested by and ok mpi@

Revision 1.64 / (download) - annotate - [select for diffs], Sat Dec 5 16:24:59 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.63: +4 -4 lines
Diff to previous 1.63 (colored)

Keep kernel definitions under _KERNEL to unbreak ports that include
<net/if_var.h> because some other operating systems have defines in
there.

ok jasper@

Revision 1.63 / (download) - annotate - [select for diffs], Thu Dec 3 21:11:53 2015 UTC (8 years, 6 months ago) by sashan
Branch: MAIN
Changes since 1.62: +2 -1 lines
Diff to previous 1.62 (colored)

ip_send()/ip6_send() allow PF to send response packet in ipsoftnet task.
this avoids current recursion to pf_test() function. the change also
switches icmp_error()/icmp6_error() to use ip_send()/ip6_send() so
they are safe for PF.

The idea comes from Markus Friedl. bluhm, mikeb and mpi helped me
a lot to get it into shape.

OK bluhm@, mpi@

Revision 1.62 / (download) - annotate - [select for diffs], Thu Dec 3 16:27:32 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.61: +2 -2 lines
Diff to previous 1.61 (colored)

Use SRPL_HEAD() and SRPL_ENTRY() to be consistent with and allow to
fallback to a SLIST.

ok dlg@, jasper@

Revision 1.61 / (download) - annotate - [select for diffs], Thu Dec 3 12:22:51 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.60: +2 -2 lines
Diff to previous 1.60 (colored)

rework if_start to allow nics to provide an mpsafe start routine.

existing start routines will still be called under the kernel lock
and at IPL_NET.

mpsafe start routines will be serialised so only one instance of
each interfaces function will be running in the kernel at any point
in time. this guarantees packets will be dequeued in order, and the
start routines dont have to lock against themselves because if_start
does it for them.

the code to do that is based on the scsi runqueue code.

this also provides an if_start_barrier() function that should wait
until any currently running instances of if_start have finished.

a driver can opt in to the mpsafe if_start call by doing the following:

1. setting ifp->if_xflags = IFXF_MPSAFE
2. only calling if_start() instead of its own start routine
3. clearing IFF_RUNNING before calling if_start_barrier() on its way down
4. only using IFQ_DEQUEUE (not ifq_deq_begin/commit/rollback)

to simplify the implementation the tx mitigation code has been removed.

tested by several
ok mpi@ jmatthew@

Revision 1.60 / (download) - annotate - [select for diffs], Wed Dec 2 08:03:00 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.59: +1 -6 lines
Diff to previous 1.59 (colored)

Remove forward declarations that are no longer needed, times and APIs are
changing.

Revision 1.59 / (download) - annotate - [select for diffs], Fri Nov 27 15:00:12 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.58: +1 -7 lines
Diff to previous 1.58 (colored)

Keep lo(4) definitions inside if_loop.c

Revision 1.58 / (download) - annotate - [select for diffs], Wed Nov 25 03:10:00 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.57: +20 -1 lines
Diff to previous 1.57 (colored)

replace IFF_OACTIVE manipulation with mpsafe operations.

there are two things shared between the network stack and drivers
in the send path: the send queue and the IFF_OACTIVE flag. the send
queue is now protected by a mutex. this diff makes the oactive
functionality mpsafe too.

IFF_OACTIVE is part of if_flags. there are two problems with that.
firstly, if_flags is a short and we dont have any MI atomic operations
to manipulate a short. secondly, while we could make the IFF_OACTIVE
operates mpsafe, all changes to other flags would have to be made
safe at the same time, otherwise a read-modify-write cycle on their
updates could clobber the oactive change.

instead, this moves the oactive mark into struct ifqueue and provides
an API for changing it. there's ifq_set_oactive, ifq_clr_oactive,
and ifq_is_oactive. these are modelled on ifsq_set_oactive,
ifsq_clr_oactive, and ifsq_is_oactive in dragonflybsd.

this diff includes changes to all the drivers manipulating IFF_OACTIVE
to now use the ifsq_{set,clr_is}_oactive API too.

ok kettenis@ mpi@ jmatthew@ deraadt@

Revision 1.57 / (download) - annotate - [select for diffs], Mon Nov 23 15:53:35 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.56: +1 -4 lines
Diff to previous 1.56 (colored)

There's no longer a need to include <net/hfsc.h> in <net/if_var.h>

Revision 1.56 / (download) - annotate - [select for diffs], Sat Nov 21 01:08:50 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.55: +1 -3 lines
Diff to previous 1.55 (colored)

simplify ifq_deq_rollback by only having it unlock.

hfsc needed a rollback ifqop to requeue the mbuf because it used
ml_dequeue in the begin op. now it uses MBUF_LIST_FIRST to get a
ref to the first mbuf in deq_begin.

now the disciplines dont need a rollback op, so ifq_deq_rollback
can be simplified to just releasing the mutex.

based on a discussion with kenjiro cho

Revision 1.55 / (download) - annotate - [select for diffs], Fri Nov 20 11:15:07 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.54: +6 -6 lines
Diff to previous 1.54 (colored)

i made a mistake. rename ifq_enq and ifq_deq to ifq_enqueue and ifq_dequeue

fixing it now before i regret it more.

Revision 1.54 / (download) - annotate - [select for diffs], Fri Nov 20 03:35:23 2015 UTC (8 years, 6 months ago) by dlg
Branch: MAIN
Changes since 1.53: +60 -109 lines
Diff to previous 1.53 (colored)

shuffle struct ifqueue so in flight mbufs are protected by a mutex.

the code is refactored so the IFQ macros call newly implemented ifq
functions. the ifq code is split so each discipline (priq and hfsc
in our case) is an opaque set of operations that the common ifq
code can call. the common code does the locking, accounting (ifq_len
manipulation), and freeing of the mbuf if the disciplines enqueue
function rejects it. theyre kind of like bufqs in the block layer
with their fifo and nscan disciplines.

the new api also supports atomic switching of disciplines at runtime.
the hfsc setup in pf_ioctl.c has been tweaked to build a complete
hfsc_if structure which it attaches to the send queue in a single
operation, rather than attaching to the interface up front and
building up a list of queues.

the send queue is now mutexed, which raises the expectation that
packets can be enqueued or purged on one cpu while another cpu is
dequeueing them in a driver for transmission. a lot of drivers use
IFQ_POLL to peek at an mbuf and attempt to fit it on the ring before
committing to it with a later IFQ_DEQUEUE operation. if the mbuf
gets freed in between the POLL and DEQUEUE operations, fireworks
will ensue.

to avoid this, the ifq api introduces ifq_deq_begin, ifq_deq_rollback,
and ifq_deq_commit. ifq_deq_begin allows a driver to take the ifq
mutex and get a reference to the mbuf they wish to try and tx. if
there's space, they can ifq_deq_commit it to remove the mbuf and
release the mutex. if there's no space, ifq_deq_rollback simply
releases the mutex. this api was developed to make updating the
drivers using IFQ_POLL easy, instead of having to do significant
semantic changes to avoid POLL that we cannot test on all the
hardware.

the common code has been tested pretty hard, and all the driver
modifications are straightforward except for de(4). if that breaks
it can be dealt with later.

ok mpi@ jmatthew@

Revision 1.53 / (download) - annotate - [select for diffs], Wed Nov 18 13:58:02 2015 UTC (8 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.52: +3 -1 lines
Diff to previous 1.52 (colored)

Factorize the bits to check if a L2 route is connected, wether it is
attached to a carp(4) or bridge(4) member, to not dereference rt_ifp
directly.

ok visa@

Revision 1.52 / (download) - annotate - [select for diffs], Wed Nov 11 10:23:23 2015 UTC (8 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.51: +2 -2 lines
Diff to previous 1.51 (colored)

Store the index of the lo0 interface instead of a pointer to its
descriptor.

Allow to get rid of two if_ref() in the output paths.

ok dlg@

Revision 1.51 / (download) - annotate - [select for diffs], Sun Oct 25 11:58:11 2015 UTC (8 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.50: +7 -5 lines
Diff to previous 1.50 (colored)

Introduce if_rtrequest() the successor of ifa_rtrequest().

L2 resolution depends on the protocol (encoded in the route entry) and
an ``ifp''.  Not having to care about an ``ifa'' makes our life easier
in our MP effort.  Fewer dependencies between data structures implies
fewer headaches.

Discussed with bluhm@, ok claudio@

Revision 1.50 / (download) - annotate - [select for diffs], Sat Oct 24 10:52:05 2015 UTC (8 years, 7 months ago) by reyk
Branch: MAIN
Changes since 1.49: +4 -1 lines
Diff to previous 1.49 (colored)

Add pair(4), a vether-based virtual Ethernet driver to interconnect
rdomains and bridges on the local system.  This can be used to route
through local rdomains, to create L2 devices (like trunks) between
them, and many other things.

Discussed with many, with input from mpi@
OK sthen@ phessler@ yasuoka@ mikeb@

Revision 1.49 / (download) - annotate - [select for diffs], Thu Oct 22 17:48:34 2015 UTC (8 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.48: +1 -2 lines
Diff to previous 1.48 (colored)

Kill link_rtrequest(), introduce in 1990 to "fix" the result
of rt_getifa() when adding link level route from outside the
kernel.

ok claudio@

Revision 1.48 / (download) - annotate - [select for diffs], Mon Oct 12 13:17:58 2015 UTC (8 years, 8 months ago) by dlg
Branch: MAIN
Changes since 1.47: +2 -3 lines
Diff to previous 1.47 (colored)

the pattr argument to IFQ_ENQUEUE is unused, so let's get rid of it.

also the comment above IFQ_ENQUEUE that says the pattr argument is unused.

ok mpi@

Revision 1.47 / (download) - annotate - [select for diffs], Mon Oct 5 15:19:29 2015 UTC (8 years, 8 months ago) by uebayasi
Branch: MAIN
Changes since 1.46: +2 -1 lines
Diff to previous 1.46 (colored)

Add ifi_oqdrops and its alias to struct if_data.

Necessary bumps in Ports will be handled by sthen@.

OK mpi@ dlg@

Revision 1.46 / (download) - annotate - [select for diffs], Wed Sep 30 11:33:51 2015 UTC (8 years, 8 months ago) by dlg
Branch: MAIN
Changes since 1.45: +3 -2 lines
Diff to previous 1.45 (colored)

sleep until all references to an interface have been released during detach.

this is done by moving to the refcnt api and using refcnt_finalize.

tested by Hrjove Popovski
ok mpi@

Revision 1.45 / (download) - annotate - [select for diffs], Mon Sep 28 08:24:53 2015 UTC (8 years, 8 months ago) by mpi
Branch: MAIN
Changes since 1.44: +1 -2 lines
Diff to previous 1.44 (colored)

Remove "if_tp" from the "struct ifnet".

Instead of violating a layer of abstraction by keeping per pseudo-driver
informations in "struct ifnet", the port trunk is now passed as a cookie
to the interface input handler (ifih).

The time of per pseudo-driver hack in the network stack is over!

ok mikeb@

Revision 1.44 / (download) - annotate - [select for diffs], Sun Sep 27 05:23:50 2015 UTC (8 years, 8 months ago) by dlg
Branch: MAIN
Changes since 1.43: +4 -3 lines
Diff to previous 1.43 (colored)

pull the m_freem calls out of hfsc_enqueue by having IFQ_ENQUEUE free
the mbuf in both the hfsc and priq error paths.

ok mikeb@ mpi@ claudio@ henning@

Revision 1.43 / (download) - annotate - [select for diffs], Sun Sep 13 17:53:44 2015 UTC (8 years, 8 months ago) by mpi
Branch: MAIN
Changes since 1.42: +1 -3 lines
Diff to previous 1.42 (colored)

There's no point in abstracting ifp->if_output() as long as pf_test()
needs to see lo0 in the output path.

ok claudio@

Revision 1.42 / (download) - annotate - [select for diffs], Sun Sep 13 09:58:03 2015 UTC (8 years, 8 months ago) by kettenis
Branch: MAIN
Changes since 1.41: +2 -1 lines
Diff to previous 1.41 (colored)

Run the interface watchdog timer routine as a task such that we have process
context.

ok mpi@, claudio@

Revision 1.41 / (download) - annotate - [select for diffs], Sat Sep 12 20:26:06 2015 UTC (8 years, 9 months ago) by mpi
Branch: MAIN
Changes since 1.40: +3 -1 lines
Diff to previous 1.40 (colored)

Stop overwriting the rt_ifp pointer of RTF_LOCAL routes with lo0ifp.

Use instead the RTF_LOCAL flag to loop local traffic back to the
corresponding protocol queue.

With this change rt_ifp is now always the same as rt_ifa->ifa_ifp.

ok claudio@

Revision 1.40 / (download) - annotate - [select for diffs], Sat Sep 12 13:34:12 2015 UTC (8 years, 9 months ago) by mpi
Branch: MAIN
Changes since 1.39: +3 -1 lines
Diff to previous 1.39 (colored)

Introduce if_input_local() a function to feed local traffic back to
the protocol queues.

It basically does what looutput() was doing but having a generic
function will allow us to get rid of the loopback hack overwwritting
the rt_ifp field of RTF_LOCAL routes.

ok mikeb@, dlg@, claudio@

Revision 1.39 / (download) - annotate - [select for diffs], Thu Sep 10 18:11:05 2015 UTC (8 years, 9 months ago) by dlg
Branch: MAIN
Changes since 1.38: +3 -3 lines
Diff to previous 1.38 (colored)

rework how we store and manage the interface index to ifp map in preparation of using SRPs as a backend for if_get.

this also tries to document how things work and what if index 0 is for.

ok mpi@ claudio@

Revision 1.38 / (download) - annotate - [select for diffs], Thu Sep 10 16:41:30 2015 UTC (8 years, 9 months ago) by mikeb
Branch: MAIN
Changes since 1.37: +6 -4 lines
Diff to previous 1.37 (colored)

pass a cookie argument to interface input handlers that can be used
to pass additional context or transient data with the similar life
time.

ok mpi, suggestions, hand holding and ok from dlg

Revision 1.37 / (download) - annotate - [select for diffs], Thu Sep 10 14:56:41 2015 UTC (8 years, 9 months ago) by dlg
Branch: MAIN
Changes since 1.36: +2 -1 lines
Diff to previous 1.36 (colored)

include srp.h so userland can understand struct srpl.

noticed by deraadt@

Revision 1.36 / (download) - annotate - [select for diffs], Thu Sep 10 13:32:19 2015 UTC (8 years, 9 months ago) by dlg
Branch: MAIN
Changes since 1.35: +5 -11 lines
Diff to previous 1.35 (colored)

move the if input handler list to an SRP list.

instead of having every driver that manipulates the ifih list
understand SRPLs, this moves that processing into if_ih_insert and
if_ih_remove functions.

we rely on the kernel lock to serialise the modifications to the
list.

tested by mpi@
ok mpi@ claudio@ mikeb@

Revision 1.35 / (download) - annotate - [select for diffs], Wed Sep 9 16:01:10 2015 UTC (8 years, 9 months ago) by dlg
Branch: MAIN
Changes since 1.34: +2 -1 lines
Diff to previous 1.34 (colored)

introduce reference counts for interfaces (ie, struct ifnet *ifp).

if_get can get a reference to an ifp, but it never releases that
reference. this provides an if_put function that can be used to
decrement the refcount.

we cannot come up with a scheme for letting the network stack run on
one (or many) cpus while ioctls are pulling interfaces down on another
cpu without refcounts for the interfaces.

if_put is going in now so we can go through the stack and put the
necessary calls to it in, and then we'll backfill this implementation
to actually check the refcounts when the interface detaches.

ok mpi@ mikeb@ claudio@

Revision 1.34 / (download) - annotate - [select for diffs], Thu Jul 2 09:40:02 2015 UTC (8 years, 11 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_5_8_BASE, OPENBSD_5_8
Changes since 1.33: +3 -3 lines
Diff to previous 1.33 (colored)

By design if_input_process() needs to hold a reference on the receiving
ifp in order to access its ifih handlers.

So get rid of if_get() in the various ifih handlers we know the ifp is
live at this point.

ok dlg@

Revision 1.33 / (download) - annotate - [select for diffs], Tue Jun 30 13:54:42 2015 UTC (8 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.32: +2 -2 lines
Diff to previous 1.32 (colored)

Rename if_output() into if_enqueue() to avoid confusion with comments
talking about (*ifp->if_output)().

ok claudio@, dlg@

Revision 1.32 / (download) - annotate - [select for diffs], Tue Jun 2 13:23:55 2015 UTC (9 years ago) by mpi
Branch: MAIN
Changes since 1.31: +1 -2 lines
Diff to previous 1.31 (colored)

RIP ether_input_mbuf().

Revision 1.31 / (download) - annotate - [select for diffs], Thu May 28 11:57:33 2015 UTC (9 years ago) by mpi
Branch: MAIN
Changes since 1.30: +1 -11 lines
Diff to previous 1.30 (colored)

Kill unused IF_INPUT_ENQUEUE().

ok dlg@

Revision 1.30 / (download) - annotate - [select for diffs], Tue May 26 11:39:07 2015 UTC (9 years ago) by mpi
Branch: MAIN
Changes since 1.29: +3 -3 lines
Diff to previous 1.29 (colored)

Now that the Ethernet header is always passed as part of the mbuf, kill
the second (unused) argument of the input packet handlers.

ok dlg@

Revision 1.29 / (download) - annotate - [select for diffs], Tue May 19 11:09:24 2015 UTC (9 years ago) by mpi
Branch: MAIN
Changes since 1.28: +2 -1 lines
Diff to previous 1.28 (colored)

Take vlan(4) out of ether_input().

To keep the list of input handlers short, multiple vlans share the
same ifih.

if_input_process() now looks if the interface of a mbuf changed to
make sure the corresponding handlers are executed.  This is a hack
and will be improved later.

ok dlg@

Revision 1.28 / (download) - annotate - [select for diffs], Mon May 18 13:32:28 2015 UTC (9 years ago) by reyk
Branch: MAIN
Changes since 1.27: +2 -2 lines
Diff to previous 1.27 (colored)

Move the rdomain from struct ifnet into struct if_data.  This way it
will be exported to userland with the existing sysctl, getifaddrs()
and routing socket (if_msghdr.ifm_data) interfaces that expose
if_data.  All programs and daemons - Apps - that call the
SIOCGIFRDOMAIN ioctl in a getifaddrs() loop or after receiving an
interface message on the routing socket can now remove the pointless
additional ioctl.  In base, that could be: dhclient, isakmpd, dhcpd,
dhcrelay, ntpd, ospfd, ripd, ifconfig.

No ABI breakage because it uses a previously unused pad field in if_data.

OK mpi@ deraadt@

Revision 1.27 / (download) - annotate - [select for diffs], Fri May 15 11:53:06 2015 UTC (9 years ago) by claudio
Branch: MAIN
Changes since 1.26: +2 -1 lines
Diff to previous 1.26 (colored)

Give carp(4) interfaces their own low priority. The change should not
change behaviour for now but will allow to share the same address with
the parent interface without major hacks.
OK mpi@

Revision 1.26 / (download) - annotate - [select for diffs], Fri May 15 10:15:13 2015 UTC (9 years ago) by mpi
Branch: MAIN
Changes since 1.25: +2 -1 lines
Diff to previous 1.25 (colored)

Introduce if_output(), a function do to the last steps before enqueuing
a packet on the sending queue of an interface.

Tested by many, thanks a lot!

ok dlg@, claudio@

Revision 1.25 / (download) - annotate - [select for diffs], Thu Apr 23 09:45:24 2015 UTC (9 years, 1 month ago) by dlg
Branch: MAIN
Changes since 1.24: +5 -5 lines
Diff to previous 1.24 (colored)

replace the use of struct ifqueue in pipex with mbuf_queues.

this has a slight semantic change. previously pipex would only
process up to 128 packets on the input and output queues at a time
and would reschedule the softint if there were any left. now it
mq_delists the current set of pending packets and only processes
them. if anything is added to the queues later they'll cause the
softint to run again.

this in turn lets us deprecate sysctl_ifq since nothing uses it
anymore. because niqueues are mostly wrappers around mbuf_queues,
we can provide sysctl_mq and just #define sysctl_niq to it.

pipex bits are ok yasuoka@

Revision 1.24 / (download) - annotate - [select for diffs], Tue Apr 7 10:46:20 2015 UTC (9 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.23: +3 -3 lines
Diff to previous 1.23 (colored)

Do not pass an `ifp' argument to interface input handlers since it
might be overwritten by pseudo-drivers.

ok dlg@, henning@

Revision 1.23 / (download) - annotate - [select for diffs], Wed Apr 1 04:00:55 2015 UTC (9 years, 2 months ago) by dlg
Branch: MAIN
Changes since 1.22: +2 -3 lines
Diff to previous 1.22 (colored)

create a taskq for network tasks to run in. cut ether_input_mbuf
and if_input up so the work ether_input does gets run on the task
instead of directly from hardware receive handlers.

this is a step toward letting hardware drivers run without biglock
by shoving the work the stack does which needs that lock sideways.

general agreement at s2k15
ok mpi@ kettenis@ claudio@

Revision 1.22 / (download) - annotate - [select for diffs], Wed Mar 25 11:49:02 2015 UTC (9 years, 2 months ago) by dlg
Branch: MAIN
Changes since 1.21: +26 -3 lines
Diff to previous 1.21 (colored)

introduce code for network input queues. these are to replace the
use of struct ifqueue for things handled by softnet. they instead
use an mbuf_queue (yay mpsafe) and wrap up the schednetisr and
if_congestion handling.

ok mpi@

Revision 1.21 / (download) - annotate - [select for diffs], Wed Mar 18 12:23:15 2015 UTC (9 years, 2 months ago) by dlg
Branch: MAIN
Changes since 1.20: +2 -5 lines
Diff to previous 1.20 (colored)

remove the congestion handling from struct ifqueue.

its only used for the ip and ip6 network stack input queues, so it
seems unfair that every instance of ifqueue has to carry a pointer
around for this specific use case.

this moves the congestion marker to a kernel global. if we detect
that we're congested, we assume the whole system is busy and punish
all input queues.

marking a system as congested is done by setting the global to the
current value of ticks. as the system moves away from that value,
it moves away from being congested until the comparison fails.

written at s2k15
ok henning@ beck@ bluhm@ claudio@

Revision 1.20 / (download) - annotate - [select for diffs], Mon Feb 9 03:09:57 2015 UTC (9 years, 4 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_5_7_BASE, OPENBSD_5_7
Changes since 1.19: +3 -2 lines
Diff to previous 1.19 (colored)

tweak the new if_input function so it takes an mbuf_list instead
of a single mbuf. this forces us to batch work between the hardware
rx handlers and the stack.

this includes a converstion of bge from ether_input to if_input.

ok claudio@ pelikan@ mpi@

Revision 1.19 / (download) - annotate - [select for diffs], Sun Feb 8 06:00:52 2015 UTC (9 years, 4 months ago) by mpi
Branch: MAIN
Changes since 1.18: +15 -2 lines
Diff to previous 1.18 (colored)

Introduce if_input() a function to pass packets dequeued from a
recieving ring to the stack.

if_input() is at the moment a drop-in replacement for ether_input_mbuf()
but will let us stack pseudo-driver in a nice way in order to no longer
call ether_input() recursively.

ok pelikan@, reyk@, blambert@, henning@

Revision 1.18 / (download) - annotate - [select for diffs], Fri Feb 6 06:42:36 2015 UTC (9 years, 4 months ago) by henning
Branch: MAIN
Changes since 1.17: +59 -61 lines
Diff to previous 1.17 (colored)

since I just touched this file and thus cause an almost full recompile of
everything in the kernel for everybody anyway, can as well use the
opportunity to move the block with the IF_* macros down next to the IFQ_*
versions; has always been slightly confusing - was like that due to the
long gone ALTQ versions of these macros. claudio agrees.

Revision 1.17 / (download) - annotate - [select for diffs], Fri Feb 6 06:38:08 2015 UTC (9 years, 4 months ago) by henning
Branch: MAIN
Changes since 1.16: +0 -3 lines
Diff to previous 1.16 (colored)

g/c unused IFQ_INC_LEN, IFQ_DEC_LEN and IFQ_INC_DROPS, ok claudio

Revision 1.16 / (download) - annotate - [select for diffs], Thu Dec 18 15:29:30 2014 UTC (9 years, 5 months ago) by krw
Branch: MAIN
Changes since 1.15: +3 -1 lines
Diff to previous 1.15 (colored)

Change the link state change routing message generation to a taskq.
One less workq to worry about.

Tweaks from many. ok mpi@ mikeb@

Revision 1.15 / (download) - annotate - [select for diffs], Mon Dec 8 10:46:14 2014 UTC (9 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.14: +1 -2 lines
Diff to previous 1.14 (colored)

There's no good reason to keep into "struct ifnet" a pointer that's only
used by enc(4) devices to attach their routes.

ok sthen@, mikeb@

Revision 1.14 / (download) - annotate - [select for diffs], Fri Dec 5 15:50:04 2014 UTC (9 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.13: +1 -26 lines
Diff to previous 1.13 (colored)

Explicitly include <net/if_var.h> instead of pulling it in <net/if.h>.

ok mikeb@, krw@, bluhm@, tedu@

Revision 1.13 / (download) - annotate - [select for diffs], Mon Dec 1 15:06:54 2014 UTC (9 years, 6 months ago) by mikeb
Branch: MAIN
Changes since 1.12: +3 -1 lines
Diff to previous 1.12 (colored)

Make every interface with a watchdog register it's own slow timeout

This removes the system wide if_slowtimo timeout and lets every
interface with a valid if_watchdog method register it's own in
order to get rid of the ifnet loop in the softclock context and
avoid further complications with concurrent access to the ifnet
list.

ok deraadt, input and ok mpi, looked at by claudio

Revision 1.12 / (download) - annotate - [select for diffs], Tue Jul 8 04:02:14 2014 UTC (9 years, 11 months ago) by dlg
Branch: MAIN
CVS Tags: OPENBSD_5_6_BASE, OPENBSD_5_6
Changes since 1.11: +12 -1 lines
Diff to previous 1.11 (colored)

introduce the if_rxr api. it is intended to pull the rx ring accounting
out of the mbuf layer, and break the assumption that an interface will
only have a single ring per mbuf cluster size.

mpi@ is ok with moving this forward

Revision 1.11 / (download) - annotate - [select for diffs], Mon May 26 08:33:48 2014 UTC (10 years ago) by mpi
Branch: MAIN
Changes since 1.10: +2 -2 lines
Diff to previous 1.10 (colored)

Document that this reference counter is not generic.  It indicates how
many route entries are pointing to this address.

Revision 1.10 / (download) - annotate - [select for diffs], Mon May 5 11:44:33 2014 UTC (10 years, 1 month ago) by mpi
Branch: MAIN
Changes since 1.9: +2 -1 lines
Diff to previous 1.9 (colored)

Use a custom ifa_rtrequest function for point-to-point interfaces
instead of relying on hacks in nd6_rtrequest() to add a route to
loopback for each address configured on such interfaces.

While here document that abusing lo0 for local traffic is not safe
for interfaces in a non-default rdomain.

Tested by claudio@, jca@ and sthen@, ok sthen@

Revision 1.9 / (download) - annotate - [select for diffs], Wed Apr 23 09:30:57 2014 UTC (10 years, 1 month ago) by mpi
Branch: MAIN
Changes since 1.8: +1 -3 lines
Diff to previous 1.8 (colored)

You don't want to use ifa_ifwithroute(), it exists for to the routing
crazyness only.

Revision 1.8 / (download) - annotate - [select for diffs], Sat Apr 19 15:57:25 2014 UTC (10 years, 1 month ago) by henning
Branch: MAIN
Changes since 1.7: +2 -4 lines
Diff to previous 1.7 (colored)

ifnet's if_snd becomes a regular ifqueue instead of ifaltq, the need to
keep ifqueue and ifaltq in sync is gone and thus the comment obsolete,
and finally there is no more need to include if_altq.h either

Revision 1.7 / (download) - annotate - [select for diffs], Sat Apr 19 11:26:10 2014 UTC (10 years, 1 month ago) by henning
Branch: MAIN
Changes since 1.6: +0 -69 lines
Diff to previous 1.6 (colored)

the altq versions of the IFQ_* macros can finally go. chances of this
file becoming readable increase.

Revision 1.6 / (download) - annotate - [select for diffs], Thu Mar 27 10:39:23 2014 UTC (10 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.5: +4 -3 lines
Diff to previous 1.5 (colored)

Stop mixing interface address flags with routing entry ones.

Instead of always copying ifa_flags to the routing entry flags when
creating a route by calling rtinit(), explicitly pass the RTF_CLONING
flag when required.  This means ifa_flags are now *only* used to check
if an address has an associated route that was created by the kernel
auto-magically.

ok benno@

Revision 1.5 / (download) - annotate - [select for diffs], Thu Mar 20 13:19:06 2014 UTC (10 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.4: +1 -10 lines
Diff to previous 1.4 (colored)

Do not pull <sys/tree.h> unconditionally in <net/if.h>, only the address
tree and the 80211 nodes need it.

ok henning@, mikeb@

Revision 1.4 / (download) - annotate - [select for diffs], Tue Jan 21 10:18:26 2014 UTC (10 years, 4 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_5_5_BASE, OPENBSD_5_5
Changes since 1.3: +2 -2 lines
Diff to previous 1.3 (colored)

Do not clean the multicast records of an interface when it is destroyed
(unplugged).  Even if it makes no sense to keep them around if the
interface is no more, we cannot safely remove them since pcb multicast
options might keep a pointer to them.

Fixes a user after free introduced by the multicast address linking
rewrite and reported by Alexey Suslikov, thanks!

ok claudio@

Revision 1.3 / (download) - annotate - [select for diffs], Thu Nov 28 11:05:18 2013 UTC (10 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.2: +1 -9 lines
Diff to previous 1.2 (colored)

IFAFREE() was resurrected from the dead with the new bandwith subsytem,
send it back in the Attic.

Revision 1.2 / (download) - annotate - [select for diffs], Thu Nov 28 10:16:44 2013 UTC (10 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.1: +12 -1 lines
Diff to previous 1.1 (colored)

Change the way protocol multicast addresses are linked to an interface.

Instead of linking multicast records to the first configured address of
the corresponding protocol, making this address and its position in the
global list special, add them to a new list directly linked to the
interface descriptor.

This new multicast address list is similar to the address list, all its
elements contain a protocol agnostic part.  This design allows us to
be able to join a multicast group without necessarily having a configured
address.  That means IPv6 multicast kludges are no longer needed.

Another benefit is to be able to add and remove an IP address from an
interface without worrying about multicast records.  That means that the
global IPv4 list is no longer needed since the first configured address
of an interface is no longer special.

This new list might also be extended in the future to contain the
link-layer addresses used to configure hardware filters.

Tested by sthen@ and weerd@, ok mikeb@

Revision 1.1 / (download) - annotate - [select for diffs], Thu Nov 21 17:32:12 2013 UTC (10 years, 6 months ago) by mikeb
Branch: MAIN

split kernel parts of the if.h into a separate header file if_var.h
which allows us to modify ifnet structure in a relatively safe way;
discussed with deraadt, ok mpi

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.