OpenBSD CVS

CVS log for src/sys/kern/uipc_socket.c


[BACK] Up to [local] / src / sys / kern

Request diff between arbitrary revisions


Default branch: MAIN


Revision 1.335 / (download) - annotate - [select for diffs], Fri May 17 19:11:14 2024 UTC (3 weeks, 3 days ago) by mvs
Branch: MAIN
CVS Tags: HEAD
Changes since 1.334: +84 -127 lines
Diff to previous 1.334 (colored)

Turn sblock() to `sb_lock' rwlock(9) wrapper for all sockets.

Unify behaviour to all sockets. Now sblock() should be always
taken before solock() in all involved paths as sosend(), soreceive(),
sorflush() and sosplice(). sblock() is fine-grained lock which
serializes socket send and receive routines on `so_rcv' or `so_snd'
buffers. There is no big problem to wait netlock while holding sblock().

This unification removes a lot of temporary "sb_flags & SB_MTXLOCK" code
from sockets layer. This unification makes straight "solock()" and
"sblock()" lock order, no more solock() -> sblock() -> sounlock() ->
solock() -> sbunlock() -> sounlock() chains in sosend() and soreceive()
paths. This unification brings witness(4) support for sblock(), include
NFS involved sockets, which is useful.

Since the witness(4) support was introduced to sblock() with this diff,
some new witness reports appeared.

bulk(1) tests by tb, ok bluhm

Revision 1.334 / (download) - annotate - [select for diffs], Fri May 17 19:02:04 2024 UTC (3 weeks, 3 days ago) by mvs
Branch: MAIN
Changes since 1.333: +2 -1 lines
Diff to previous 1.333 (colored)

Switch AF_KEY sockets to the new locking scheme.

The simplest case. Nothing to change in sockets layer, only set
SB_MTXLOCK on socket buffers.

ok bluhm

Revision 1.333 / (download) - annotate - [select for diffs], Fri May 3 17:43:09 2024 UTC (5 weeks, 3 days ago) by mvs
Branch: MAIN
Changes since 1.332: +32 -27 lines
Diff to previous 1.332 (colored)

Push solock() down to sosend() and remove it from soreceive() paths fro
unix(4) sockets.

Push solock() deep down to sosend() and remove it from soreceive() paths
for unix(4) sockets.

The transmission of unix(4) sockets already half-unlocked because
connected peer is not locked by solock() during sbappend*() call. Use
`sb_mtx' mutex(9) and `sb_lock' rwlock(9) to protect both `so_snd' and
`so_rcv'.

Since the `so_snd' is protected by `sb_mtx' mutex(9) the re-locking
is not required in uipc_rcvd().

Do direct `so_rcv' dispose and cleanup in sofree(). This sockets is
almost dead and unlinked from everywhere include spliced peer, so
concurrent sotask() thread will just exit. This required to keep locks
order between `i_lock' and `sb_lock'. Also this removes re-locking from
sofree() for all sockets.

SB_OWNLOCK became redundant with SB_MTXLOCK, so remove it. SB_MTXLOCK
was kept because checks against SB_MTXLOCK within sb*() routines are mor
consistent.

Feedback and ok bluhm

Revision 1.332 / (download) - annotate - [select for diffs], Thu May 2 11:55:31 2024 UTC (5 weeks, 4 days ago) by mvs
Branch: MAIN
Changes since 1.331: +2 -2 lines
Diff to previous 1.331 (colored)

Pass `sosp' instead of `so' to sblock() when locking `so_snd' within
sosplice().

ok bluhm

Revision 1.331 / (download) - annotate - [select for diffs], Tue Apr 30 17:59:15 2024 UTC (5 weeks, 6 days ago) by mvs
Branch: MAIN
Changes since 1.330: +36 -11 lines
Diff to previous 1.330 (colored)

Push  solock() down to sosend() for SOCK_RAW sockets.

Raw sockets are the simplest inet sockets, so use them to start landing
`sb_mtx' mutex(9) protection for `so_snd' buffer. Now solock() is taken
only around pru_send*(), the rest of sosend() serialized by sblock() and
`sb_mtx'. The unlocked SS_ISCONNECTED check is fine, because
rip{,6}_send() check it. Also, previously the SS_ISCONNECTED could be
lost due to solock() release around following m_getuio().

ok bluhm

Revision 1.330 / (download) - annotate - [select for diffs], Mon Apr 15 21:31:29 2024 UTC (8 weeks ago) by mvs
Branch: MAIN
Changes since 1.329: +61 -62 lines
Diff to previous 1.329 (colored)

Don't take solock() in soreceive() for udp(4) sockets.

These sockets are not connection oriented, they don't call pru_rcvd(),
but they have splicing ability and they set `so_error'.

Splicing ability is the most problem. However, we can hold `sb_mtx'
around `ssp_socket' modifications together with solock(). So the
`sb_mtx' is pretty enough to isspiced() check in soreceive(). The
unlocked `so_sp' dereference is fine, because we set it only once for
the whole socket life-time and we do this before `ssp_socket'
assignment.

We also need to take sblock() before splice sockets, so the sosplice()
and soreceive() are both serialized. Since `sb_mtx' required to unsplice
sockets too, it also serializes somove() with soreceive() regardless on
somove() caller.

The sosplice() was reworked to accept standalone sblock() for udp(4)
sockets.

soreceive() performs unlocked `so_error' check and modification.
Previously, we have no ability to predict which concurrent soreceive()
or sosend() thread will fail and clean `so_error'. With this unlocked
access we could have sosend() and soreceive() threads which fails
together.

`so_error' stored to local `error2' variable because `so_error' could be
overwritten by concurrent sosend() thread.

Tested and ok bluhm

Revision 1.329 / (download) - annotate - [select for diffs], Thu Apr 11 13:32:51 2024 UTC (8 weeks, 4 days ago) by mvs
Branch: MAIN
Changes since 1.328: +79 -32 lines
Diff to previous 1.328 (colored)

Don't take solock() in soreceive() for SOCK_RAW inet sockets.

For inet sockets solock() is the netlock wrapper, so soreceive() could
be performed simultaneous with exclusively locked code paths.

These sockets are not connection oriented, they don't call pru_rcvd(),
they can't be spliced, they don't set `so_error'. Nothing to protect
with solock() in soreceive() path.

`so_rcv' buffer protected by `sb_mtx' mutex(9), but since it released,
sblock() required to serialize concurrent soreceive() and sorflush()
threads. Current sblock() is some kind of rwlock(9) implementation, so
introduce `sb_lock' rwlock(9) and use it directly for that purpose.

The sorflush() and callers were refactored to avoid solock() for raw
inet sockets. This was done to avoid packet processing stop.

Tested and ok bluhm.

Revision 1.328 / (download) - annotate - [select for diffs], Wed Apr 10 12:04:41 2024 UTC (2 months ago) by mvs
Branch: MAIN
Changes since 1.327: +2 -10 lines
Diff to previous 1.327 (colored)

Remove `head' socket re-locking in sonewconn().

uipc_attach() releases solock() because it should be taken after
`unp_gc_lock' rwlock(9) which protects the `unp_link' list. For this
reason, the listening `head' socket should be unlocked too while
sonewconn() calls uipc_attach(). This could be reworked because now
`so_rcv' sockbuf relies on `sb_mtx' mutex(9).

The last one `unp_link' foreach loop within unp_gc() discards sockets
previously marked as UNP_GCDEAD. These sockets are not accessed from the
userland. The only exception is the sosend() threads of connected
sending peers, but they only sbappend*() mbuf(9) to `so_rcv'. So it's
enough to unlink mbuf(9) chain with `sb_mtx' held and discard lockless.

Please note, the existing SS_NEWCONN_WAIT logic was never used because
the listening unix(4) socket protected from concurrent unp_detach() by
vnode(9) lock, however `head' re-locked all times.

ok bluhm

Revision 1.327 / (download) - annotate - [select for diffs], Tue Apr 2 14:23:15 2024 UTC (2 months, 1 week ago) by claudio
Branch: MAIN
Changes since 1.326: +2 -1 lines
Diff to previous 1.326 (colored)

Implement SO_ACCEPTCONN in getsockopt(2)
Requested by robert@
OK mvs@ millert@ deraadt@

Revision 1.326 / (download) - annotate - [select for diffs], Tue Apr 2 12:21:39 2024 UTC (2 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.325: +3 -3 lines
Diff to previous 1.325 (colored)

Remove wrong "temporary udp error" comment in filt_so{read,write}(). Not
only udp(4) sockets set and check `so_error'.

No functional changes.

ok bluhm

Revision 1.325 / (download) - annotate - [select for diffs], Sun Mar 31 14:01:28 2024 UTC (2 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.324: +9 -1 lines
Diff to previous 1.324 (colored)

Allow listen(2) only on sockets of type SOCK_STREAM or SOCK_SEQPACKET.
listen(2) man(1) page clearly prohibits sockets of other types.

Reported-by: syzbot+00450333592fcd38c6fe@syzkaller.appspotmail.com

ok bluhm

Revision 1.324 / (download) - annotate - [select for diffs], Sun Mar 31 13:50:00 2024 UTC (2 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.323: +9 -1 lines
Diff to previous 1.323 (colored)

Mark `so_rcv' sockbuf of udp(4) sockets as SB_OWNLOCK.

sbappend*() and soreceive() of SB_MTXLOCK marked sockets uses `sb_mtx'
mutex(9) for protection, meanwhile buffer usage check and corresponding
sbwait() sleep still serialized by solock(). Mark udp(4) as SB_OWNLOCK
to avoid solock() serialization and rely to `sb_mtx' mutex(9). The
`sb_state' and `sb_flags' modifications must be protected by `sb_mtx'
too.

ok bluhm

Revision 1.323 / (download) - annotate - [select for diffs], Wed Mar 27 22:47:53 2024 UTC (2 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.322: +18 -7 lines
Diff to previous 1.322 (colored)

Introduce SB_OWNLOCK to mark sockets which `so_rcv' buffer modified
outside socket lock.

`sb_mtx' mutex(9) used for this case and it should not be released between
`so_rcv' usage check and corresponding sbwait() sleep. Otherwise wakeup()
could be lost sometimes.

ok bluhm

Revision 1.322 / (download) - annotate - [select for diffs], Tue Mar 26 09:46:47 2024 UTC (2 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.321: +31 -15 lines
Diff to previous 1.321 (colored)

Use `sb_mtx' to protect `so_rcv' receive buffer of unix(4) sockets.

This makes re-locking unnecessary in the uipc_*send() paths, because
it's enough to lock one socket to prevent peer from concurrent
disconnection. As the little bonus, one  unix(4) socket can perform
simultaneous transmission and reception with one exception for
uipc_rcvd(), which still requires the re-lock for connection oriented
sockets.

The socket lock is not held while filt_soread() and filt_soexcept()
called from uipc_*send() through sorwakeup(). However, the unlocked
access to the `so_options', `so_state' and `so_error' is fine.

The receiving socket can't be or became listening socket. It also can't
be disconnected concurrently. This makes immutable SO_ACCEPTCONN,
SS_ISDISCONNECTED and SS_ISCONNECTED bits which are clean and set
respectively.

`so_error' is set on the peer sockets only by unp_detach(), which also
can't be called concurrently on sending socket.

This is also true for filt_fiforead() and filt_fifoexcept(). For other
callers like kevent(2) or doaccept() the socket lock is still held.

ok bluhm

Revision 1.321 / (download) - annotate - [select for diffs], Fri Mar 22 17:34:11 2024 UTC (2 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.320: +1 -2 lines
Diff to previous 1.320 (colored)

Use sorflush() instead of direct unp_scan(..., unp_discard) to discard
dead unix(4) sockets.

The difference in direct unp_scan() and sorflush() is the mbuf(9) chain.
For the first case it is still linked to the `so_rcv', for the second it
is not. This is required to make `sb_mtx' mutex(9) the only `so_rcv'
sockbuf protection and remove socket re-locking from the most of
uipc_*send() paths. The unlinked mbuf(9) chain doesn't require any
protection, so this allows to perform sleeping unp_discard() lockless.

Also, the mbuf(9) chain of the discarded socket still contains addresses
of file descriptors and it is much safer to unlink it before FRELE()
them. This is the reason to commit this diff standalone.

ok bluhm

Revision 1.320 / (download) - annotate - [select for diffs], Mon Feb 12 22:48:27 2024 UTC (3 months, 3 weeks ago) by mvs
Branch: MAIN
CVS Tags: OPENBSD_7_5_BASE, OPENBSD_7_5
Changes since 1.319: +5 -4 lines
Diff to previous 1.319 (colored)

Pass protosw instead of domain structure to soalloc() to get real
`pr_type'. The corresponding domain is referenced as `pr_domain'.
Otherwise dp->dom_protosw->pr_type of inet sockets always points
to inetsw[0].

ok bluhm

Revision 1.319 / (download) - annotate - [select for diffs], Sun Feb 11 21:36:49 2024 UTC (3 months, 4 weeks ago) by mvs
Branch: MAIN
Changes since 1.318: +4 -4 lines
Diff to previous 1.318 (colored)

Release `sb_mtx' mutex(9) before sbunlock().

ok bluhm

Revision 1.318 / (download) - annotate - [select for diffs], Sun Feb 11 18:14:26 2024 UTC (3 months, 4 weeks ago) by mvs
Branch: MAIN
Changes since 1.317: +23 -11 lines
Diff to previous 1.317 (colored)

Use `sb_mtx' instead of `inp_mtx' in receive path for inet sockets.

In soreceve(), we only touch `so_rcv' socket buffer, which has it's own
`sb_mtx' mutex(9) for protection. So, we can avoid solock() in this
path - it's enough to hold `sb_mtx' in soreceive() and around
corresponding sbappend*(). But not right now :)

This time we use shared netlock for some inet sockets in the soreceive()
path. To protect `so_rcv' buffer we use `inp_mtx' mutex(9) and the
pru_lock() to acquire this mutex(9) in socket layer. But the `inp_mtx'
mutex belongs to the PCB. We initialize socket before PCB, tcp(4)
sockets could exist without PCB, so use `sb_mtx' mutex(9) to protect
sockbuf stuff.

This diff mechanically replaces `inp_mtx' by `sb_mtx' in the receive
path. Only for sockets which already use `inp_mtx'. All other sockets
left as is. They will be converted later.

Since the `sb_mtx' is optional, the new SB_MTXLOCK flag introduced. If
this flag is set on `sb_flags', the `sb_mtx' mutex(9) should be taken.
New sb_mtx_lock() and sb_mtx_unlock() was introduced to hide this check.
They are temporary and will be replaced by mtx_enter() when all this
area will be converted to `sb_mtx' mutex(9).

Also, the new sbmtxassertlocked() function introduced to throw
corresponding assertion for SB_MTXLOCK marked buffers. This time only
sbappendaddr() calls it. This function is also temporary and will be
replaced by MTX_ASSERT_LOCKED() later.

ok bluhm

Revision 1.317 / (download) - annotate - [select for diffs], Mon Feb 5 20:21:38 2024 UTC (4 months ago) by mvs
Branch: MAIN
Changes since 1.316: +7 -5 lines
Diff to previous 1.316 (colored)

Use `sb_mtx' mutex(9) to protect `sb_timeo_nsecs'. In most places
solock() is still held because other 'sockbuf' members require it, but
in so{g,s}etopt() paths solock() is avoided.

ok bluhm

Revision 1.316 / (download) - annotate - [select for diffs], Sat Feb 3 22:50:08 2024 UTC (4 months ago) by mvs
Branch: MAIN
Changes since 1.315: +100 -87 lines
Diff to previous 1.315 (colored)

Rework socket buffers locking for shared netlock.

Shared netlock is not sufficient to call so{r,w}wakeup(). The following
sowakeup() modifies `sb_flags' and knote(9) stuff. Unfortunately, we
can't call so{r,w}wakeup() with `inp_mtx' mutex(9) because sowakeup()
also calls pgsigio() which grabs kernel lock.

However, `so*_filtops' callbacks only perform read-only access to the
socket stuff, so it is enough to hold shared netlock only, but the klist
stuff needs to be protected.

This diff introduces `sb_mtx' mutex(9) to protect sockbuf. This time
`sb_mtx' used to protect only `sb_flags' and `sb_klist'.

Now we have soassertlocked_readonly() and soassertlocked(). The first
one is happy if only shared netlock is held, meanwhile the second wants
`so_lock' or pru_lock() be held together with shared netlock.

To keep soassertlocked*() assertions soft, we need to know mutex(9)
state, so new mtx_owned() macro was introduces. Also, the new optional
(*pru_locked)() handler brings the state of pru_lock().

Tests and ok from bluhm.

Revision 1.315 / (download) - annotate - [select for diffs], Fri Jan 26 18:24:23 2024 UTC (4 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.314: +7 -5 lines
Diff to previous 1.314 (colored)

Unlock listen(2). `somaxconn_local' and `sominconn_local' used
respectively to cache values as we do in other places.

ok bluhm

Revision 1.314 / (download) - annotate - [select for diffs], Fri Jan 12 10:48:03 2024 UTC (4 months, 4 weeks ago) by bluhm
Branch: MAIN
Changes since 1.313: +5 -5 lines
Diff to previous 1.313 (colored)

Send UDP packets in parallel.

Sending UDP packets via datagram socket is MP safe now.  Same applies
to raw IPv4 and IPv6, and divert sockets.  Switch sosend() from
exclusive net lock to shared net lock in combination with per socket
lock.  TCP and GRE still use exclusive net lock.

tested by otto@ and florian@
OK mvs@

Revision 1.313 / (download) - annotate - [select for diffs], Thu Jan 11 14:15:11 2024 UTC (4 months, 4 weeks ago) by bluhm
Branch: MAIN
Changes since 1.312: +4 -4 lines
Diff to previous 1.312 (colored)

Use domain name for socket lock.

Syzkaller with witness complains about lock ordering of pf lock
with socket lock.  Socket lock for inet is taken before pf lock.
Pf lock is taken before socket lock for route.  This is a false
positive as route and inet socket locks are distinct.  Witness does
not know this.  Name the socket lock like the domain of the socket,
then rwlock name is used in witness lo_name subtype.  Make domain
names more consistent for locking, they were not used anyway.
Regardless of witness problem, unique lock name for each socket
type make sense.

Reported-by: syzbot+34d22dcbf20d76629c5a@syzkaller.appspotmail.com
Reported-by: syzbot+fde8d07ba74b69d0adfe@syzkaller.appspotmail.com
OK mvs@

Revision 1.312 / (download) - annotate - [select for diffs], Tue Dec 19 21:34:22 2023 UTC (5 months, 3 weeks ago) by bluhm
Branch: MAIN
Changes since 1.311: +3 -2 lines
Diff to previous 1.311 (colored)

Release inpcb mutex while calling sbwait().

As sbwait() may sleep, holding any mutex is not allowed.  Call
pru_unlock() before sbwait() in soreceive().

Bug spotted by sashan@; OK sashan@ mvs@

Revision 1.311 / (download) - annotate - [select for diffs], Tue Dec 19 01:11:21 2023 UTC (5 months, 3 weeks ago) by bluhm
Branch: MAIN
Changes since 1.310: +3 -4 lines
Diff to previous 1.310 (colored)

soreceive() must not hold mutex when calling sblock().

In my recent commit I missed that sblock() may sleep while soreceive()
holds the incpb mutex.  Call pru_lock() after sblock().

Reported-by: syzbot+f79c896ec019553655a0@syzkaller.appspotmail.com
Reported-by: syzbot+08b6f1102e429b2d4f84@syzkaller.appspotmail.com
OK mvs@

Revision 1.310 / (download) - annotate - [select for diffs], Mon Dec 18 13:11:20 2023 UTC (5 months, 3 weeks ago) by bluhm
Branch: MAIN
Changes since 1.309: +11 -1 lines
Diff to previous 1.309 (colored)

Run bind(2) system call in parallel.

For protocols that care about locking, use the shared net lock to
call sobind().  Use the per socket rwlock together with shared net
lock.  This affects protocols UDP, raw IP, and divert.  Move the
inpcb mutex locking into soreceive(), it is only used there.  Add
a comment to describe the current inmplementation of inpcb locking.

OK mvs@ sashan@

Revision 1.309 / (download) - annotate - [select for diffs], Tue Aug 8 22:07:25 2023 UTC (10 months ago) by mvs
Branch: MAIN
CVS Tags: OPENBSD_7_4_BASE, OPENBSD_7_4
Changes since 1.308: +5 -8 lines
Diff to previous 1.308 (colored)

Merge SO_BINDANY cases from both switch blocks within sosetopt(). This
time SO_LINGER case is separated, so there is no reason for dedicated
switch block.

ok bluhm

Revision 1.308 / (download) - annotate - [select for diffs], Tue Aug 8 22:06:27 2023 UTC (10 months ago) by mvs
Branch: MAIN
Changes since 1.307: +14 -33 lines
Diff to previous 1.307 (colored)

Merge SO_SND* with corresponding SO_RCV* cases within sosetopt(). The
only difference is the socket buffer.

As bonus, in the future solock() will be easily replaced by sblock()
instead pushing it down to each SO_SND* and SO_RCV* case.

ok bluhm

Revision 1.307 / (download) - annotate - [select for diffs], Thu Aug 3 09:49:08 2023 UTC (10 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.306: +44 -24 lines
Diff to previous 1.306 (colored)

Move solock() down to sosetopt(). A part of standalone sblock() work.
This movement required because buffers related SO_SND* and SO_RCV*
socket options should be protected with sblock(). However, standalone
sblock() has different lock order with solock() and `so_snd' and
`so_rcv' buffers. At least sblock() for `so_snd' buffer will always be
taken before solock() in the sosend() path.

The (*pr_ctloutput)() call was removed from the SOL_SOCKET level 'else'
branch. Except the SO_RTABLE case where it handled in the special way,
this is null op call.

For SO_SND* and SO_RCV* cases solock() will be replaced by sblock() in
the future.

Feedback from bluhm

Tested by bluhm naddy

ok bluhm

Revision 1.306 / (download) - annotate - [select for diffs], Sat Jul 22 14:30:39 2023 UTC (10 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.305: +3 -2 lines
Diff to previous 1.305 (colored)

Add `sb_state' output to sobuf_print(). It contains SS_CANTSENDMORE,
SS_ISSENDING, SS_CANTRCVMORE and SS_RCVATMARK bits. Also do `sb_flags'
output as hex, it contains flags too.

ok kn bluhm

Revision 1.305 / (download) - annotate - [select for diffs], Tue Jul 4 22:28:24 2023 UTC (11 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.304: +7 -8 lines
Diff to previous 1.304 (colored)

Introduce SBL_WAIT and SBL_NOINTR sbwait() flags.

This refactoring is another step to make standalone socket buffers
locking. sblock() uses M_WAITOK and M_NOWAIT flags passed as the third
argument together with the SB_NOINTR flag on the `sb_flags' to control
sleep behaviour. To perform uninterruptible acquisition, SB_NOINTR flag
should be set before sblock() call. `sb_flags' modification requires to
hold solock() around sblock()/sbunlock() that makes standalone call
impossible.

Also `sb_flags' modifications outside sblock()/sbunlock() makes
uninterruptible acquisition code huge enough. This time only sorflush()
does this (and forgets to restore SB_NOINTR flag, so shutdown(SHUT_RDWR)
call permanently modifies socket locking behaviour) and this looks not
the big problem. But with the standalone socket buffer locking it will
be many such places, so this huge construction is unwanted.

Introduce new SBL_NOINTR flag passed as third sblock() argument. The
sblock() acquisition will be uninterruptible when existing SB_NOINTR
flag is set on `sb_flags' or SBL_NOINTR was passed.

The M_WAITOK and M_NOWAIT flags belongs to malloc(9). It has no M_NOINTR
flag and there is no reason to introduce it. So for consistency reasons
introduce new SBL_WAIT and use it together with SBL_NOINTR instead of
M_WAITOK and M_NOINTR respectively.

ok bluhm

Revision 1.304 / (download) - annotate - [select for diffs], Fri Jun 30 11:52:11 2023 UTC (11 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.303: +2 -2 lines
Diff to previous 1.303 (colored)

Use "newcon" instead of "netlck" as identifier of the sleep reason while
awaiting concurrent sonewconn() threads termination.

ok bluhm

Revision 1.303 / (download) - annotate - [select for diffs], Fri Apr 28 12:53:42 2023 UTC (13 months, 1 week ago) by bluhm
Branch: MAIN
Changes since 1.302: +8 -3 lines
Diff to previous 1.302 (colored)

Add a membar_consumer() for the taskq_create() in sosplice().  Membar
producer and consumer must come in pair and the latter was missing.
Also move the code a bit to make clear which check is needed for
what.
OK mvs@

Revision 1.302 / (download) - annotate - [select for diffs], Mon Apr 24 09:20:09 2023 UTC (13 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.301: +13 -15 lines
Diff to previous 1.301 (colored)

Don't check `so_sp' within sofree(). The following isspliced() and
issplicedback() already have this check.

ok bluhm@

Revision 1.301 / (download) - annotate - [select for diffs], Fri Feb 10 14:34:17 2023 UTC (15 months, 4 weeks ago) by visa
Branch: MAIN
CVS Tags: OPENBSD_7_3_BASE, OPENBSD_7_3
Changes since 1.300: +2 -2 lines
Diff to previous 1.300 (colored)

Adjust knote(9) API

Make knote(9) lock the knote list internally, and add knote_locked(9)
for the typical situation where the list is already locked.

Remove the KNOTE(9) macro to simplify the API.

Manual page OK jmc@
OK mpi@ mvs@

Revision 1.300 / (download) - annotate - [select for diffs], Thu Feb 2 09:35:07 2023 UTC (16 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.299: +16 -12 lines
Diff to previous 1.299 (colored)

Move the rest of common socket initialization within soalloc().

ok visa@

Revision 1.299 / (download) - annotate - [select for diffs], Fri Jan 27 21:01:59 2023 UTC (16 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.298: +23 -6 lines
Diff to previous 1.298 (colored)

Push solock() down to sogetopt(). It is not required for the most cases.
Also, some cases could be protected with solock_shared().

ok bluhm@

Revision 1.298 / (download) - annotate - [select for diffs], Fri Jan 27 18:46:34 2023 UTC (16 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.297: +9 -9 lines
Diff to previous 1.297 (colored)

Replace selinfo structure by klist in sockbuf. No reason to keep it,
selinfo is just wrapper to klist. netstat(1) and libkvm use socket
structure, but don't touch so_{snd,rcv}.sb_sel.

ok visa@

Revision 1.297 / (download) - annotate - [select for diffs], Mon Jan 23 18:34:24 2023 UTC (16 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.296: +8 -8 lines
Diff to previous 1.296 (colored)

Move SS_ISSENDING flag to `sb_state'. It should belong to the send
buffer as the SS_CANTSENDMORE flag.

ok bluhm@

Revision 1.296 / (download) - annotate - [select for diffs], Mon Jan 23 18:33:34 2023 UTC (16 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.295: +7 -7 lines
Diff to previous 1.295 (colored)

In somove() rename `state' variable to `rcvstate' to make code more
readable. No functional changes.

Proposed by and ok bluhm@

Revision 1.295 / (download) - annotate - [select for diffs], Sun Jan 22 12:05:44 2023 UTC (16 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.294: +16 -13 lines
Diff to previous 1.294 (colored)

Move SS_CANTRCVMORE and SS_RCVATMARK bits from `so_state' to `sb_state' of
receive buffer. As it was done for SS_CANTSENDMORE bit, the definition
kept as is, but now these bits belongs to the `sb_state' of receive
buffer. `sb_state' ored with `so_state' when socket data exporting to the
userland.

ok bluhm@

Revision 1.294 / (download) - annotate - [select for diffs], Sat Jan 21 11:23:23 2023 UTC (16 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.293: +11 -9 lines
Diff to previous 1.293 (colored)

Introduce per-sockbuf `sb_state' to use it with SS_CANTSENDMORE.

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state.

Opposing the previous reverted diff, the SS_CANTSENDMORE definition left
as is, but it used only with `sb_state'. `sb_state' ored with original
`so_state' when socket's data exported to the userland, so the ABI kept as
it was.

Inputs from deraadt@.

ok bluhm@

Revision 1.293 / (download) - annotate - [select for diffs], Mon Dec 12 08:30:22 2022 UTC (17 months, 4 weeks ago) by tb
Branch: MAIN
Changes since 1.292: +8 -10 lines
Diff to previous 1.292 (colored)

Revert sb_state changes to unbreak tree.

Revision 1.292 / (download) - annotate - [select for diffs], Sun Dec 11 21:19:08 2022 UTC (17 months, 4 weeks ago) by mvs
Branch: MAIN
Changes since 1.291: +11 -9 lines
Diff to previous 1.291 (colored)

This time, socket's buffer lock requires solock() to be held. As a part of
socket buffers standalone locking work, move socket state bits which
represent its buffers state to per buffer state. Introduce `sb_state' and
turn SS_CANTSENDMORE to SBS_CANTSENDMORE. This bit will be processed on
`so_snd' buffer only.

Move SS_CANTRCVMORE and SS_RCVATMARK bits with separate diff to make
review easier and exclude possible so_rcv/so_snd mistypes.

Also, don't adjust the remaining SS_* bits right now.

ok millert@

Revision 1.291 / (download) - annotate - [select for diffs], Mon Nov 28 21:39:28 2022 UTC (18 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.290: +2 -4 lines
Diff to previous 1.290 (colored)

Simplify return path of (*pr_ctloutput)() return value in sogetopt().

ok guenther@ kn@

Revision 1.290 / (download) - annotate - [select for diffs], Mon Oct 3 16:43:52 2022 UTC (20 months, 1 week ago) by bluhm
Branch: MAIN
Changes since 1.289: +6 -5 lines
Diff to previous 1.289 (colored)

System calls should not fail due to temporary memory shortage in
malloc(9) or pool_get(9).
Pass down a wait flag to pru_attach().  During syscall socket(2)
it is ok to wait, this logic was missing for internet pcb.  Pfkey
and route sockets were already waiting.
sonewconn() must not wait when called during TCP 3-way handshake.
This logic has been preserved.  Unix domain stream socket connect(2)
can wait until the other side has created the socket to accept.
OK mvs@

Revision 1.289 / (download) - annotate - [select for diffs], Mon Sep 5 14:56:08 2022 UTC (21 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_7_2_BASE, OPENBSD_7_2
Changes since 1.288: +10 -10 lines
Diff to previous 1.288 (colored)

Use shared netlock in soreceive().  The UDP and IP divert layer
provide locking of the PCB.  If that is possible, use shared instead
of exclusive netlock in soreceive().  The PCB mutex provides a per
socket lock against multiple soreceive() running in parallel.
Release and regrab both locks in sosleep_nsec().
OK mvs@

Revision 1.288 / (download) - annotate - [select for diffs], Sun Sep 4 09:04:27 2022 UTC (21 months ago) by bluhm
Branch: MAIN
Changes since 1.287: +3 -2 lines
Diff to previous 1.287 (colored)

Use pru_send function to check socket splicing compatibility.  Only
checking socket type is not sufficient as it could splice together
unix and inet sockets resulting in crashes.  As splicing is about
sending, the same send function looks like a good criteria.
Reported-by: syzbot+fc6901d63d858d5dd00a@syzkaller.appspotmail.com
Reported-by: syzbot+0e026f1bf8b259c6395e@syzkaller.appspotmail.com
OK gnezdo@

Revision 1.287 / (download) - annotate - [select for diffs], Sat Sep 3 13:29:33 2022 UTC (21 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.286: +2 -2 lines
Diff to previous 1.286 (colored)

Fix socket splicing between inet and inet6 sockets broken by PRU_CONTROL
request splitting to (*pru_control)().

ok bluhm@

Revision 1.286 / (download) - annotate - [select for diffs], Sun Aug 28 18:43:12 2022 UTC (21 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.285: +4 -4 lines
Diff to previous 1.285 (colored)

Don't check `so_pcb' with PR_WANTRCVD flag. tcp(4) sockets are the only
sockets which could have NULL `so_pcb' and we handle this case within
tcp_rcvd() handler.

ok bluhm@

Revision 1.285 / (download) - annotate - [select for diffs], Fri Aug 26 16:17:38 2022 UTC (21 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.284: +4 -4 lines
Diff to previous 1.284 (colored)

Move PRU_RCVD request to (*pru_rcvd)().

ok bluhm@

Revision 1.284 / (download) - annotate - [select for diffs], Sun Aug 21 16:22:17 2022 UTC (21 months, 2 weeks ago) by mvs
Branch: MAIN
Changes since 1.283: +5 -6 lines
Diff to previous 1.283 (colored)

Change soabort() return value to void. We never interesting on it.

ok bluhm@

Revision 1.283 / (download) - annotate - [select for diffs], Mon Aug 15 09:11:38 2022 UTC (21 months, 3 weeks ago) by mvs
Branch: MAIN
Changes since 1.282: +3 -3 lines
Diff to previous 1.282 (colored)

Introduce 'pr_usrreqs' structure and move existing user-protocol
handlers into it. We want to split existing (*pr_usrreq)() to multiple
short handlers for each PRU_ request as it was already done for
PRU_ATTACH and PRU_DETACH. This is the preparation step, (*pr_usrreq)()
split will be done with the following diffs.

Based on reverted diff from guenther@.

ok bluhm@

Revision 1.282 / (download) - annotate - [select for diffs], Sun Aug 14 01:58:28 2022 UTC (21 months, 4 weeks ago) by jsg
Branch: MAIN
Changes since 1.281: +1 -3 lines
Diff to previous 1.281 (colored)

remove unneeded includes in sys/kern
ok mpi@ miod@

Revision 1.281 / (download) - annotate - [select for diffs], Sat Aug 13 21:01:46 2022 UTC (21 months, 4 weeks ago) by mvs
Branch: MAIN
Changes since 1.280: +22 -41 lines
Diff to previous 1.280 (colored)

Introduce the pru_*() wrappers for corresponding (*pr_usrreq)() calls.

This is helpful for the following (*pr_usrreq)() split to multiple
handlers. But right now this makes code more readable.

Also add '#ifndef _SYS_SOCKETVAR_H_' to sys/socketvar.h. This prevents the
collisions when both sys/protosw.h and sys/socketvar.h are included
together. Both 'socket' and 'protosw' structures are required to be
defined before pru_*() wrappers, so we need to include sys/socketvar.h to
sys/protosw.h.

ok bluhm@

Revision 1.280 / (download) - annotate - [select for diffs], Mon Jul 25 07:28:22 2022 UTC (22 months, 2 weeks ago) by visa
Branch: MAIN
Changes since 1.279: +2 -2 lines
Diff to previous 1.279 (colored)

Replace selwakeup() with KNOTE() in socket event activation

Let's try this again now that the kernel locking issue in nfsrv_rcv()
has been fixed.

The previous attempt of the conversion triggered hangs on NFS servers.
This was probably caused by the removal of the kernel-locked section
just prior to the socket upcall. The section had masked a locking error
in NFS code.

Revision 1.279 / (download) - annotate - [select for diffs], Fri Jul 1 09:56:17 2022 UTC (23 months, 1 week ago) by mvs
Branch: MAIN
Changes since 1.278: +77 -5 lines
Diff to previous 1.278 (colored)

Make fine grained unix(4) domain sockets locking. Use the per-socket
`so_lock' rwlock(9) instead of global `unp_lock' which locks the whole
layer.

The PCB of unix(4) sockets are linked to each other and we need to lock
them both. This introduces the lock ordering problem, because when the
thread (1) keeps lock on `so1' and trying to lock `so2', the thread (2)
could hold lock on `so2' and trying to lock `so1'. To solve this we
always lock sockets in the strict order.

For the sockets which are already accessible from userland, we always
lock socket with the smallest memory address first. Sometimes we need to
unlock socket before lock it's peer and lock it again.

We use reference counters for prevent the connected peer destruction
during to relock. We also handle the case where the peer socket was
replaced by another socket.

For the newly connected sockets, which are not yet exported to the
userland by accept(2), we always lock the listening socket `head' first.
This allows us to avoid unwanted relock within accept(2) syscall.

ok claudio@

Revision 1.278 / (download) - annotate - [select for diffs], Mon Jun 6 14:45:41 2022 UTC (2 years ago) by claudio
Branch: MAIN
Changes since 1.277: +55 -54 lines
Diff to previous 1.277 (colored)

Simplify solock() and sounlock(). There is no reason to return a value
for the lock operation and to pass a value to the unlock operation.
sofree() still needs an extra flag to know if sounlock() should be called
or not. But sofree() is called less often and mostly without keeping the lock.
OK mpi@ mvs@

Revision 1.277 / (download) - annotate - [select for diffs], Mon May 9 14:49:55 2022 UTC (2 years, 1 month ago) by visa
Branch: MAIN
Changes since 1.276: +2 -2 lines
Diff to previous 1.276 (colored)

Revert "Replace selwakeup() with KNOTE() in pipe and socket event activation."

The commit caused hangs with NFS.

Reported by ajacoutot@ and naddy@

Revision 1.276 / (download) - annotate - [select for diffs], Fri May 6 13:09:41 2022 UTC (2 years, 1 month ago) by visa
Branch: MAIN
Changes since 1.275: +2 -2 lines
Diff to previous 1.275 (colored)

Replace selwakeup() with KNOTE() in pipe and socket event activation.

OK mpi@

Revision 1.275 / (download) - annotate - [select for diffs], Fri Feb 25 23:51:03 2022 UTC (2 years, 3 months ago) by guenther
Branch: MAIN
CVS Tags: OPENBSD_7_1_BASE, OPENBSD_7_1
Changes since 1.274: +5 -5 lines
Diff to previous 1.274 (colored)

Reported-by: syzbot+1b5b209ce506db4d411d@syzkaller.appspotmail.com
Revert the pr_usrreqs move: syzkaller found a NULL pointer deref
and I won't be available to monitor for followup issues for a bit

Revision 1.274 / (download) - annotate - [select for diffs], Fri Feb 25 08:36:01 2022 UTC (2 years, 3 months ago) by guenther
Branch: MAIN
Changes since 1.273: +5 -5 lines
Diff to previous 1.273 (colored)

Move pr_attach and pr_detach to a new structure pr_usrreqs that can
then be shared among protosw structures, following the same basic
direction as NetBSD and FreeBSD for this.

Split PRU_CONTROL out of pr_usrreq into pru_control, giving it the
proper prototype to eliminate the previously necessary casts.

ok mvs@ bluhm@

Revision 1.273 / (download) - annotate - [select for diffs], Wed Feb 16 13:16:10 2022 UTC (2 years, 3 months ago) by visa
Branch: MAIN
Changes since 1.272: +21 -161 lines
Diff to previous 1.272 (colored)

Reduce code duplication in socket event filters.

OK mpi@

Revision 1.272 / (download) - annotate - [select for diffs], Sun Feb 13 12:58:46 2022 UTC (2 years, 3 months ago) by visa
Branch: MAIN
Changes since 1.271: +5 -5 lines
Diff to previous 1.271 (colored)

Rename knote_modify() to knote_assign()

This avoids verb overlap with f_modify.

Revision 1.271 / (download) - annotate - [select for diffs], Fri Dec 24 06:50:16 2021 UTC (2 years, 5 months ago) by visa
Branch: MAIN
Changes since 1.270: +14 -2 lines
Diff to previous 1.270 (colored)

Make poll/select version of filt_solisten() more similar to soo_poll().

OK mpi@

Revision 1.270 / (download) - annotate - [select for diffs], Mon Dec 13 14:56:55 2021 UTC (2 years, 5 months ago) by visa
Branch: MAIN
Changes since 1.269: +7 -8 lines
Diff to previous 1.269 (colored)

Revise EVFILT_EXCEPT filters

Restrict the circumstances where EVFILT_EXCEPT filters trigger:
* when out-of-band data is present and NOTE_OOB is requested.
* when the channel is fully closed and consumer is poll(2).

This should clarify the logic and suppress events that kqueue-based
poll(2) does not except.

OK mpi@

Revision 1.269 / (download) - annotate - [select for diffs], Thu Nov 11 16:35:09 2021 UTC (2 years, 6 months ago) by mvs
Branch: MAIN
Changes since 1.268: +13 -13 lines
Diff to previous 1.268 (colored)

Destroy protocol control block before perform `so_q0' and `so_q' queues
cleanup.

The dying socket is already unlinked from the file descriptor layer, but
still accessible from the stack or from the file system layer. We need to
unlink the socket to prevent concurrent connection when we unlocked dying
socket while we perform `so_q0' or `so_q' queues cleanup or while we
perform (*pr_detach)(). This unlocking will be appeared with the upcoming
fine grained locked sockets diffs.

ok bluhm@

Revision 1.268 / (download) - annotate - [select for diffs], Sat Nov 6 05:26:33 2021 UTC (2 years, 7 months ago) by visa
Branch: MAIN
Changes since 1.267: +14 -3 lines
Diff to previous 1.267 (colored)

Allocate socket and initialize so_lock in one place

This makes witness(4) use a single lock type for tracking so_lock.
Previously, so_lock was covered by two distinct lock types because there
were separate rw_init() initializers in socreate() and sonewconn().

OK kettenis@

Revision 1.267 / (download) - annotate - [select for diffs], Sun Oct 24 07:02:47 2021 UTC (2 years, 7 months ago) by visa
Branch: MAIN
Changes since 1.266: +45 -13 lines
Diff to previous 1.266 (colored)

Set klist lock for sockets to make socket event filters MP-safe

The filterops instances already provide f_modify and f_process
callbacks with proper internal locking. Locking of socket klists
has been the missing detail for MP-safety.

OK mpi@

Revision 1.266 / (download) - annotate - [select for diffs], Fri Oct 22 15:11:32 2021 UTC (2 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.265: +80 -11 lines
Diff to previous 1.265 (colored)

Make EVFILT_EXCEPT handling separate from the read filter.

This is a change of behavior and events wont be generated if there
is something to read on the fd.  Only EV_EOF or NOTE_OOB will now
be reported.

While here a new filter for FIFOs supporting EV_EOF and __EV_HUP.

ok visa@

Revision 1.265 / (download) - annotate - [select for diffs], Thu Oct 14 23:05:10 2021 UTC (2 years, 7 months ago) by mvs
Branch: MAIN
Changes since 1.264: +3 -1 lines
Diff to previous 1.264 (colored)

Release solock() before call unp_externalize().

A little step forward to make UNIX domain sockets locking fine grained.
The closest goal is to introduce the new rwlock(9) and use it to protect
garbage collector data. This leaves existing `unp_lock' rwlock(9) which
cowers the whole layer for per-socket data only and allows to replace it
with per-socket `so_lock' with further diffs.

Except file descriptor table unp_externalize() operates with the garbage
collector data only such as `unp_rights', `unp_msgcount' directly and
`unp_deferred' through unp_discard(). I want to introduce the new garbage
collector rwlock(9) with the separate diff, so `unp_lock' is still taken
within unp_externalize() around garbage collector data access. But right
now M_WAITOK allocation removed from rwlock(9). Also useless M_WAITOK
allocation and fdplock()/fdpunlock() dances removed from the error path.
The `unp_lock' and fdplock() are not taken together within
unp_externalize() but unp_internalize() still enforces `unp_lock' ->
fdplock() lock order. This rests the only place and will be changed with
the upcoming unp_internalize() and garbage collector rwlock(9) diffs.

ok bluhm@

Revision 1.264 / (download) - annotate - [select for diffs], Mon Jul 26 05:51:13 2021 UTC (2 years, 10 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_7_0_BASE, OPENBSD_7_0
Changes since 1.263: +10 -10 lines
Diff to previous 1.263 (colored)

Pass a socket pointer to various socket buffer routines in preparation for
per-socket locking.

No functional change.

Revision 1.263 / (download) - annotate - [select for diffs], Fri May 28 16:24:53 2021 UTC (3 years ago) by visa
Branch: MAIN
Changes since 1.262: +147 -21 lines
Diff to previous 1.262 (colored)

Add f_modify and f_process callbacks to socket filterops.

This makes kqueue use the extended callback interface with socket event
filters. Now one level of nested kernel locking is avoided, and the
callbacks run without splhigh().

The filterops no longer check NOTE_SUBMIT, and use a fixed locking
pattern instead. The f_event routines are always called with solock(),
whereas f_modify and f_process are always called without the lock.

OK mpi@

Revision 1.262 / (download) - annotate - [select for diffs], Tue May 25 22:45:09 2021 UTC (3 years ago) by bluhm
Branch: MAIN
Changes since 1.261: +5 -3 lines
Diff to previous 1.261 (colored)

As network features are not added dynamically, the domain structures
are constant.  Having more const makes MP review easier.  More
pointers are mapped read-only in the kernel image.
OK deraadt@ mvs@

Revision 1.261 / (download) - annotate - [select for diffs], Thu May 13 19:43:11 2021 UTC (3 years ago) by mvs
Branch: MAIN
Changes since 1.260: +2 -2 lines
Diff to previous 1.260 (colored)

Do `so_rcv' cleanup with sblock() held.

solock() should be taken before sblock(). soreceive() grabs solock() and
then locks `so_rcv'. But later it releases solock() before call uimove(9).
So concurrent thread which performs soshutdown() could break sorecive()
loop. But `so_rcv' is still locked by sblock() so this soshutdown()
thread will sleep in sorflush() at sblock() call. soshutdown() thread
doesn't release solock() after sblock() call so it has no matter where to
release `so_rcv' - is will be locked until the solock() release.

That's why this strange looking code works fine. This sbunlock() movement
just after `so_rcv' cleanup  affects nothing but makes the code
consistent and clean to understand.

ok mpi@

Revision 1.260 / (download) - annotate - [select for diffs], Thu May 13 18:06:54 2021 UTC (3 years ago) by mvs
Branch: MAIN
Changes since 1.259: +3 -3 lines
Diff to previous 1.259 (colored)

Use NULL instead of 0 for mbuf(9) pointers.

ok millert@

Revision 1.259 / (download) - annotate - [select for diffs], Sat May 1 16:13:12 2021 UTC (3 years, 1 month ago) by mvs
Branch: MAIN
Changes since 1.258: +2 -1 lines
Diff to previous 1.258 (colored)

Implement per-socket `so_lock' rwlock(9) and use it to protect routing
(PF_ROUTE) sockets. This can be done because we have no cases where one
thread should lock two sockets simultaneously.

Against the previous version rtm_senddesync_timer() execution was moved
to process context.

Also this time `so_lock' used for routing sockets only but in the future
it will be used to other socket types too.

tested by claudio@

ok claudio@ bluhm@

Revision 1.258 / (download) - annotate - [select for diffs], Mon Apr 26 08:21:35 2021 UTC (3 years, 1 month ago) by claudio
Branch: MAIN
Changes since 1.257: +1 -2 lines
Diff to previous 1.257 (colored)

Revert per-socket `so_lock' rwlock(9) and use it to protect routing
(PF_ROUTE) sockets. There is a locking issue with timeouts that needs
to be fixed.
Requested by deraadt@

Revision 1.257 / (download) - annotate - [select for diffs], Sun Apr 25 00:00:34 2021 UTC (3 years, 1 month ago) by mvs
Branch: MAIN
Changes since 1.256: +2 -1 lines
Diff to previous 1.256 (colored)

Implement per-socket `so_lock' rwlock(9) and use it to protect routing
(PF_ROUTE) sockets. This can be done because we have no cases where one
thread should lock two sockets simultaneously.

Also this time `so_lock 'used for routing sockets only but in the future
it will be used to other socket types too.

ok bluhm@

Revision 1.256 / (download) - annotate - [select for diffs], Wed Feb 24 13:19:48 2021 UTC (3 years, 3 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_9_BASE, OPENBSD_6_9
Changes since 1.255: +2 -3 lines
Diff to previous 1.255 (colored)

In sorflush() use m_purge() instead of handrolling it.
no objections mvs@

Revision 1.255 / (download) - annotate - [select for diffs], Thu Feb 18 11:40:19 2021 UTC (3 years, 3 months ago) by mvs
Branch: MAIN
Changes since 1.254: +7 -6 lines
Diff to previous 1.254 (colored)

Release mbuf(9) chain with a simple m_freem(9) loop in sorflush().
Passing local copy of socket to sbrelease() is too complicated to just
free receive buffer. We don't allocate large object on the stack. Also
we don't pass unlocked socket to soassertlocked() within sbdrop(). This
was not triggered because we lock the whole layer with one lock.

Also sorflush() is now private to kern/uipc_socket.c, so it's definition
was made to be in accordance.

ok claudio@ mpi@

Revision 1.254 / (download) - annotate - [select for diffs], Sun Jan 17 05:23:34 2021 UTC (3 years, 4 months ago) by visa
Branch: MAIN
Changes since 1.253: +1 -7 lines
Diff to previous 1.253 (colored)

Replace SB_KNOTE and sb_flagsintr with direct checking of klist.

OK mpi@ as part of a larger diff

Revision 1.253 / (download) - annotate - [select for diffs], Sat Jan 9 15:30:38 2021 UTC (3 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.252: +2 -3 lines
Diff to previous 1.252 (colored)

If the loop check in somove(9) goes to release without setting an
error, a broadcast mbuf will stay in the socket buffer forever.
This is bad as multiple mbufs can use up all the space.  Better
report ELOOP, dissolve splicing, and let userland handle it.
OK anton@

Revision 1.252 / (download) - annotate - [select for diffs], Fri Dec 25 12:59:52 2020 UTC (3 years, 5 months ago) by visa
Branch: MAIN
Changes since 1.251: +4 -4 lines
Diff to previous 1.251 (colored)

Refactor klist insertion and removal

Rename klist_{insert,remove}() to klist_{insert,remove}_locked().
These functions assume that the caller has locked the klist. The current
state of locking remains intact because the kernel lock is still used
with all klists.

Add new functions klist_insert() and klist_remove() that lock the klist
internally. This allows some code simplification.

OK mpi@

Revision 1.251 / (download) - annotate - [select for diffs], Sat Dec 12 11:48:54 2020 UTC (3 years, 5 months ago) by jan
Branch: MAIN
Changes since 1.250: +3 -3 lines
Diff to previous 1.250 (colored)

Rename the macro MCLGETI to MCLGETL and removes the dead parameter ifp.

OK dlg@, bluhm@
No Opinion mpi@
Not against it claudio@

Revision 1.250 / (download) - annotate - [select for diffs], Tue Nov 17 14:45:42 2020 UTC (3 years, 6 months ago) by claudio
Branch: MAIN
Changes since 1.249: +7 -6 lines
Diff to previous 1.249 (colored)

Fix handling of MSG_PEEK in soreceive() for the case where an empty
mbuf is encountered in a seqpacket socket.

This diff uses the fact that setting orig_resid to 0 results in soreceive()
to return instead of looping back with the intent to sleep for more data.
orig_resid is now always set to 0 in the control message case (instead of
only if controlp is defined). This is the same behaviour as for the PR_NAME
case.  Additionally orig_resid is set to 0 in the data reader when MSG_PEEK
is used.

Tested in snaps for a while and by anton@

Reported-by: syzbot+4b0e9698b344b0028b14@syzkaller.appspotmail.com

Revision 1.249 / (download) - annotate - [select for diffs], Tue Sep 29 11:48:54 2020 UTC (3 years, 8 months ago) by claudio
Branch: MAIN
Changes since 1.248: +5 -7 lines
Diff to previous 1.248 (colored)

Move the solock() call outside of solisten(). The reason is that the
so_state and splice checks were done without the proper lock which is
incorrect. This is similar to sobind(), soconnect() which also require
the callee to hold the socket lock.
Found by, with and OK mvs@, OK mpi@

Revision 1.248 / (download) - annotate - [select for diffs], Fri Aug 7 14:35:38 2020 UTC (3 years, 10 months ago) by cheloha
Branch: MAIN
CVS Tags: OPENBSD_6_8_BASE, OPENBSD_6_8
Changes since 1.247: +3 -2 lines
Diff to previous 1.247 (colored)

sosplice(9): fully validate idle timeout

The socket splice idle timeout is a timeval, so we need to check that
tv_usec is both non-negative and less than one million.  Otherwise it
isn't in canonical form.

We can check for this with timerisvalid(3).

benno@ says this shouldn't break anything in base.

ok benno@, bluhm@

Revision 1.247 / (download) - annotate - [select for diffs], Mon Jun 22 13:14:32 2020 UTC (3 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.246: +19 -3 lines
Diff to previous 1.246 (colored)

Extend kqueue interface with EVFILT_EXCEPT filter.

This filter, already implemented in macOS and Dragonfly BSD, returns
exceptional conditions like the reception of out-of-band data.

The functionnality is similar to poll(2)'s POLLPRI & POLLRDBAND and
it can be used by the kqfilter-based poll & select implementation.

ok millert@ on a previous version, ok visa@

Revision 1.246 / (download) - annotate - [select for diffs], Thu Jun 18 14:05:21 2020 UTC (3 years, 11 months ago) by mvs
Branch: MAIN
Changes since 1.245: +5 -5 lines
Diff to previous 1.245 (colored)

Compare `so' and `sosp' types just after `sosp' obtaining. We can't splice
sockets from different domains so there is no reason to have locking and memory
allocation in this error path. Also in this case only `so' will be locked by
solock() so we should avoid `sosp' modification.

ok mpi@

Revision 1.245 / (download) - annotate - [select for diffs], Mon Jun 15 15:29:40 2020 UTC (3 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.244: +9 -1 lines
Diff to previous 1.244 (colored)

Set __EV_HUP when the conditions matching poll(2)'s POLLUP are found.

This is only done in poll-compatibility mode, when __EV_POLL is set.

ok visa@, millert@

Revision 1.244 / (download) - annotate - [select for diffs], Sun Apr 12 16:15:18 2020 UTC (4 years, 1 month ago) by anton
Branch: MAIN
CVS Tags: OPENBSD_6_7_BASE, OPENBSD_6_7
Changes since 1.243: +9 -1 lines
Diff to previous 1.243 (colored)

In sosplice(), temporarily release the socket lock before calling
FRELE() as the last reference could be dropped which in turn will cause
soclose() to be called where the socket lock is unconditionally
acquired. Note that this is only a problem for sockets protected by the
non-recursive NET_LOCK() right now.

ok mpi@ visa@

Reported-by: syzbot+7c805a09545d997b924d@syzkaller.appspotmail.com

Revision 1.243 / (download) - annotate - [select for diffs], Tue Apr 7 13:27:51 2020 UTC (4 years, 2 months ago) by visa
Branch: MAIN
Changes since 1.242: +6 -6 lines
Diff to previous 1.242 (colored)

Abstract the head of knote lists. This allows extending the lists,
for example, with locking assertions.

OK mpi@, anton@

Revision 1.231.2.1 / (download) - annotate - [select for diffs], Thu Mar 12 19:33:37 2020 UTC (4 years, 2 months ago) by tb
Branch: OPENBSD_6_5
Changes since 1.231: +9 -3 lines
Diff to previous 1.231 (colored) next main 1.232 (colored)

Fix unlimited recursion caused by local outbound bcast/mcast packet
sent via spliced socket.

Reported-by: syzbot+2f9616f39d3f3b281cfb@syzkaller.appspotmail.com

OK bluhm@

OpenBSD 6.5 errata 033 (6.5/033_sosplice.patch.sig)

Revision 1.234.2.1 / (download) - annotate - [select for diffs], Thu Mar 12 19:33:35 2020 UTC (4 years, 2 months ago) by tb
Branch: OPENBSD_6_6
Changes since 1.234: +9 -3 lines
Diff to previous 1.234 (colored) next main 1.235 (colored)

Fix unlimited recursion caused by local outbound bcast/mcast packet
sent via spliced socket.

Reported-by: syzbot+2f9616f39d3f3b281cfb@syzkaller.appspotmail.com

OK bluhm@

OpenBSD 6.6 errata 023 (6.6/023_sosplice.patch.sig)

Revision 1.242 / (download) - annotate - [select for diffs], Wed Mar 11 22:21:28 2020 UTC (4 years, 3 months ago) by sashan
Branch: MAIN
Changes since 1.241: +9 -3 lines
Diff to previous 1.241 (colored)

Fix unlimited recursion caused by local outbound bcast/mcast packet
sent via spliced socket.

Reported-by: syzbot+2f9616f39d3f3b281cfb@syzkaller.appspotmail.com

OK bluhm@

Revision 1.241 / (download) - annotate - [select for diffs], Thu Feb 20 16:56:52 2020 UTC (4 years, 3 months ago) by visa
Branch: MAIN
Changes since 1.240: +4 -4 lines
Diff to previous 1.240 (colored)

Replace field f_isfd with field f_flags in struct filterops to allow
adding more filter properties without cluttering the struct.

OK mpi@, anton@

Revision 1.240 / (download) - annotate - [select for diffs], Fri Feb 14 14:32:44 2020 UTC (4 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.239: +1 -3 lines
Diff to previous 1.239 (colored)

Push the KERNEL_LOCK() insidge pgsigio() and selwakeup().

The 3 subsystems: signal, poll/select and kqueue can now be addressed
separatly.

Note that bpf(4) and audio(4) currently delay the wakeups to a separate
context in order to respect the KERNEL_LOCK() requirement.  Sockets (UDP,
TCP) and pipes spin to grab the lock for the sames reasons.

ok anton@, visa@

Revision 1.239 / (download) - annotate - [select for diffs], Wed Jan 15 13:17:35 2020 UTC (4 years, 4 months ago) by mpi
Branch: MAIN
Changes since 1.238: +30 -14 lines
Diff to previous 1.238 (colored)

Keep socket timeout intervals in nsecs and use them with tsleep_nsec(9).

Introduce and use TIMEVAL_TO_NSEC() to convert SO_RCVTIMEO/SO_SNDTIMEO
specified values into nanoseconds.  As a side effect it is now possible
to specify a timeout larger that (USHRT_MAX / 100) seconds.

To keep code simple `so_linger' now represents a number of seconds with
0 meaning no timeout or 'infinity'.

Yes, the 0 -> INFSLP API change makes conversions complicated as many
timeout holders are still memset()'d.

Inputs from cheloha@ and bluhm@, ok bluhm@

Revision 1.238 / (download) - annotate - [select for diffs], Tue Dec 31 13:48:32 2019 UTC (4 years, 5 months ago) by visa
Branch: MAIN
Changes since 1.237: +21 -7 lines
Diff to previous 1.237 (colored)

Use C99 designated initializers with struct filterops. In addition,
make the structs const so that the data are put in .rodata.

OK mpi@, deraadt@, anton@, bluhm@

Revision 1.237 / (download) - annotate - [select for diffs], Thu Dec 12 16:33:02 2019 UTC (4 years, 5 months ago) by visa
Branch: MAIN
Changes since 1.236: +16 -3 lines
Diff to previous 1.236 (colored)

Reintroduce socket locking inside socket event filters.

Tested by anton@, sashan@
OK mpi@, anton@, sashan@

Revision 1.236 / (download) - annotate - [select for diffs], Mon Dec 2 21:47:54 2019 UTC (4 years, 6 months ago) by cheloha
Branch: MAIN
Changes since 1.235: +2 -3 lines
Diff to previous 1.235 (colored)

Revert "timeout(9): switch to tickless backend"

It appears to have caused major performance regressions all over the
network stack.

Reported by bluhm@

ok deraadt@

Revision 1.235 / (download) - annotate - [select for diffs], Tue Nov 26 15:27:08 2019 UTC (4 years, 6 months ago) by cheloha
Branch: MAIN
Changes since 1.234: +4 -3 lines
Diff to previous 1.234 (colored)

timeout(9): switch to tickless backend

Rebase the timeout wheel on the system uptime clock.  Timeouts are now
set to run at or after an absolute time as returned by nanouptime(9).
Timeouts are thus "tickless": they expire at a real time on that clock
instead of at a particular value of the global "ticks" variable.

To facilitate this change the timeout struct's .to_time member becomes a
timespec.  Hashing timeouts into a bucket on the wheel changes slightly:
we build a 32-bit hash with 25 bits of seconds (.tv_sec) and 7 bits of
subseconds (.tv_nsec).  7 bits of subseconds means the width of the
lowest wheel level is now 2 seconds on all platforms and each bucket in
that lowest level corresponds to 1/128 seconds on the uptime clock.
These values were chosen to closely align with the current 100hz
hardclock(9) typical on almost all of our platforms.  At 100hz a bucket
is currently ~1/100 seconds wide on the lowest level and the lowest
level itself is ~2.56 seconds wide.  Not a huge change, but a change
nonetheless.

Because a bucket no longer corresponds to a single tick more than one
bucket may be dumped during an average timeout_hardclock_update() call.
On 100hz platforms you now dump ~2 buckets.  On 64hz machines (sh) you
dump ~4 buckets.  On 1024hz machines (alpha) you dump only 1 bucket,
but you are doing extra work in softclock() to reschedule timeouts
that aren't due yet.

To avoid changing current behavior all timeout_add*(9) interfaces
convert their timeout interval into ticks, compute an equivalent
timespec interval, and then add that interval to the timestamp of
the most recent timeout_hardclock_update() call to determine an
absolute deadline.  So all current timeouts still "use" ticks,
but the ticks are faked in the timeout layer.

A new interface, timeout_at_ts(9), is introduced here to bypass this
backwardly compatible behavior.  It will be used in subsequent diffs
to add absolute timeout support for userland and to clean up some of
the messier parts of kernel timekeeping, especially at the syscall
layer.

Because timeouts are based against the uptime clock they are subject to
NTP adjustment via adjtime(2) and adjfreq(2).  Unless you have a crazy
adjfreq(2) adjustment set this will not change the expiration behavior
of your timeouts.

Tons of design feedback from mpi@, visa@, guenther@, and kettenis@.
Additional amd64 testing from anton@ and visa@.  Octeon testing from visa@.
macppc testing from me.

Positive feedback from deraadt@, ok visa@

Revision 1.234 / (download) - annotate - [select for diffs], Mon Jul 22 15:34:07 2019 UTC (4 years, 10 months ago) by robert
Branch: MAIN
CVS Tags: OPENBSD_6_6_BASE
Branch point for: OPENBSD_6_6
Changes since 1.233: +9 -1 lines
Diff to previous 1.233 (colored)

implement SO_DOMAIN and SO_PROTOCOL so that the domain and the protocol
can also be retrieved with getsockopt(3)
it looks like these will also be in the next issue of posix:
http://austingroupbugs.net/view.php?id=840#c2263

ok claudio@, sthen@

Revision 1.233 / (download) - annotate - [select for diffs], Thu Jul 11 11:57:35 2019 UTC (4 years, 11 months ago) by bluhm
Branch: MAIN
Changes since 1.232: +3 -4 lines
Diff to previous 1.232 (colored)

listen(2) should return EINVAL if the socket is connected.
This behavior matches NetBSD, POSIX, and our own man page.
Fix whitespace while here.
from Moritz Buhl; OK millert@

Revision 1.232 / (download) - annotate - [select for diffs], Thu Jul 4 17:42:17 2019 UTC (4 years, 11 months ago) by bluhm
Branch: MAIN
Changes since 1.231: +27 -11 lines
Diff to previous 1.231 (colored)

Remove a useless kernel lock from the TCP socket splicing path.
When send buffer space in the drain socket becomes available, a
task is added to move data, and also the userland was informed.
The latter is not usefull as this would mix a kernel and user stream.
So programs do not wait for this event.  Avoid calling sowakeup()
from sowwakeup(), this also reduces grabing the kernel lock.  Instead
inform the userland about the write event when the splicing is
dissolved in sounsplice().
OK claudio@

Revision 1.227.2.2 / (download) - annotate - [select for diffs], Tue Dec 18 15:10:14 2018 UTC (5 years, 5 months ago) by bluhm
Branch: OPENBSD_6_4
Changes since 1.227.2.1: +11 -3 lines
Diff to previous 1.227.2.1 (colored) to branchpoint 1.227 (colored) next main 1.228 (colored)

When using MSG_WAITALL, soreceive() can sleep while processing the
receive buffer of a stream socket.  Then a new pair of control and
data mbuf can be appended to the mbuf queue.  In this case, terminate
the loop with a short read to prevent a panic.  Userland should
read the control message with the next system call.
found by Greg Steuck; OK claudio@ deraadt@
Reported-by: syzbot+613db18acc3d2149ab94@syzkaller.appspotmail.com

OpenBSD 6.4 errata 009

Revision 1.218.2.2 / (download) - annotate - [select for diffs], Tue Dec 18 15:09:12 2018 UTC (5 years, 5 months ago) by bluhm
Branch: OPENBSD_6_3
Changes since 1.218.2.1: +11 -3 lines
Diff to previous 1.218.2.1 (colored) to branchpoint 1.218 (colored) next main 1.219 (colored)

When using MSG_WAITALL, soreceive() can sleep while processing the
receive buffer of a stream socket.  Then a new pair of control and
data mbuf can be appended to the mbuf queue.  In this case, terminate
the loop with a short read to prevent a panic.  Userland should
read the control message with the next system call.
found by Greg Steuck; OK claudio@ deraadt@
Reported-by: syzbot+613db18acc3d2149ab94@syzkaller.appspotmail.com

OpenBSD 6.3 errata 026

Revision 1.231 / (download) - annotate - [select for diffs], Mon Dec 17 16:46:59 2018 UTC (5 years, 5 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_5_BASE
Branch point for: OPENBSD_6_5
Changes since 1.230: +11 -3 lines
Diff to previous 1.230 (colored)

When using MSG_WAITALL, soreceive() can sleep while processing the
receive buffer of a stream socket.  Then a new pair of control and
data mbuf can be appended to the mbuf queue.  In this case, terminate
the loop with a short read to prevent a panic.  Userland should
read the control message with the next system call.
OK claudio@ deraadt@

Revision 1.230 / (download) - annotate - [select for diffs], Fri Nov 30 09:23:31 2018 UTC (5 years, 6 months ago) by claudio
Branch: MAIN
Changes since 1.229: +2 -2 lines
Diff to previous 1.229 (colored)

Trivial MH_ALIGN/M_ALIGN to m_align conversions.
OK bluhm@

Revision 1.218.2.1 / (download) - annotate - [select for diffs], Thu Nov 29 17:05:00 2018 UTC (5 years, 6 months ago) by bluhm
Branch: OPENBSD_6_3
Changes since 1.218: +12 -10 lines
Diff to previous 1.218 (colored)

When using MSG_PEEK to peak into packets skip control messages holding
SCM_RIGHTS from being sent to the userland since they hold kernel internal
data and it does not make sense to externalize it.
In unp_internalize() check the length more carefully preventing an
underflow in a later calcuation. Using the same CMSG_LEN(0) check
that other cmsghdr handlers implemented.
from claudio@

OpenBSD 6.3 errata 025

Revision 1.227.2.1 / (download) - annotate - [select for diffs], Thu Nov 29 17:02:22 2018 UTC (5 years, 6 months ago) by bluhm
Branch: OPENBSD_6_4
Changes since 1.227: +12 -10 lines
Diff to previous 1.227 (colored)

When using MSG_PEEK to peak into packets skip control messages holding
SCM_RIGHTS from being sent to the userland since they hold kernel internal
data and it does not make sense to externalize it.
In unp_internalize() check the length more carefully preventing an
underflow in a later calcuation. Using the same CMSG_LEN(0) check
that other cmsghdr handlers implemented.
from claudio@

OpenBSD 6.4 errata 006

Revision 1.229 / (download) - annotate - [select for diffs], Wed Nov 21 16:50:49 2018 UTC (5 years, 6 months ago) by claudio
Branch: MAIN
Changes since 1.228: +12 -10 lines
Diff to previous 1.228 (colored)

When using MSG_PEEK to peak into packets skip control messages holding
SCM_RIGHTS from being sent to the userland since they hold kernel internal
data and it does not make sense to externalize it.
OK deraadt@, guenther@, visa@

Revision 1.228 / (download) - annotate - [select for diffs], Mon Nov 19 13:15:37 2018 UTC (5 years, 6 months ago) by visa
Branch: MAIN
Changes since 1.227: +7 -5 lines
Diff to previous 1.227 (colored)

Utilize sigio with sockets.

OK mpi@

Revision 1.227 / (download) - annotate - [select for diffs], Tue Aug 21 12:34:11 2018 UTC (5 years, 9 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_4_BASE
Branch point for: OPENBSD_6_4
Changes since 1.226: +9 -5 lines
Diff to previous 1.226 (colored)

If the control message of IP_SENDSRCADDR did not fit into the socket
buffer together with an UDP packet, sosend(9) returned EWOULDBLOCK.
As it is an persistent problem, EMSGSIZE is the correct error code.
Split the AF_UNIX case into a separate condition and do not change
its logic.  For atomic protocols, check that both data and control
message length fit into the socket buffer.
original bug report from Alexander Markert
discussed with jca@; OK vgross@

Revision 1.226 / (download) - annotate - [select for diffs], Mon Jul 30 12:22:14 2018 UTC (5 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.225: +7 -11 lines
Diff to previous 1.225 (colored)

Use FNONBLOCK instead of SS_NBIO to check/indicate that the I/O mode
for sockets is non-blocking.

This allows us to G/C SS_NBIO.  Having to keep the two flags in sync
in a mp-safe way is complicated.

This change introduce a behavior change in sosplice(), it can now
always block.  However this should not matter much due to the socket
lock being taken beforhand.

ok bluhm@, benno@, visa@

Revision 1.225 / (download) - annotate - [select for diffs], Thu Jul 5 14:45:07 2018 UTC (5 years, 11 months ago) by visa
Branch: MAIN
Changes since 1.224: +16 -4 lines
Diff to previous 1.224 (colored)

Serialize the sosplice taskq allocation. This prevents an unlikely
duplicate allocation that could happen in the future when each socket
has a dedicated lock. Right now, the code path is serialized also by
the NET_LOCK() (and the KERNEL_LOCK()).

OK mpi@

Revision 1.224 / (download) - annotate - [select for diffs], Thu Jun 14 08:46:09 2018 UTC (5 years, 11 months ago) by bluhm
Branch: MAIN
Changes since 1.223: +6 -4 lines
Diff to previous 1.223 (colored)

In soclose() and soaccept() convert the KASSERT(SS_NOFDREF) back
to a panic message.  The latter prints socket pointer and type to
help debugging.
OK mpi@

Revision 1.223 / (download) - annotate - [select for diffs], Wed Jun 6 06:55:22 2018 UTC (6 years ago) by mpi
Branch: MAIN
Changes since 1.222: +28 -23 lines
Diff to previous 1.222 (colored)

Pass the socket to sounlock(), this prepare the terrain for per-socket
locking.

ok visa@, bluhm@

Revision 1.222 / (download) - annotate - [select for diffs], Wed Jun 6 06:47:01 2018 UTC (6 years ago) by mpi
Branch: MAIN
Changes since 1.221: +3 -5 lines
Diff to previous 1.221 (colored)

Asseert that a pfkey or routing socket is referenced by a `fp' instead
of calling sofree(), when its PCB is detached.

This is different from TCP which does not always detach `inpcb's from
sockets.  In the pfkey & routing case caling sofree() there is a noop
whereas for TCP it's needed to free closed connections.

Having fewer sofree() makes it easier to understand the code and move
the locks down.

ok visa@

Revision 1.221 / (download) - annotate - [select for diffs], Tue May 8 15:03:27 2018 UTC (6 years, 1 month ago) by bluhm
Branch: MAIN
Changes since 1.220: +20 -8 lines
Diff to previous 1.220 (colored)

Socket splicing can delay operations by task or timeout.  Introduce
soreaper() that is scheduled onto the timer thread.  soput() is
scheduled from there onto the sosplice task thread.  After that it
is save to pool_put() the socket and splicing data structures.
OK mpi@ visa@

Revision 1.220 / (download) - annotate - [select for diffs], Sun Apr 8 18:57:39 2018 UTC (6 years, 2 months ago) by guenther
Branch: MAIN
Changes since 1.219: +4 -4 lines
Diff to previous 1.219 (colored)

AF_LOCAL was a failed attempt (by POSIX?) to seem less UNIX-specific, but
AF_UNIX is both the historical _and_ standard name, so prefer and recommend
it in the headers, manpages, and kernel.

ok miller@ deraadt@ schwarze@

Revision 1.219 / (download) - annotate - [select for diffs], Tue Mar 27 08:27:29 2018 UTC (6 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.218: +4 -5 lines
Diff to previous 1.218 (colored)

Use a goto to merge multiple error blocks in sosplice().

ok bluhm@

Revision 1.218 / (download) - annotate - [select for diffs], Thu Mar 1 14:11:11 2018 UTC (6 years, 3 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_3_BASE
Branch point for: OPENBSD_6_3
Changes since 1.217: +27 -4 lines
Diff to previous 1.217 (colored)

When socket splicing is involved, delay the pool_put() after the
splicing thread has finished sotask() with the socket to be freed.
Use after free reported and fix successfully tested by Rivo Nurges.
discussed with mpi@

Revision 1.217 / (download) - annotate - [select for diffs], Mon Feb 19 11:35:41 2018 UTC (6 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.216: +4 -3 lines
Diff to previous 1.216 (colored)

Grab solock() inside soconnect2() instead of asserting for it to be held.

ok millert@

Revision 1.216 / (download) - annotate - [select for diffs], Mon Feb 19 08:59:52 2018 UTC (6 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.215: +3 -3 lines
Diff to previous 1.215 (colored)

Remove almost unused `flags' argument of suser().

The account flag `ASU' will no longer be set but that makes suser()
mpsafe since it no longer mess with a per-process field.

No objection from millert@, ok tedu@, bluhm@

Revision 1.215 / (download) - annotate - [select for diffs], Wed Jan 10 18:14:34 2018 UTC (6 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.214: +3 -2 lines
Diff to previous 1.214 (colored)

Mark sosplice task mp safe, do not grab kernel lock for tcp output.
OK mpi@

Revision 1.214 / (download) - annotate - [select for diffs], Tue Jan 9 15:14:23 2018 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.213: +2 -2 lines
Diff to previous 1.213 (colored)

Change `so_state' and `so_error' to unsigned int such that they can
be atomically read from any context.

ok bluhm@, visa@

Revision 1.213 / (download) - annotate - [select for diffs], Tue Jan 2 12:54:07 2018 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.212: +3 -7 lines
Diff to previous 1.212 (colored)

Do not memset() the whole structure in sorflush() to keep `sb_flagsintr'
untouched.

ok bluhm@, visa@

Revision 1.212 / (download) - annotate - [select for diffs], Tue Dec 19 09:29:37 2017 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.211: +5 -5 lines
Diff to previous 1.211 (colored)

Remove unnecessary unlock/lock dance when following a goto.

ok bluhm@

Revision 1.211 / (download) - annotate - [select for diffs], Mon Dec 18 10:07:55 2017 UTC (6 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.210: +3 -16 lines
Diff to previous 1.210 (colored)

Revert grabbing the socket lock in kqueue(2) filters.

This change exposed or created a situation where a CPU started to be
irresponsive while holding the KERNEL_LOCK().  These led to lockups and
even with MP_LOCKDEBUG it was not clear what happened to this CPU.

These situations have been experience by dhill@ with dcrwallet and jcs@
with syncthing.  Both applications are written in Go and do kevent(2)
& networking across multiple threads.

Revision 1.210 / (download) - annotate - [select for diffs], Sun Dec 10 11:31:54 2017 UTC (6 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.209: +16 -15 lines
Diff to previous 1.209 (colored)

Move SB_SPLICE, SB_WAIT and SB_SEL to `sb_flags', serialized by solock().

SB_KNOTE remains the only bit set on `sb_flagsintr' as it is set/unset in
contexts related to kqueue(2) where we'd like to avoid grabbing solock().

While here add some KERNEL_LOCK()/UNLOCK() dances around selwakeup() and
csignal() to mark which remaining functions need to be addressed in the
socket layer.

ok visa@, bluhm@

Revision 1.209 / (download) - annotate - [select for diffs], Thu Nov 23 13:45:46 2017 UTC (6 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.208: +5 -5 lines
Diff to previous 1.208 (colored)

Constify protocol tables and remove an assert now that ip_deliver() is
mp-safe.

ok bluhm@, visa@

Revision 1.208 / (download) - annotate - [select for diffs], Thu Nov 23 13:42:53 2017 UTC (6 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.207: +12 -12 lines
Diff to previous 1.207 (colored)

We want `sb_flags' to be protected by the socket lock rather than the
KERNEL_LOCK(), so change asserts accordingly.

This is now possible since sblock()/sbunlock() are always called with
the socket lock held.

ok bluhm@, visa@

Revision 1.207 / (download) - annotate - [select for diffs], Sat Nov 4 14:13:53 2017 UTC (6 years, 7 months ago) by mpi
Branch: MAIN
Changes since 1.206: +16 -3 lines
Diff to previous 1.206 (colored)

Make it possible for multiple threads to enter kqueue_scan() in parallel.

This is a requirement to use a sleeping lock inside kqueue filters.
It is now possible, but not recommended, to sleep inside ``f_event''.

Threads iterating over the list of pending events are now recognizing
and skipping other threads' markers.  knote_acquire() and knote_release()
must be used to "own" a knote to make sure no other thread is sleeping
with a reference on it.

Acquire and marker logic taken from DragonFly but the KERNEL_LOCK()
is still serializing the execution of the kqueue code.

This also enable the NET_LOCK() in socket filters.

Tested by abieber@ & juanfra@, run by naddy@ in a bulk, ok visa@, bluhm@

Revision 1.206 / (download) - annotate - [select for diffs], Thu Nov 2 14:01:18 2017 UTC (6 years, 7 months ago) by florian
Branch: MAIN
Changes since 1.205: +4 -3 lines
Diff to previous 1.205 (colored)

Move PRU_DETACH out of pr_usrreq into per proto pr_detach
functions to pave way for more fine grained locking.

Suggested by, comments & OK mpi

Revision 1.205 / (download) - annotate - [select for diffs], Fri Sep 15 19:29:28 2017 UTC (6 years, 8 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_2_BASE, OPENBSD_6_2
Changes since 1.204: +3 -3 lines
Diff to previous 1.204 (colored)

Coverity complains that top == NULL was checked and further down
top->m_pkthdr.len was accessed without check.  See CID 1452933.
In fact top cannot be NULL there and the condition was always false.
m_getuio() did never reserve space for the header.  The correct
check is m == top to find the first mbuf.
OK visa@

Revision 1.204 / (download) - annotate - [select for diffs], Mon Sep 11 11:15:52 2017 UTC (6 years, 9 months ago) by bluhm
Branch: MAIN
Changes since 1.203: +5 -2 lines
Diff to previous 1.203 (colored)

Coverty complains that the return value of sblock() is not checked
in sorflush(), but in other places it is.  See CID 1453099.  The
flags SB_NOINTR and M_WAITOK should avoid failure.  Put an assert
there to be sure.
OK visa@ mpi@

Revision 1.203 / (download) - annotate - [select for diffs], Fri Sep 1 15:05:31 2017 UTC (6 years, 9 months ago) by mpi
Branch: MAIN
Changes since 1.202: +28 -53 lines
Diff to previous 1.202 (colored)

Change sosetopt() to no longer free the mbuf it receives and change
all the callers to call m_freem(9).

Support from deraadt@ and tedu@, ok visa@, bluhm@

Revision 1.202 / (download) - annotate - [select for diffs], Tue Aug 22 09:13:36 2017 UTC (6 years, 9 months ago) by mpi
Branch: MAIN
Changes since 1.201: +4 -17 lines
Diff to previous 1.201 (colored)

Make sogetopt(9) caller responsible for allocating an MT_SOOPTS mbuf.

Move a blocking memory allocation out of the socket lock and create
a simpler alloc/free pattern to review.  Now both m_get() and m_free()
are in the same place.

Discussed with bluhm@.

Encouragements from deraadt@ and tedu@, ok kettenis@, florian@, visa@

Revision 1.201 / (download) - annotate - [select for diffs], Thu Aug 10 19:20:43 2017 UTC (6 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.200: +4 -4 lines
Diff to previous 1.200 (colored)

Move the solock()/sounlock() dance outside of sobind().

ok phessler@, visa@, bluhm@

Revision 1.200 / (download) - annotate - [select for diffs], Thu Aug 10 16:48:25 2017 UTC (6 years, 10 months ago) by bluhm
Branch: MAIN
Changes since 1.199: +6 -6 lines
Diff to previous 1.199 (colored)

The socket field so_proto can never be NULL.  Remove the checks.
OK mpi@ visa@

Revision 1.199 / (download) - annotate - [select for diffs], Wed Aug 9 14:22:58 2017 UTC (6 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.198: +10 -30 lines
Diff to previous 1.198 (colored)

Move the socket lock "above" sosetopt(), sogetopt() and sosplice().

Protect the fields modifieds by sosetopt() and simplify the dance
with the stars.

ok bluhm@

Revision 1.152.2.1 / (download) - annotate - [select for diffs], Wed Aug 2 22:22:55 2017 UTC (6 years, 10 months ago) by bluhm
Branch: OPENBSD_6_0
Changes since 1.152: +16 -5 lines
Diff to previous 1.152 (colored) next main 1.153 (colored)

If pool_get() sleeps while allocating additional memory for socket
splicing, another process may allocate it in the meantime.  Then
one of the splicing structures leaked in sosplice().  Recheck that
no struct sosplice exists after a protential sleep.
reported by Ilja Van Sprundel
errata 038

Revision 1.181.4.1 / (download) - annotate - [select for diffs], Tue Aug 1 22:34:18 2017 UTC (6 years, 10 months ago) by bluhm
Branch: OPENBSD_6_1
Changes since 1.181: +16 -5 lines
Diff to previous 1.181 (colored) next main 1.182 (colored)

If pool_get() sleeps while allocating additional memory for socket
splicing, another process may allocate it in the meantime.  Then
one of the splicing structures leaked in sosplice().  Recheck that
no struct sosplice exists after a protential sleep.
reported by Ilja Van Sprundel
errata 025

Revision 1.198 / (download) - annotate - [select for diffs], Thu Jul 27 12:05:36 2017 UTC (6 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.197: +2 -1 lines
Diff to previous 1.197 (colored)

Assert that the KERNEL_LOCK() is held prior to call csignal() and
selwakeup().

ok bluhm@

Revision 1.197 / (download) - annotate - [select for diffs], Mon Jul 24 15:07:39 2017 UTC (6 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.196: +6 -7 lines
Diff to previous 1.196 (colored)

Extend the scope of the socket lock to protect `so_state' in connect(2).

As a side effect, soconnect() and soconnect2() now expect a locked socket,
so update all the callers.

ok bluhm@

Revision 1.196 / (download) - annotate - [select for diffs], Thu Jul 20 09:49:45 2017 UTC (6 years, 10 months ago) by bluhm
Branch: MAIN
Changes since 1.195: +16 -5 lines
Diff to previous 1.195 (colored)

If pool_get() sleeps while allocating additional memory for socket
splicing, another process may allocate it in the meantime.  Then
one of the splicing structures leaked in sosplice().  Recheck that
no struct sosplice exists after a protential sleep.
reported by Ilja Van Sprundel; OK mpi@

Revision 1.195 / (download) - annotate - [select for diffs], Thu Jul 20 08:23:43 2017 UTC (6 years, 10 months ago) by mpi
Branch: MAIN
Changes since 1.194: +14 -9 lines
Diff to previous 1.194 (colored)

Prepare filt_soread() to be locked.  No functionnal change.

ok bluhm@, claudio@, visa@

Revision 1.194 / (download) - annotate - [select for diffs], Thu Jul 13 16:19:38 2017 UTC (6 years, 10 months ago) by bluhm
Branch: MAIN
Changes since 1.193: +4 -3 lines
Diff to previous 1.193 (colored)

Do not unlock the netlock in the goto out error path before it has
been acquired in sosend().  Fixes a kernel lock assertion panic.
OK visa@ mpi@

Revision 1.193 / (download) - annotate - [select for diffs], Sat Jul 8 09:19:02 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.192: +2 -11 lines
Diff to previous 1.192 (colored)

Revert grabbing the socket lock in kqueue filters.

It is unsafe to sleep while iterating the list of pending events in
kqueue_scan().

Reported by abieber@ and juanfra@

Revision 1.192 / (download) - annotate - [select for diffs], Tue Jul 4 12:58:32 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.191: +26 -25 lines
Diff to previous 1.191 (colored)

Always hold the socket lock when calling sblock().

Implicitely protects `so_state' with the socket lock in sosend().

ok visa@, bluhm@

Revision 1.191 / (download) - annotate - [select for diffs], Mon Jul 3 08:29:24 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.190: +23 -14 lines
Diff to previous 1.190 (colored)

Protect `so_state', `so_error' and `so_qlen' with the socket lock in
kqueue filters.

ok millert@, bluhm@, visa@

Revision 1.190 / (download) - annotate - [select for diffs], Tue Jun 27 12:02:43 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.189: +7 -1 lines
Diff to previous 1.189 (colored)

Add missing solock()/sounlock() dances around sbreserve().

While here document an abuse of parent socket's lock.

Problem reported by krw@, analysis and ok bluhm@

Revision 1.189 / (download) - annotate - [select for diffs], Mon Jun 26 09:32:31 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.188: +19 -13 lines
Diff to previous 1.188 (colored)

Assert that the corresponding socket is locked when manipulating socket
buffers.

This is one step towards unlocking TCP input path.  Note that all the
functions asserting for the socket lock are not necessarilly MP-safe.
All the fields of 'struct socket' aren't protected.

Introduce a new kernel-only kqueue hint, NOTE_SUBMIT, to be able to
tell when a filter needs to lock the underlying data structures.  Logic
and name taken from NetBSD.

Tested by Hrvoje Popovski.

ok claudio@, bluhm@, mikeb@

Revision 1.188 / (download) - annotate - [select for diffs], Tue Jun 20 17:13:21 2017 UTC (6 years, 11 months ago) by bluhm
Branch: MAIN
Changes since 1.187: +2 -2 lines
Diff to previous 1.187 (colored)

In ddb print socket bit field so_state in hex to match SS_ defines.

Revision 1.187 / (download) - annotate - [select for diffs], Tue Jun 20 09:10:04 2017 UTC (6 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.186: +2 -2 lines
Diff to previous 1.186 (colored)

Convert sodidle() to timeout_set_proc(9), it needs a process context
to grab the rwlock.

Problem reported by Rivo Nurges.

ok bluhm@

Revision 1.186 / (download) - annotate - [select for diffs], Wed May 31 08:55:10 2017 UTC (7 years ago) by markus
Branch: MAIN
Changes since 1.185: +4 -0 lines
Diff to previous 1.185 (colored)

new socketoption SO_ZEROIZE: zero out all mbufs sent over socket
ok deraadt bluhm

Revision 1.185 / (download) - annotate - [select for diffs], Sat May 27 18:50:53 2017 UTC (7 years ago) by claudio
Branch: MAIN
Changes since 1.184: +3 -2 lines
Diff to previous 1.184 (colored)

Push the NET_LOCK down into PF_KEY so that it can be treated like PF_ROUTE.
Only pfkeyv2_send() needs the NET_LOCK() so grab it at the start and release
at the end.  This should allow to push the locks down in other places.
OK mpi@, bluhm@

Revision 1.184 / (download) - annotate - [select for diffs], Mon May 15 13:00:10 2017 UTC (7 years ago) by mpi
Branch: MAIN
Changes since 1.183: +4 -3 lines
Diff to previous 1.183 (colored)

so_splicelen needs to be protected by the socket lock.  We are now
safe since we're always holding the KERNEL_LOCK() but we want to move
away from that.

Suggested by and ok bluhm@

Revision 1.183 / (download) - annotate - [select for diffs], Mon May 15 12:26:00 2017 UTC (7 years ago) by mpi
Branch: MAIN
Changes since 1.182: +4 -2 lines
Diff to previous 1.182 (colored)

Enable the NET_LOCK(), take 3.

Recursions are still marked as XXXSMP.

ok deraadt@, bluhm@

Revision 1.182 / (download) - annotate - [select for diffs], Sun Apr 2 23:40:08 2017 UTC (7 years, 2 months ago) by deraadt
Branch: MAIN
Changes since 1.181: +3 -3 lines
Diff to previous 1.181 (colored)

Less convoluted code in soshutdown()
ok guenther

Revision 1.181 / (download) - annotate - [select for diffs], Fri Mar 17 17:19:16 2017 UTC (7 years, 2 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_6_1_BASE
Branch point for: OPENBSD_6_1
Changes since 1.180: +2 -4 lines
Diff to previous 1.180 (colored)

Revert the NET_LOCK() and bring back pf's contention lock for release.

For the moment the NET_LOCK() is always taken by threads running under
KERNEL_LOCK().  That means it doesn't buy us anything except a possible
deadlock that we did not spot.  So make sure this doesn't happen, we'll
have plenty of time in the next release cycle to stress test it.

ok visa@

Revision 1.180 / (download) - annotate - [select for diffs], Mon Mar 13 20:18:21 2017 UTC (7 years, 2 months ago) by claudio
Branch: MAIN
Changes since 1.179: +4 -4 lines
Diff to previous 1.179 (colored)

Move PRU_ATTACH out of the pr_usrreq functions into pr_attach.
Attach is quite a different thing to the other PRU functions and
this should make locking a bit simpler. This also removes the ugly
hack on how proto was passed to the attach function.
OK bluhm@ and mpi@ on a previous version

Revision 1.179 / (download) - annotate - [select for diffs], Tue Mar 7 09:23:27 2017 UTC (7 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.178: +3 -2 lines
Diff to previous 1.178 (colored)

Do not grab the NET_LOCK() for routing sockets operations.

The only function that need the lock is rtm_output() as it messes with
the routing table.  So grab the lock there since it is safe to sleep
in a process context.

ok bluhm@

Revision 1.178 / (download) - annotate - [select for diffs], Fri Mar 3 09:41:20 2017 UTC (7 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.177: +2 -6 lines
Diff to previous 1.177 (colored)

Prevent a recursion in the socket layer.

Always defere soreceive() to an nfsd(8) process instead of doing it in
the 'softnet' thread.  Avoiding this recursion ensure that we do not
introduce a new sleeping point by releasing and grabbing the netlock.

Tested by many, committing now in order to find possible performance
regression.

Revision 1.177 / (download) - annotate - [select for diffs], Tue Feb 14 09:46:21 2017 UTC (7 years, 3 months ago) by mpi
Branch: MAIN
Changes since 1.176: +65 -66 lines
Diff to previous 1.176 (colored)

Wrap the NET_LOCK() into a per-socket solock() that does nothing for
unix domain sockets.

This should prevent the multiple deadlock related to unix domain sockets.

Inputs from millert@ and bluhm@, ok bluhm@

Revision 1.176 / (download) - annotate - [select for diffs], Wed Feb 1 20:59:47 2017 UTC (7 years, 4 months ago) by dhill
Branch: MAIN
Changes since 1.175: +21 -11 lines
Diff to previous 1.175 (colored)

In sogetopt, preallocate an mbuf to avoid using sleeping mallocs with
the netlock held.  This also changes the prototypes of the *ctloutput
functions to take an mbuf instead of an mbuf pointer.

help, guidance from bluhm@ and mpi@
ok bluhm@

Revision 1.175 / (download) - annotate - [select for diffs], Fri Jan 27 20:31:42 2017 UTC (7 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.174: +2 -2 lines
Diff to previous 1.174 (colored)

In sosend() the size of the control message for file descriptor
passing is checked.  As the data type has changed in unp_internalize(),
the calculation has to be adapted in sosend().
Found by relayd regress test on i386.
OK millert@

Revision 1.174 / (download) - annotate - [select for diffs], Thu Jan 26 00:08:50 2017 UTC (7 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.173: +2 -2 lines
Diff to previous 1.173 (colored)

Do not hold the netlock while pool_get() may sleep.  It is not
necessary to lock code that initializes a new socket structure
before it has been linked to any global list.
OK mpi@

Revision 1.173 / (download) - annotate - [select for diffs], Wed Jan 25 16:45:50 2017 UTC (7 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.172: +2 -3 lines
Diff to previous 1.172 (colored)

As NET_LOCK() is a read/write lock, it can sleep in sotask().  So
the TASKQ_CANTSLEEP flag is no longer valid for the splicing thread.
OK mikeb@

Revision 1.172 / (download) - annotate - [select for diffs], Wed Jan 25 06:15:50 2017 UTC (7 years, 4 months ago) by mpi
Branch: MAIN
Changes since 1.171: +15 -8 lines
Diff to previous 1.171 (colored)

Enable the NET_LOCK(), take 2.

Recursions are currently known and marked a XXXSMP.

Please report any assert to bugs@

Revision 1.171 / (download) - annotate - [select for diffs], Thu Dec 29 12:12:43 2016 UTC (7 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.170: +4 -7 lines
Diff to previous 1.170 (colored)

Change NET_LOCK()/NET_UNLOCK() to be simple wrappers around
splsoftnet()/splx() until the known issues are fixed.

In other words, stop using a rwlock since it creates a deadlock when
chrome is used.

Issue reported by Dimitris Papastamos and kettenis@

ok visa@

Revision 1.170 / (download) - annotate - [select for diffs], Tue Dec 20 21:15:36 2016 UTC (7 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.169: +11 -11 lines
Diff to previous 1.169 (colored)

Grab the NET_LOCK() in so{s,g}etopt(), pffasttimo() and pfslowtimo().

ok rzalamena@, bluhm@

Revision 1.169 / (download) - annotate - [select for diffs], Mon Dec 19 08:36:49 2016 UTC (7 years, 5 months ago) by mpi
Branch: MAIN
Changes since 1.168: +64 -60 lines
Diff to previous 1.168 (colored)

Introduce the NET_LOCK() a rwlock used to serialize accesses to the parts
of the network stack that are not yet ready to be executed in parallel or
where new sleeping points are not possible.

This first pass replace all the entry points leading to ip_output(). This
is done to not introduce new sleeping points when trying to acquire ART's
write lock, needed when a new L2 entry is created via the RT_RESOLVE.

Inputs from and ok bluhm@, ok dlg@

Revision 1.168 / (download) - annotate - [select for diffs], Tue Nov 29 10:22:30 2016 UTC (7 years, 6 months ago) by jsg
Branch: MAIN
Changes since 1.167: +4 -7 lines
Diff to previous 1.167 (colored)

m_free() and m_freem() test for NULL.  Simplify callers which had their own
NULL tests.

ok mpi@

Revision 1.167 / (download) - annotate - [select for diffs], Wed Nov 23 13:05:53 2016 UTC (7 years, 6 months ago) by bluhm
Branch: MAIN
Changes since 1.166: +11 -3 lines
Diff to previous 1.166 (colored)

Some socket splicing tests on loopback hang with large mbufs and
reduced buffer size.  If the send buffer size is less than the size
of a single mbuf, it will never fit.  So if the send buffer is
empty, split the large mbuf and move only a part.
OK claudio@

Revision 1.166 / (download) - annotate - [select for diffs], Tue Nov 22 10:29:39 2016 UTC (7 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.165: +29 -13 lines
Diff to previous 1.165 (colored)

Enforce that pr_ctloutput is called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@

Revision 1.165 / (download) - annotate - [select for diffs], Mon Nov 21 09:09:06 2016 UTC (7 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.164: +3 -1 lines
Diff to previous 1.164 (colored)

Enforce that pr_usrreq functions are called at IPL_SOFTNET.

This will allow us to keep locking simple as soon as we trade
splsoftnet() for a rwlock.

ok bluhm@, claudio@

Revision 1.164 / (download) - annotate - [select for diffs], Mon Nov 14 08:45:30 2016 UTC (7 years, 6 months ago) by mpi
Branch: MAIN
Changes since 1.163: +8 -11 lines
Diff to previous 1.163 (colored)

Remove splnet() from socket kqueue code.

splnet() was necessary when link state changes were executed from
hardware interrupt handlers, nowdays all the changes are serialized
by the KERNEL_LOCK() so assert that it is held instead.

ok mikeb@

Revision 1.163 / (download) - annotate - [select for diffs], Thu Oct 6 19:09:08 2016 UTC (7 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.162: +1 -8 lines
Diff to previous 1.162 (colored)

Remove redundant comments that say a function must be called at
splsoftnet() if the function does a splsoftassert(IPL_SOFTNET)
anyway.

Revision 1.162 / (download) - annotate - [select for diffs], Thu Oct 6 17:02:10 2016 UTC (7 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.161: +8 -9 lines
Diff to previous 1.161 (colored)

Separate splsoftnet() from variable initialization.
From mpi@'s netlock diff; OK mikeb@

Revision 1.161 / (download) - annotate - [select for diffs], Tue Sep 20 14:27:43 2016 UTC (7 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.160: +10 -4 lines
Diff to previous 1.160 (colored)

Protect soshutdown() with splsoftnet() to define one layer where
we enter networking code.  Fixes an splassert() found by David Hill.
OK mikeb@

Revision 1.160 / (download) - annotate - [select for diffs], Tue Sep 20 11:11:44 2016 UTC (7 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.159: +11 -12 lines
Diff to previous 1.159 (colored)

Add some spl softnet assertions that will help us to find the right
places for the upcoming network lock.  This might trigger some
asserts, but we have to find the missing code paths.
OK mpi@

Revision 1.159 / (download) - annotate - [select for diffs], Thu Sep 15 02:00:16 2016 UTC (7 years, 8 months ago) by dlg
Branch: MAIN
Changes since 1.158: +5 -6 lines
Diff to previous 1.158 (colored)

all pools have their ipl set via pool_setipl, so fold it into pool_init.

the ioff argument to pool_init() is unused and has been for many
years, so this replaces it with an ipl argument. because the ipl
will be set on init we no longer need pool_setipl.

most of these changes have been done with coccinelle using the spatch
below. cocci sucks at formatting code though, so i fixed that by hand.

the manpage and subr_pool.c bits i did myself.

ok tedu@ jmatthew@

@ipl@
expression pp;
expression ipl;
expression s, a, o, f, m, p;
@@
-pool_init(pp, s, a, o, f, m, p);
-pool_setipl(pp, ipl);
+pool_init(pp, s, a, ipl, f, m, p);

Revision 1.158 / (download) - annotate - [select for diffs], Tue Sep 13 07:50:36 2016 UTC (7 years, 8 months ago) by mpi
Branch: MAIN
Changes since 1.157: +3 -3 lines
Diff to previous 1.157 (colored)

Do not raise splsoftnet() recursively in soaccept().

This is not an issue right now, but it will become one when an non
recursive lock will be used.

ok claudio@

Revision 1.157 / (download) - annotate - [select for diffs], Sat Sep 3 14:09:58 2016 UTC (7 years, 9 months ago) by bluhm
Branch: MAIN
Changes since 1.156: +3 -1 lines
Diff to previous 1.156 (colored)

If sosend() cannot allocate a large cluster, try a small one as
fallback.
OK claudio@

Revision 1.156 / (download) - annotate - [select for diffs], Sat Sep 3 11:13:36 2016 UTC (7 years, 9 months ago) by yasuoka
Branch: MAIN
Changes since 1.155: +3 -1 lines
Diff to previous 1.155 (colored)

Return immediately when m_getuio() fails by invalid uio parameter.

ok mikeb bluhm claudio

Revision 1.155 / (download) - annotate - [select for diffs], Thu Aug 25 14:13:19 2016 UTC (7 years, 9 months ago) by bluhm
Branch: MAIN
Changes since 1.154: +48 -9 lines
Diff to previous 1.154 (colored)

Spliced TCP sockets become faster when the output part is running
as its own task thread.  This is inspired by userland copy where a
process also has to go through the scheduler.  This gives the socket
buffer a chance to be filled up and tcp_output() is called less
often and with bigger chunks.
When two kernel tasks share all the workload, the current scheduler
implementation will hang userland processes on single cpu machines.
As a workaround put a yield() into the splicing thread after each
task execution.  This reduces the number of calls of tcp_output()
even more.
OK tedu@ mpi@

Revision 1.154 / (download) - annotate - [select for diffs], Thu Aug 25 13:59:16 2016 UTC (7 years, 9 months ago) by bluhm
Branch: MAIN
Changes since 1.153: +5 -9 lines
Diff to previous 1.153 (colored)

Completely revert the M_WAIT change on the cluster allocation and
bring back the behaviour of rev 1.72.  Although allocating small
mbufs when allocating an mbuf cluster fails seems suboptimal, this
should not be changed as a side effect when introducing m_getuio().
OK claudio@

Revision 1.153 / (download) - annotate - [select for diffs], Mon Aug 22 10:23:42 2016 UTC (7 years, 9 months ago) by claudio
Branch: MAIN
Changes since 1.152: +82 -53 lines
Diff to previous 1.152 (colored)

Refactor the uio to mbuf code out of sosend and start to make use of
MCLGETI and large mbuf clusters. This should speed up local connections
a fair bit. OK dlg@ and bluhm@ (after reverting the M_WAIT change on the
cluster allocation)

Revision 1.141.4.1 / (download) - annotate - [select for diffs], Thu Jul 21 14:31:29 2016 UTC (7 years, 10 months ago) by tedu
Branch: OPENBSD_5_8
Changes since 1.141: +13 -2 lines
Diff to previous 1.141 (colored) next main 1.142 (colored)

I forgot to commit the 5.8 version of the splice fix.
Reminded by Florian Riehm
backport splice loop fix:
On localhost a user program may create a socket splicing loop.
After writing data into this loop, it was spinning forever causing
a kernel hang.  Detect the loop by counting how often the same mbuf
is spliced.  If that happens 128 times, assume that there is a loop
and abort the splicing with ELOOP.
Bug found by tedu@;  OK tedu@ millert@ benno@

Revision 1.149.2.1 / (download) - annotate - [select for diffs], Thu Jul 14 02:56:15 2016 UTC (7 years, 10 months ago) by tedu
Branch: OPENBSD_5_9
Changes since 1.149: +11 -2 lines
Diff to previous 1.149 (colored) next main 1.150 (colored)

backport splice loop fix:
On localhost a user program may create a socket splicing loop.
After writing data into this loop, it was spinning forever causing
a kernel hang.  Detect the loop by counting how often the same mbuf
is spliced.  If that happens 128 times, assume that there is a loop
and abort the splicing with ELOOP.
Bug found by tedu@;  OK tedu@ millert@ benno@

Revision 1.152 / (download) - annotate - [select for diffs], Mon Jun 13 21:24:43 2016 UTC (7 years, 11 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_6_0_BASE
Branch point for: OPENBSD_6_0
Changes since 1.151: +11 -2 lines
Diff to previous 1.151 (colored)

On localhost a user program may create a socket splicing loop.
After writing data into this loop, it was spinning forever causing
a kernel hang.  Detect the loop by counting how often the same mbuf
is spliced.  If that happens 128 times, assume that there is a loop
and abort the splicing with ELOOP.
Bug found by tedu@;  OK tedu@ millert@ benno@

Revision 1.151 / (download) - annotate - [select for diffs], Sun Jun 12 21:42:47 2016 UTC (8 years ago) by bluhm
Branch: MAIN
Changes since 1.150: +2 -2 lines
Diff to previous 1.150 (colored)

Fix format string in ddb show socket.

Revision 1.150 / (download) - annotate - [select for diffs], Mon Mar 14 23:08:06 2016 UTC (8 years, 2 months ago) by krw
Branch: MAIN
Changes since 1.149: +2 -2 lines
Diff to previous 1.149 (colored)

Change a bunch of (<blah> *)0 to NULL.

ok beck@ deraadt@

Revision 1.149 / (download) - annotate - [select for diffs], Fri Jan 15 11:58:34 2016 UTC (8 years, 4 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_5_9_BASE
Branch point for: OPENBSD_5_9
Changes since 1.148: +15 -9 lines
Diff to previous 1.148 (colored)

Improve the socket panic messages further.  claudio@ wants to see
the socket type and dlg@ is interested in the pointers for ddb show
socket.
OK deraadt@ dlg@

Revision 1.148 / (download) - annotate - [select for diffs], Fri Jan 15 11:30:03 2016 UTC (8 years, 4 months ago) by dlg
Branch: MAIN
Changes since 1.147: +2 -1 lines
Diff to previous 1.147 (colored)

print TAILQ_NEXT(so, so_qe) too

Revision 1.147 / (download) - annotate - [select for diffs], Fri Jan 15 11:21:58 2016 UTC (8 years, 4 months ago) by dlg
Branch: MAIN
Changes since 1.146: +87 -1 lines
Diff to previous 1.146 (colored)

add a "show socket" command to ddb

should help inspecting socket issues in the future.

enthusiasm from mpi@ bluhm@ deraadt@

Revision 1.146 / (download) - annotate - [select for diffs], Wed Jan 13 21:39:39 2016 UTC (8 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.145: +7 -7 lines
Diff to previous 1.145 (colored)

To make bug hunting easier, print more information in the soreceive()
and somove() panic messages.
OK phessler@ benno@ deraadt@ mpi@

Revision 1.145 / (download) - annotate - [select for diffs], Wed Jan 6 10:06:50 2016 UTC (8 years, 5 months ago) by stefan
Branch: MAIN
Changes since 1.144: +27 -30 lines
Diff to previous 1.144 (colored)

Prevent integer overflows in sosend() and soreceive() by converting
min()+uiomovei() to ulmin()+uiomove() and re-arranging space computations
in sosend(). The soreceive() part was also reported by Martin Natano.

ok bluhm@ and also discussed with tedu@

Revision 1.144 / (download) - annotate - [select for diffs], Sat Dec 5 10:11:53 2015 UTC (8 years, 6 months ago) by tedu
Branch: MAIN
Changes since 1.143: +1 -5 lines
Diff to previous 1.143 (colored)

remove stale lint annotations

Revision 1.143 / (download) - annotate - [select for diffs], Fri Oct 30 19:47:40 2015 UTC (8 years, 7 months ago) by bluhm
Branch: MAIN
Changes since 1.142: +2 -4 lines
Diff to previous 1.142 (colored)

Let m_resethdr() clear the whole mbuf packet header, not only the
pf part.  This allows to reuse this function in socket splicing.
Reset the mbuf flags that are related to the packet header, but
preserve the data flags.
pair(4) tested by reyk@; sosplice(9) tested by bluhm@; OK mikeb@ reyk@

Revision 1.142 / (download) - annotate - [select for diffs], Mon Aug 24 14:28:25 2015 UTC (8 years, 9 months ago) by bluhm
Branch: MAIN
Changes since 1.141: +3 -2 lines
Diff to previous 1.141 (colored)

Items from pool sosplice_pool are get in process context and put
in soft interrupt.  So the pool needs an IPL_SOFTNET protection.
This fixes a panic: mtx_enter: locking against myself.
While there, call pool_setipl() also for socket_pool.  Although
this pool uses explicit spl protection around pool_get() and
pool_put(), it is better to specify the IPL it is operating on.
OK mpi@ mikeb@

Revision 1.141 / (download) - annotate - [select for diffs], Wed Jul 8 07:21:50 2015 UTC (8 years, 11 months ago) by mpi
Branch: MAIN
CVS Tags: OPENBSD_5_8_BASE
Branch point for: OPENBSD_5_8
Changes since 1.140: +5 -5 lines
Diff to previous 1.140 (colored)

MFREE(9) is dead, long live m_freem(9)!

ok bluhm@, claudio@, dlg@

Revision 1.140 / (download) - annotate - [select for diffs], Tue Jun 30 15:30:17 2015 UTC (8 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.139: +3 -3 lines
Diff to previous 1.139 (colored)

Get rid of the undocumented & temporary* m_copy() macro added for
compatibility with 4.3BSD in September 1989.

*Pick your own definition for "temporary".

ok bluhm@, claudio@, dlg@

Revision 1.139 / (download) - annotate - [select for diffs], Tue Jun 16 11:09:39 2015 UTC (8 years, 11 months ago) by mpi
Branch: MAIN
Changes since 1.138: +2 -2 lines
Diff to previous 1.138 (colored)

Store a unique ID, an interface index, rather than a pointer to the
receiving interface in the packet header of every mbuf.

The interface pointer should now be retrieved when necessary with
if_get().  If a NULL pointer is returned by if_get(), the interface
has probably been destroy/removed and the mbuf should be freed.

Such mechanism will simplify garbage collection of mbufs and limit
problems with dangling ifp pointers.

Tested by jmatthew@ and krw@, discussed with many.

ok mikeb@, bluhm@, dlg@

Revision 1.138 / (download) - annotate - [select for diffs], Wed May 6 08:52:17 2015 UTC (9 years, 1 month ago) by mpi
Branch: MAIN
Changes since 1.137: +2 -2 lines
Diff to previous 1.137 (colored)

Pass a thread pointer instead of its file descriptor table to getsock(9).

Diff from Vitaliy Makkoveev.

Manpage tweak and ok millert@

Revision 1.137 / (download) - annotate - [select for diffs], Sat Mar 14 03:38:51 2015 UTC (9 years, 3 months ago) by jsg
Branch: MAIN
Changes since 1.136: +1 -2 lines
Diff to previous 1.136 (colored)

Remove some includes include-what-you-use claims don't
have any direct symbols used.  Tested for indirect use by compiling
amd64/i386/sparc64 kernels.

ok tedu@ deraadt@

Revision 1.136 / (download) - annotate - [select for diffs], Tue Feb 10 21:56:10 2015 UTC (9 years, 4 months ago) by miod
Branch: MAIN
CVS Tags: OPENBSD_5_7_BASE, OPENBSD_5_7
Changes since 1.135: +4 -4 lines
Diff to previous 1.135 (colored)

First step towards making uiomove() take a size_t size argument:
- rename uiomove() to uiomovei() and update all its users.
- introduce uiomove(), which is similar to uiomovei() but with a size_t.
- rewrite uiomovei() as an uiomove() wrapper.
ok kettenis@

Revision 1.135 / (download) - annotate - [select for diffs], Thu Dec 11 19:21:57 2014 UTC (9 years, 6 months ago) by tedu
Branch: MAIN
Changes since 1.134: +3 -3 lines
Diff to previous 1.134 (colored)

convert bcopy to memcpy/memmove. ok krw

Revision 1.134 / (download) - annotate - [select for diffs], Mon Nov 3 17:20:46 2014 UTC (9 years, 7 months ago) by bluhm
Branch: MAIN
Changes since 1.133: +51 -22 lines
Diff to previous 1.133 (colored)

Put the socket splicing fields into a seperate struct sosplice that
gets only allocated when needed.  This way struct socket shrinks
from 472 to 392 bytes on amd64.  When splicing gets active, another
88 bytes are allocated for struct sosplice.
OK dlg@

Revision 1.133 / (download) - annotate - [select for diffs], Tue Sep 9 02:07:17 2014 UTC (9 years, 9 months ago) by guenther
Branch: MAIN
Changes since 1.132: +2 -9 lines
Diff to previous 1.132 (colored)

Delete the SS_ISCONFIRMING flag that supported delayed connection
confirmation: it was only used for netiso, which was deleted a *decade* ago

ok mpi@ claudio@  ports scan by sthen@

Revision 1.132 / (download) - annotate - [select for diffs], Mon Sep 8 06:24:13 2014 UTC (9 years, 9 months ago) by jsg
Branch: MAIN
Changes since 1.131: +1 -2 lines
Diff to previous 1.131 (colored)

remove uneeded route.h includes
ok miod@ mpi@

Revision 1.131 / (download) - annotate - [select for diffs], Sun Aug 31 01:42:36 2014 UTC (9 years, 9 months ago) by guenther
Branch: MAIN
Changes since 1.130: +2 -2 lines
Diff to previous 1.130 (colored)

Add additional kernel interfaces for setting close-on-exec on fds
when creating them: pipe2(), dup3(), accept4(), MSG_CMSG_CLOEXEC,
SOCK_CLOEXEC.  Includes SOCK_NONBLOCK support.

ok matthew@

Revision 1.130 / (download) - annotate - [select for diffs], Sun Jul 13 15:52:38 2014 UTC (9 years, 11 months ago) by tedu
Branch: MAIN
CVS Tags: OPENBSD_5_6_BASE, OPENBSD_5_6
Changes since 1.129: +4 -4 lines
Diff to previous 1.129 (colored)

bzero -> memset. for the speeds.

Revision 1.129 / (download) - annotate - [select for diffs], Wed Jul 9 15:43:33 2014 UTC (9 years, 11 months ago) by tedu
Branch: MAIN
Changes since 1.128: +2 -2 lines
Diff to previous 1.128 (colored)

spelling

Revision 1.128 / (download) - annotate - [select for diffs], Sun Jun 8 14:17:52 2014 UTC (10 years ago) by miod
Branch: MAIN
Changes since 1.127: +3 -2 lines
Diff to previous 1.127 (colored)

Use memcpy to copy the sogetopt() SO_SPLICE off_t value, for it may not be
correctly aligned. Similar in spirit to 1.119.

Revision 1.127 / (download) - annotate - [select for diffs], Mon Apr 7 10:04:17 2014 UTC (10 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.126: +15 -11 lines
Diff to previous 1.126 (colored)

Retire kernel support for SO_DONTROUTE, this time without breaking
localhost connections.

The plan is to always use the routing table for addresses and routes
resolutions, so there is no future for an option that wants to bypass
it.  This option has never been implemented for IPv6 anyway, so let's
just remove the IPv4 bits that you weren't aware of.

Tested a least by lteo@, guenther@ and chrisz@, ok mikeb@, benno@

Revision 1.126 / (download) - annotate - [select for diffs], Sun Mar 30 21:54:48 2014 UTC (10 years, 2 months ago) by guenther
Branch: MAIN
Changes since 1.125: +3 -3 lines
Diff to previous 1.125 (colored)

Eliminates struct pcred by moving the real and saved ugids into
struct ucred; struct process then directly links to the ucred

Based on a discussion at c2k10 or so before noting that FreeBSD and
NetBSD did this too.

ok matthew@

Revision 1.125 / (download) - annotate - [select for diffs], Fri Mar 28 08:33:51 2014 UTC (10 years, 2 months ago) by sthen
Branch: MAIN
Changes since 1.124: +10 -14 lines
Diff to previous 1.124 (colored)

revert "Retire kernel support for SO_DONTROUTE" diff, which does bad things
for localhost connections. discussed with deraadt@

Revision 1.124 / (download) - annotate - [select for diffs], Thu Mar 27 13:27:28 2014 UTC (10 years, 2 months ago) by mpi
Branch: MAIN
Changes since 1.123: +15 -11 lines
Diff to previous 1.123 (colored)

Retire kernel support for SO_DONTROUTE, since the plan is to always
use the routing table there's no future for an option that wants to
bypass it.  This option has never been implemented for IPv6 anyway,
so let's just remove the IPv4 bits that you weren't aware of.

Tested by florian@, man pages inputs from jmc@, ok benno@

Revision 1.123 / (download) - annotate - [select for diffs], Tue Mar 18 07:01:21 2014 UTC (10 years, 2 months ago) by guenther
Branch: MAIN
Changes since 1.122: +2 -2 lines
Diff to previous 1.122 (colored)

When creating a unix socket, save the PID for pf's log(user), even when
not in the original thread.

ok matthew@

Revision 1.122 / (download) - annotate - [select for diffs], Tue Jan 21 23:57:56 2014 UTC (10 years, 4 months ago) by guenther
Branch: MAIN
CVS Tags: OPENBSD_5_5_BASE, OPENBSD_5_5
Changes since 1.121: +2 -1 lines
Diff to previous 1.121 (colored)

Don't leak kernel stack in timeval padding in getsockopt(SO_{SND,RCV}TIMEO)

ok mikeb@ deraadt@

Revision 1.121 / (download) - annotate - [select for diffs], Sat Jan 11 14:33:48 2014 UTC (10 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.120: +2 -2 lines
Diff to previous 1.120 (colored)

When I created UDP socket splicing, I added the goto nextpkt loop
to splice multiple UDP packets in the m_nextpkt list.  Some profiling
with TCP splicing showed that checking so_rcv.sb_mb is wrong.  It
causes several useless runs through the loop.  Better check for
nextrecord which contains the original m_nextpkt value of the mbuf.
OK mikeb@

Revision 1.120 / (download) - annotate - [select for diffs], Tue Dec 10 21:44:50 2013 UTC (10 years, 6 months ago) by mikeb
Branch: MAIN
Changes since 1.119: +3 -3 lines
Diff to previous 1.119 (colored)

dead assignment;  from david hill, ok claudio

Revision 1.119 / (download) - annotate - [select for diffs], Tue Aug 27 03:32:11 2013 UTC (10 years, 9 months ago) by deraadt
Branch: MAIN
Changes since 1.118: +9 -8 lines
Diff to previous 1.118 (colored)

Manipulate timevals seperately, not inside a mbuf.  Alignment constraints
miod ran into.
ok miod matthew

Revision 1.118 / (download) - annotate - [select for diffs], Fri Apr 5 08:25:30 2013 UTC (11 years, 2 months ago) by tedu
Branch: MAIN
CVS Tags: OPENBSD_5_4_BASE, OPENBSD_5_4
Changes since 1.117: +11 -14 lines
Diff to previous 1.117 (colored)

remove some obsolete casts

Revision 1.117 / (download) - annotate - [select for diffs], Thu Apr 4 18:13:43 2013 UTC (11 years, 2 months ago) by bluhm
Branch: MAIN
Changes since 1.116: +3 -1 lines
Diff to previous 1.116 (colored)

Do not allow the listen(2) syscall for an already connected socket.
This would create a weird set of states in TCP.  FreeBSD has the
same check.
Issue found by and OK guenther@

Revision 1.116 / (download) - annotate - [select for diffs], Wed Mar 27 15:41:04 2013 UTC (11 years, 2 months ago) by bluhm
Branch: MAIN
Changes since 1.115: +16 -18 lines
Diff to previous 1.115 (colored)

Move soidle() into the big #ifdef SOCKET_SPLICE block to have it
all in one place.  Saves one additional #ifdef, no functional change.
OK mikeb@

Revision 1.115 / (download) - annotate - [select for diffs], Tue Mar 19 20:07:14 2013 UTC (11 years, 2 months ago) by bluhm
Branch: MAIN
Changes since 1.114: +5 -3 lines
Diff to previous 1.114 (colored)

After a socket splicing timeout is fired, a network interrupt can
unsplice() the sockets before soidle() goes to splsoftnet.  In this
case, unsplice() was called twice.  So check wether splicing still
exists within the splsoftnet protection.
Uvm fault in sounsplice() reported by keith at scott-land dot net.
OK claudio@

Revision 1.114 / (download) - annotate - [select for diffs], Sat Feb 16 14:34:52 2013 UTC (11 years, 3 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_5_3_BASE, OPENBSD_5_3
Changes since 1.113: +8 -1 lines
Diff to previous 1.113 (colored)

Fix a bug in udp socket splicing in case a packet gets diverted and
spliced and routed to loopback.  The content of the pf header in
the mbuf was keeping the divert information on its way.  Reinitialize
the whole packet header of the mbuf and remove the mbuf tags when
the packet gets spliced.
OK claudio@ markus@

Revision 1.113 / (download) - annotate - [select for diffs], Thu Jan 17 16:30:10 2013 UTC (11 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.112: +82 -12 lines
Diff to previous 1.112 (colored)

Expand the socket splicing functionality from TCP to UDP.  Merge
the code relevant for UDP from sosend() and soreceive() into somove().
That allows the kernel to directly transfer the UDP data from one
socket to another.
OK claudio@

Revision 1.112 / (download) - annotate - [select for diffs], Tue Jan 15 21:48:32 2013 UTC (11 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.111: +5 -2 lines
Diff to previous 1.111 (colored)

Pass an EFBIG error to user land when the maximum splicing length
has been reached.  This creates a read event on the spliced source
socket that can be noticed with select(2).  So the kernel passes
control to the relay process immediately.  This could be used to
log the end of an http request within a persistent connection.
deraadt@ reyk@ mikeb@ like the idea

Revision 1.111 / (download) - annotate - [select for diffs], Tue Jan 15 11:12:57 2013 UTC (11 years, 4 months ago) by bluhm
Branch: MAIN
Changes since 1.110: +7 -7 lines
Diff to previous 1.110 (colored)

Changing the socket buffer flags sb_flags was not interrupt safe
as |= and &= are non-atomic operations.  To avoid additional locks,
put the flags that have to be accessed from interrupt into a separate
sb_flagsintr 32 bit integer field.  sb_flagsintr is protected by
splsoftnet.
Input from miod@ deraadt@; OK deraadt@

Revision 1.110 / (download) - annotate - [select for diffs], Mon Dec 31 13:46:49 2012 UTC (11 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.109: +2 -4 lines
Diff to previous 1.109 (colored)

Put the #ifdef SOCKBUF_DEBUG around sbcheck() into a SBCHECK macro.
That is consistent to the SBLASTRECORDCHK and SBLASTMBUFCHK macros.
OK markus@

Revision 1.109 / (download) - annotate - [select for diffs], Fri Oct 5 01:30:28 2012 UTC (11 years, 8 months ago) by yasuoka
Branch: MAIN
Changes since 1.108: +2 -2 lines
Diff to previous 1.108 (colored)

add send(2) MSG_DONTWAIT support which enables us to choose nonblocking
or blocking for each send(2) call.

diff from UMEZAWA Takeshi
ok bluhm

Revision 1.108 / (download) - annotate - [select for diffs], Thu Sep 20 12:34:18 2012 UTC (11 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.107: +21 -21 lines
Diff to previous 1.107 (colored)

In somove() free the mbufs when necessary instead of freeing them
in the release path.  Especially accessing m in a KDASSERT() could
go wrong.
OK claudio@

Revision 1.107 / (download) - annotate - [select for diffs], Wed Sep 19 20:00:32 2012 UTC (11 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.106: +3 -3 lines
Diff to previous 1.106 (colored)

When a socket is spliced, it may not wakeup the userland for reading.
There was a small race in sorwakeup() where that could happen if
we slept before the SB_SPLICE flag was set.
ok claudio@

Revision 1.106 / (download) - annotate - [select for diffs], Wed Sep 19 19:41:29 2012 UTC (11 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.105: +2 -2 lines
Diff to previous 1.105 (colored)

In somove() make the call to pr_usrreq(PRU_RCVD) under the same
conditions as in soreceive().  My goal is to make socket splicing
less protocol dependent.
ok claudio@

Revision 1.105 / (download) - annotate - [select for diffs], Mon Sep 17 14:33:56 2012 UTC (11 years, 8 months ago) by bluhm
Branch: MAIN
Changes since 1.104: +13 -13 lines
Diff to previous 1.104 (colored)

Fix indent white spaces.

Revision 1.104 / (download) - annotate - [select for diffs], Sun Jul 22 18:11:54 2012 UTC (11 years, 10 months ago) by guenther
Branch: MAIN
CVS Tags: OPENBSD_5_2_BASE, OPENBSD_5_2
Changes since 1.103: +2 -2 lines
Diff to previous 1.103 (colored)

unp_dispose() walks not just the mbuf chain (m_next) but also the packet
chain (m_nextpkt), so the mbuf passed to it must be disconnected completely
from the socket buffer's chains.

Problem noticed by yasuoka@; tweak from krw@, ok deraadt@

Revision 1.103 / (download) - annotate - [select for diffs], Tue Jul 10 11:42:53 2012 UTC (11 years, 11 months ago) by guenther
Branch: MAIN
Changes since 1.102: +4 -6 lines
Diff to previous 1.102 (colored)

For setsockopt(SO_{SND,RCV}TIMEO), convert the timeval to ticks using
tvtohz() so that the rounding is correct and we don't time out a tick early

ok claudio@

Revision 1.102 / (download) - annotate - [select for diffs], Tue Jul 10 09:40:25 2012 UTC (11 years, 11 months ago) by claudio
Branch: MAIN
Changes since 1.101: +12 -6 lines
Diff to previous 1.101 (colored)

Try to cleanup the macro magic because of socket spliceing. Since struct
socket is no longer affected by option SOCKET_SPLICE we can simplyfy the
code. OK bluhm@

Revision 1.101 / (download) - annotate - [select for diffs], Sat Jul 7 18:48:19 2012 UTC (11 years, 11 months ago) by bluhm
Branch: MAIN
Changes since 1.100: +9 -4 lines
Diff to previous 1.100 (colored)

Fix two races in socket splicing.  When somove() gets called from
sosplice() to move the data already there, it might sleep in
m_copym().
Another process must not unsplice during that sleep, so also lock
the receive buffer when sosplice is called with fd -1.
The same sleep can allow network interrupts to modify the socket
buffer.  So use sbsync() to write back modifications within the
loop instead of fixing the socket buffer after the loop.
OK claudio@

Revision 1.100 / (download) - annotate - [select for diffs], Tue Apr 24 16:35:08 2012 UTC (12 years, 1 month ago) by deraadt
Branch: MAIN
Changes since 1.99: +13 -3 lines
Diff to previous 1.99 (colored)

In sosend() for AF_UNIX control message sending, correctly calculate
the size (internalized ones can be larger on some architectures) for
fitting into the socket.  Avoid getting confused by sb_hiwat as well.
This fixes a variety of issues where sendmsg() would fail to deliver
a fd set or fail to wait; even leading to file leakage.
Worked on this with claudio for about a week...

Revision 1.99 / (download) - annotate - [select for diffs], Sun Apr 22 05:43:14 2012 UTC (12 years, 1 month ago) by guenther
Branch: MAIN
Changes since 1.98: +4 -4 lines
Diff to previous 1.98 (colored)

Add struct proc * argument to FRELE() and FILE_SET_MATURE() in
anticipation of further changes to closef().  No binary change.

ok krw@ miod@ deraadt@

Revision 1.98 / (download) - annotate - [select for diffs], Fri Mar 23 15:51:26 2012 UTC (12 years, 2 months ago) by guenther
Branch: MAIN
Changes since 1.97: +3 -3 lines
Diff to previous 1.97 (colored)

Make rusage totals, itimers, and profile settings per-process instead
of per-rthread.  Handling of per-thread tick and runtime counters
inspired by how FreeBSD does it.

ok kettenis@

Revision 1.97 / (download) - annotate - [select for diffs], Sat Mar 17 10:16:41 2012 UTC (12 years, 2 months ago) by dlg
Branch: MAIN
Changes since 1.96: +1 -3 lines
Diff to previous 1.96 (colored)

remove IP_JUMBO, SO_JUMBO, and RTF_JUMBO.

no objection from mcbride@ krw@ markus@ deraadt@

Revision 1.96 / (download) - annotate - [select for diffs], Wed Mar 14 21:27:01 2012 UTC (12 years, 2 months ago) by kettenis
Branch: MAIN
Changes since 1.95: +66 -37 lines
Diff to previous 1.95 (colored)

Close a race that would corrupt a sockbuf because the code that externalizes
an SCM_RIGHTS message may sleep.  Bits and pieces from NetBSD with some
simplifications by yours truly.

Fixes the "receive 1" panic seen by many.

ok guenther@, claudio@

Revision 1.95 / (download) - annotate - [select for diffs], Tue Aug 23 13:44:58 2011 UTC (12 years, 9 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_5_1_BASE, OPENBSD_5_1
Changes since 1.94: +2 -2 lines
Diff to previous 1.94 (colored)

iPrevent that a socket splicing timeout error in one direction is
also added to the other direction.
ok mikeb@

Revision 1.94 / (download) - annotate - [select for diffs], Mon Jul 4 00:33:36 2011 UTC (12 years, 11 months ago) by mikeb
Branch: MAIN
CVS Tags: OPENBSD_5_0_BASE, OPENBSD_5_0
Changes since 1.93: +31 -6 lines
Diff to previous 1.93 (colored)

Implement an idle timeout for the socket splicing.  A new `sp_idle'
field of the `splice' structure can be used to specify a period of
inactivity after which splicing will be dissolved.  ETIMEDOUT error
retrieved with a SO_ERROR indicates the idle timeout expiration.
With comments from and OK bluhm.

Revision 1.93 / (download) - annotate - [select for diffs], Sat Jul 2 22:20:08 2011 UTC (12 years, 11 months ago) by nicm
Branch: MAIN
Changes since 1.92: +2 -2 lines
Diff to previous 1.92 (colored)

kqueue attach functions should return an errno or 0, not a plain 1. Fix
the obvious cases to return EINVAL and ENXIO.

ok tedu deraadt

Revision 1.92 / (download) - annotate - [select for diffs], Mon May 2 13:48:38 2011 UTC (13 years, 1 month ago) by mikeb
Branch: MAIN
Changes since 1.91: +28 -1 lines
Diff to previous 1.91 (colored)

recognize SO_RTABLE socket option at the SOL_SOCKET level;
discussed with and ok claudio

Revision 1.91 / (download) - annotate - [select for diffs], Tue Apr 19 22:33:08 2011 UTC (13 years, 1 month ago) by bluhm
Branch: MAIN
Changes since 1.90: +25 -30 lines
Diff to previous 1.90 (colored)

Put splice cleanup code into a common function sounsplice().
ok claudio@

Revision 1.90 / (download) - annotate - [select for diffs], Mon Apr 4 21:08:26 2011 UTC (13 years, 2 months ago) by claudio
Branch: MAIN
Changes since 1.89: +9 -7 lines
Diff to previous 1.89 (colored)

Plug mbuf leaks in SO_PEERCRED by not double allocating mbufs into
the same variable. Leak found with dlg's magic mbuf leakage finder.
OK henning@, deraadt@

Revision 1.89 / (download) - annotate - [select for diffs], Mon Apr 4 11:10:26 2011 UTC (13 years, 2 months ago) by claudio
Branch: MAIN
Changes since 1.88: +13 -3 lines
Diff to previous 1.88 (colored)

If the socket was half closed then don't let userland change the
socketbuffer size of the closed side since on half close the high
watermark was set to 0.
OK blambert@

Revision 1.88 / (download) - annotate - [select for diffs], Mon Mar 14 01:06:20 2011 UTC (13 years, 3 months ago) by bluhm
Branch: MAIN
Changes since 1.87: +6 -2 lines
Diff to previous 1.87 (colored)

When a process reads from a spliced socket that already got an
end-of-file but still has data in the receive buffer, soreceive()
should block until all data has been moved.
To make kqueue work with socket splicing, it has to report spliced
sockets as non-readable.
ok deraadt@

Revision 1.87 / (download) - annotate - [select for diffs], Sat Mar 12 18:31:41 2011 UTC (13 years, 3 months ago) by bluhm
Branch: MAIN
Changes since 1.86: +8 -1 lines
Diff to previous 1.86 (colored)

There existed a race when a process was trying to read from a spliced
socket.  soreceive() releases splsoftnet for uiomove().  In that
moment, somove() could pull the mbuf from the receive buffer.  After
that, soreceive removed the mbuf again.  The corrupted length
accounting resulted in a panic.
The fix is to block read calls in soreceive() until splicing has
been finished.
just commit deraadt@

Revision 1.86 / (download) - annotate - [select for diffs], Mon Feb 28 16:29:42 2011 UTC (13 years, 3 months ago) by bluhm
Branch: MAIN
CVS Tags: OPENBSD_4_9_BASE, OPENBSD_4_9
Changes since 1.85: +2 -2 lines
Diff to previous 1.85 (colored)

When the maximum splice length has been reached, send out the data
immediately by unsetting the SS_ISSENDING flag.  This prevents a
possible 5 seconds delay in socket splicing.
ok markus@; commit it deraadt@

Revision 1.85 / (download) - annotate - [select for diffs], Fri Jan 7 17:50:42 2011 UTC (13 years, 5 months ago) by bluhm
Branch: MAIN
Changes since 1.84: +359 -2 lines
Diff to previous 1.84 (colored)

Add socket option SO_SPLICE to splice together two TCP sockets.
The data received on the source socket will automatically be sent
on the drain socket.  This allows to write relay daemons with zero
data copy.
ok markus@

Revision 1.84 / (download) - annotate - [select for diffs], Fri Sep 24 02:59:45 2010 UTC (13 years, 8 months ago) by claudio
Branch: MAIN
Changes since 1.83: +5 -3 lines
Diff to previous 1.83 (colored)

TCP send and recv buffer scaling.
Send buffer is scaled by not accounting unacknowledged on the wire
data against the buffer limit. Receive buffer scaling is done similar
to FreeBSD -- measure the delay * bandwith product and base the
buffer on that. The problem is that our RTT measurment is coarse
so it overshoots on low delay links. This does not matter that much
since the recvbuffer is almost always empty.
Add a back pressure mechanism to control the amount of memory
assigned to socketbuffers that kicks in when 80% of the cluster
pool is used.
Increases the download speed from 300kB/s to 4.4MB/s on ftp.eu.openbsd.org.

Based on work by markus@ and djm@.

OK dlg@, henning@, put it in deraadt@

Revision 1.83 / (download) - annotate - [select for diffs], Sat Jul 3 04:44:51 2010 UTC (13 years, 11 months ago) by guenther
Branch: MAIN
CVS Tags: OPENBSD_4_8_BASE, OPENBSD_4_8
Changes since 1.82: +1 -2 lines
Diff to previous 1.82 (colored)

Fix the naming of interfaces and variables for rdomains and rtables
and make it possible to bind sockets (including listening sockets!)
to rtables and not just rdomains.  This changes the name of the
system calls, socket option, and ioctl.  After building with this
you should remove the files /usr/share/man/cat2/[gs]etrdomain.0.

Since this removes the existing [gs]etrdomain() system calls, the
libc major is bumped.

Written by claudio@, criticized^Wcritiqued by me

Revision 1.82 / (download) - annotate - [select for diffs], Fri Jul 2 19:57:15 2010 UTC (13 years, 11 months ago) by tedu
Branch: MAIN
Changes since 1.81: +1 -8 lines
Diff to previous 1.81 (colored)

remove support for compat_sunos (and m68k4k).  ok deraadt guenther

Revision 1.81 / (download) - annotate - [select for diffs], Thu Jul 1 18:47:45 2010 UTC (13 years, 11 months ago) by deraadt
Branch: MAIN
Changes since 1.80: +2 -2 lines
Diff to previous 1.80 (colored)

SO_PEERCRED should return ENOTCONN when the sockets are not connected

Revision 1.80 / (download) - annotate - [select for diffs], Wed Jun 30 19:57:05 2010 UTC (13 years, 11 months ago) by deraadt
Branch: MAIN
Changes since 1.79: +18 -1 lines
Diff to previous 1.79 (colored)

Add getsockopt SOL_SOCKET SO_PEERCRED support. This behaves similar to
getpeereid(2), but also supplies the remote pid.  This is supplied in
a 'struct sockpeercred' (unlike Linux -- they showed how little they
know about real unix by calling theirs 'struct ucred').
ok guenther ajacoutot

Revision 1.79 / (download) - annotate - [select for diffs], Sat Oct 31 12:00:08 2009 UTC (14 years, 7 months ago) by fgsch
Branch: MAIN
CVS Tags: OPENBSD_4_7_BASE, OPENBSD_4_7
Changes since 1.78: +2 -2 lines
Diff to previous 1.78 (colored)

Use suser when possible. Suggested by miod@.
miod@ deraadt@ ok.

Revision 1.78 / (download) - annotate - [select for diffs], Mon Aug 10 16:49:38 2009 UTC (14 years, 10 months ago) by thib
Branch: MAIN
Changes since 1.77: +1 -1 lines
Diff to previous 1.77 (colored)

Don't use char arrays for sleep wchans and reuse them.
just use strings and make things unique.

ok claudio@

Revision 1.77 / (download) - annotate - [select for diffs], Fri Jun 5 00:05:21 2009 UTC (15 years ago) by claudio
Branch: MAIN
CVS Tags: OPENBSD_4_6_BASE, OPENBSD_4_6
Changes since 1.76: +3 -1 lines
Diff to previous 1.76 (colored)

Initial support for routing domains. This allows to bind interfaces to
alternate routing table and separate them from other interfaces in distinct
routing tables. The same network can now be used in any doamin at the same
time without causing conflicts.
This diff is mostly mechanical and adds the necessary rdomain checks accross
net and netinet. L2 and IPv4 are mostly covered still missing pf and IPv6.
input and tested by jsg@, phessler@ and reyk@. "put it in" deraadt@

Revision 1.76 / (download) - annotate - [select for diffs], Sun Mar 15 19:40:41 2009 UTC (15 years, 2 months ago) by miod
Branch: MAIN
Changes since 1.75: +3 -3 lines
Diff to previous 1.75 (colored)

Introduce splsoftassert(), similar to splassert() but for soft interrupt
levels. This will allow for platforms where soft interrupt levels do not
map to real hardware interrupt levels to have soft ipl values overlapping
hard ipl values without breaking spl asserts.

Revision 1.75 / (download) - annotate - [select for diffs], Sun Feb 22 07:47:22 2009 UTC (15 years, 3 months ago) by otto
Branch: MAIN
CVS Tags: OPENBSD_4_5_BASE, OPENBSD_4_5
Changes since 1.74: +5 -3 lines
Diff to previous 1.74 (colored)

fix PR 6082: do not create more fd's than will fit in the message on
the receiving side when passing fd's. ok deraadt@ kettenis@

Revision 1.74 / (download) - annotate - [select for diffs], Tue Jan 13 13:36:12 2009 UTC (15 years, 5 months ago) by blambert
Branch: MAIN
Changes since 1.73: +3 -3 lines
Diff to previous 1.73 (colored)

Change sbreserve() to return 0 on success, 1 on failure, as god intended.
This sort of breaking with traditional and expected behavior annoys me.

"yes!" henning@

Revision 1.73 / (download) - annotate - [select for diffs], Thu Oct 9 16:00:05 2008 UTC (15 years, 8 months ago) by deraadt
Branch: MAIN
Changes since 1.72: +3 -3 lines
Diff to previous 1.72 (colored)

Change sb_timeo to unsigned, so that even if some calculation (ie. n * HZ)
becomes a very large number it will not wrap the short into a negative
number and screw up timeouts.  It will simply become a max of 65535.  Since
this happens when HZ is cranked to a high number, this will still only take
n seconds, or less.  Safer than crashing.
Prompted by PR 5511
ok guenther

Revision 1.72 / (download) - annotate - [select for diffs], Thu Aug 7 17:43:37 2008 UTC (15 years, 10 months ago) by reyk
Branch: MAIN
Changes since 1.71: +2 -2 lines
Diff to previous 1.71 (colored)

don't wait for a free mbuf cluster in sosend() and enter the existing
error handler that was never used before.  this fixes a bug that a
userland process might hang if the system ran out of mbuf clusters or
even other unexpected behaviour in the network drivers.

this bug is very old - it is also found in rev 1.1/stevens v2/44lite2/...

discussed with many
ok markus@ thib@ dlg@

Revision 1.71 / (download) - annotate - [select for diffs], Sat Jun 14 10:55:21 2008 UTC (16 years ago) by mk
Branch: MAIN
CVS Tags: OPENBSD_4_4_BASE, OPENBSD_4_4
Changes since 1.70: +2 -3 lines
Diff to previous 1.70 (colored)

A bunch of pool_get() + bzero() -> pool_get(..., .. | PR_ZERO)
conversions that should shave a few bytes off the kernel.

ok henning, krw, jsing, oga, miod, and thib (``even though i usually prefer
FOO|BAR''; thanks for looking.

Revision 1.70 / (download) - annotate - [select for diffs], Fri May 23 15:51:12 2008 UTC (16 years ago) by thib
Branch: MAIN
Changes since 1.69: +19 -16 lines
Diff to previous 1.69 (colored)

Deal with the situation when TCP nfs mounts timeout and processes
get hung in nfs_reconnect() because they do not have the proper
privilages to bind to a socket, by adding a struct proc * argument
to sobind() (and the *_usrreq() routines, and finally in{6}_pcbbind)
and do the sobind() with proc0 in nfs_connect.

OK markus@, blambert@.
"go ahead" deraadt@.

Fixes an issue reported by bernd@ (Tested by bernd@).
Fixes PR5135 too.

Revision 1.69 / (download) - annotate - [select for diffs], Fri May 9 02:52:15 2008 UTC (16 years, 1 month ago) by markus
Branch: MAIN
Changes since 1.68: +10 -1 lines
Diff to previous 1.68 (colored)

Add SO_BINDANY socket option from BSD/OS.

The option allows a socket to be bound to addresses which are not
local to the machine.  In order to receive packets for these addresses
SO_BINDANY needs to be combined with matching outgoing pf(4) divert
rules, see pf.conf(5).

ok beck@

Revision 1.68 / (download) - annotate - [select for diffs], Fri May 2 06:49:32 2008 UTC (16 years, 1 month ago) by ckuethe
Branch: MAIN
Changes since 1.67: +3 -1 lines
Diff to previous 1.67 (colored)

Make the SO_TIMESTAMP sockopt work. When set, this allows the user to
get a timestamp of when the datagram was accepted (by udp(4), for
example) rather than having to take a timestamp with gettimeofday(2)
when recv(2) returns - possibly several hundreds of microseconds later.
May be of use to those interested in precision network timing schemes
or QoS for media applications. Tested on alpha, amd64, i386 and sparc64.
manpage suggestions from jmc, ok deraadt

Revision 1.67 / (download) - annotate - [select for diffs], Thu Dec 20 17:16:50 2007 UTC (16 years, 5 months ago) by chl
Branch: MAIN
CVS Tags: OPENBSD_4_3_BASE, OPENBSD_4_3
Changes since 1.66: +2 -2 lines
Diff to previous 1.66 (colored)

Remove an obsolete nfs kludge, spotted by Frank Denis (many thanks), also there in NetBSD and FreeBSD trees.

Tested by thib@ who found that it shaved 18min wall clock time of coping a 20G file.

Been in snaps for some time

"looks ok" markus@ "makes sense" blambert@ ok claudio@ thib@

Revision 1.66 / (download) - annotate - [select for diffs], Mon Feb 26 23:53:33 2007 UTC (17 years, 3 months ago) by kurt
Branch: MAIN
CVS Tags: OPENBSD_4_2_BASE, OPENBSD_4_2, OPENBSD_4_1_BASE, OPENBSD_4_1
Changes since 1.65: +2 -1 lines
Diff to previous 1.65 (colored)

exclude control data from the number of bytes returned by FIONREAD ioctl()
by adding a sb_datacc count to sockbuf that counts data excluding
MT_CONTROL and MT_SONAME mbuf types.  w/help from deraadt@.
okay deraadt@ claudio@

Revision 1.65 / (download) - annotate - [select for diffs], Wed Feb 14 00:53:48 2007 UTC (17 years, 4 months ago) by jsg
Branch: MAIN
Changes since 1.64: +2 -2 lines
Diff to previous 1.64 (colored)

Consistently spell FALLTHROUGH to appease lint.
ok kettenis@ cloder@ tom@ henning@

Revision 1.64 / (download) - annotate - [select for diffs], Sat Jun 10 17:05:17 2006 UTC (18 years ago) by beck
Branch: MAIN
CVS Tags: OPENBSD_4_0_BASE, OPENBSD_4_0
Changes since 1.63: +3 -5 lines
Diff to previous 1.63 (colored)

allow SO_SNDBUF and SO_RECVBUF setsockopts on existing sockets to succeed
for any value that is not an increase in size when we are under mbuf pressure,
rather than only succeeding when setting the value to the 4k minimum.
ok markus@, henning@

Revision 1.63 / (download) - annotate - [select for diffs], Sat Mar 4 22:40:15 2006 UTC (18 years, 3 months ago) by brad
Branch: MAIN
Changes since 1.62: +2 -2 lines
Diff to previous 1.62 (colored)

With the exception of two other small uncommited diffs this moves
the remainder of the network stack from splimp to splnet.

ok miod@

Revision 1.62 / (download) - annotate - [select for diffs], Thu Jan 5 05:05:06 2006 UTC (18 years, 5 months ago) by jsg
Branch: MAIN
CVS Tags: OPENBSD_3_9_BASE, OPENBSD_3_9
Changes since 1.61: +26 -62 lines
Diff to previous 1.61 (colored)

ansi/deregister

Revision 1.61 / (download) - annotate - [select for diffs], Fri Sep 16 16:44:43 2005 UTC (18 years, 8 months ago) by deraadt
Branch: MAIN
Changes since 1.60: +1 -3 lines
Diff to previous 1.60 (colored)

backout until we find a socket state for init

Revision 1.60 / (download) - annotate - [select for diffs], Sat Sep 10 19:13:32 2005 UTC (18 years, 9 months ago) by deraadt
Branch: MAIN
Changes since 1.59: +3 -1 lines
Diff to previous 1.59 (colored)

upon shutdown(), if socket is unconnected return ENOTCONN; ok millert

Revision 1.59 / (download) - annotate - [select for diffs], Thu Aug 11 18:20:10 2005 UTC (18 years, 10 months ago) by millert
Branch: MAIN
CVS Tags: OPENBSD_3_8_BASE, OPENBSD_3_8
Changes since 1.58: +11 -7 lines
Diff to previous 1.58 (colored)

Use SHUT_* values directly in soshutdown() instead of converting
to FREAD/FWRITE.  OK deraadt@

Revision 1.58 / (download) - annotate - [select for diffs], Fri May 27 17:16:13 2005 UTC (19 years ago) by dhartmei
Branch: MAIN
Changes since 1.57: +2 -1 lines
Diff to previous 1.57 (colored)

add a field to struct socket that stores the pid of the process that
created the socket, and populate it. ok bob@, henning@

Revision 1.57 / (download) - annotate - [select for diffs], Fri May 27 04:55:27 2005 UTC (19 years ago) by mcbride
Branch: MAIN
Changes since 1.56: +3 -1 lines
Diff to previous 1.56 (colored)

Experimental support for opportunitic use of jumbograms where only some hosts
on the local network support them.

This adds a new socket option, SO_JUMBO, and a new route flag,
RTF_JUMBO. If _both_ the socket option is set and the route for the host
has RTF_JUMBO set, ip_output will fragment the packet to the largest
possible size for the link, ignoring the card's MTU.

The semantics of this feature will be evolving rapidly; talk to us
if you intend to use it.

ok deraadt@ marius@

Revision 1.56 / (download) - annotate - [select for diffs], Thu Nov 18 15:09:07 2004 UTC (19 years, 6 months ago) by markus
Branch: MAIN
CVS Tags: OPENBSD_3_7_BASE, OPENBSD_3_7
Changes since 1.55: +3 -5 lines
Diff to previous 1.55 (colored)

enable receive() accounting and use uio_procp for send() accounting, too
ok deraadt, jared, djm

Revision 1.55 / (download) - annotate - [select for diffs], Thu Sep 16 13:11:01 2004 UTC (19 years, 8 months ago) by markus
Branch: MAIN
Changes since 1.54: +6 -1 lines
Diff to previous 1.54 (colored)

add hint for lower layer that a sosend() is in progress (SS_ISSENDING)
inspired by a posting from David Borman and similar changes in net/freebsd
ok mcbride

Revision 1.54 / (download) - annotate - [select for diffs], Wed Jul 28 15:12:55 2004 UTC (19 years, 10 months ago) by millert
Branch: MAIN
CVS Tags: OPENBSD_3_6_BASE, OPENBSD_3_6
Changes since 1.53: +8 -1 lines
Diff to previous 1.53 (colored)

Call dom_dispose() for any SCM_RIGHTS message that went through the
read path rather than recv.  Previously, if an fd was passed via
sendmsg() but was consumed by the receiver via read() the ref count
was incremented and never decremented and so the ref count would
never reach zero even when there was no long any processes holding
the file open (this was especially bad for locked fds).
OK markus@ and art@

Revision 1.27.4.8 / (download) - annotate - [select for diffs], Sat Jun 5 23:13:02 2004 UTC (20 years ago) by niklas
Branch: SMP
Changes since 1.27.4.7: +20 -12 lines
Diff to previous 1.27.4.7 (colored) to branchpoint 1.27 (colored) next main 1.28 (colored)

Merge with the trunk

Revision 1.53 / (download) - annotate - [select for diffs], Mon Apr 19 22:39:07 2004 UTC (20 years, 1 month ago) by deraadt
Branch: MAIN
CVS Tags: SMP_SYNC_B, SMP_SYNC_A
Changes since 1.52: +12 -4 lines
Diff to previous 1.52 (colored)

also use sbcheckreserve() for setsockopt of SO_SNDBUF and SO_RCVBUF

Revision 1.52 / (download) - annotate - [select for diffs], Thu Apr 1 23:56:05 2004 UTC (20 years, 2 months ago) by tedu
Branch: MAIN
Changes since 1.51: +10 -10 lines
Diff to previous 1.51 (colored)

use NULL for ptrs.  parts from Joris Vink

Revision 1.27.4.7 / (download) - annotate - [select for diffs], Thu Feb 19 10:56:38 2004 UTC (20 years, 3 months ago) by niklas
Branch: SMP
Changes since 1.27.4.6: +4 -4 lines
Diff to previous 1.27.4.6 (colored) to branchpoint 1.27 (colored)

Merge of current from two weeks agointo the SMP branch

Revision 1.51 / (download) - annotate - [select for diffs], Mon Jul 21 22:44:50 2003 UTC (20 years, 10 months ago) by tedu
Branch: MAIN
CVS Tags: OPENBSD_3_5_BASE, OPENBSD_3_5, OPENBSD_3_4_BASE, OPENBSD_3_4
Changes since 1.50: +4 -4 lines
Diff to previous 1.50 (colored)

remove caddr_t casts.  it's just silly to cast something when the function
takes a void *.  convert uiomove to take a void * as well.  ok deraadt@

Revision 1.27.4.6 / (download) - annotate - [select for diffs], Sat Jun 7 11:03:40 2003 UTC (21 years ago) by ho
Branch: SMP
Changes since 1.27.4.5: +2 -6 lines
Diff to previous 1.27.4.5 (colored) to branchpoint 1.27 (colored)

Sync SMP branch to -current

Revision 1.50 / (download) - annotate - [select for diffs], Mon Jun 2 23:28:07 2003 UTC (21 years ago) by millert
Branch: MAIN
Changes since 1.49: +2 -6 lines
Diff to previous 1.49 (colored)

Remove the advertising clause in the UCB license which Berkeley
rescinded 22 July 1999.  Proofed by myself and Theo.

Revision 1.39.2.4 / (download) - annotate - [select for diffs], Mon May 19 22:31:57 2003 UTC (21 years ago) by tedu
Branch: UBC
Changes since 1.39.2.3: +12 -10 lines
Diff to previous 1.39.2.3 (colored) to branchpoint 1.39 (colored) next main 1.40 (colored)

sync

Revision 1.27.4.5 / (download) - annotate - [select for diffs], Fri Mar 28 00:41:27 2003 UTC (21 years, 2 months ago) by niklas
Branch: SMP
Changes since 1.27.4.4: +87 -17 lines
Diff to previous 1.27.4.4 (colored) to branchpoint 1.27 (colored)

Sync the SMP branch with 3.3

Revision 1.49 / (download) - annotate - [select for diffs], Mon Feb 3 21:22:09 2003 UTC (21 years, 4 months ago) by deraadt
Branch: MAIN
CVS Tags: UBC_SYNC_A, OPENBSD_3_3_BASE, OPENBSD_3_3
Changes since 1.48: +9 -9 lines
Diff to previous 1.48 (colored)

knf

Revision 1.48 / (download) - annotate - [select for diffs], Wed Nov 27 19:39:15 2002 UTC (21 years, 6 months ago) by millert
Branch: MAIN
Changes since 1.47: +2 -2 lines
Diff to previous 1.47 (colored)

Avoid possible wraparound when checking timeout size; mickey@ OK

Revision 1.47 / (download) - annotate - [select for diffs], Wed Nov 27 13:31:09 2002 UTC (21 years, 6 months ago) by mickey
Branch: MAIN
Changes since 1.46: +3 -1 lines
Diff to previous 1.46 (colored)

fix an underflow in socket timeout calculations.
(see http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/32827).
itojun@ ok

Revision 1.39.2.3 / (download) - annotate - [select for diffs], Tue Oct 29 00:36:44 2002 UTC (21 years, 7 months ago) by art
Branch: UBC
Changes since 1.39.2.2: +75 -9 lines
Diff to previous 1.39.2.2 (colored) to branchpoint 1.39 (colored)

sync to -current

Revision 1.46 / (download) - annotate - [select for diffs], Thu Aug 8 19:18:12 2002 UTC (21 years, 10 months ago) by provos
Branch: MAIN
CVS Tags: UBC_SYNC_B, OPENBSD_3_2_BASE, OPENBSD_3_2
Changes since 1.45: +70 -4 lines
Diff to previous 1.45 (colored)

redo socketbuf speedup.

Revision 1.45 / (download) - annotate - [select for diffs], Thu Aug 8 18:26:37 2002 UTC (21 years, 10 months ago) by todd
Branch: MAIN
Changes since 1.44: +3 -69 lines
Diff to previous 1.44 (colored)

backout the tree break. ok pb@, art@

Revision 1.44 / (download) - annotate - [select for diffs], Thu Aug 8 17:07:32 2002 UTC (21 years, 10 months ago) by provos
Branch: MAIN
Changes since 1.43: +70 -4 lines
Diff to previous 1.43 (colored)

socket buf speedup from thorpej@netbsd, okay art@ ericj@:

Make insertion of data into socket buffers O(C):
* Keep pointers to the first and last mbufs of the last record in the
  socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
  to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
  guarantee that there will never be more than one record in the
  socket buffer.  This function uses the sb_mbtail pointer to perform
  the data insertion.  Make TCP use sbappend_stream(). On a profiling
run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time. Thanks
to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!

Revision 1.43 / (download) - annotate - [select for diffs], Tue Jun 11 05:07:43 2002 UTC (22 years ago) by art
Branch: MAIN
Changes since 1.42: +6 -6 lines
Diff to previous 1.42 (colored)

splassert where necessary

Revision 1.39.2.2 / (download) - annotate - [select for diffs], Tue Jun 11 03:29:40 2002 UTC (22 years ago) by art
Branch: UBC
Changes since 1.39.2.1: +6 -2 lines
Diff to previous 1.39.2.1 (colored) to branchpoint 1.39 (colored)

Sync UBC branch to -current

Revision 1.42 / (download) - annotate - [select for diffs], Sat May 11 00:06:33 2002 UTC (22 years, 1 month ago) by deraadt
Branch: MAIN
Changes since 1.41: +3 -1 lines
Diff to previous 1.41 (colored)

track egid/rgid on bound/connected sockets too (pf will use this)

Revision 1.27.4.4 / (download) - annotate - [select for diffs], Wed Mar 6 02:13:23 2002 UTC (22 years, 3 months ago) by niklas
Branch: SMP
Changes since 1.27.4.3: +4 -3 lines
Diff to previous 1.27.4.3 (colored) to branchpoint 1.27 (colored)

Merge in trunk

Revision 1.41 / (download) - annotate - [select for diffs], Tue Feb 5 22:04:43 2002 UTC (22 years, 4 months ago) by nordin
Branch: MAIN
CVS Tags: OPENBSD_3_1_BASE, OPENBSD_3_1
Changes since 1.40: +4 -2 lines
Diff to previous 1.40 (colored)

Do range check on SO_LINGER, closes pr#2375. art@ ok

Revision 1.39.2.1 / (download) - annotate - [select for diffs], Thu Jan 31 22:55:41 2002 UTC (22 years, 4 months ago) by niklas
Branch: UBC
Changes since 1.39: +2 -3 lines
Diff to previous 1.39 (colored)

Merge in -current, builds on i386, otherwise untested

Revision 1.40 / (download) - annotate - [select for diffs], Wed Jan 23 00:39:48 2002 UTC (22 years, 4 months ago) by art
Branch: MAIN
Changes since 1.39: +2 -3 lines
Diff to previous 1.39 (colored)

Pool deals fairly well with physical memory shortage, but it doesn't deal
well (not at all) with shortages of the vm_map where the pages are mapped
(usually kmem_map).

Try to deal with it:
 - group all information the backend allocator for a pool in a separate
   struct. The pool will only have a pointer to that struct.
 - change the pool_init API to reflect that.
 - link all pools allocating from the same allocator on a linked list.
 - Since an allocator is responsible to wait for physical memory it will
   only fail (waitok) when it runs out of its backing vm_map, carefully
   drain pools using the same allocator so that va space is freed.
   (see comments in code for caveats and details).
 - change pool_reclaim to return if it actually succeeded to free some
   memory, use that information to make draining easier and more efficient.
 - get rid of PR_URGENT, noone uses it.

Revision 1.27.4.3 / (download) - annotate - [select for diffs], Wed Dec 5 01:02:39 2001 UTC (22 years, 6 months ago) by niklas
Branch: SMP
Changes since 1.27.4.2: +42 -22 lines
Diff to previous 1.27.4.2 (colored) to branchpoint 1.27 (colored)

Merge in -current

Revision 1.39 / (download) - annotate - [select for diffs], Wed Nov 28 17:18:00 2001 UTC (22 years, 6 months ago) by ericj
Branch: MAIN
CVS Tags: UBC_BASE
Branch point for: UBC
Changes since 1.38: +8 -12 lines
Diff to previous 1.38 (colored)


avoid possible infinite loop in sosend() on 64bit systems. - from netbsd
art@ ok

Revision 1.38 / (download) - annotate - [select for diffs], Tue Nov 27 22:53:19 2001 UTC (22 years, 6 months ago) by provos
Branch: MAIN
Changes since 1.37: +24 -6 lines
Diff to previous 1.37 (colored)

change socket allocation to pool allocator; from netbsd; okay niklas@

Revision 1.37 / (download) - annotate - [select for diffs], Tue Nov 27 17:55:39 2001 UTC (22 years, 6 months ago) by provos
Branch: MAIN
Changes since 1.36: +7 -3 lines
Diff to previous 1.36 (colored)

fix an error in sosend() that could make a transient error permant.
verified with both netbsd and freebsd.

from netbsd:
Tue Jun  8 02:39:57 1999 UTC by thorpej

In sosend(), if so_error is set, clear it before returning the error to
the process (i.e. pre-Reno behavior).  The 4.4BSD behavior (introduced
in Reno) caused transient errors to stick incorrectly.

This is from PR #7640 (Havard Eidnes), cross-checked w/ FreeBSD, where
Bill Fenner committed the same fix (as described in a comment in the
Vat sources, by Van Jacobsen).

Revision 1.36 / (download) - annotate - [select for diffs], Tue Nov 27 15:51:36 2001 UTC (22 years, 6 months ago) by provos
Branch: MAIN
Changes since 1.35: +6 -4 lines
Diff to previous 1.35 (colored)

change socket connection queues to use TAILQ_

from NetBSD:
Wed Jan  7 23:47:08 1998 UTC by thorpej

Make insertion and removal of sockets from the partial and incoming
connections queues O(C) rather than O(N).

Revision 1.27.4.2 / (download) - annotate - [select for diffs], Wed Jul 4 10:48:45 2001 UTC (22 years, 11 months ago) by niklas
Branch: SMP
Changes since 1.27.4.1: +70 -69 lines
Diff to previous 1.27.4.1 (colored) to branchpoint 1.27 (colored)

Merge in -current from two days ago in the SMP branch.
As usual with merges, they do not indicate progress, so do not hold
your breath for working SMP, and do not mail me and ask about the
state of it.  It has not changed.  There is work ongoing, but very, very
slowly.  The commit is done in parts as to not lock up the tree in too
big chunks at a time.

Revision 1.35 / (download) - annotate - [select for diffs], Fri Jun 22 14:14:09 2001 UTC (22 years, 11 months ago) by deraadt
Branch: MAIN
CVS Tags: OPENBSD_3_0_BASE, OPENBSD_3_0
Changes since 1.34: +68 -68 lines
Diff to previous 1.34 (colored)

KNF

Revision 1.34 / (download) - annotate - [select for diffs], Fri May 25 22:08:23 2001 UTC (23 years ago) by itojun
Branch: MAIN
Changes since 1.33: +3 -2 lines
Diff to previous 1.33 (colored)

recover old acecept(2) behavior (no ECONNABORTED) for unix domain socket.
it is to be friendly with postfix daemon-to-daemon communication
(not 100% sure if which behavior is correct, specwise).  patch similar to netbsd.

Revision 1.27.4.1 / (download) - annotate - [select for diffs], Mon May 14 22:32:45 2001 UTC (23 years, 1 month ago) by niklas
Branch: SMP
Changes since 1.27: +128 -1 lines
Diff to previous 1.27 (colored)

merge in approximately 2.9 into SMP branch

Revision 1.33 / (download) - annotate - [select for diffs], Tue Mar 6 19:42:43 2001 UTC (23 years, 3 months ago) by provos
Branch: MAIN
CVS Tags: OPENBSD_2_9_BASE, OPENBSD_2_9
Changes since 1.32: +6 -4 lines
Diff to previous 1.32 (colored)

different fix, we still need to deliver EV_EOF; from jlemon@freebsd.org

Revision 1.32 / (download) - annotate - [select for diffs], Tue Mar 6 17:06:23 2001 UTC (23 years, 3 months ago) by provos
Branch: MAIN
Changes since 1.31: +4 -1 lines
Diff to previous 1.31 (colored)

fix a kqueue related panic triggered by shutdown, okay art@

Revision 1.31 / (download) - annotate - [select for diffs], Thu Mar 1 20:54:34 2001 UTC (23 years, 3 months ago) by provos
Branch: MAIN
Changes since 1.30: +34 -26 lines
Diff to previous 1.30 (colored)

port kqueue changes from freebsd, plus all required openbsd glue.
okay deraadt@, millert@
from jlemon@freebsd.org:
extend kqueue down to the device layer, backwards compatible approach
suggested by peter@freebsd.org

Revision 1.30 / (download) - annotate - [select for diffs], Wed Feb 7 12:20:42 2001 UTC (23 years, 4 months ago) by itojun
Branch: MAIN
Changes since 1.29: +2 -2 lines
Diff to previous 1.29 (colored)

return ECONNABORTED, if the socket (tcp connection for example)
is disconnected by RST right before accept(2).  fixes NetBSD PR 10698/12027.
checked with SUSv2, XNET 5.2, and Stevens (unix network programming
vol 1 2nd ed) section 5.11.

Revision 1.29 / (download) - annotate - [select for diffs], Tue Jan 23 02:18:55 2001 UTC (23 years, 4 months ago) by itojun
Branch: MAIN
Changes since 1.28: +3 -1 lines
Diff to previous 1.28 (colored)

when the peer is disconnected before accept(2) is issued,
do not return junk data in mbuf (= sockaddr on accept(2)'s 2nd arg).
set the length to zero.

behavior checked with bsdi and freebsd.
partial solution to NetBSD PR 12027 and 10698 (need more investigation).

Revision 1.28 / (download) - annotate - [select for diffs], Thu Nov 16 20:02:19 2000 UTC (23 years, 6 months ago) by provos
Branch: MAIN
Changes since 1.27: +113 -1 lines
Diff to previous 1.27 (colored)

support kernel event queues, from FreeBSD by Jonathan Lemon,
okay art@, millert@

Revision 1.27 / (download) - annotate - [select for diffs], Thu Oct 14 08:18:49 1999 UTC (24 years, 8 months ago) by cmetz
Branch: MAIN
CVS Tags: kame_19991208, SMP_BASE, OPENBSD_2_8_BASE, OPENBSD_2_8, OPENBSD_2_7_BASE, OPENBSD_2_7, OPENBSD_2_6_BASE, OPENBSD_2_6
Branch point for: SMP
Changes since 1.26: +16 -6 lines
Diff to previous 1.26 (colored)

Fix for PR 871.

This fix is taken from BSD/OS (the file in question being BSD licensed).

It continues to remove a datagram from a socket receive buffer even if there is
an error on the copy-out, so as to leave the buffer in a reasonable state.
Before, the kernel would stop in mid-receive if the copy-out failed, and the
buffer's structural requirements would be violated (since the start of a
datagram must be an address iff ).

Note that if the user provides any invalid addresses as arguments to a
recvmsg(), the datagram at the front of the buffer will be discarded. The more
correct behavior would be not to remove this datagram if the arguments are
invalid. Implementing this behavior requires a lot of significant changes, and
socket receives are a critical path.

Also included are two simple and fairly obvious fixes from the same source.
If non-blocking I/O is set, it makes sure the receieve is non-blocking. It also
fixes a slightly over-aggressive optimization.

Revision 1.26 / (download) - annotate - [select for diffs], Fri Feb 19 15:06:52 1999 UTC (25 years, 3 months ago) by millert
Branch: MAIN
CVS Tags: OPENBSD_2_5_BASE, OPENBSD_2_5
Changes since 1.25: +21 -10 lines
Diff to previous 1.25 (colored)

fixed patch for accept/select race; mycroft@netbsd.org

Revision 1.25 / (download) - annotate - [select for diffs], Thu Feb 18 22:56:58 1999 UTC (25 years, 3 months ago) by deraadt
Branch: MAIN
Changes since 1.24: +13 -20 lines
Diff to previous 1.24 (colored)

undo select/accept patch, which causes full listen queues apparently

Revision 1.24 / (download) - annotate - [select for diffs], Fri Feb 5 00:40:22 1999 UTC (25 years, 4 months ago) by deraadt
Branch: MAIN
Changes since 1.23: +6 -3 lines
Diff to previous 1.23 (colored)

support MSG_BCAST and MSG_MCAST

Revision 1.23 / (download) - annotate - [select for diffs], Thu Jan 21 03:27:42 1999 UTC (25 years, 4 months ago) by millert
Branch: MAIN
Changes since 1.22: +20 -13 lines
Diff to previous 1.22 (colored)

Fixes select(2)/accept(2) race condition which permits DoS; mycroft@netbsd.org

Revision 1.22 / (download) - annotate - [select for diffs], Tue Jul 28 00:13:07 1998 UTC (25 years, 10 months ago) by millert
Branch: MAIN
CVS Tags: OPENBSD_2_4_BASE, OPENBSD_2_4
Changes since 1.21: +5 -4 lines
Diff to previous 1.21 (colored)

Return EINVAL when msg_iovlen or iovcnt <= 0; Make uio_resid unsigned (size_t) and don't return EINVAL if it is < 0 in sys_{read,write}.  Remove check for uio_resid < 0 uiomove() now that uio_resid is unsigned and brack remaining panics with #ifdef DIAGNOSTIC.  vn_rdwr() must now take a size_t * as its 9th argument so change that and clean up uses of vn_rdwr().  Fixes 549 + more

Revision 1.21 / (download) - annotate - [select for diffs], Sat Feb 14 10:55:09 1998 UTC (26 years, 4 months ago) by deraadt
Branch: MAIN
CVS Tags: OPENBSD_2_3_BASE, OPENBSD_2_3
Changes since 1.20: +3 -2 lines
Diff to previous 1.20 (colored)

add seperate so_euid & so_ruid to struct socket, so that identd is still fast.. Sigh. I will change this again later

Revision 1.20 / (download) - annotate - [select for diffs], Tue Jan 6 23:49:48 1998 UTC (26 years, 5 months ago) by deraadt
Branch: MAIN
Changes since 1.19: +3 -3 lines
Diff to previous 1.19 (colored)

so_linger is in seconds

Revision 1.19 / (download) - annotate - [select for diffs], Sat Nov 15 19:57:51 1997 UTC (26 years, 7 months ago) by deraadt
Branch: MAIN
Changes since 1.18: +3 -1 lines
Diff to previous 1.18 (colored)

for shutdown(2), if "how" is not 0-2, return EINVAL

Revision 1.18 / (download) - annotate - [select for diffs], Tue Nov 11 18:22:49 1997 UTC (26 years, 7 months ago) by deraadt
Branch: MAIN
Changes since 1.17: +4 -2 lines
Diff to previous 1.17 (colored)

MSG_EOR on SOCK_STREAM is invalid; wollman

Revision 1.17 / (download) - annotate - [select for diffs], Sun Aug 31 20:42:24 1997 UTC (26 years, 9 months ago) by deraadt
Branch: MAIN
CVS Tags: OPENBSD_2_2_BASE, OPENBSD_2_2
Changes since 1.16: +1 -6 lines
Diff to previous 1.16 (colored)

for non-tty TIOCSPGRP/F_SETOWN/FIOSETOWN pgid setting calls, store uid
and euid as well, then deliver them using new csignal() interface
which ensures that pgid setting process is permitted to signal the
pgid process(es). Thanks to newsham@aloha.net for extensive help and
discussion.

Revision 1.16 / (download) - annotate - [select for diffs], Sun Aug 31 06:29:35 1997 UTC (26 years, 9 months ago) by deraadt
Branch: MAIN
Changes since 1.15: +5 -3 lines
Diff to previous 1.15 (colored)

mbuf leak repair; mycroft@netbsd

Revision 1.15 / (download) - annotate - [select for diffs], Sun Jun 29 18:14:35 1997 UTC (26 years, 11 months ago) by deraadt
Branch: MAIN
Changes since 1.14: +5 -3 lines
Diff to previous 1.14 (colored)

constrain lowwater >= highwater

Revision 1.14 / (download) - annotate - [select for diffs], Mon Jun 23 01:42:04 1997 UTC (26 years, 11 months ago) by deraadt
Branch: MAIN
Changes since 1.13: +2 -2 lines
Diff to previous 1.13 (colored)

oops

Revision 1.13 / (download) - annotate - [select for diffs], Mon Jun 23 00:22:03 1997 UTC (26 years, 11 months ago) by deraadt
Branch: MAIN
Changes since 1.12: +11 -4 lines
Diff to previous 1.12 (colored)

for SO_SND*/SO_RCV*, clip low-end of parameter to 1

Revision 1.12 / (download) - annotate - [select for diffs], Fri Jun 6 11:12:13 1997 UTC (27 years ago) by deraadt
Branch: MAIN
Changes since 1.11: +2 -2 lines
Diff to previous 1.11 (colored)

SO_SNDTIMEO tv_usec calc error; stevens, vol2, p548

Revision 1.11 / (download) - annotate - [select for diffs], Fri Feb 28 04:03:45 1997 UTC (27 years, 3 months ago) by angelos
Branch: MAIN
CVS Tags: OPENBSD_2_1_BASE, OPENBSD_2_1
Changes since 1.10: +1 -11 lines
Diff to previous 1.10 (colored)

Moved IPsec socket state to the PCB.

Revision 1.10 / (download) - annotate - [select for diffs], Fri Feb 28 03:20:38 1997 UTC (27 years, 3 months ago) by angelos
Branch: MAIN
Changes since 1.9: +9 -4 lines
Diff to previous 1.9 (colored)

New variables for system-wide security default levels.

Revision 1.9 / (download) - annotate - [select for diffs], Fri Feb 28 02:56:50 1997 UTC (27 years, 3 months ago) by angelos
Branch: MAIN
Changes since 1.8: +7 -1 lines
Diff to previous 1.8 (colored)

IPsec socket API additions.

Revision 1.8 / (download) - annotate - [select for diffs], Mon Dec 16 14:30:17 1996 UTC (27 years, 6 months ago) by deraadt
Branch: MAIN
Changes since 1.7: +3 -1 lines
Diff to previous 1.7 (colored)

uiomove not checked for failure; wpaul@skynet.ctr.columbia.edu

Revision 1.7 / (download) - annotate - [select for diffs], Fri Sep 20 22:53:10 1996 UTC (27 years, 8 months ago) by deraadt
Branch: MAIN
CVS Tags: OPENBSD_2_0_BASE, OPENBSD_2_0
Changes since 1.6: +12 -4 lines
Diff to previous 1.6 (colored)

`solve' the syn bomb problem as well as currently known; add sysctl's for
SOMAXCONN (kern.somaxconn), SOMINCONN (kern.sominconn), and TCPTV_KEEP_INIT
(net.inet.tcp.keepinittime). when this is not enough (ie. overfull), start
doing tail drop, but slightly prefer the same port.

Revision 1.6 / (download) - annotate - [select for diffs], Sat Aug 24 04:56:36 1996 UTC (27 years, 9 months ago) by deraadt
Branch: MAIN
Changes since 1.5: +2 -4 lines
Diff to previous 1.5 (colored)

change to so_uid, also fix a missing credential found by dm

Revision 1.5 / (download) - annotate - [select for diffs], Wed Aug 14 07:26:21 1996 UTC (27 years, 10 months ago) by deraadt
Branch: MAIN
Changes since 1.4: +2 -2 lines
Diff to previous 1.4 (colored)

incorrect size calculation in mbuf copying, netbsd pr#2692; fix from explorer@flame.org

Revision 1.4 / (download) - annotate - [select for diffs], Mon Aug 5 01:00:53 1996 UTC (27 years, 10 months ago) by deraadt
Branch: MAIN
Changes since 1.3: +4 -1 lines
Diff to previous 1.3 (colored)

struct socket gets so_ucred; permit only same uid or root to do port takeover.

Revision 1.3 / (download) - annotate - [select for diffs], Sun Mar 3 17:20:19 1996 UTC (28 years, 3 months ago) by niklas
Branch: MAIN
Changes since 1.2: +38 -43 lines
Diff to previous 1.2 (colored)

From NetBSD: 960217 merge

Revision 1.2 / (download) - annotate - [select for diffs], Sun Mar 3 04:44:06 1996 UTC (28 years, 3 months ago) by mickey
Branch: MAIN
Changes since 1.1: +1 -1 lines
Diff to previous 1.1 (colored)

from NetBSD: so it compiles now again ;)

Revision 1.1.1.1 / (download) - annotate - [select for diffs] (vendor branch), Wed Oct 18 08:52:47 1995 UTC (28 years, 8 months ago) by deraadt
CVS Tags: netbsd_1_1
Changes since 1.1: +0 -0 lines
Diff to previous 1.1 (colored)

initial import of NetBSD tree

Revision 1.1 / (download) - annotate - [select for diffs], Wed Oct 18 08:52:47 1995 UTC (28 years, 8 months ago) by deraadt
Branch: MAIN

Initial revision

This form allows you to request diff's between any two revisions of a file. You may select a symbolic revision name using the selection box or you may type in a numeric name using the type-in text box.