- 28 Jan, 2008 40 commits
-
-
Pavel Emelyanov authored
They are all collected in the net/ipv4/devinet.c file and mostly use the IPV4_DEVCONF_DFLT macro. So I add the net parameter to it and patch users accordingly. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
This is the core. Add all and default pointers on the netns_ipv4 and register a new pernet subsys to initialize them. Also add the ctl_table_header to register the net.ipv4.ip_forward ctl. I don't allocate additional memory for init_net, but use global devinets. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
Some handers and strategies of devinet sysctl tables need to know the net to propagate the ctl change to all the net devices. I use the (currently unused) extra2 pointer on the tables to get it. Holding the reference on the struct net is not possible, because otherwise we'll get a net->ctl_table->net circular dependency. But since the ctl tables are unregistered during the net destruction, this is safe to get it w/o additional protection. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
This one will need to set the IPV4_DEVCONF_ALL(PROXY_ARP), but there's no ways to get the net right in place, so we have to pull one from the inet_ioctl's struct sock. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
Currently, this function is void, so failures in creating sysctls for new/renamed devices are not reported to anywhere. Fixing this is another complex (needed?) task, but this return value is needed during the namespaces creation to handle the case, when we failed to create "all" and "default" entries. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
The ipv4 will store its parameters inside this structure. This one is empty now, but it will be eventually filled. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Pavel Emelyanov authored
The _find calls calculate the hash value using the xfrm_state_hmask, without the xfrm_state_lock. But the value of this mask can change in the _resize call under the state_lock, so we risk to fail in finding the desired entry in hash. I think, that the hash value is better to calculate under the state lock. Signed-off-by: Pavel Emelyanov <xemul@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Matthias Kaehlcke authored
PPP synchronous tty channel driver: convert the semaphore dead_sem to a completion Signed-off-by: Matthias Kaehlcke <matthias.kaehlcke@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
Now that external users may increment the counters directly, we need to ensure that udp_stats_in6 is always available. Otherwise we'd either have to requrie the external users to be built as modules or ipv6 to be built-in. This isn't too bad because udp_stats_in6 is just a pair of pointers plus an EXPORT, e.g., just 40 (16 + 24) bytes on x86-64. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
YOSHIFUJI Hideaki authored
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This introduces a CCMPS field for setting a CCID-specific upper bound on the application payload size, as is defined in RFC 4340, section 14. Only the TX CCID is considered in setting this limit, since the RX CCID generates comparatively small (DCCP-Ack) feedback packets. The CCMPS field includes network and transport layer header lengths. The only current CCMPS customer is CCID4 (via RFC 4828). A wrapper is used to allow querying the CCMPS even at times where the CCID modules may not have been fully negotiated yet. In dccp_sync_mss() the variable `mss_now' has been renamed into `cur_mps', to reflect that we are dealing with an MPS, but not an MSS. Since the DCCP code closely follows the TCP code, the identifiers `dccp_sync_mss' and `dccps_mss_cache' have been kept, as they have direct TCP counterparts. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
The patch makes the registration messages of CCID 2/3 a bit more informative: instead of repeating the CCID number as currently done, "CCID: Registered CCID 2 (ccid2)" or "CCID: Registered CCID 3 (ccid3)", the descriptive names of the CCID's (from RFCs) are now used: "CCID: Registered CCID 2 (TCP-like)" and "CCID: Registered CCID 3 (TCP-Friendly Rate Control)". To allow spaces in the name, the slab name string has been changed to refer to the numeric CCID identifier, using the same format as before. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This adds documentation for the ccid_operations structure. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
There are several thresholds for trie fib hash management. They are used in the code as a constants. Make them constants from the compiler point of view. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
This is similar to the change already done for IPIP tunnels. Once created, a SIT tunnel can't be bound to another device. To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode sit remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: ipv6/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Michal Schmidt authored
This is similar to the change already done for IPIP tunnels. Once created, a GRE tunnel can't be bound to another device. To reproduce: # create a tunnel: ip tunnel add tunneltest0 mode gre remote 10.0.0.1 dev eth0 # try to change the bounding device from eth0 to eth1: ip tunnel change tunneltest0 dev eth1 # show the result: ip tunnel show tunneltest0 tunneltest0: gre/ip remote 10.0.0.1 local any dev eth0 ttl inherit Notice the bound device has not changed from eth0 to eth1. This patch fixes it. When changing the binding, it also recalculates the MTU according to the new bound device's MTU. Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Denis V. Lunev authored
This makes the code in the inet6_rt_notify more straightforward and provides groud for namespace passing. Signed-off-by: Denis V. Lunev <den@openvz.org> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
Further testing shows that my ICMP relookup patch can cause xfrm_lookup to return zero on error which isn't very nice since it leads to the caller dying on null pointer dereference. The bug is due to not setting err to ENOENT just before we leave xfrm_lookup in case of no policy. This patch moves the err setting to where it should be. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This implements [RFC 4340, p. 32]: "any feature negotiation options received on DCCP-Data packets MUST be ignored". Also added a FIXME for further processing, since the code currently (wrongly) classifies empty Confirm options as invalid - this needs to be resolved in a separate patch. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes several `XXX' references which indicate a missing support for non-1-byte feature values: this is unnecessary, as all currently known (standardised) SP feature values are 1-byte quantities. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes two inlines which were both called in a single function only: 1) dccp_feat_change() is always called with either DCCPO_CHANGE_L or DCCPO_CHANGE_R as argument * from dccp_set_socktopt_change() via do_dccp_setsockopt() with DCCP_SOCKOPT_CHANGE_R/L * from __dccp_feat_init() via dccp_feat_init() also with DCCP_SOCKOPT_CHANGE_R/L. Hence the dccp_feat_is_valid_type() is completely unnecessary and always returns true. 2) Due to (1), the length test reduces to 'len >= 4', which in turn makes dccp_feat_is_valid_length() unnecessary. Furthermore, the inline function dccp_feat_is_reserved() was unfolded, since only called in a single place. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This provides a separate routine to insert options during the initial handshake. The main purpose is to conduct feature negotiation, for the moment the only user is the timestamp echo needed for the (CCID3) handshake RTT sample. Padding of options has been put into a small separate routine, to be shared among the two functions. This could also be used as a generic routine to finish inserting options. Also removed an `XXX' comment since its content was obvious. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
In DCCP, timestamps can occur on packets anytime, CCID3 uses a timestamp(/echo) on the Request/Response exchange. This patch addresses the following situation: * timestamps are recorded on the listening socket; * Responses are sent from dccp_request_sockets; * suppose two connections reach the listening socket with very small time in between: * the first timestamp value gets overwritten by the second connection request. This is not really good, so this patch separates timestamps into * those which are received by the server during the initial handshake (on dccp_request_sock); * those which are received by the client or the client after connection establishment. As before, a timestamp of 0 is regarded as indicating that no (meaningful) timestamp has been received (in addition, a warning message is printed if hosts send 0-valued timestamps). The timestamp-echoing now works as follows: * when a timestamp is present on the initial Request, it is placed into dreq, due to the call to dccp_parse_options in dccp_v{4,6}_conn_request; * when a timestamp is present on the Ack leading from RESPOND => OPEN, it is copied over from the request_sock into the child cocket in dccp_create_openreq_child; * timestamps received on an (established) dccp_sock are treated as before. Since Elapsed Time is measured in hundredths of milliseconds (13.2), the new dccp_timestamp() function is used, as it is expected that the time between receiving the timestamp and sending the timestamp echo will be very small against the wrap-around time. As a byproduct, this allows smaller timestamping-time fields. Furthermore, inserting the Timestamp Echo option has been taken out of the block starting with '!dccp_packet_without_ack()', since Timestamp Echo can be carried on any packet (5.8 and 13.3). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Acked-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This adds option-parsing code to processing of Acks in the listening state on request_socks on the server, serving two purposes (i) resolves a FIXME (removed); (ii) paves the way for feature-negotiation during connection-setup. There is an intended subtlety here with regard to dccp_check_req: Parsing options happens only after testing whether the received packet is a retransmitted Request. Otherwise, if the Request contained (a possibly large number of) feature-negotiation options, recomputing state would have to happen each time a retransmitted Request arrives, which opens the door to an easy DoS attack. Since in a genuine retransmission the options should not be different from the original, reusing the already computed state seems better. The other point is - if there are timestamp options on the Request, they will not be answered; which means that in the presence of retransmission (likely due to loss and/or other problems), the use of Request/Response RTT sampling is suspended, so that startup problems here do not propagate. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
The option parsing code currently only parses on full sk's. This causes a problem for options sent during the initial handshake (in particular timestamps and feature-negotiation options). Therefore, this patch extends the option parsing code with an additional argument for request_socks: if it is non-NULL, options are parsed on the request socket, otherwise the normal path (parsing on the sk) is used. Subsequent patches, which implement feature negotiation during connection setup, make use of this facility. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This replaces 4 individual assignments for `len' with a single one, placed where the control flow of those 4 leads to. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This adds a socket option and signalling support for the case where the server holds timewait state on closing the connection, as described in RFC 4340, 8.3. Since holding timewait state at the server is the non-usual case, it is enabled via a socket option. Documentation for this socket option has been added. The setsockopt statement has been made resilient against different possible cases of expressing boolean `true' values using a suggestion by Ian McDonald. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This removes another Fixme, using the TCP maximum RTO rather than the value specified by the DCCP specification. Across the sections in RFC 4340, 64 seconds is consistently suggested as maximum RTO backoff value; and this is the value which is now used. I have checked both termination cases for retransmissions of Close/CloseReq: with the default value 15 of `retries2', and an initial icsk_retransmit = 0, it takes about 614 seconds to declare a non-responding peer as dead, after which the final terminating Reset is sent. With the TCP maximum RTO value of 120 seconds it takes (as might be expected) almost twice as long, about 23 minutes. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
When performing active close, RFC 4340, 8.3. requires to retransmit the Close/CloseReq with a backoff-retransmit timer starting at intially 2 RTTs. This patch shifts the existing code for active-close retransmit timer into output.c, so that the retransmit timer is started when the first Close/CloseReq is sent. Previously, the timer was started when, after releasing the socket in dccp_close(), the actively-closing side had not yet reached the CLOSED/TIMEWAIT state. The patch further reduces the initial timeout from 3 seconds to the required 2 RTTs, where - in absence of a known RTT - the fallback value specified in RFC 4340, 3.4 is used. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Daniel Lezcano authored
Removed useless and buggy __exit section in the different ipv6 subsystems. Otherwise they will be called inside an init section during rollbacking in case of an error in the protocol initialization. Signed-off-by: Daniel Lezcano <dlezcano@fr.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Gerrit Renker authored
This patch performs two changes: 1) Close the write-end in addition to the read-end when a fin-like segment (Close or CloseReq) is received by DCCP. This accounts for the fact that DCCP, in contrast to TCP, does not have a half-close. RFC 4340 says in this respect that when a fin-like segment has been sent there is no guarantee at all that any further data will be processed. Thus this patch performs SHUT_WR in addition to the SHUT_RD when a fin-like segment is encountered. 2) Minor change: I noted that code appears twice in different places and think it makes sense to put this into a self-contained function (dccp_enqueue()). Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Ian McDonald <ian.mcdonald@jandi.co.nz> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
My previous patch made the wait flag take the opposite value to what it should be. This patch fixes that. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-
Herbert Xu authored
The caller must hold the RTNL so let's check it in unregister_netdevice. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>
-