- 27 Apr, 2009 40 commits
-
-
Eric Moore authored
Bug fix in the broadcast primative async event code where the driver would stop sending tm queries after the first queury was completed. This was due driver not reseting the tm_cmds.status field back to MPT2_CMD_NOT_USED after completing a task management request. An addtional fix adding sanity check to insure sas_device->starget set to NULL. During multipath testing fail over/fail back, the mid layer was holding onto sdev longer than the fail back period, thus starget was getting set to NULL for device being added. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
The Dell branding along with the VID, DID, SSVID, SSDID following the LSI branding that contains the card firmware/chip/bios versions. If the SSDID is not known but it is a Dell HBA, the driver will print the SSDID instead of the Dell branding string. Nothing will be printed for non Dell HBAs Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
The driver name needs to be at the beginining of the driver_version string in MPT2IOCINFO ioctl. This is the same behaviour is there already in the mptsas driver. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
The driver is not freeing message frame when returning failure from _ctl_do_task_abort. If you call this function 500 times when its unable to find an active task mid, you end up with no message frames. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
There is a bug in firmware where the reply message frame says there is a 16kb sense buffer, when in reality its only 20 bytes. This fix insures the memcpy action doesn't corrupte the memory beyond the 90 bytes allocated in the scsi command for sense buffer. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
The poison sanity check on the reply_post_free register needs to be by 32bit, not 64bit. The poison check is there because its possible that the driver read the 1st 32bit before the 2nd 32bit has been written to by firmware. In other words, this handles race between driver reading the 64 bit register, and it being dma'd across pci memory from controller firmware as two 32bit pci writes. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Eric Moore authored
The current magic number is shared with mptsas driver. This to be unique to fix issues with register_ioctls32_conversion in older kernels. We are making this change across all versions of the sas2.0 drivers. Signed-off-by: Eric Moore <eric.moore@lsi.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Dave Hansen authored
Shifting an unsigned char implicitly casts it to a signed int. This caused 'lba' to sign-extend and Linux would then try READ CAPACITY 16 which was not supported by at least one drive. Using the get_unaligned_be*() helpers keeps us from having to worry about how the extension might occur. Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com> Reviewed-by: Matthew Wilcox <willy@linux.intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Brian King authored
The ata_sas_slave_configure was changed such that it now allocates some memory for a drain buffer for ATAPI devices. Fixup the ipr driver such that we no longer make this call with interrupts disabled. Signed-off-by: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Joe Eykholt authored
The FCoE forwarder (FCF) would be selected, but then would soon time out after three advertisements were missed. This would be 24 seconds by default, or 3 times the keep-alive interval configured on the switch. The cause was that the multicast address for all FIP E-nodes was never added. Signed-off-by: Joe Eykholt <jeykholt@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Robert Love authored
When building with a .config generated from 'make allmodconfig' some build warnings are generated. This patch corrects the warnings, adds a FC_FID_NONE (= 0) enumeration for FC-IDs and cleans up one variable naming to meet our variable naming conventions. For example, fc_lport's should be named "lport," not "lp." Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Chris Leech authored
Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Chris Leech authored
These probably never should have been exported. If they were needed outside of the fcoe module, they would have been moved to libfcoe. Signed-off-by: Chris Leech <christopher.leech@intel.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Dan Carpenter authored
sk_buff pointers should use kfree_skb() instead of vanilla kfree(). Found by smatch (http://repo.or.cz/w/smatch.git). Signed-off-by: Dan Carpenter <error27@gmail.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
When a delete event is queued for an rport, set state to NONE so that no other processing is done on the rport as it is being removed. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
After lport_destroy, the local port should not be used again. Transition to state NONE, any incoming frames or link up should not transition out of this state since we are deleting exchange table and cleaning up the local port. Also, mark link as down. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
We want to generate the rport queue event (from the logoff) before flushing the queue otherwise the event may still be in the queue when we logoff. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
Rogue ports are currently not tracked on any list. The only reference to them is through any outstanding exchanges pending on the rogue ports. If the module is removed while a retry is set on a rogue port (say a Plogi retry for instance), this retry is not cancelled because there is no reference to the rogue port in the discovery rports list. Thus the local port can clean itself up, delete the exchange pool, and then the rogue port timeout can fire and try to start up another exchange. This patch tracks the rogue ports in a new list disc->rogue_rports. Creating a new list instead of using the disc->rports list keeps remote port code change to a minimum. 1) Whenever a rogue port is created, it is immediately added to the disc->rogue_rports list. 2) When the rogues port goes to ready, it is removed from the rogue list and the real remote port is added to the disc->rports list 3) The removal of the rogue from the disc->rogue_rports list is done in the context of the fc_rport_work() workQ thread in discovery callback. 4) Real rports are removed from the disc->rports list like before. Lookup is done only in the real rports list. This avoids making large changes to the remote port code. 5) In fc_disc_stop_rports, the rogues list is traversed in addition to the real list to stop the rogue ports and issue logoffs on them. This way, rogue ports get cleaned up when the local port goes away. 6) rogue remote ports are not removed from the list right away, but removed late in fc_rport_work() context, multiple threads can find the same remote port in the list and call rport_logoff(). Rport_logoff() only continues with the logoff if port is not in NONE state, thus preventing multiple logoffs and multiple list deletions. 7) Since the rport is removed from the disc list at a later stage (in the disc callback), incoming frames can find the rport even if rport_logoff() has been called on the rport. When rport_logoff() is called, the rport state is set to NONE, and we are trying to cancel all exchanges and retries on that port. While in this state, if an incoming Plogi/Prli/Logo or other frames match the rport, we should not reply because the rport is in the NONE state. Just drop the frame, since the rport will be deleted soon in the disc callback (fc_rport_work) 8) In fc_disc_single(), remove rport lookup and call to fc_disc_del_target. fc_disc_single() is called from recv_rscn_req() where rport lookup and rport_logoff is already done. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
For instance, if there is a Plogi pending (remote port is in Plogi state), and the state changes to say NONE (because the port is being logged off), then when the Plogi resp times out, do not start a retry. This patch partially reverts an earlier patch (libfc: check for err when recv and state is incorrect), by moving the state check back to before checking for error. However, if the state does not match, then there is an additional check to see if its an error ptr or a real frame before jumping to err or out respectively. Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Abhijeet Joglekar authored
gpn_ft_resp processing currently does not hold the discovery lock. disc_done() thus gets called from gpn_ft_resp or from gpn_ft_parse without the lock held. This then sets disc->pending to zero or calls gpn_ft_req() without disc_lock held. - Hold disc mutex during gpn_ft resp processing - In disc_done, release the disc mutex while calling lport callback Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com> Signed-off-by: Robert Love <robert.w.love@intel.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
kxie@chelsio.com authored
(version 2) Fixed a bug in calculating ddp map range when search for free entries: it was going beyond the end by one, thus corrupting gl_skb[0]. Signed-off-by: Karen Xie <kxie@chelsio.com> Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
When doing a lot (128) of large writes (256K) we can hit the cxgb3_snd_win check pretty easily. The driver's xmit thread then takes 100% of the cpu. The driver should not be returning -EAGAIN for this problem. It should be returing -ENOBUFS, then when the window is opened again it should queue the xmit thread (it already wakes the xmit thread). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
cxgb3i was setting can_queue to only 128 commands, and was setting the can_queue and cmd_per_lun to the same value. This sets the can_queue to 1024 commands, and sets the cmd_per_lun to a safer default of 32. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
Set target can queue limit to the number of preallocated session tasks we have. This along with the cxgb3i can_queue patch will fix a throughput problem where it could only queue one LU worth of data at a time. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
If a command's scsi cmd pdu setup fails then we can just fail the IO to the scsi layer. If a DATA_OUT for a R2T fails then we will want to drop the session, because it means we got a bad request from the target (iscsi protocol error). This patch has us propogate the error upwards so libiscsi_tcp or libiscsi can decide what the best action is to take. It also fixes a bug where we could try to grab the session lock while holding it, because if iscsi_tcp drops the session in the pdu setup callout the session lock is held when setting up the scsi cmd pdu. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Christof Schmitt authored
The zfcp_port might have been removed, while the FC fast_io_fail timer is still running and could trigger the terminate_rport_io callback. Set the pointer to the zfcp_port to NULL and check accordingly before using it. Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Christof Schmitt authored
Before dropping the reference count with zfcp_adapter_put, increase it with zfcp_adapter_get when issuing cfdc requests. Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin Petermann authored
If this problem appears zfcp ports cannot be de-queued since it is checked for a zero refcount. The port reference counting is wrong for existing zfcp ports when e.g. an adapter gets on-line again. During port scanning the reference counting for existing ports should not be changed. Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin Petermann authored
The current sbal counting can be wrong if a fsf request is waiting for free sbals and at the same time qdio request queue is shutdown and re-opened. Revering a previous patch fixes this issue. Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Christof Schmitt authored
When the abort handler cannot find a pending FSF request, the request completion could just be running. This means we cannot return SUCCESS, since this would lead to call to scsi_done after exiting the SCSI error handler which is not allowed. Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
A remote port remains in error state even if we receive a RSCN stating that the connection is re-established. The port recovery is not started due to a flag which is not reset. The solution is to clear the flag in question before we trigger a ERP. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Christof Schmitt authored
Error codes specific to the control file requests are evaluated by the actcli tool, so don't report -ENXIO for those. Generic problems are still checked for outside the command specific handler. Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin Petermann authored
On some hardware it can take some time to add a unit. If some remove this unit during this process the remove will fail. Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
The remote port remains in error state even if the connection is re-established. A wrong precondition check was performed on the port status leading to a cancellation of the port reopen. Remove the pre-req check because it's not required and better handled within the ERP. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
The ERP thread is performing a task before it is executing the corresponding down on the semaphore. The response handler of the just started exchange config should wait for the completion by performing a down on this semaphore. Since this semaphore is still positive from the ERP enqueue the handler won't wait and therefore the exchange config will always fail leaving the adapter in error. The problem can be solved by performing the down on the semaphore before starting an ERP task. This is the logically correct order. Only walk the ERP loop if there is a task to perform. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
The nameserver port might be in state online when the adapter is offlined. On adapter reactivation the nameserver port is not re-opened due to the PORT_ONLINE status. This results in an unsuccessful recovery. In forcing the nameserver port status to offline on all adapter offline events this issue is prevented. Waiting for the reference count to drop to zero in zfcp_wka_port_offline is not required, so remove it. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
When running the scsi_scan from the zfcp workqueue and the target device does not respond, the zfcp workqueue can block until the scsi_scan hits a timeout. Move the work to the scsi host workqueue, since this one is also used for the scan from the SCSI midlayer. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Christof Schmitt authored
Fix problem that zfcp_fsf_exchange_config_data_sync and zfcp_fsf_exchange_config_data_sync could try to call zfcp_fsf_req_free with a NULL pointer. Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Swen Schillig authored
Since we're setting the host port type now to FC_PORTTYPE_NPIV for adapters running in NPIV mode we should allow this port type for auto-port scanning as well. Signed-off-by: Swen Schillig <swen@vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin Petermann authored
Avoid referencing a fsf request after sending it in fcp_fsf_req_send, it might have already completed and deallocated. Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com> Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-