- 15 Oct, 2008 1 commit
-
-
James Bottomley authored
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
- 13 Oct, 2008 38 commits
-
-
Mike Christie authored
We must be using the bh spin locking functions in iscsi_eh_device_reset becuase the session lock interacts with a thread and softirq. This patch also fixes up a bogus comment and check in fail_command, because no one drops the lock (bnx2i did but it is not going upstream yet and there were other refcount changes for that). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
Some wires got crossed on some patches and I messed up in the code below when rebuilding a patch. We want to be checking if flag equaled the value indicating if we killing the session due to final logout or if we just trying to relogin. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
The segment->done functions return a iscsi error value which gives a lot more info than conn failed, so this patch has us return that value. I also add a new one for xmit failures. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
I had this in my patchset to add target reset support, but it got dropped due to patching conflicts. This initial patch just renames the function and users. We are actually just dropping the session, and so this does not have anything to do with the host exactly. It does for software iscsi because we allocate a host per session, but for cxgb3i this makes no sense. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
Some endpoint code was using unsigned int and some was using uint64_t. This converts it all to uint64_t. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
If the driver knows when hardware is removed like with cxgb3i, bnx2i, qla4xxx and iser then we will want to remove the sessions/devices that are bound to that device before removing the host. cxgb3i and in the future bnx2i will remove the host and that will remove all the sessions on the hba. iser can call iscsi_kill_session when it gets an event that indicates that a hca is removed. And when qla4xxx is hooked in to the lib (it is only hooked into the class right now) it can call iscsi remove host like the partial offload card drivers. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
iscsi_tcp was updating the exp_statsn (exp_statsn acknowledges status and tells the target it is ok to let the resources for a iscsi pdu to be reused) before it got all the data for pdu read into OS buffers. Data corruption was occuring if something happens to a packet and the network layer requests a retransmit, and the initiator has told the target about the udpated exp_statsn ack, then the target may be sending data from a buffer it has reused for a new iscsi pdu. This fixes the problem by having the LLD (iscsi_tcp in this case) just handle the transferring of data, and has libiscsi handle the processing of status (libiscsi completion processing is done after LLD data transfers are complete). Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin K. Petersen authored
For some reason these messages ended up being printed with KERN_INFO rendering them invisible to pretty much everyone. Switch to KERN_NOTICE. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin K. Petersen authored
The old detection code couldn't handle all possible combinations of DIX and DIF. This version does, giving priority to DIX if the controller is capable. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin K. Petersen authored
Now that we no longer use protection_type as trigger for preparing protected CDBs we can remove the places that set it to zero. This allows userland to see which protection type the device is formatted with regardless of whether the HBA supports DIF or not. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Martin K. Petersen authored
Use the same logic to prepare RD/WRPROTECT and the protection operation. Fixes a corner case where we could issue an unprotected CDB and yet tell the HBA to do DIF to the drive. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Bottomley authored
There's a target reset bug. This loop: for (id = 0; id <= shost->max_id; id++) { Never terminates if shost->max_id is set to ~0, like aic94xx does. It's also pretty inefficient since you mostly have compact target numbers, but the max_id can be very high. The best way would be to sort the recovery list by target id and skip them if they're equal, but even a worst case O(N^2) traversal is probably OK here, so fix it by finding the next highest target number (assuming n+1) and terminating when there isn't one. Cc: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Added support for new sysfs attributes: lpfc_stat_data_ctrl and lpfc_max_scsicmpl_time. The attributes control statistical reporting of io load. Added support for new fc vendor events for error reporting. Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Added new sysfs attribute lpfc_max_scsicmpl_time. Attribute, when enabled, will control target queue depth based on I/O completion time. Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Revert the target busy response in favor of the transport disrupted response for node state transitions. Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
We do not need to set REQ_NOMERGE because when the module calls blk_execute_rq -> blk_execute_rq_nowait, blk_execute_rq_nowait sets it for us. This brings all the modules in sync for those bits. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Add support for MSI-X Multi-Message interrupts. We use different vectors for fast-path interrupts (i/o) and slow-patch interrupts (discovery, etc). Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
[jejb: drop rejecting hunk altered by target busy patches] Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Add support for PCI-EEH permanent-disabling a device via lpfc_pci_remove_one() Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Miscellaneous Fixes: - Fix the wrong variable name used for checking node active usage status - Fix numerous duplicate log message numbers - Fix change KERN_WARNING messages to KERN_INFO. - Stop sending erroneous LOGO to fabric after vport is already terminated - Fix HBQ allocates that were kalloc'ing w/ GFP_KERNEL while holding a lock. - Fix gcc 4.3.2 compiler warnings and a sparse warning - Fix bugs in handling unsolicited ct event queue - Reorder some of the initial link up checks, to remove odd VPI states. - Correct poor VPI handling - Add debug messages - Expand Update_CFG mailbox definition - Fix handling of VPD data offsets - Reorder loopback flags - convert to use offsetof() Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Update driver for new SLI-3 features: - interrupt enhancements - lose adapter doorbell writes - inlining support for FCP_Ixx cmds Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Miscellaneous Discovery fixes: - Fix rejection followed by acceptance in handling RPL and RPS unsolicited events - Fix for vport delete crash - Fix PLOGI vs ADISC race condition Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
James Smart authored
Signed-off-by: James Smart <james.smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
This checks the errors the scsi-ml determined were retryable and returns if we should fast fail it based on the request fail fast flags. Without the patch, drivers like lpfc, qla2xxx and fcoe would return DID_ERROR for what it determines is a temporary communication problem. There is no loss of connectivity at that time and the driver thinks that it would be fast to retry at the driver level. SCSI-ml will however sees fast fail on the request and DID_ERROR and will fast fail the io. This will then cause dm-multipath to fail the path and possibley switch target controllers when we should be retrying at the scsi layer. We also were fast failing device errors to dm multiapth when unless the scsi_dh modules think otherwis we want to retry at the scsi layer because multipath can only retry the IO like scsi should have done. multipath is a little dumber though because it does not what the error was for and assumes that it should fail the paths. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
Multipath is best at handling transport errors. If it gets a device error then there is not much the multipath layer can do. It will just access the same device but from a different path. This patch breaks up failfast into device, transport and driver errors. The multipath layers (md and dm mutlipath) only ask the lower levels to fast fail transport errors. The user of failfast, read ahead, will ask to fast fail on all errors. Note that blk_noretry_request will return true if any failfast bit is set. This allows drivers that do not support the multipath failfast bits to continue to fail on any failfast error like before. Drivers like scsi that are able to fail fast specific errors can check for the specific fail fast type. In the next patch I will convert scsi. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: Jens Axboe <jens.axboe@oracle.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
This has qla2xxx use the new transport error values instead of DID_BUS_BUSY. I am not sure if all the errors in qla_isr.c I changed are transport related. We end up blocking/deleting the rport for all of them so it is better to use the new transport error since the fc classs will decide when to fail the IO. With this patch if I pull a cable then IO that had reached the driver, will be failed with DID_TRANSPORT_DISRUPTED (not including tape). The fc class will then fail the IO when the fast io fail tmo has fired, and the driver will flush any other commands running. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Cc: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
If the target is blocked and fast io fail tmo has not fired then we requeue with DID_TRANSPORT_DISRUPTED. Once that tmo fires we fail with DID_TRANSPORT_FAILFAST. v2 - seperate from "fc class: unblock target after calling terminate callback" to make it easier to review. - Add JamesS's ack from list. v2 - initial patch Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
This patch converts the iscsi drivers to the new host byte values. v2 Drop some conversions. Want to avoid conflicts with other patches. v1 initial patch. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
Currently, if there is a transport problem the iscsi drivers will return outstanding commands (commands being exeucted by the driver/fw/hw) with DID_BUS_BUSY and block the session so no new commands can be queued. Commands that are caught between the failure handling and blocking are failed with DID_IMM_RETRY or one of the scsi ml queuecommand return values. When the recovery_timeout fires, the iscsi drivers then fail IO with DID_NO_CONNECT. For fcp, some drivers will fail some outstanding IO (disk but possibly not tape) with DID_BUS_BUSY or DID_ERROR or some other value that causes a retry and hits the scsi_error.c failfast check, block the rport, and commands caught in the race are failed with DID_IMM_RETRY. Other drivers, may hold onto all IO and wait for the terminate_rport_io or dev_loss_tmo_callbk to be called. The following patches attempt to unify what upper layers will see drivers like multipath can make a good guess. This relies on drivers being hooked into their transport class. This first patch just defines two new host byte errors so drivers can return the same value for when a rport/session is blocked and for when the fast_io_fail_tmo fires. The idea is that if the LLD/class detects a problem and is going to block a rport/session, then if the LLD wants or must return the command to scsi-ml, then it can return it with DID_TRANSPORT_DISRUPTED. This will requeue the IO into the same scsi queue it came from, until the fast io fail timer fires and the class decides what to do. When using multipath and the fast_io_fail_tmo fires then the class can fail commands with DID_TRANSPORT_FAILFAST or drivers can use DID_TRANSPORT_FAILFAST in their terminate_rport_io callbacks or the equivlent in iscsi if we ever implement more advanced recovery methods. A LLD, like lpfc, could continue to return DID_ERROR and then it will hit the normal failfast path, so drivers do not have fully be ported to work better. The point of the patches is that upper layers will not see a failure that could be recovered from while the rport/session is blocked until fast_io_fail_tmo/recovery_timeout fires. V3 Remove some comments. V2 Fixed patch/diff errors and renamed DID_TRANSPORT_BLOCKED to DID_TRANSPORT_DISRUPTED. V1 initial patch. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
The fc class now calls scsi_target_unblock after calling the terminate callback, so this patch removes the calls from the drivers. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
When we block a rport and the driver implements the terminate callback we will fail IO that was running quickly. However IO that was in the scsi_device/block queue sits there until the dev_loss_tmo fires, and this can make it look like IO is lost because new IO will get executed but that IO stuck in the blocked queue sits there for some time longer. With this patch when the fast io fail tmo fires, we will fail the blocked IO and any new IO. This patch also allows all drivers to partially support the fast io fail tmo. If the terminate io callback is not implemented, we will still fail blocked IO and any new IO, so multipath can handle that. This patch also allows the fc and iscsi classes to implement the same behavior. The timers are just unfornately named differently. This patch also fixes the problem where drivers were unblocking the target in their terminate callback, which was needed for rport removal, but for fast io fail timeout it would cause IO to bounce arround the scsi/block layer and the LLD queuecommand. And it for drivers that could have IO stuck but did not have a terminate callback the unblock calls in the class will fix them. v2. - fix up bit setting style to meet JamesS's pref. - Broke out new host byte error changes to make it easier to read. - added JamesS's ack from list. v1 - initial patch Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: James Smart <James.Smart@emulex.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
We do want to call right back into the queuecommand during the race, so we can just use SCSI_MLQUEUE_TARGET_BUSY. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: James Smart <James.Smart@Emulex.Com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
For the conditions below we do not want the queuecommand function to call us right back, so return SCSI_MLQUEUE_TARGET_BUSY. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
If the fcport is not online then we do not want to block IO to all ports on the host. We just want to stop IO on port not online, so we should be using the SCSI_MLQUEUE_TARGET_BUSY return value. For the case where we race with the rport memset initialization we do not want the queuecommand to be called again so we can just use SCSI_MLQUEUE_TARGET_BUSY for this. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: Andrew Vasquez <andrew.vasquez@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
When qla4xxx begins recovery and the iscsi class is firing up to handle it, we need to retrn SCSI_MLQUEUE_TARGET_BUSY from the driver instead of host busy, because the session recovery only affects the one target. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Acked-by: David C Somayajulu <david.somayajulu@qlogic.com> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
Mike Christie authored
SCSI-ml manages the queueing limits for the device and host, but does not do so at the target level. However something something similar can come in userful when a driver is transitioning a transport object to the the blocked state, becuase at that time we do not want to queue io and we do not want the queuecommand to be called again. The patch adds code similar to the exisiting SCSI_ML_*BUSY handlers. You can now return SCSI_MLQUEUE_TARGET_BUSY when we hit a transport level queueing issue like the hw cannot allocate some resource at the iscsi session/connection level, or the target has temporarily closed or shrunk the queueing window, or if we are transitioning to the blocked state. bnx2i, when they rework their firmware according to netdev developers requests, will also need to be able to limit queueing at this level. bnx2i will hook into libiscsi, but will allocate a scsi host per netdevice/hba, so unlike pure software iscsi/iser which is allocating a host per session, it cannot set the scsi_host->can_queue and return SCSI_MLQUEUE_HOST_BUSY to reflect queueing limits on the transport. The iscsi class/driver can also set a scsi_target->can_queue value which reflects the max commands the driver/class can support. For iscsi this reflects the number of commands we can support for each session due to session/connection hw limits, driver limits, and to also reflect the session/targets's queueing window. Changes: v1 - initial patch. v2 - Fix scsi_run_queue handling of multiple blocked targets. Previously we would break from the main loop if a device was added back on the starved list. We now run over the list and check if any target is blocked. v3 - Rediff for scsi-misc. Signed-off-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>
-
- 12 Oct, 2008 1 commit
-
-
Randy Dunlap authored
Remove ending ':' from some of the Topic lines for consistency. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-