Commits · 6f92a7a0aff413cdf42955a187647e3736ebd8f3 · linux / linux-davinci

27 Apr, 2009 40 commits

[SCSI] mpt2sas: fix hotplug event processing · 6f92a7a0

Eric Moore authored Apr 21, 2009

Here's a fix for hotplug events. The useage of queue_delayed_work seems
to broke the fifo for processing of firmware events. After several iterations
of adding and removing cabling connected to jbods, the devices are not
getting added becuase kernel thread is activited out of order.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

6f92a7a0

[SCSI] mpt2sas : release diagnotic buffers prior host reset · 99bb214b

Eric Moore authored Apr 21, 2009

Diagnostic buffer support is already there in the driver.  This support allows
applications to pull ring buffers from controller firmware for debugging
firmware related issues.

What this patch does is sends reqeust to firmware to release the buffers prior
to host reset.   This will allow what ever debug info is there prior to reset
to be dma'd to host memory. With out this fix, some of the debug data would
been lost.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

99bb214b

[SCSI] mpt2sas : Broadcast Primative AEN bug fix · 8901cbb4

Eric Moore authored Apr 21, 2009

Bug fix in the broadcast primative async event code where the driver would
stop sending tm queries after the first queury was completed. This was due
driver not reseting the tm_cmds.status field back to MPT2_CMD_NOT_USED after
completing a task management request.

An addtional fix adding sanity check to insure sas_device->starget set to NULL.
During multipath testing fail over/fail back, the mid layer was holding onto
sdev longer than the fail back period, thus starget was getting set to NULL
for device being added.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

8901cbb4

[SCSI] mpt2sas : Identify Dell series-7 adapters at driver load time · f0f9cc1f

Eric Moore authored Apr 21, 2009

The Dell branding along with the VID, DID, SSVID, SSDID following the LSI
branding that contains the card firmware/chip/bios versions. If the SSDID
is not known but it is a Dell HBA, the driver will print the SSDID instead
of the Dell branding string. Nothing will be printed for non Dell HBAs
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

f0f9cc1f

[SCSI] mpt2sas : driver name needs to be in the MPT2IOCINFO ioctl · e5f9bb19

Eric Moore authored Apr 21, 2009

The driver name needs to be at the beginining of the driver_version string in
MPT2IOCINFO ioctl.  This is the same behaviour is there already in the mptsas
driver.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

e5f9bb19

[SCSI] mpt2sas : running out of message frames · 77bdd9ee

Eric Moore authored Apr 21, 2009

The driver is not freeing message frame when returning failure from
_ctl_do_task_abort.   If you call this function 500 times when its unable
to find an active task mid, you end up with no message frames.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

77bdd9ee

[SCSI] mpt2sas : fix oops when firmware sends large sense buffer size · 0d04df9b

Eric Moore authored Apr 21, 2009

There is a bug in firmware where the reply message frame says there is a
16kb sense buffer, when in reality its only 20 bytes. This fix insures
the memcpy action doesn't corrupte the memory beyond the 90 bytes allocated in
the scsi command for sense buffer.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

0d04df9b

[SCSI] mpt2sas : the sanity check in base_interrupt needs to be on dword boundary · 03ea1115

Eric Moore authored Apr 21, 2009

The poison sanity check on the reply_post_free register needs to be by 32bit,
not 64bit. The poison check is there because its possible that the driver read
the 1st 32bit before the 2nd 32bit has been written to by firmware. In other
words, this handles race between driver reading the 64 bit register, and it
being dma'd across pci memory from controller firmware as two 32bit pci writes.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

03ea1115

[SCSI] mpt2sas : unique ioctl magic number · fd01825c

Eric Moore authored Apr 21, 2009

The current magic number is shared with mptsas driver. This to be unique to
fix issues with register_ioctls32_conversion in older kernels. We are making
this change across all versions of the sas2.0 drivers.
Signed-off-by: Eric Moore <eric.moore@lsi.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

fd01825c

[SCSI] fix sign extension with 1.5TB usb-storage LBD=y · 8f76d151

Dave Hansen authored Apr 21, 2009

Shifting an unsigned char implicitly casts it to a signed int.  This
caused 'lba' to sign-extend and Linux would then try READ CAPACITY 16
which was not supported by at least one drive.  Using the
get_unaligned_be*() helpers keeps us from having to worry about how the
extension might occur.
Signed-off-by: Dave Hansen <dave@linux.vnet.ibm.com>
Reviewed-by: Matthew Wilcox <willy@linux.intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

8f76d151

[SCSI] ipr: Fix sleeping function called with interrupts disabled · dd406ef8

Brian King authored Apr 22, 2009

The ata_sas_slave_configure was changed such that it now allocates
some memory for a drain buffer for ATAPI devices. Fixup the ipr
driver such that we no longer make this call with interrupts disabled.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

dd406ef8

[SCSI] fcoe: fip: add multicast filter to receive FIP advertisements. · 6401bdca

Joe Eykholt authored Apr 21, 2009

The FCoE forwarder (FCF) would be selected, but then would soon time
out after three advertisements were missed.  This would be 24 seconds
by default, or 3 times the keep-alive interval configured on the switch.

The cause was that the multicast address for all FIP E-nodes
was never added.
Signed-off-by: Joe Eykholt <jeykholt@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

6401bdca

[SCSI] libfc: Fix compilation warnings with allmodconfig · a29e7646

Robert Love authored Apr 21, 2009

When building with a .config generated from 'make allmodconfig'
some build warnings are generated. This patch corrects the warnings,
adds a FC_FID_NONE (= 0) enumeration for FC-IDs and cleans up one
variable naming to meet our variable naming conventions. For example,
fc_lport's should be named "lport," not "lp."
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

a29e7646

[SCSI] fcoe: fix spelling typos and bad comments · dd3fd72e

Chris Leech authored Apr 21, 2009

Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

dd3fd72e

[SCSI] fcoe: don't export functions that are internal to fcoe · fc224a5b

Chris Leech authored Apr 21, 2009

These probably never should have been exported.
If they were needed outside of the fcoe module, they
would have been moved to libfcoe.
Signed-off-by: Chris Leech <christopher.leech@intel.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

fc224a5b

[SCSI] fcoe: kfree() -> kfree_skb() · 3caf02ee

Dan Carpenter authored Apr 21, 2009

sk_buff pointers should use kfree_skb() instead of vanilla kfree().

Found by smatch (http://repo.or.cz/w/smatch.git).
Signed-off-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

3caf02ee

[SCSI] libfc: whenever queueing delete ev for rport, set state to NONE · 55c7a60c

Abhijeet Joglekar authored Apr 21, 2009

When a delete event is queued for an rport, set state to NONE so that no
other processing is done on the rport as it is being removed.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

55c7a60c

[SCSI] libfc: Change state to NONE in fc_lport_destroy · bbf15669

Abhijeet Joglekar authored Apr 21, 2009

After lport_destroy, the local port should not be used again. Transition
to state NONE, any incoming frames or link up should not transition out
of this state since we are deleting exchange table and cleaning up the
local port. Also, mark link as down.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

bbf15669

[SCSI] libfc: During fabric logoff, flush the rport Q after logging off dns port · a0fd2e49

Abhijeet Joglekar authored Apr 21, 2009

We want to generate the rport queue event (from the logoff)
before flushing the queue otherwise the event may still be
in the queue when we logoff.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

a0fd2e49

[SCSI] libfc: Track rogue remote ports · b4c6f546

Abhijeet Joglekar authored Apr 21, 2009

Rogue ports are currently not tracked on any list. The only reference
to them is through any outstanding exchanges pending on the rogue ports.
If the module is removed while a retry is set on a rogue port
(say a Plogi retry for instance), this retry is not cancelled because there
is no reference to the rogue port in the discovery rports list. Thus the
local port can clean itself up, delete the exchange pool, and then the
rogue port timeout can fire and try to start up another exchange.

This patch tracks the rogue ports in a new list disc->rogue_rports. Creating
a new list instead of using the disc->rports list keeps remote port code
change to a minimum.

1) Whenever a rogue port is created, it is immediately added to the
disc->rogue_rports list.

2) When the rogues port goes to ready, it is removed from the rogue list
and the real remote port is added to the disc->rports list

3) The removal of the rogue from the disc->rogue_rports list is done in
the context of the fc_rport_work() workQ thread in discovery callback.

4) Real rports are removed from the disc->rports list like before. Lookup
is done only in the real rports list. This avoids making large changes
to the remote port code.

5) In fc_disc_stop_rports, the rogues list is traversed in addition to the
real list to stop the rogue ports and issue logoffs on them. This way, rogue
ports get cleaned up when the local port goes away.

6) rogue remote ports are not removed from the list right away, but
removed late in fc_rport_work() context, multiple threads can find the same
remote port in the list and call rport_logoff(). Rport_logoff() only
continues with the logoff if port is not in NONE state, thus preventing
multiple logoffs and multiple list deletions.

7) Since the rport is removed from the disc list at a later stage
(in the disc callback), incoming frames can find the rport even if
rport_logoff() has been called on the rport. When rport_logoff() is called,
the rport state is set to NONE, and we are trying to cancel all exchanges
and retries on that port. While in this state, if an incoming
Plogi/Prli/Logo or other frames match the rport, we should not reply
because the rport is in the NONE state. Just drop the frame, since the
rport will be deleted soon in the disc callback (fc_rport_work)

8) In fc_disc_single(), remove rport lookup and call to fc_disc_del_target.
fc_disc_single() is called from recv_rscn_req() where rport lookup
and rport_logoff is already done.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

b4c6f546

[SCSI] libfc: Do not retry if the new state is not the same as old state · 76f6804e

Abhijeet Joglekar authored Apr 21, 2009

For instance, if there is a Plogi pending (remote port is in Plogi state),
and the state changes to say NONE (because the port is being logged off),
then when the Plogi resp times out, do not start a retry.

This patch partially reverts an earlier patch (libfc: check for err when
recv and state is incorrect), by moving the state check back to before
checking for error. However, if the state does not match, then there is
an additional check to see if its an error ptr or a real frame before
jumping to err or out respectively.
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

76f6804e

[SCSI] libfc: Hold disc mutex while processing gpn ft resp · 0d228c0f

Abhijeet Joglekar authored Apr 21, 2009

gpn_ft_resp processing currently does not hold the discovery lock.
disc_done() thus gets called from gpn_ft_resp or from gpn_ft_parse
without the lock held. This then sets disc->pending to zero or calls
gpn_ft_req() without disc_lock held.

- Hold disc mutex during gpn_ft resp processing
- In disc_done, release the disc mutex while calling lport callback
Signed-off-by: Abhijeet Joglekar <abjoglek@cisco.com>
Signed-off-by: Robert Love <robert.w.love@intel.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

0d228c0f

[SCSI] cxgb3i: fix ddp map overrun · a53922dd

kxie@chelsio.com authored Apr 21, 2009

(version 2)

Fixed a bug in calculating ddp map range when search for free entries:
it was going beyond the end by one, thus corrupting gl_skb[0].
Signed-off-by: Karen Xie <kxie@chelsio.com>
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

a53922dd

[SCSI] cxgb3i: fix cpu use abuse during writes · 1393109f

Mike Christie authored Apr 21, 2009

When doing a lot (128) of large writes (256K) we can hit the cxgb3_snd_win
check pretty easily. The driver's xmit thread then takes 100% of the cpu.

The driver should not be returning -EAGAIN for this problem. It should
be returing -ENOBUFS, then when the window is opened again it should
queue the xmit thread (it already wakes the xmit thread).
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

1393109f

[SCSI] cxgb3i: fix can_queue and cmd_per_lun initialization · dd0af9f9

Mike Christie authored Apr 21, 2009

cxgb3i was setting can_queue to only 128 commands, and was
setting the can_queue and cmd_per_lun to the same value.

This sets the can_queue to 1024 commands, and sets the cmd_per_lun
to a safer default of 32.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

dd0af9f9

[SCSI] cxgb3i, iser, iscsi_tcp: set target can queue · 6b5d6c44

Mike Christie authored Apr 21, 2009

Set target can queue limit to the number of preallocated
session tasks we have.

This along with the cxgb3i can_queue patch will fix a throughput
problem where it could only queue one LU worth of data at a time.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

6b5d6c44

[SCSI] iscsi_tcp: don't fire conn error if pdu init fails · 9a6510eb

Mike Christie authored Apr 21, 2009

If a command's scsi cmd pdu setup fails then we can just fail
the IO to the scsi layer. If a DATA_OUT for a R2T fails then
we will want to drop the session, because it means we got a
bad request from the target (iscsi protocol error).

This patch has us propogate the error upwards so libiscsi_tcp
or libiscsi can decide what the best action is to take. It
also fixes a bug where we could try to grab the session lock
while holding it, because if iscsi_tcp drops the session in the
pdu setup callout the session lock is held when setting up the
scsi cmd pdu.
Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

9a6510eb

[SCSI] zfcp: Fix oops when port disappears · 70932935

Christof Schmitt authored Apr 17, 2009

The zfcp_port might have been removed, while the FC fast_io_fail timer
is still running and could trigger the terminate_rport_io callback.
Set the pointer to the zfcp_port to NULL and check accordingly
before using it.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

70932935

[SCSI] zfcp: Reference counting for cfdc requests · 3869bb6e

Christof Schmitt authored Apr 17, 2009

Before dropping the reference count with zfcp_adapter_put, increase it
with zfcp_adapter_get when issuing cfdc requests.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

3869bb6e

[SCSI] zfcp: Fix port reference counting · 6ab35c07

Martin Petermann authored Apr 17, 2009

If this problem appears zfcp ports cannot be de-queued since it is
checked for a zero refcount. The port reference counting is wrong for
existing zfcp ports when e.g. an adapter gets on-line again. During
port scanning the reference counting for existing ports should not be
changed.
Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

6ab35c07

[SCSI] zfcp: revert previous patch for sbal counting · 7001f0c4

Martin Petermann authored Apr 17, 2009

The current sbal counting can be wrong if a fsf request is
waiting for free sbals and at the same time qdio request queue
is shutdown and re-opened. Revering a previous patch fixes this
issue.
Signed-off-by: Martin Petermann <martin.petermann@de.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

7001f0c4

[SCSI] zfcp: Fix abort handler for completions in progress · c6936e7f

Christof Schmitt authored Apr 17, 2009

When the abort handler cannot find a pending FSF request, the request
completion could just be running. This means we cannot return SUCCESS,
since this would lead to call to scsi_done after exiting the SCSI
error handler which is not allowed.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

c6936e7f

[SCSI] zfcp: no port recovery after ADISC request timeout · 5b43e719

Swen Schillig authored Apr 17, 2009

A remote port remains in error state even if we receive a RSCN
stating that the connection is re-established. The port recovery
is not started due to a flag which is not reset.
The solution is to clear the flag in question before we trigger a ERP.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

5b43e719

[SCSI] zfcp: Let actcli handle control file errors · f7306bf6

Christof Schmitt authored Apr 17, 2009

Error codes specific to the control file requests are evaluated by the
actcli tool, so don't report -ENXIO for those. Generic problems are
still checked for outside the command specific handler.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

f7306bf6

[SCSI] zfcp: remove unit will fail if add unit is not finished · 048225e3

Martin Petermann authored Apr 17, 2009

On some hardware it can take some time to add a unit. If
some remove this unit during this process the remove will
fail.
Signed-off-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

048225e3

[SCSI] zfcp: no port recovery after storage side error inject · d81ad31c

Swen Schillig authored Apr 17, 2009

The remote port remains in error state even if the connection
is re-established. A wrong precondition check was performed on
the port status leading to a cancellation of the port reopen.
Remove the pre-req check because it's not required and better
handled within the ERP.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

d81ad31c

[SCSI] zfcp: avoid false ERP complete due to sema race · 94ab4b38

Swen Schillig authored Apr 17, 2009

The ERP thread is performing a task before it is executing the
corresponding down on the semaphore. The response handler of the
just started exchange config should wait for the completion by
performing a down on this semaphore. Since this semaphore is still
positive from the ERP enqueue the handler won't wait and therefore
the exchange config will always fail leaving the adapter in error.
The problem can be solved by performing the down on the semaphore
before starting an ERP task. This is the logically correct order.
Only walk the ERP loop if there is a task to perform.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

94ab4b38

[SCSI] zfcp: Set WKA-port to offline on adapter deactivation · 828bc121

Swen Schillig authored Apr 17, 2009

The nameserver port might be in state online when the adapter is
offlined. On adapter reactivation the nameserver port is not
re-opened due to the PORT_ONLINE status. This results in an
unsuccessful recovery. In forcing the nameserver port status
to offline on all adapter offline events this issue is prevented.

Waiting for the reference count to drop to zero in
zfcp_wka_port_offline is not required, so remove it.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

828bc121

[SCSI] zfcp: Dont block zfcp_wq with scan · 92d5193b

Swen Schillig authored Apr 17, 2009

When running the scsi_scan from the zfcp workqueue and the target
device does not respond, the zfcp workqueue can block until the
scsi_scan hits a timeout. Move the work to the scsi host workqueue,
since this one is also used for the scan from the SCSI midlayer.
Signed-off-by: Swen Schillig <swen@vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

92d5193b

[SCSI] zfcp: Dont call zfcp_fsf_req_free on NULL pointer · ada81b74

Christof Schmitt authored Apr 17, 2009

Fix problem that zfcp_fsf_exchange_config_data_sync and
zfcp_fsf_exchange_config_data_sync could try to call zfcp_fsf_req_free
with a NULL pointer.
Reviewed-by: Martin Petermann <martin@linux.vnet.ibm.com>
Signed-off-by: Christof Schmitt <christof.schmitt@de.ibm.com>
Signed-off-by: James Bottomley <James.Bottomley@HansenPartnership.com>

ada81b74