Age | Commit message (Collapse) | Author |
|
Force full-speed (12 Mbit/s) operation if CONFIG_USB_GADGET_DUALSPEED
is not defined.
The controller is capable of high-speed (480 Mbit/s) operation,
but some designs may require the use of lower-speed operation.
Signed-off-by: Eric Nelson <eric.nelson@boundarydevices.com>
|
|
This allows the USB code to determine whether a USB bus reset was issued,
which in turn allows the code to differentiate between a detach (return
to shell prompt) and a board reset/reboot request.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Almost all of ci_udc.c uses variable name "ep" for a struct usb_ep and
"ci_ep" for a struct ci_ep. This is nice and consistent, and helps people
know what type a variable is without searching for the declaration.
handle_ep_complete() doesn't do this, so fix it to be consistent.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
struct ci_req is a purely software structure, and needs no specific
memory alignment. Hence, allocate it with calloc() rather than
memalign(). The use of memalign() was left-over from when struct ci_req
was going to hold the aligned bounce buffer, but this is now dynamically
allocated.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
There's no need to store an array of QTD pointers in the controller.
Since the calculation is so simple, just have ci_get_qtd() perform it
at run-time, rather than pre-calculating everything.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
2 QTDs are allocated for each EP. The current allocation scheme aligns
the first QTD in each pair, but simply adds the struct size to calculate
the second QTD's address. This will result in a non-cache-aligned
addresss IF the system's ARCH_DMA_MINALIGN is not 32 bytes (i.e. the
size of struct ept_queue_item).
Similarly, the original ilist_ent_sz calculation aligned the value to
ARCH_DMA_MINALIGN but didn't take the USB HW's 32-byte alignment
requirement into account. This doesn't cause a practical issue unless
ARCH_DMA_MINALIGN < 32 (which I suspect is quite unlikely), but we may
as well fix the code to be explicit, so it's obviously completely
correct.
The new value of ILIST_ENT_SZ takes all alignment requirements into
account, so we can simplify ci_{flush,invalidate}_qtd() by simply using
that macro rather than calling roundup().
Similarly, the calculation of controller.items[i] can be simplified,
since each QTD is evenly spaced at its individual alignment requirement,
rather than each pair being aligned, and entries within the pair being
spaced apart only by structure size.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
This will allow functions other than ci_udc_probe() to make use of the
constants in a future change.
This in turn requires converting the const int variables to #defines,
since the initialization of one global const int can't depend on the
value of another const int; the compiler thinks it's non-constant if
that dependency exists.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Fix ci_ep_submit_next_request()'s ZLP transmission code to explicitly
call ci_get_qtd() to find the address of the other QTD to use. This
will allow us to correctly align each QTD individually in the future,
which may involve leaving a gap between the QTDs.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_udc_probe() initializes a pair of QHs and QTDs for each EP. After
each pair has been initialized, the pair is cache-flushed. The
conversion from QH/QTD index [0..2*NUM_END_POINTS) to EP index
[0..NUM_ENDPOINTS] is incorrect; it simply subtracts 1 (which yields
the QH/QTD index of the first entry in the pair) rather than dividing
by two (which scales the range). Fix this.
On my system, this avoids cache debug prints due to requests to flush
unaligned ranges. This is caused because the flush calls happen before
the items[] array entries are initialized for all but EP0.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
s/ot/to/
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_udc.c's usb_gadget_unregister_driver() doesn't call driver->unbind()
unlike other USB gadget drivers. Fix it to do this.
Without this, when ether.c's CDC Ethernet device is torn down,
eth_unbind() is never called, so dev->gadget is never set to NULL.
For some reason, usb_eth_halt() is called both at the end of the first
use of the Ethernet device, and prior to any subsequent use. Since
dev->gadget is never cleared, all calls to usb_eth_halt() attempt to
stop, disconnect, and clean up the device, resulting in double cleanup,
which hangs U-Boot on my Tegra device at least.
ci_udc allocates its own singleton EP0 request object, and cleans it up
during usb_gadget_unregister_driver(). This appears necessary when using
the USB gadget framework in U-Boot, since that does not allocate/free
the EP0 request. However, the CDC Ethernet driver *does* allocate and
free its own EP0 requests. Consequently, we must protect
ci_ep_free_request() against double-freeing the request.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Sometimes, a zero-length packet is required at the end of an IN
transaction so that the host knows the device is done sending data.
Enhance ci_udc to send a zlp when necessary. See the comments for
more details.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
usb_gadget_unregister_driver() is called to tear down the USB device mode
stack. Fix the driver to stop the USB HW (which causes any attached host
to notice the disappearance of the device), and free all allocations
(which obviously prevents memory leaks).
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
If allocation of the ep0 req fails, clean up all the allocations that
were made in ci_udc_probe().
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_ep_alloc_request() avoids allocating multiple request objects for ep0
by keeping a record of the first req allocated for ep0, and always
returning that instead of allocating a new req. However, if this req is
ever freed, the record of the previous allocation is not cleared, so
ci_ep_alloc_request() will keep returning this stale pointer. Fix
ci_ep_free_request() to clear the record of the previous allocation.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_pullup()'s !is_on path contains a cut/paste copy of udc_disconnect().
Remove the duplication by simply calling udc_disconnect() instead.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
handle_setup() currently assumes that the response to a Setup transaction
will be an OUT transaction, and any subsequent packet (if any) will be an
IN transaction. This appears to be valid in many cases; both USB
enumeration and Mass Storage work OK with this restriction. However, DFU
uses ep0 to transfer data in both directions. This renders the assumption
invalid; when sending data from device to host, the Data Stage is an IN
transaction, and the Status Stage is an OUT transaction. Enhance
handle_setup() to deduce the correct direction for the USB transactions
based on Setup transaction data.
ep0's request object only needs to be automatically re-queued when the
Data Stage completes, in order to implement the Status Stage. Once the
Status Stage transaction is complete, there is no need to re-queue the
USB request, so don't do that.
Don't sent USB request completion callbacks for Status Stage transactions.
These were queued by ci_udc itself, and only serve to confuse the USB
function code. For example, f_dfu attempts to interpret the 0-length data
buffers for Status Stage transactions as DFU packets. These buffers
contain stale data from the previous transaction. This causes f_dfu to
complain about a sequence number mismatch.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Allocate ep0's USB request object when the UDC driver is probed. This
solves a couple of issues in the current code:
a) A request object always exists for ep0. Prior to this patch, if setup
transactions arrived in an unexpected order, handle_setup() would need
to reply to a setup transaction before any ep0 usb_req was created.
This issue was introduced in commit 2813006fecda "usb: ci_udc: allow
multiple buffer allocs per ep."
b) handle_ep_complete no longer /has/ to queue the ep0 request again
after every single request completion. This is currently required, since
handle_setup() assumes it can find some request object in ep0's request
queue. This patch doesn't actually stop handle_ep_complete() from always
requeueing the request, but the next patch will.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_udc currently points ep->desc at separate descriptors for IN and OUT.
These descriptors only differ in the ep address IN/OUT field. Modify the
code to use a single descriptor, and change that descriptor's ep address
to indicate IN/OUT as required. This removes some data duplication.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
The flipping of ep0 between IN and OUT relies on ci_ep_queue() consuming
the current IN/OUT setting immediately. If this is deferred to a later
point when the req is pulled out of ci_req->queue, then the IN/OUT
setting may have been changed since the req was queued, and state will
get out of sync. This condition doesn't occur today, but could if bugs
were introduced later, and this error-check will save a lot of debugging
time.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_udc only allocates a single QTD structure per EP. All data needs to be
extracted from the DTD prior to calling ci_ep_submit_next_request(), since
that fills the QTD with next transaction's parameters. Fix
handle_ep_complete() to extract the transaction (remaining) length before
kicking off the next transaction.
In practice, this only causes writes to UMS devices to fail for me. I may
have tested the final versions of my previous ci_udc patch only with
reads. More recently, I had patches applied locally that allocated a QTD
per USB request rather than per USB EP, although since that doesn't give
any performance benefit, I'm dropping those.
Fixes: 2813006fecda ("usb: ci_udc: allow multiple buffer allocs per ep")
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Modify ci_ep_alloc_request() to return a dynamically allocated request
object, rather than a singleton that's part of the endpoint. This
requires moving various state from the endpoint structure to the request
structure, since we need one copy per request.
The "fast bounce buffer" b_fast is removed by this change rather than
moved to the request object. Instead, we enhance the bounce buffer logic
in ci_bounce()/ci_debounce() to keep the bounce buffer around between
request submissions. This avoids the need to allocate an arbitrarily-
sized bounce buffer up-front, yet avoids incurring the allocation
overhead each time a request is submitted.
A future enhancement would be to actually submit multiple requests to HW
at once. The Linux driver shows that this is possible. That might improve
throughput (depending on the USB protocol in use), since USB could be
performing a transfer to one HW buffer in parallel with whatever SW
actions U-Boot performs on another buffer. However, I have not made this
change as part of this patch, in order to keep SW changes related to
buffer management separate from any change in the way the HW is
programmed.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
Tegra's USB controller appears to be a variant of the ChipIdea
controller; perhaps derived from it, or simply a different version of
the IP core to what U-Boot supports today.
In this variant, at least the following difference are present:
- Some registers are moved about.
- Setup transaction completion is reported in a separate 'epsetupstat'
register, rather than in 'epstat' (which still exists, perhaps for
other transaction types).
- USB connection speed is reported in a separate 'hostpc1_devlc'
register, rather than 'portsc'.
- The registers used by ci_udc.c begin at offset 0x130 from the USB
register base, rather than offset 0x140. However, this is handled
by the associated EHCI controller driver, since the register address
is stored in controller.ctrl->hcor.
Introduce define CONFIG_CI_UDC_HAS_HOSTPC to indicate which variant of
the controller should be supported. The "HAS_HOSTPC" part of this name
mirrors the similar "has_hostpc" field used by the Linux EHCI controller
core to represent the presence/absence of the hostpc1_devlc register.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
usb_gadget_register_driver() currently unconditionally programs PORTSC
to select a ULPI PHY. This is incorrect on at least the Tegra boards I
am testing with, which use a UTMI PHY for the OTG ports. Make the PHY
selection code conditional upon the specific EHCI controller that is in
use.
Ideally, I believe that the PHY initialization code should be part of
ehci_hcd_init() in the relevant EHCI controller driver, or some board-
specific function that ehci_hcd_init() calls.
For MX6, I'm not sure this PHY initialization code is correct even before
this patch, since ehci-mx6's ehci_hcd_init() already configures PORTSC to
a board-specific value, and it seems likely that the code in ci_udc.c is
incorrectly undoing this. Perhaps this is not an issue if the PHY
selection register bits aren't implemented on this instance of the MX6
USB controller?
ehci-mxs.c doens't appear to touch PORTSC, so this code is likely still
required there.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
At least drivers/usb/gadget/storage_common.c expects that ep->req.actual
contain the number of bytes actually transferred. (At least in practice,
I observed it failing to work correctly unless this was the case).
However, ci_udc.c modifies ep->req.length instead. I assume that .length
is supposed to represent the allocated buffer size, whereas .actual is
supposed to represent the actual number of bytes transferred. In the OUT
transaction case, this may happen simply because the host sends a smaller
packet than the max possible size, which is quite legal. In the IN case,
transferring fewer bytes than requested could presumably happen as an
error.
Modify handle_ep_complete() to write to .actual rather than modifying
.length.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
ci_ep_queue() currently only fills in the page0/page1 fields in the
queue item. If the buffer is larger than 4KiB (unaligned) or 8KiB
(page-aligned), then this prevents the HW from knowing where to write
the balance of the data.
Fix this by initializing all 5 pageN pointers, which allows up to
16KiB (potentially non-page-aligned) buffers.
Signed-off-by: Stephen Warren <swarren@nvidia.com>
|
|
The mv_udc is not marvell-specific anymore. The mv_udc is used to drive
generic ChipIdea CI13xxx series OTG cores, so rename the driver to ci_udc
instead.
Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Eric Nelson <eric.nelson@boundarydevices.com>
Cc: Stefano Babic <sbabic@denx.de>
|