This is the mail archive of the ecos-bugs@sourceware.org mailing list for the eCos project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug 1001897] lpc2xxx CAN driver improvements / enhancements


Please do not reply to this email, use the link below.

http://bugs.ecos.sourceware.org/show_bug.cgi?id=1001897

--- Comment #3 from Uwe Kindler <uwe_kindler@web.de> ---
(In reply to comment #2)
> I've no experience with LPC2xxx but with LPC1765 that has the same CAN cell
> from the LPC2xxx.
> 
> AFAIK, ICR_BUS_ERR shows an error on the bus, it may not be always a BUS OFF
> condition.

The CAN ISR is triggered by various warning or error conditions. To check,
which error occured, we check within the lpc2xxx_can_getevent which event
occured by testing the ICR register for various flags. The flag ICR_BUS_ERR is
bit 7 of ICR and tests for bus error interrupt.

This is, what the manual of the LPC2xxx writes about error handling:

manual snippet ------------------>
The CAN Controllers count and handle transmit and receive errors as specified
in CAN Spec 2.0B. The Transmit and Receive Error Counters are incremented for
each detected error and are decremented when operation is error-free. If the
Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off. In this state, the following
register bits are set: BS in CANSR, BEI and EI in CANIR if these are enabled,
and RM in CANMOD. RM resets and disables much of the CAN Controller. Also at
this time the Transmit Error Counter is set to 127 and the Receive
Error Counter is cleared. Software must next clear the RM bit. Thereafter the
Transmit Error Counter will count down 128 occurrences of the Bus Free
condition (11 consecutive recessive bits).
<--------------------

So the ICR_BUS_ERR flag is set if BEI (Bus error interrupt) is enabled.
According to the manual, is this interrupt occurs, the BS bit in CANSR (status
register) is set. The manual says:

These bits are identical to the BS bit in the GSR (Global Status Register)

Bit 7 in GSR (Global Status Register) is the Bus status flag - this is written
in the manual about this flag:

Bus Status: the CAN controller is currently prohibited from
bus activity because the Transmit Error Counter reached
its limiting value of 255.

That means, the bus error interrupt occures if there is a Bus-Off condition.
For all other error or warning interrupts there are other interrupt flags. So
the bus error is a Bus-off condition and the flag ICR_BUS_ERR tests for this
Bus-off condition.


> 
> To know about BUS OFF, you must check bit 7 of the GSR.

Yes, but in case of an interrupt need to check bit 7 of ICR (Interrupt and
Capture Register) - bit 7 (BEI - Bus error interrupt) is set if Bit 7 in GSR is
set (Bus-off contiditon) is set.

> 
> If you immediately clears the counters, the DSR can't know about the
> counters value and has no way to help diagnose the problem occurring on the
> bus.

The counter values are cleared in the lpc2xxx_can_getevent function - so the
DSR has already been executet. In the lpc2xxx_can_getevent function the event
flag is set in case of a Bus-off condition (pevent->flags |=
CYGNUM_CAN_EVENT_BUS_OFF) and propagated to the application code that will
receive this event. In case of a Bus-off condition the application dont't need
to read the error counters because the data size of the error counter registers
is 8 bit and a Bus-off condition occures if the error counter contains 255 and
another error occures (that means the counter overflows). So in case of a
Bus-off error the error counters do not contain "valid" counter values anymore.
Normally the application does not need to care about the error counters,
because the LPC2xxx CAN controller has status flags and interrupts for all
important conditions.

1. Warning interrupt if counter raise above the warning limit (>96)
2. Error passive interript if counters raise abover error passive level (>128)
3. Bus-off interrupt if counters overflow (>255)

The error counters a normally for the internal CAN controller logik to track
controller warning and error states.


> The LPC1765 irq system is different but I don't see why a MCU would do that,
> the problem must be elsewhere because the CAN controller is expected to exit
> the bus off condition by itself, at least if there is activity on the bus.

Here is a small snippet from the LPC2xxx manual what the controller does in
case of a Bus-off condition:


manual snippet ------------------->
If the Transmit Error counter contains 255 and another error occurs, the CAN
Controller is forced into a state called Bus-Off. In this state, the following
register bits are set: BS in CANSR, BEI and EI in CANIR if these are enabled,
and RM in CANMOD. RM resets and disables much of the CAN Controller. Also at
this time the Transmit Error Counter is set to 127 and the Receive Error
Counter is cleared.
<----------------------

That means: 

1. Bus error interrupt
2. RM in CANMOD set (RM - Reset Mode resets and disables much of the CAN
Controller)
3. TX counter is set to 127 and RX counter is cleared.

That means according to the manual the hardware should do exactly what my patch
does in case of a Bus-off confition. The problem is, although it is written in
the manual, it does not happen for my LPC2xxx. Via debug output I can see the
following:

1. Bus error interrupt occures (BS in CANSR and GSR is set)
2. RM in CANMOD is NOT set (controller remains active)
3. TX counter is NOT set to 127 and RX counter is NOT cleared.

So the hardware acts differently than the manual states. I could not find
anything in the errata sheets and I don't know if this also happens for newer
(i.e. LPC3xxx) devices - but for the LPC2294 controller on the olimex board,
this is reality. Because the controller does not enter RM (Reset Mode) and
because the counters are not cleared by hardware, the Bus error interrupt will
happen immediatelly again as soon as the ISR / DSR processing has finished.
This will block application from running because the ISR /DSR code will fire
again and again. So my patch simply does, what it is written in the manual:

1. Set the controller into reset mode (RM bit)
2. Set the TX counter to 127 and clear the RX counter.

The only additional step my code does, is clearing the RM bit. So the
controller leaves the reset mode again. Because the TX counter value is 127 it
takes a while until the TX counter overflows again and the next bus error
interrupt occures. During this time the application code can run and will
receive the CYGNUM_CAN_EVENT_BUS_OFF. As long as the bus off condition exists,
the application will recevice the CYGNUM_CAN_EVENT_BUS_OFF event again and
again - each time the TX error counter overflows. But as soon as the bus off
condition goes away (i.e. if the node is physically connected to the bus again)
the bus off condition goes away and the controller automatically recovers from
this bus of condition.

> What your patch does it to stop the CAN controller to send error frames as
> soon as a single error, or any kind, is detected, which probably breaks the
> CAN spec (which may be normal with CANopen, I don't know, but in that case
> the driver becomes specific to CANopen).
> 

No, my patch does excatly what is written in the LPC2xxx manual. The Bus-off
condition auomatically stops the controller from sending error frames. This is
what the bus off confition is made for - ensuring that a broken node does not
destroy the whole CAN communication. My patch sets the TX counter back to 127.
This ensures that controller stays in error passive mode - that means is sends
only Passive Error Flags on the bus - A Passive Error Flag comprises 6
recessive bits, and will not destroy other bus traffic. Here is a good article
about CAN bus error handling:

http://www.kvaser.com/zh/about-can/the-can-protocol/23.html

So my patch does:

1. Ensure that the application is not blocked by bus error ISR/DSR
2. The application gets informed about bus off condition via
CYGNUM_CAN_EVENT_BUS_OFF.
3. The controller stays in error passive mode - sends only 6 recessive error
bits, and will not destroy other bus traffic
4. The controller properly recovers from bus off condition
5. The controller behaves like written in the user manual

-- 
You are receiving this mail because:
You are the assignee for the bug.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]