avl_sweden
avl_sweden

Reputation: 622

Linux can-bus excessive retransmit

I'm working on a project involving a linux embedded device with CAN bus-support.

I've noticed that if I try to send a CAN-packet without having anything attached to the CAN-bus, the transmit is automatically reattempted by the kernel an unlimited number of times. I can verify this using a scope - the same message is automatically transmitted over and over. This retransmission persists even if I shut down the process which created the message, and even if this process only ever attempts to transmit one single message.

My question is - is this normal behaviour for a linux CAN bus kernel? My worry is that if there is ever something wrong in the device, and it erroneously concludes that it is alone on the bus, the device might possibly swamp the bus making it unusable for other bus participants. I would have expected there to be some sort of retry-limit.

The device is using linux 4.14.48, and the can-chip is Philips SJA1000.

Upvotes: 1

Views: 1912

Answers (2)

R Virzi
R Virzi

Reputation: 11

Short answer is yes - if ACK is the only TEM error the counter will stop at 128 and not go into BUS OFF. It will go forever. This happened to me as well and I just turned off the re-transmit function from the processor side. Not sure if that is a CAN standard function or not.

Upvotes: 1

Lundin
Lundin

Reputation: 213832

What you are seeing is likely error frames. Compliant behavior is this:

  • Node is active. It attempts to send a data frame but get no ACK bits set since nobody is listening to it.
  • It will send out an error frame, which pretty much only consists of 6 dominant bits to purposely break bit stuffing.
  • The controller will re-attempt to send the message. If a new attempt to send without receiving ACK is done, another error frame will be sent. This will keep repeating automatically.
  • After 128 errors, the node will go error passive, where it will still send error frames, but now with recessive level where it doesn't disrupt other traffic.
  • After a total of 256 errors, the node will go bus off and shut up completely.

This should all be handled by the CAN controller hardware, not by the OS. You might need to reset or power cycle the SJA1000 once it goes bus off. If it never goes bus off, then something in the driver code might be continuously resetting the CAN controller after a certain amount of errors.

Mind that microcontroller implementations might act the same and reset upon errors too, since that's typically the only way to re-establish communication after a bus off. This depends on the nature of the CAN application.

Upvotes: 2

Related Questions