top of page
  • SheeldS Contributor

Automotive Cybersecurity Testing Tools. Building Tools for the CAN Bus

Why, in an emerging field such as automotive cyber-security, you sometimes need to build your own automotive cyber security testing tools to get the job done.

Automotive cybersecurity testing tools. Why I don’t test my software with anyone else’s

Automotive cybersecurity testing tools come in all shapes and sizes. When it comes to testing and validating communication systems, you can find a variety of testing equipment for each protocol and standard you want to test.

Some are free, and some are expensive and considered professional equipment. Some can emulate the states of the communication and some will test your device for accuracy and performance. But sometimes, commercially available solutions don’t (especially in an emerging and evolving field such as automotive cyber-security) do what you need them to do.

This article’s focus is the CAN bus, and in particular, CAN bus intrusion detection systems (IDS) used in in-vehicle automotive cybersecurity.

What parameters are we looking for in CAN testing equipment?

Accuracy: Automotive cybersecurity testing tools should send only the planned messages and nothing more. Although this sounds trivial, it’s not. For example, for some Ethernet testing systems (software-based in particular), we can find in-band control on the same interface being tested.

Active protocol emulation: This is relevant for some CAN bus protocols; however, most communications are stateless broadcasts. It is more relevant for specific cases, such as for diagnostics or protocols such as SAE J1939 for trucks and other heavy-duty vehicles.

Error frames: Recreation of an error state on the bus might be relevant in some cases since we assume a vehicle may have an error on the bus from time to time (and we want to avoid a false alarm). On the other hand, an error state (on the bus or error frame) may indicate an attack if the attacker gains access to a CAN controller (on the bus) and uses it in a way that generates errors (whether intentionally or unintentionally).

Recording and replaying: Accurate recording and replaying of vehicle communications. Vehicle communications are often predictable, and messages are relayed at predefined rates. As can be seen in Arilou’s patents, among other things, Cerrado's IDS inspects the timing of messages to determine if an anomaly occurred.

Additional parameters: Ease of use, reliability. Can we use it programmatically? Maximal recording length, support of databases formats, number of buses, scalability (price), software tools support (e.g. Linux), acceptance in the industry, among others.

What is considered accurate timing for CAN bus?

Typically, CAN-bus runs in a vehicle at rates of 125kbps to 500kbps. Long messages are around 120 bits when considering bit-stuffing, and short messages can be around 60 bits. A typical CAN controller runs on an embedded processor running an RTOS.

Message-send timing: Typical high-priority messages (that win the arbitration process) on a 500kbps bus with a well-designed ECU have a jitter <100μs. Low priority messages (that may wait in the queue), slower buses, and going through gateways may result in jitter within the order of 1ms.

Clock skew and jitter: Clock skew and jitter may result in inter-bit timing change and other effects. Low-level clocking anomalies can be found using low-level signal analysis as covered by Cerrado's PIPS system and relevant Cerrado patents.

Inter-bus timing: Some messages may be a result of other messages passing through a gateway. Let us consider message A on bus 1 (125 kbps) arriving at the gateway. As a result, the gateway sends message B on bus 2 (500kbps). It will be anomalous if we see message B then message A. As can be seen in appendix A, standard and even professional equipment will typically result in message reordering in this scenario.

Back to top

Why would you need testing equipment for CAN bus IDS and derived testing system requirements?

There are several use cases:

Demo: A demo of a CAN bus anomaly detection system. This is the simplest scenario since we control the environment and what we test. Any automotive cybersecurity testing tools should be predictable and reliable, with some cap on the max jitter. However, real-time accuracy is not a must.

Vehicle integration: With vehicle integration, test equipment is used for 2 purposes:

Accurate and long recording, where we will require real-time, accurate recording of communications on several buses.

Accurate real-time replay on multiple buses.

Bench testing with real ECUs: In some scenarios, real ECUs or embedded devices are integrated on a test bench. An example of this is PASTA by Toyota. This scenario is very similar to the vehicle integration scenario.

Bench testing: In some cases, to make a test predictable, repeatable, and easy to use, a vehicle recording is made and replayed as a test. In these cases, both training and testing are done using the same replay equipment. It doesn’t have to be accurate on both buses as training can use the same system if each replay has the same repeatable errors (and is therefore consistent).

Internal development and testing: In these cases, we will want the testing tools to be accurate on all buses. They should be scalable, and therefore non-expensive.

Attack generation: Can be done in any of the mentioned scenarios and even in a standalone vehicle attack scenario. Usually, timing is not as important as a fast response to a message. The capability to program the device for more advanced attacks, active message drop of other participants on the bus, error generation at an exact time synchronization and other utility is also desirable.

What’s automotive cybersecurity testing tools are available on the market?

We will only go over equipment relevant to automotive cybersecurity testing tools that we’ve used, tested and have a professional opinion on. This review is based on personal experience and should be viewed as such.

First, let us consider software support. Each professional tool comes with its own software suite, however, for interoperability and ease of use, it’s often advantageous to use a 3rd party software tool.

BUSmaster: Is a software tool by Bosch & Etas. It’s free, open-source, supports many hardware (HW) interfaces, and is easy to use. Its programmatic capability (node simulation) is a great advantage, although the programming/compiling interface is challenging when performing advanced operations. It only supports windows. As for timing capability, for recording, it depends on the hardware’s capability to perform timestamping of incoming messages. For a replay of messages, it both depends on the OS accuracy and its intrinsic problem of adding delay to message send (timing drift).

As can be seen in the source it doesn’t take into account process time and delay for event calling.

Linux: Has comprehensive support for CAN bus using its socket CAN drivers, which provide a standard network interface for a CAN device. Besides the kernel module, it supports many HW modules, has comprehensive protocol support, sending and recording capabilities, virtual CAN gateways, virtual CAN devices and much more. It is open-source and free. If the HW supports HW timestamping of incoming messages, it will receive an accurate timestamp on the receiving side. Using python, it is easy to program and use. On the downside, the sending of messages is not accurate and depends on the OS. Even VCAN (virtual CAN device) sending of messages, although it is a simulation model controlled entirely by the OS, the user cannot control the timestamps and it uses OS timestamping.

Specialized HW devices

Vector: Vector informatic devices are the golden standard in the industry for CAN testing devices. As such using vector is extremely expensive. Vector provides a library and API for others to use (Windows only for now). BUSmaster supports Vector devices, however, Linux is not supported.

Vector software tools provide wide support and capabilities, some are unique to Vector. The receiving side of vector devices has HW timestamping. Standard Vector devices we used do not have timed HW buffers for message replay, and as such depend on OS timing capability which is not accurate enough.

Vector claim the VN8900 series is designed for exact timing, we have not had a chance to test this equipment until now and we’ve not encountered it in the industry so far. The CANalyzer SW with a device from the VN1600 series did not exhibit timing drift (unlike the BUSmaster) but did exhibit SW based jitter as can be seen in figure 2. We did not test it in an extreme scenario (e.g. what would happen if we connected another device to the internal USB bus, etc.).

As for reliability, although expensive, we had a Vector OBD cable break as any other OBD cable. In summary, we must support vector since it is a standard in the industry, however, its timing is not accurate enough for replay-based IDS testing.

PEAK-System Technik: Peak provides professional equipment for CAN bus. It is not a standard in the industry and therefore it is reasonably priced. Peak devices support BUSmaster, Linux, and its own Peak software. For receiving messages, it has accurate HW based message timestamping. As with Vector, message replay-timing is based on the operating system, therefore it is not accurate enough for replay-based IDS testing.

Professional devices are not enough

As we’re at the leading edge of a new discipline, if we want automotive cybersecurity testing tools tailored to our needs (that are scalable, accurate, and easy to use) we’re going to have to make our own. Arilou has developed internal replay and recording tools based on commercially available HW.

The replay capability is capable of <100μs jitter on multiple CAN buses. In order to perform at that level, an embedded real-time OS is used. The replay equipment uses buffering of messages with transmit timestamping managed by the embedded HW and not the PC’s OS.

Recording is flexible since most of the professional tools support HW timestamping. In addition to physical CAN message sending, for simulation and internal testing purposes, a control over VCAN timing in Linux was implemented. It required both Kernel side driver changes and changes to the python side.

Appendix: Inter-bus message re-ordering analysis:

When receiving messages over CAN bus devices, HW timestamps are created by the CAN controller (and its clock) when the message finishes being received. This method of receiving has three problems:

  • The clock must be synchronized between buses.

  • If timestamping is performed during the receiving interrupt (and not in the modified CAN controller) while processing one message, other messages received will have a delayed timestamp.

  • The timestamp is added at the end of the message and not the start of the message.

Since timestamp recording problems 1 and 2 are something the industry is aware of, we will concentrate on problem 3.

Sending a 125-bit message on bus A at 125kbps takes 1ms\ (T_{1}-T_{0}) (T1​−T0​), on\ T_{1} T1​, both the GW and the testing equipment will get a complete message, timestamping it with\ T_{1} T1​.

At \ T_{2} T2​ the GW will start sending the message. Usually, the processing time\ (PT = T_{2}-T_{1}) (PT=T2​−T1​) is much smaller than the message send-time. Sending the message on bus B at 500kbps takes\ (T_{3}-T_{2}) = 0.25ms (T3​−T2​)=0.25ms. ECU2 and our testing equipment will receive the message with\ T_{3} T3​ timestamping on it.

Let us consider the following scenario:

ECU1 sends an 8-byte message (let us assume a 125-bit message) over bus A, through the gateway (GW) to ECU2 over bus B.

Sending a 125-bit message on bus A at 125kbps takes 1ms\ (T_{1}-T_{0}) (T1​−T0​), on\ T_{1} T1​, both the GW and the testing equipment will get a complete message, timestamping it with\ T_{1} T1​.

At \ T_{2} T2​ the GW will start sending the message. Usually, the processing time\ (PT = T_{2}-T_{1}) (PT=T2​−T1​) is much smaller than the message send-time. Sending the message on bus B at 500kbps takes\ (T_{3}-T_{2}) = 0.25ms (T3​−T2​)=0.25ms. ECU2 and our testing equipment will receive the message with\ T_{3} T3​ timestamping on it.

The time between the frames as recorded by the testing equipment is\ T_{3}-T_{1} = 0.25ms + PT \approx 0.25ms T3​−T1​=0.25ms+PT≈0.25ms where the real time-difference should have been\ (T_{2}-T_{0}) \approx 1ms (T2​−T0​)≈1ms, meaning we have a\ 0.75ms 0.75ms timing measuring error between the messages.

Now we would like to replay this scenario to test our IDS. First, we will send the message on bus A’ at 125kbps then on bus B’ at 500kbps. We expect the message to first be received on bus A’ and then on bus B’.

Using the same notation as before, with apostrophe denoting the testing bus, the replay simulator starts sending the message on bus A’ at\ T_{0}^{'}T0′​ and on bus B’ at \ T_{2}^{'}T2′​, the message is received and timestamped on bus A’ at\ T_{1}^{'}T1′​ and on bus B’ at\ T_{3}^{'}T3′​. We can see we have message reordering. Why?

We already have a\ \sim0.75ms∼0.75ms time shift from the sniffing side. On the replay side, we have the same time shift again, accounting for a total of\ \sim1.5ms∼1.5ms time shift. This results in\ T_{1}^{'}-T_{3}^{'} \approx 0.5msT1′​−T3′​≈0.5ms instead of the\ T_{3}-T_{1} \approx 0.25msT3​−T1​≈0.25ms we have on the sniffing side. Thus we have message reordering.

How do we fix the problem?

The replay equipment software should compensate for the timing issue by virtually timestamping the beginning of the message. The reason we don’t want it on the sniffer side is that standard CAN controllers can only timestamp at the end of the message, therefore in order to have (during the IDS training phase) the same timestamp we have during actual vehicle integration, we have to leave the original sniffed timestamp.

Substituting the message send-time for each message is required, taking into account the number of bits after bit stuffing\ N_{b} Nb​ and bus rate\ Baud Baud results in:\Delta T= \frac{N_{b}}{Baud}ΔT=BaudNb​​This will also solve time differences on the same bus originating from variable message length that may not cause messages to reorder but will cause timing differences.

To summarize, even seemingly simple scenarios such as sniffing and replaying messages can present nontrivial challenges. To be precise, sometimes the commercially available solutions are not good enough.

*Cerrado previously known as Arilou

bottom of page