SAE J1939 Cybersecurity. How to Secure Trucks and other Commercial Vehicles
Securing truck, heavy-duty, and other commercial-vehicle CAN bus networks is challenging, but not impossible.
*Updated: May 3rd, 2021
Why does SAE J1939 cybersecurity need a different approach?
At some point, you have probably had one of those terrible suits that are too loose in one place and breathtakingly tight in another. And just as we could all benefit from a dress suit that is tailor-made to our exact size, so too can a customized cybersecurity solution benefit in-vehicle networks.
Customization is necessary in an environment in which a baggy fit (or high network overhead in this case) might mean the need for millions of dollars in additional hardware costs over the lifetime of a full vehicle production cycle.
Not all networks and protocols are created equal. Accordingly, we must tailor cybersecurity protection to specific network characteristics. Threat models, anticipated attack vectors: these must be measured, chalked, and cut to fit. The SAE J1939 CAN bus protocol –designed specifically for commercial vehicles– is no different.
Understanding the Layers
You can divide the field into five layers:
Individual ECU (Electronic Control Unit)
IVN (In-Vehicle Network)
EE (Electrical and Electronic) architecture
In this article, we are looking mainly at the IVN layer. This layer will be addressed in upcoming standards SAE J1939-91A (in-vehicle network security) and J1939-91B (telematics interfaces, e.g. secure bidirectional OTA (over-the-air) updates, ExVe (extended vehicle), and ITS (intelligent transportation systems) among others).
The main reason for this approach is that we see this layer as the most vulnerable with the highest risks and lowest inherited protection. Other layers are not specific to J1939 (such as ECU hardware and embedded software securing). Higher layers (such as connected vehicle and connected fleet) are more generic and share a commonality with passenger vehicle cybersecurity.
This article derives from our field experience and understanding of SAE J1939 cybersecurity. It reflects our expert professional view of the subject, rather than just being an explanation of the standards. The conclusions and proposed courses of action you will find later in this article are not academic, but realistic examples that you can use as guidelines for implementation.
Who are the threat actors targeting commercial vehicles?
Know your enemy.
When you consider which threat actors are motivated to attack commercial vehicles, it’s important to look beyond normal civilian considerations. Trucks, heavy-duty, and other commercial vehicles often form part of critical economic and social infrastructure, including heavy industry, agriculture, emergency, and military services.
The three main categories you should look at when considering potential attackers are:
Criminals, looking for ransom or wishing to inflict physical damage. They may want to harm the reputation of your business or use your connected devices as a vector to hack government infrastructure or private individuals. They could be doing this to find ways to manipulate them for immediate financial benefit, but also consider industrial espionage.
Terrorists, nation-states, anarchists, activists, or any group/individual with a political or ideological background. Their primary target is to cause as much damage as possible.
The owner/end-user might attempt to hack the vehicle themselves. They might do this to make use of clone ECU replacements, chip tuning, and/or similar actions. Not often considered, you should keep this in mind as ultimately it can harm your bottom line if you’re a fleet owner. This is especially true now when the automotive industry is looking to leverage in-vehicle data and IVI ecosystems as new revenue streams.
How do attack surfaces and threat models for SAE J1939 cybersecurity differ from regular CAN?
We will not look at individual ECUs as they can be considered just another hardware/software embedded device. While no doubt there are specific vulnerabilities and exposures in some devices, here we are focusing exclusively on scenarios using the J1939 protocol.
Attack Surfaces and Threat Models
The preferred attack method would be remote. This type of attack can be achieved from anywhere and leaves very few traces –if any.
Attaching a physical device is an option that provides full access to the bus, but this has the risk of being detected. Injecting malware into an ECU is a more viable option. An attacker who can gain physical access to a vehicle has more effective and efficient methods at their disposal, but still, the cyber-attack has many virtues.
In contrast to passenger vehicles, which use regular CAN bus (with flavors per individual make and model due to proprietary implementations by OEMs), J1939 is well standardized (in production terms). SAE J1939 architectures share a pool of modular components, with little variation of integration between vehicles.
In many cases, ECUs are interchangeable between vehicles manufactured by different OEMs. This similarity optimizes development costs and re-use factors. It also means that once hackers identify a kill chain, they can leverage it to attack many vehicles from multiple OEMs. This characteristic offers clear financial benefits to manufacturers but is an obvious threat from a cybersecurity perspective.
Other cases for J1939 include modular vehicle attachments such as agricultural equipment or trailers. These can serve as a vector to attack the parent vehicle (in this instance, a tractor). Gaining access to a trailer left overnight is much simpler, say, than to a tractor that is better guarded. The trailer is likely used by many service vehicles, and thus can infect many, if not all of them. These, in turn, can be infected by compromised service vehicles creating a viral effect.
It is worth noting that attacks originating from the cloud, such as fleet management systems, Wi-Fi, and Bluetooth, are possible. However, as they do not relate directly to SAE J1939 cybersecurity, we will look at them in a future article.
Regardless of the attack method, we should assume the enemy is among us, and that somehow the attacker can inject rogue traffic into the J1939 network.
SAE J1939 cybersecurity attack vectors and scenarios
J1939 uses 29-bit, extended CAN addressing, implementing its own format. This implies there is no way to authenticate the origin of the message.
Any spoofing or impersonation attack is possible since any ECU can send a message ID it wishes. This is without any way to ensure that it originated from the designated ECU.
For example, any ECU can inject messages that would tamper with the power train, engine control, or instrument cluster ECUs. Misuse of diagnostics messages can stop a trailer dead in its tracks. Complex scenarios, such as MITM (Man In The Middle), may also be staged.
Passenger vehicle configuration remains fixed for the vast majority of cases after it leaves the OEM assembly line, and via the dealer, reaches its owner. This means that the security mechanism imposed at this point remains valid and relevant for the whole lifecycle of the vehicle.
This is not the case with commercial vehicles as they have a much more complex and dynamic lifecycle. The dealer is entitled to make changes, add optional components, etc. The same applies to the vocational integrator, the fleet owner, or any independent owner.
Equipment can be mixed and matched, and you will see this more often than not with agricultural equipment. Even worse, these are usually shared among many vehicles, enabling the spread of malware to any connected device, and vice versa. An epidemic effect is highly likely in this sort of situation.
Aftermarket Fleet Management Equipment Installation
These are devices added by fleet owners to monitor and control their fleet. In some cases, regulations require the installation of driver-hours recording ELDs (Electronic Logging Devices) and other telematics equipment.
In all such cases, where these devices have not been part of the OEM cybersecurity control process –nor part of its supply chain– cyber-protection is not guaranteed.
As a result, such devices can be compromised and used as a means to introduce malware into the vehicle.
Specific Embedded Software Issues
There are several types of vulnerabilities when implementing a protocol or a standard:
Inherent protocol vulnerabilities, where a protocol is defined in a vulnerable way – we can attack the system just by reading the specification.
Implementation vulnerabilities or software bugs, for example, buffer overflow.
A poorly defined/complex protocol that leads to implementation vulnerabilities, or code flow that exposes the protocol to attacks.
As we are focusing on the J1939 protocol, we won’t cover generic embedded issues or general ECU-related vulnerabilities. J1939 is a complex protocol (compared to the standard CAN bus). We shall discuss the 3rd vulnerability type. To explore them in any depth, we need to look at actual code.
DISCLAIMER: The code used for this vulnerability assessment –although freely available on the internet as open-source for educational purposes– will not be referenced at the request of the code owner. All vulnerabilities found were reported to the owner to help make it better. This is not amateur code, having been developed by an authority in the field.
The code target system is various Arduino platforms, including ARM-based, and therefore there are some small footprint requirements.
General Pitfalls Seen in the Code
A common flaw seen is the use of the J1939 sequence number (SN) as an index into the message data – without scripting a check of the following:
0 < SN <= 255 (SN = 0 may easily cause a buffer overflow)
SN <= number of frames
SN should go up by one – no reordering in CAN bus
Another is the acceptance of message size and the number of frames without the performance of a sanity check:
Number of frames = ceil(message_size/7)
Also, when using a message size buffer less than the maximal size of 1785, there is no special care taken to prevent buffer overflow:
Validate the number of frames and message size to match the buffer size.
When the buffer size is not a multiple of 7, pay special attention to the last copy instruction.
State Machine Timeouts and Decisions
Here it is vaguer, since every decision may have another effect. In general, use a ‘what-if’ approach to cover as many attack scenarios as possible.
Message Override Vulnerability
In standard CAN bus attacks, when overriding a message, we usually either send a message very frequently or disable the sending ECU (by entering a diagnostics session for example). Sending frequent messages creates a substantial footprint, and may be visible. Also, a diagnostic session may require us to fill in a lot of these.
The J1939 transport protocol enables us to override a message without the sender or receiver being aware. In this attack, we will override the message after it has been transmitted by the sender, making it a virtual MITM attack.
Anticipating a legitimate ECU will send a TP.CM BAM message (these are periodic messages). We send these messages with the number of packets doubled (byte 4), assuming that the number of frames is smaller than 128. After the legitimate sender sends the whole message (TP.CM BAM and the message data TP.DM messages), we will send our data. Neither the receiver nor the sender knows the message is modified. Furthermore, the attacker can look at the original before sending their own.
How it works:
If the BAM messages state machine is in an idle state, and we receive a TP.CM message with command BAM(32), we read the BAM message data and move to the next state. Any further BAM messages will be ignored until the message is completed, or a timeout occurs. So, the second BAM message is ignored.
There is no sanity check between the message length and the number of packets. The write pointer to the frame is deduced from the message sequence number, without validation. When the attacker sends the messages with the sequence numbers starting from sequence 1, it will override the reassembled message data. Finally, since we have doubled the number of expected frames, only now will the TP (transport protocol) transaction finish and the message be received by the higher layers.
Several mitigations could prevent the attack:
Comparing the number of frames and message length.
Comparing the number of frames/message length with known PGN message length.
Comparing the message sequence number with a stored value since there is no reordering on the CAN bus.
PTP (peer-to-peer) Sender DOS (denial-of-service)
Although DOS (denial-of-service) attacks in SAE J1939 cybersecurity are trivial, we will highlight this one since it could have been easily avoided.
The TP (transport protocol) defines a PTP session that requires RTS (Request To Send) messages, and that the receiver should send a CTS (Clear To Send) message.
When receiving RTS requests (even if dedicated to another ECU) the attacker keeps sending CTS replies. This will prevent the sender from transmitting the data. This will continue until the receiver times out and sends an abort message.
How it works:
When receiving a CTS reply, the sender resets the sending, preventing the timeout from occurring.
When receiving repeated CTS messages we should never reset the timer.
TP Receive Buffer Overflow
Since memory allocation is static, you should not expect a buffer overflow into the TP packet data. However, there are two of them.
Send a BAM or PTP TP message with a sequence number of 0. This will copy 7 bytes into bytes -7–0 of the 1785 bytes, overriding 3 bytes of the PGN, the source address, the destination address, the number of packets, and the frame number of the current session.
If the pointer was not defined as a signed int (16 bit for Arduino), but rather as an unsigned int, the attack would have copied 7 bytes into bytes 1785-1792 where we have a 1785-byte buffer. This overrides the session length variable, used later for copying data, resulting in much more dangerous exploitation.
How it works:
When BAM message data is received, the data is copied according to the message sequence number without validation (see Figure 1 above). The sequence number in the protocol should start from 1. When placing 0, it means nPointer = -7 and the attacker writes to the 7 bites before the buffer.
Validate the sequence number; SN – 0 < SN < number of frames. Also, validate the number of frames.
Small Footprint Code Variant Buffer Overflow
Sometimes to save space, the J1939 stack does not support the maximal frame size of 1785 bytes. In one of the compilation variations, the maximal frame size was 256 bytes with a 256-byte buffer. The fact that we rely on the sequence number as part of our pointer, without validation, can cause a buffer overflow. This applies to both BAM and PTP sessions.
Assuming we have a 256-byte buffer. We can send a message with up to 256-byte message length but with a sequence number >= 36. This causes a basic buffer overflow of up to 1529 bytes (see Figure 1 above).
Validate the sequence number; SN – 0 < SN < number of frames. Also, validate the number of frames according to message length. Make sure the last copied frame does not create an overflow or increase the buffer size to 7*36 = 259 bytes.
Mitigating cyber-attacks on SAE J1939 commercial vehicles
Tier-1s and OEMs need to take proactive action to protect the heavy vehicle for a large number of reasons, including:
Regulations such as the UNECE WP.29
Growing awareness within professional bodies in the automotive industry
Top management who wish to protect their firm’s reputation, and prevent loss of life and damage to property, for which they will be liable
Insurance companies requiring cybersecurity adoption to minimize risk
Reducing risk involves acting in a large number of areas. Examples include those detailed in the ISO/SAE 21434 standard and AUTOSAR best practices guides, among others. Practically, this translates to an extensive set of activities, including:
Process and procedures
Cybersecurity management systems
Secure by design approach of all components and the overall system
A secured software development lifecycle
Compliance with standards such as A-SPICE and MISRA
Dedicated cybersecurity protection mechanisms such as IDS/IPS, endpoint protection, cryptographic solutions.
It is important to note that out of all of the solutions mentioned above, IDS/IPS is the only dedicated, devoted, and independent component of this defence-in-depth approach.
While implementing many security means as part of the overall strategy is imperative, maintaining an IDS/IPS is crucial. IDS/IPS solutions are the only components of a vehicle to have all their resources allocated to its protection.
Ideally, the IDS/IPS function should reside in the central gateway. However, you can distribute it to the TCU, IVI, V2X OBU, domain controller, or any other connectivity or safety-critical ECU. Only this way will you best protect the vehicle..
Automotive cybersecurity solutions for SAE J1939 commercial vehicles
Since the J1939 CAN bus is the heart of the commercial vehicle, interconnecting all ECUs, it is prone to cyber-attack and should therefore be well protected. The methods to be employed should include a variety of procedural and technological means as we have detailed throughout the article.
We need to take measures to implement proper security measures. To protect the vehicle, and the J1939 networks used by commercial vehicles, from dangerous attacks.
Arilou offers the Sentinel-TRK IDS/IPS for SAE J1939 as a customizable and comprehensive protection tool.
In addition to Sentinel-TRK, a variety of automotive cybersecurity products are available both in a complementary capacity or for other domains:
These among our many other products provide holistic protection solutions which answer the challenge of securing modern and legacy in-vehicle networks.
*Cerrado previously known as Arilou