Projects that use commercial web software should take proactive measures to mitigate IT security risks.
The use of commercial web software, especially open-source packages, can introduce additional IT security risks to software development projects. These widely available software packages are subject to analysis and compromise by communities dedicated to identifying and exploiting vulnerabilities. Projects that use commercial web software should proactively monitor security bulletins, conduct penetration testing, and ensure efficient and effective team working structures and processes to mitigate this risk.
In March 2017, a Chinese vulnerability assessment company posted an exploit for the popular Apache Struts2 web framework. Within hours, an addition to a common penetration testing application called Metasploit had been created to scan for the vulnerability. This caused the attack vector of the exploit to be rapidly distributed before the software vendor could post a patch update to fix the issue. The zero-day exploit was quickly picked up by several malicious groups and used to attack websites using the Struts2 framework. Among those attacked were several public-facing NASA sites, including NASA TechPort.
Because the NASA TechPort team had incorporated log-monitoring and notification mechanisms, they were notified of suspicious behavior within a matter of minutes. The TechPort team immediately invoked their Incident Response Plan and gathered the key members of the Response Team, which included the system administrator, security lead, and technical lead. After assessing the situation and determining that no patch was available from the vendor, the TechPort team devised a custom solution which was rapidly tested and deployed. The success of this fix was primarily due to the TechPort team’s DevOps working model and the rapid deployment capabilities of a team-operated cloud computing environment.
Lesson Number: 21301
Lesson Date: March 9, 2017
Submitting Organization: NASA Headquarters
HIGHLIGHTS
LESSON LEARNED
- Patch releases may not be available when exploits are identified. Teams should be prepared to handle these types of situations. While incident drills are a useful tool, the most important asset is the team’s ability to respond efficiently and effectively to the issue at hand.
- Active monitoring of common security bulletin networks, and even software-specific bulletins, may not be sufficient to identify zero-day vulnerabilities. Additional, specific information avenues should be explored for crucial software components.
RECOMMENDATIONS
- Ensure all software packages are kept up to date with the latest official release. This will likely cause the project to incur additional operations cost (including development and regression testing) and should be included up-front in the project plan and estimate.
- Incorporate custom log-monitoring and notification solutions tailored to your system and your team’s needs. Do not rely entirely on commercial intrusion detection/prevention packages. Take proactive action to protect your application with multiple security layers.
Consult the lesson learned for complete lists.
NASA Office of the Chief Engineer Mission Resilience and Protection Program’s Technical Program Manager Joshua Krage on the importance of this lesson learned:
I think the lesson is important because we still have these issues today and we likely will have very similar issues for a long time to come. Basically, it’s not going to go away. In this case, it’s two parts. One is the need to understand or recognize that the world is changing around us, and that includes the potential for someone intentionally — or perhaps unintentionally — causing harm to one of our systems. That activity has dramatically increased and is more accessible. People are doing more research into security issues, and there are various incentives, including profit for a lot of it. So, what that means is even if we built the best system ever, someone out there is trying to figure out how to make it work in unexpected ways. So that in turn leads to the other piece, which is that we must be thinking about our systems not as static. We finish them, we love them, we leave them, and we go work on something else. But we have to maintain our systems as long as they are in use, including adapting to the changes around us. I think those are some fundamental lessons that can be extracted from this past event, and we’re going to see the same kinds of things happen and the same underlying issues over time in other systems.
One aspect that really jumped out at me was that the technical team supporting the application really understood what they were supporting and how it worked to the point that given vague information, they could go do their own research, infer some knowledge, and basically develop a solution. And they did that in a relatively short amount of time! So, all that speaks to an excellent basis of knowledge and how to apply it in that particular context.
If you think about the kinds of challenges that NASA deals with, this is what we want our groups to be able to do, right? We need their expertise and their contextual application of that expertise, which allows them to adapt to unexpected circumstances like the team was able to do in this case. They put in place some detection capability to make sure they were responding appropriately, and they were able to test some of their solutions and assumptions along the way. That skill set and that kind of thinking are extremely applicable across all our efforts in the engineering and technical community.
Applying this to a spacecraft — and in my case with a lot of the thinking I do about the Mission Resilience and Protection Program — how do we detect the unexpected and then how do we respond to it by adapting our system in some way to those events? Basically, if we identify a failure is beginning, can we detect it fast enough that we can understand, characterize, and respond to it without making things worse? At a pure technical level, we must come up with scenarios that might work. When we’re remotely controlling a spacecraft and only getting telemetry downlinks once a day, it becomes a bit more challenging and so that encourages us to plan ahead, which was one of the aspects of this lesson. Don’t just be prepared to respond but extrapolate a bit from past experience as to what might happen and practice that contingency scenario and response.
One of the recurring themes I’ve seen lately in a lot of areas is that things outside of our control are changing. We have to recognize that things are going to happen, that they’re changing, and that we can’t stop them. So, we have to figure out what we ourselves can do within the systems that we can control. From a security and protection perspective, I think that’s an extremely valuable perspective to build and to bring into any technical team.
Spotlight on Lessons Learned is a monthly series of articles featuring a valuable lesson along with perspective from a NASA technical expert on why the lesson is important. The full lessons are publicly available in NASA’s Lessons Learned Information System (LLIS).
If you have a favorite NASA lesson learned that belongs in the spotlight, please contact us and be sure to include the LLIS Lesson Number.