08 Dec 2022

Stack the Codes 2022

2 for 2 Hackathons!

Overview

Team Name: VulnGuard
Role: Frontend (VSCode Extension) Developer
Competition Placing: 1
Awards: Most Innovative Solution
GitHub Repository: https://github.com/ItzyBitzySpider/VulnGuard

VulnGuard Icon

Over the past month, I participated in GovTech’s inaugural Cybersecurity Hackathon - STACK the Codes 2022. We built a SAST tool named VulnGuard.

Technologies Used

  • Vanilla Javascript with HTML
  • VSCode Extension API

Problem Statemet

As a DevSecOps specialist, I would like to ensure that developers are able to develop features quickly, while being prevented from introducing common vulnerabilities into the application, such as those listed in OWASP Top 10. The solution should secure a common web framework (e.g. Django, Express.js, Spring), to prevent developers from introducing specific classes of vulnerabilities (e.g. XSS, SQLi, command injection, IDOR)

Application Overview

VulnGuard is a static analysis tool for JavaScript that automatically detects potential vulnerabilities in code. The tool is built on top of several powerful technologies, including Semgrep, Regex, and dependency checking.

With VulnGuard, JavaScript developers can quickly and easily scan their code for potential vulnerabilities. The tool uses Semgrep to identify patterns in the code that may indicate potential issues such as insecure coding practices. It also uses regex to search for specific strings or patterns in the code, making it easy to catch potential vulnerabilities that may not be caught by Semgrep alone.

In addition to these powerful technologies, VulnGuard also includes a dependency checking feature. This allows the tool to identify and alert developers to any insecure dependencies in their project. This is particularly useful, as many common security vulnerabilities in JavaScript applications can be traced back to insecure dependencies.

Scope

This blogpost will not detail all the technical details of VulnGuard, but you can read them in the GitHub Repository. Instead, I will discuss a couple of issues I personally had during the development.

VSCode API

Examples

Well this is a really short section, but sometimes Visual Studio Code’s Extension documentation does not provide enough information. It is a really decent piece of documentation and can solve about 50% of VSCode API woes. However, sometimes what I need are examples and I don’t think this is documented really well so here’s the link to VSCode Examples

Code Actions

Code actions are a much less documented part of VSCode though it’s such a huge feature. It is commonly seen when right-clicking in your Problems tab or as the yellow light bulb in VSCode. This is not to be confused with the Diagnostics API which adds the entries into the Problems tab. The Code Actions API is meant for offering remediation to code or other actions. In our case, to replace the vulnerable code with VulnGuard’s suggestion. As mentioned earlier, the examples repository is a great place where documentation is insufficient. The Code Actions example gives a great example of how to implement Code Actions.

In VulnGuard, there were 2 kinds of code actions - one for offering patches to vulnerable code and another for vulnerable packages during dependency checking. The vulnerable packages code action was simply a link to the docs so there’s not much to talk about and simply adapt from the example. For fixing of vulnerabilities, I simply took a parameter from the Diagnostics API using the diagnostic.tag parameter to add our own input. We also offered the feature to disable a specific ruleset or disable VulnGuard entirely for a given line. This was simply done by adding a comment in the previous line similar to the // eslint-disable-line feature from ESLint.

class FixVulnCodeActionProvider {
  constructor() {
    this.providedCodeActionKinds = [vscode.CodeActionKind.QuickFix];
  }

  provideCodeActions(document, range, context, token) {
    const ignoreLineAction = new vscode.CodeAction(
      `Disable VulnGuard for this line`,
      vscode.CodeActionKind.QuickFix
    );
    ignoreLineAction.edit = new vscode.WorkspaceEdit();
    if (range.start.line === 0) {
      const firstLine = new vscode.Range(
        range.start.line,
        0,
        range.start.line,
        range.start.character
      );
      ignoreLineAction.edit.replace(
        document.uri,
        firstLine,
        "// vulnguard-disable-*all* \n" + document.getText(firstLine)
      );
    } else {
      const commentLine = document.lineAt(range.start.line - 1);
      const commentLineRange = commentLine.range;
      const commentLineText = commentLine.text;
      if (
        commentLineText.trimStart().startsWith("//") &&
        commentLineText.includes("vulnguard-disable")
      ) {
        ignoreLineAction.edit.replace(
          document.uri,
          commentLineRange,
          "// vulnguard-disable-*all*"
        );
      } else {
        ignoreLineAction.edit.replace(
          document.uri,
          commentLineRange,
          commentLineText + "\n // vulnguard-disable-*all*"
        );
      }
    }

    const outputActions = [ignoreLineAction];
    context.diagnostics.forEach((diagnostic) =>
      outputActions.push(...this.createCodeAction(document, diagnostic))
    );

    return outputActions;
  }

  createCodeAction(document, diagnostic) {
    const output = [];

    if (diagnostic.tags) {
      const fixAction = new vscode.CodeAction(
        "Fix using VulnGuard suggestion",
        vscode.CodeActionKind.QuickFix
      );
      fixAction.edit = new vscode.WorkspaceEdit();
      fixAction.edit.replace(
        document.uri,
        diagnostic.range,
        diagnostic.tags[0]
      );
      output.push(fixAction);
    }

    const ignoreLineRuleAction = new vscode.CodeAction(
      `Disable ${diagnostic.code.value} for this line`,
      vscode.CodeActionKind.QuickFix
    );
    ignoreLineRuleAction.edit = new vscode.WorkspaceEdit();
    if (diagnostic.range.start.line === 0) {
      const firstLine = new vscode.Range(
        diagnostic.range.start.line,
        0,
        diagnostic.range.start.line,
        diagnostic.range.start.character
      );
      ignoreLineRuleAction.edit.replace(
        document.uri,
        firstLine,
        `// vulnguard-disable-${diagnostic.code.value} \n` +
          document.getText(firstLine)
      );
    } else {
      const commentLine = document.lineAt(diagnostic.range.start.line - 1);
      const commentLineRange = commentLine.range;
      const commentLineText = commentLine.text;
      if (
        commentLineText.trimStart().startsWith("//") &&
        commentLineText.includes("vulnguard-disable")
      ) {
        ignoreLineRuleAction.edit.replace(
          document.uri,
          commentLineRange,
          commentLineText + ` vulnguard-disable-${diagnostic.code.value}`
        );
      } else {
        ignoreLineRuleAction.edit.replace(
          document.uri,
          commentLineRange,
          commentLineText + `\n// vulnguard-disable-${diagnostic.code.value}`
        );
      }
    }

    if (diagnostic.code.target) {
      const readDocsAction = new vscode.CodeAction(
        `Learn more...`,
        vscode.CodeActionKind.QuickFix
      );
      readDocsAction.command = {
        title: "docs",
        command: "itzybitzyspider.vulnguard.docs",
        arguments: [diagnostic.code.target],
      };
      output.push(readDocsAction);
    }

    return output;
  }
}

Webview

Visual Studio Code, being an Electron application, makes it really easy to add your own UI on a new tab using Webviews. In VulnGuard, the Webview API was used to create our dashboard to configure rulesets to use.

VulnGuard Dashboard

Adding of CSS and other static files

This is pretty self explanatory with the example code but for legacy sake I will dump some code here. First, we declare all the files we need such as PNGs and CSS files

if (panel) panel.reveal(columnToShowIn);
else {
  panel = vscode.window.createWebviewPanel(
    "vulnGuard",
    "VulnGuard Dashboard",
    vscode.ViewColumn.One,
    {
      localResourceRoots: [
        vscode.Uri.file(path.join(context.extensionPath, "media")),
      ],
      enableScripts: true,
      retainContextWhenHidden: true,
    }
  );
  vulnguardLogo = panel.webview.asWebviewUri(
    vscode.Uri.file(path.join(context.extensionPath, "media", "vulnguard.png"))
  );
  styles = panel.webview.asWebviewUri(
    vscode.Uri.file(path.join(context.extensionPath, "media", "styles.css"))
  );
  panel.iconPath = vscode.Uri.file(
    path.join(context.extensionPath, "media", "vulnguard.png")
  );
  panel.onDidDispose(
    () => {
      panel = undefined;
    },
    null,
    context.subscriptions
  );
}

Then include them in our web application using HTML as a string in JS. I chose to use a JS string instead of reading a HTML from file as it was faster to just inject the paths in. Alternatively, we could’ve used a token replace such as <REPLACE_PATH_HERE> and replaced them after reading from a HTML file.

panel.webview.html = `<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <link rel="stylesheet" type="text/css" href="${styles}" />
    <title>VulnGuard Dashboard</title>
  </head>

  ...
  
  <div style="margin-right: 50px; width: 150px">
    <img src="${vulnguardLogo}" />
  </div>
  `;

Button Handling

We simply use a self-defined JS object to handle the button presses from the HTML

{
  command: "string",
  id: "string",
  value: "string",
}

And part of our button handler:

panel.webview.onDidReceiveMessage(
  (message) => {
    switch (message.command) {
      case "checkbox":
        setFeature(context, message.id, message.value);
        vscode.window.showInformationMessage(
          "VulnGuard: Feature selection saved"
        );
        return;

        ...
    }
  });

As per VSCode API docs, we handle the HTML button clicks by calling the acquireVsCodeApi function

<script>
  (function () {
    const vscode = acquireVsCodeApi();
    window.addEventListener("input", (evt) => {
      const src = evt.srcElement;
      if (src.type === "checkbox") {
        vscode.postMessage({
          command: "checkbox",
          id: src.id.split("__checkbox")[0],
          value: src.checked,
        });
      }
    });
    window.addEventListener("click", function (evt) {
      const src = evt.srcElement;
      if (src.nodeName !== "BUTTON") return;
      const id = src.id.split("__button__");
      vscode.postMessage({
        command: "button",
        id: id[0],
        rule: id[1],
        value: id[2],
      });
    });
  })();
</script>

JavaScript Promises

In order to decrease the scanning time of all the files, we had to leverage on concurrency using methods such as JavaScript’s Promise.all to scan for vulnerabilities. The sheer number of “threads” became an issue as we would bump into errors (though computer resources such as CPU and memory were not affected). Also bear in mind that JS doesn’t actually operate on real life threads as it is single threaded. However, file reading became a large bottleneck in terms of time so “multithreading” allowed us to make our scan times much faster, decreasing the scan time for our sample project from 10-20 mins down to 1 min.

Problem

Let’s look at the number of threads that would run. When we first run VulnGuard, we scan the entire workspace for .js files. We run each of the scans on a different thread.

async function scanWorkspace(context, include, enabledFeatures) {
  const uris = await vscode.workspace.findFiles(
    include,
    `{${getIgnoredRegex(context).join(",")}}`
  );
  await Promise.all(uris.map((uri) => scan(uri.fsPath, enabledFeatures)));
}

Each of the codebase scans (Semgrep and Regex) also get a thread. From here, the number of threads differs for each thread. In Semgrep, for each rule (we chose 400), they each get a thread too. So that’s \(400\) threads multiplied by the number of files. For regex, we used a total of 4 rulesets for a total combined of 31 rules. Each rule was given a thread so we multiply by \(31\) also. For each line in a file

This brings us to a total of

Breaking Down the Problem

This problem comes in 2 folds. Firstly, reading many files increases the chance of the EMFILE error when there’s too many file descriptors. Secondly, Node.js only allows reading up to 10240 files simultaneous for good reason. This results in the error EMFILE, too many open files.

Solution

2 packages were used to solve each of the part problems. graceful-fs was used to resolve some EMFILE errors by retrying when there’s too many file descriptors.

The second problem of limiting concurrency had to be solved using some sort of promise pool. Initially, the es6-promise-pool library was used. However, the unintuitive nature of the package compared to JavaScript’s Promise API made it quite hard to solve all concurrency issues. Instead, I used p-limit.

The number of simultaneous file reads was limited to 900 - the number was chosen empirically.

const checkPromises = [];
const fileReadLimit = pLimit(900);
let promisesFulfilled = 0;
function updatePromiseFulfilled() {
  promisesFulfilled++;
  const left = checkPromises.length - promisesFulfilled;
  if (left % 500 === 0) console.log("Processes left to scan: " + left);
  else if (left < 500 && left % 100 === 0)
    console.log("Processes left to scan: " + left);
  else if (left < 100 && left % 10 === 0)
    console.log("Processes left to scan: " + left);
  else if (left < 10) console.log("Processes left to scan: " + left);
}
...
 return fileReadLimit(async () => {
            const start = performance.now();
            const moduleName = uri.fsPath.match(
              new RegExp(`node_modules\\${path.sep}(.+?)\\${path.sep}`)
            )[1];
            /*const moduleHash = moduleName + "_" + yarnLock[moduleName].version;
            if (cached[moduleHash]) { //Skip since cached
              cacheHits[moduleHash] = true;
              return;
            }*/
            if (!hits[moduleName]) hits[moduleName] = [];
            const res = await regexRuleSetsScan(
              Global.dependencyRegexRuleSets["check"],
              uri.fsPath
            );
            if (res.length) hits[moduleName].push(...res);
            const duration = performance.now() - start;
            updatePromiseFulfilled();
            if (duration > 30000)
              console.warn(`<A> scan for ${uri.fsPath} took ${duration}ms`);
          });
        })

Potential Further Improvement

I’m quite sure that we read the same file multiple times so perhaps it’s better if we cache it or stored the state of the file until all the checks were done. But since I only looked at this backend code after the problem occurred and the deadline was a day away, I just did a quick workaround.

Conclusion

Having developed this application in a short span 7 days, our team not only had to learn the APIs really quickly, but also implement them. I learnt a lot of interesting things about VSCode API as well as Javascript concurrency and Promises. Perhaps joining more hackathons can force me to learn more things faster…?