Domain 5. Secure Software Testing

Quality Assurance

  • Quality: fitness for use according to certain requirements.
  • ISO 9216: provides guidance for establishing quality in software products.
    • Provide a quality model built around functionality, reliability and usability.
      • Additionally around efficiency, maitainability, portability
    • Address human side of quality.
  • Systems Security Engineering Capability Maturity Model (SSE-CMM), a.k.a ISO/IEC 21827:
    • An international standard for secure engineering of system.
      • Designed to be a tool to evaluate security engineering practices.
      • Organised into 11 processes and corresponding maturity levels (a standard CMM)
      • De facto standard for evaluating security engineering capability
    • Covers:
      • Concept definition
      • Requirement analysis
      • Design and development
      • Integration
      • Installation
      • Operations
      • Maintenance
      • Decommissioning
  • Open Source Security Testing Methodology Manual (OSSTMM)
    • A peer reviewed system describing security testing.
      • Provide a method for assessing operational security built upon analytical metrics
      • Provide a system that can accurately characterise the security of the operational system reliably
      • Can be used to assist in auditing
    • Five sections:
      • Data networks
      • Telecommunications
      • Wireless
      • Human
      • Physical security

Software Quality Assurance Testing

  • Functional:
    • Unit
    • Logic
    • Integration
    • Regression
  • Non-functional:
    • Performance
    • Scalability
    • Environment
    • Simulation
  • Other:
    • Privacy
    • User acceptance

Test aspects

  • Reliability: the software is functioning as it is expected.
    • Reliability is not just a measure of availability, but functionally complete availability.
  • Recoverability: the ability for the software to restore itself to an operational state.
  • Resiliency: how strong the software is to be able to withstand attacks.
  • Interoperability: the ability of the software to function in disparate environments.
  • Privacy: to check PII, PHI, PFI is assured confidential.

Testing Artifacts

  • Test Strategy:
    • Outlines the testing approach
    • Inform and communicate testing issue
    • Test goal, methods, timeline, environment, resources
    • Type of test and high level success/fail criteria
    • Developed from conceptual design
    • Security:
      • Data classification
      • Threat model
      • Access control
  • Test Plan:
    • A granular document that details the test approach systematically
      • The tester’s workflow
      • To verify that the software is reliable
    • Three primary components:
      • Requirements
      • Methods
      • Coverage
  • Test Case:
    • Take requirements and define measurables to validate the requirements are met.
    • Test case: ID, requirement reference, pre-conditions, actions, input, expected result
  • Test Script
    • Procedures undertaken to perform the test.
      • Developed using test case
    • Ensure security requirements are included.
  • Test Suite
    • A collection of test cases
  • Test Harness
    • All necessary components for testing: tools, samples, config, cases, scripts etc
      • Can be used to simulate functions that are still in development (i.e simulation test)
    • Promote the principle of leveraging existing components and Psychological acceptability.

Functional Testing (reliability testing)

  • Purpose: attest the functionality of the software as expected by the business or customer.
    • Determine compliance with requirements in terms of reliability, logic, performance and scalability.
  • Reliability: measures that the software functions as expected by end user.
  • Resiliency: measures how strongly the software can perform when under attack.
  • Steps:
    • Identifying expected functions
    • Creating input data based on specifications
    • Determining expected output based on the specifications
    • Executing test cases corresponding to functional requirements
    • Comparing expected and actual results

Unit Testing

  • Conducted by developers during implementation phase
    • Breaking down functionalities and test in isolation
  • Quality of Code (QoC)
    • High cohesiveness
    • Low coupling
  • At a minimum:
    • Ensure functional logic
    • Understandable code
    • Have a reasonable level of security control
  • Challenge in agile:
    • Code may be dependent on other developers
    • Can use drivers (simulate the calling unit) and stubs (simulate the called unit) to solve the issue
  • Benefits:
    • Validate functional logic
    • Identify inefficiencies, complexities and vulnerabilities early
    • Automate testing processes
    • Extend test coverage
    • Enable collective code ownership

Logic Testing

  • Validates the accuracy of the software processing logic.
    • Logic testing also includes the testing of predicates.
      • Predicate: something that is affirmed or denied of the subject in a proposition in logic.
    • Software with high cyclomatic complexity must undergo logic test.
  • Source of Boolean predicates:
    • Functional requirements specifications like UML diagrams, RTM
    • Assurance (security) requirements
    • Looping constructs (for, foreach, do while, while)
    • Preconditions (if-then)

Integration Testing (System testing)

  • Individual units of code are aggregated and tested together.
    • Identify problems that occur when units of code are combined.
  • Ensure the integration occurs as designed and data transfers between components are secure and proper.

Regression Testing (verification testing)

  • To validate the software did not break previous functionality or security and regress to a nonfunctional or insecure state.
    • i.e Testing various versions of software
    • Primarily focused on implementation issues over design flaws
  • Need to ensure:
    • Root cause of the bug is fixed
    • Fixing the bug doesn’t introduce new bugs
    • Fixing the bug doesn’t make old bugs reappear
    • Modifications still meet requirements
    • Unmodified code is not impacted
    • Adequate time is allocated for regression tests
  • Define a library of regression tests for better management:
    • At a minimum: boundary conditions and timing tests should be included.
    • Can use Relative Attack Surface Quotient (RASQ) as a security metric.
  • Challenges:
    • Determine which tests should be included.
    • Interactive differences between components.
    • Difficult to vendors who maintain multiple versions.
  • Common regression issues:
    • The fix may cause a fault in some other part of the software
    • The fix may undo some other mitigation at the point of the fix
    • The fix may repair a special case, but miss the general case, e.g blacklisting
  • When applying a patch, need to determine the correct level of regression testing required.

Non-Functional Testing

  • Test for recoverability and environmental aspects.
  • Recoverability: measures the ability to restore to an expected level of after a security breach.

Performance Testing

  • Goal: to determine the bottlenecks, not finding vulnerabilities.
    • Ensure software performance meets SLA and business expectations.
    • Establish a baseline for future regression.
      • Bottlenecks can be reduced by tuning the software.
  • Load Testing (longevity or endurance or volume testing)
    • Identifying max capacity
    • An iterative process
  • Stress Testing
    • Determine the breaking point (i.e recoverability)
    • To assure recovery
    • To assure fail secure

Scalability Testing

  • Goal: to identify the loads and to mitigate any bottlenecks that hinder scalability.
  • Environment Testing
    • Verify the integrity of configuration and data of the environment.
    • Define trust boundaries
    • Test data movement across trust boundaries from end to end of the application
  • Interoperability Testing
    • Check upstream and downstream dependency interfaces.
    • Examples:
      • Security standards (such as WS-Security for web services implementation) are used
      • Momplete mediation is effectively working to ensure that authentication cannot be bypassed
      • Tokens used for transfer of credentials cannot be stolen, spoofed and replayed
      • Authorization checks post authentication are working properly
  • Disaster Recovery (DR) Testing
    • Verify the recoverability of a software after a disaster
      • Also uncovers data accuracy, integrity and system availability
    • Failover testing is dependent on how close a real disaster can be simulated.

Simulation Testing

  • Testing the application in an environment that mirrors the associated production environment.
    • Validate least privilege and configuration mismatches
  • Common issue:
    • Software works in devevelopment and test, but failed in production.
    • Software run with admin privilege

Other Testing

Privacy Testing

  • Test should include the verification of organisational policy controls that impact privacy.
    • Network traffic monitoring
    • End-points communication monitoring
    • Appropriateness of notices and disclaimers
    • Opt-in and opt-out mechanisms
    • Privacy escalation mechanism, process and documentation

User Acceptance Testing (UAT)

  • To ensure software meets requirements by the end users.
    • Usually as a blackbox testing, primarily focusing on functionality and usability
    • The environment should be as close to production as possible.
    • Ultimately to decide whether the product can be released or not.
  • Prerequisites:
    • The software must have exited the development phase
    • Other quality assurance and security tests must be completed
    • Functional and security bugs need to be addressed
    • Real world usage scenarios of the software are identified and test cases to cover these scenarios are completed

Testing for Failure

  • Not all errors in code result in failure and not all vulnerabilities are exploitable.
  • Need to test for errors that may not result in immediate failure but possese a potential for future issue.

Security Testing Methods

White Box Testing (glass/clear box testing)

  • Structural analysis, full knowledge assessment.
  • Test both use/misuse cases
  • Pre-requisite
    • Scope
    • Context
    • Intended functionality
  • Inputs:
    • Design document
    • Source code
    • Configuration
    • Use/misuse cases
    • Test data
    • Test environment
    • Security specification
  • Output:
    • Test report: defects, flaws, deviations from design, change requests and recommendations

Black Box Testing

  • Behavioral analysis, zero knowledge assessment.
  • Pre-deployment test (advised):
    • Identify and address security vulnerabilities proactively.
  • Post-deployment test:
    • Identify vulnerabilities that exist in the deployed production.
    • Attest the presence and effectiveness of the software security controls.
  • Common methodologies used by blackbox testing tools:
    • Fuzzing
    • Scanning
    • Penetration testing

White Box Testing vs Black Box Testing

  • White box can be performed early:
    • May not cover code dependencies or 3rd party
    • Less insight to exploitability
  • Black box can attest the exploitability:
    • No need for source code
    • Scope limited
  • Criteria to determine approach:
    • Extent of Code Coverage (whitebox)
    • Root Cause Identification (whitebox)
      • Root Cause Analysis (RCA) is easier when the source code is available
    • Logical Flaws Detection (whitebox)
      • Semantic in nature, not syntactic (implementation bugs)
    • Number of False Positives and False Negatives (blackbox)
      • False rate is higher in blackbox
    • Deployment Issues Determination (blackbox)
  • Comparison between the white box and black box security testing methodologies:
MethodologyWhiteboxBlakbox
Also known asFull knowledge assessmentZero knowledge assessment
Assesses the software’sStructureBehavior
Root Cause identificationCan identify the exact line of code or design issue causing the vulnerabilityCan analyze only the symptoms of the problem and not necessarily the cause
Extent of code coverage possibleGreater; the source code is available for analysisLimited; not all code paths may be analyzed
Number of False positives and false negativesLess; contextual information is availableHigh; since normal behavior is unknown, expected behavior can also be falsely identified as anomalous
Logical flaws detectionHigh; design and architectural documents are available for reviewLess; limited to no design and architectural documentation is available for review
Deployment issues identificationLimited; assessment is performed in pre-deployment environmentsGreater; assessment can be performed in pre- as well as post-deployment production or production-like simulated environment.

Types of Security Testing

Cryptographic Validation Testing

  • Cryptographically random numbers are essential in cryptosystems and are best produced through cryptographic libraries.
  • Common mistakes:
    • Self-implemented cryptography
    • Key length is linked to algorithm choice and is not a common failure mode
  • Key factors:
    • Secure initialisation
    • Good random number generator
  • Standards Conformance
    • FIPS 140-2 specifies requirements, specifications, and testing of cryptographic systems for the U.S. federal government.
      • A cryptographic standard associated with the U.S. government.
      • A selection of approved algorithms: AES, RSA, and DSA.
      • Details the environments and means of implementation where cryptographic functions are used.
  • Environment Validation:
    • ISO/IEC 15408 Common Criteria (CC), not directly mapped to FIPS 140-2 security levels
    • FIPS 140-2 certificate usually not acceptable in place of CC for env validation
  • Data Validation:
    • FIPS 140-2 mendates validation of data, otherwise considered as unprotected.
  • Cryptographic Implementation:
    • Seed should be random and non-guessable.
    • Phase space analysis can be used to determine uniqueness, randomness and strength.
    • Ensure keys are not hardcoded in code.
    • Key management lifecycle should be validated

Scanning

  • Scanning can be used in software development to characterise an application on a target platform.
    • Scanners available for OWASP Top 10, SANS Top 25, PCI and SOX
    • Frequency of scanning shouldn’t just to satisfy compliance but determined by the criticality of the resource.
  • Fingerprinting
    • Active OS fingerprinting: sending packets to the remote host and analyzing the responses
      • Fast, can be detected
      • nmap
    • Passive OS fingerprinting: does not contact the remote host, captures traffic from the host
      • Slower, stealthy
      • Siphon, P0f
      • used to identify attacks originating from a botne
    • banner grabbing: enumerate and determine server versions
      • eg 21 FTP, 25 SMTP, 80 HTTP
      • Netcat, Telnet
      • Banner cloaking: security through obscurity approach, protect against version enumeration
  • Scanning can be used to:
    • Map the computing ecosystems, infrastructural and application interfaces.
    • Identify patch levels.
    • Identify server versions, open ports and running services.
    • Inventory and validate asset management databases.
    • Prove due diligence due care for compliance reasons.
  • Vulnerability Scanning
    • Goal: detect security flaws and weaknesses
    • can be used to validate readiness for audit
    • PCI DSS required periodic vulnerability scanning
      • report should provide risk rating and recommendations
    • signature based scanners are prone to new vulnerabilities
    • static scanning: scan the source code
    • dynamic scanning: scan in run time
  • Content Scanning
    • Goal: analyze the content within the document for malicious content
      • some scanner can analyse the traffic over tls/ssl
    • ensure to check both inbound and outbound transactions
  • Privacy Scanning
    • Goal: detect potential issues that violate privacy policies and end-user trust
    • attest software that collect PII for the the assurance of non-disclosure and privacy
    • the scanning technology itself shouldn’t violate any privacy regulations

Attack Surface Validation

  • A test for the resiliency of software to attest the presence and effectiveness of the security controls.
    • Begins with creating a test strategy of high risk items first, followed by low risk items.
  • Information security viz:
    • Motive
    • Opportunity
    • Means
  • Testing of Security Functionality versus Security Testing
    • Security testing:
      • Attacker perspective
      • Validate the ability of the software to withstand attack (resiliency)
        • Can also test resiliency and recoverability
      • Provide due diligence
    • Testing security functionality:
      • To assure the functionality of the protection mechanisms
  • The need for Security Testing:
    • Engage as early as possible
    • Uncover issues early

Attack Surface Analyzer

  • A Microsoft authored tool designed to measure the security impact of an application on a Windows.
    • A sophisticated scanner.
    • Detect the changes that occur to the underlying Windows OS when an application is installed.
    • Detect and alert on issues that have been shown to cause security weaknesses.
  • Benefits:
    • View changes in the Windows attack surface resulting from the installation of the application
    • Assess the aggregate attack surface change associated with the application in the enterprise
    • Evaluate the risk to the platform where the application is proposed to exist
    • Provide incident response teams detailed information associated with a Windows platform
    • Operates independently of the application that is under test
  • Scans the Windows OS environment and provides actionable information on the security implications of an application when installed on a Windows platform.

Penetration Testing (Pen-Testing)

  • Active in native, a mechanism in Verification & Validation and Certification & Accreditation.
    • Usually done after deployment
    • Rules of Engagement
      • Define the scope: IP, software interfaces, environment, data, infrastructure and not-in-scope
  • Main objective:
    • To attest whether assets can be compromised by exploiting the vulnerabilities
    • Measures the resiliency
    • Emulate the actions of a potential threat agent
  • Pen-test steps:
    • Reconnaissance (Enumeration and Discovery)
      • Enumeration: fingerprinting, banner grabbing, port and services scans, vulnerability scanning
      • eg: WHOIS, ARIN and DNS lookups, web based reconnaissance
    • Resiliency Attestation (Attack and Exploitation)
      • Exploit the discovered vulnerabilities
      • eg: brute forcing of authentication credentials, escalation of privileges, deletion of sensitive logs and audit records, information disclosure, alteration/destruction of data , DoS
    • Removal of Evidence (Cleanup activities) and Restoration
      • Target environment is restored as before testing
      • Otherwise other adversary may use the exploits
    • Reporting and Recommendations
      • Report on the findings
      • Include technical vulnerabilities, non-compliance, weakness in org process, people
        • Should result in a Plan of Action and Milestones (POA&M) and mitigation strategies, management action plan (MAP)
  • Usage of penetration test report
    • Provide insight into the state of security
    • A reference for corrective action
    • Define security controls that will mitigate identified vulnerabilities
    • Demonstrate due diligence due care processes for compliance
    • Enhance SDLC activities such as security risk assessments, C&A and process improvements.
  • Resources:
    • NIST SP 800-115 technical guide to information security testing and assessment.
    • Open Source Security Testing Methodology Manual (OSSTMM)
      • Guidance on the activities that need to be performed before, during and after a penetration test.

Fuzzing (fault injection testing)

  • A brute force approach where faults are injected into the software and its behaviors are observed.
    • Fuzz testing finds a wide range of errors with a single test method
    • Can be whitebox or blackbox
  • Common techniques for creating fuzz data:
    • Recursion
    • Replacement
  • Smart Fuzzing: uses knowledge of what could go wrong and creates malformed inputs with this knowledge.
  • Dumb Fuzzing: no fore-knowledge of the data format or protocol specifications, random data input.
  • Generation-Based Fuzzing
    • Create test data based on design specification (i.e foreknowledge) of the data.
    • Pros:
      • Greater code coverage
      • More thorough
    • Cons:
      • Time consuming
      • Biased
  • Mutation-Based Fuzzing
    • Mutate a good traffic to create new input streams for testing.
    • Can lead to DoS
    • Recommended to be done in a simulated environment

Software Security Testing

  • Example categories lists:
    • NSA IAM threat list
    • STRIDE threat lists

Testing for Input Validation

  • Input validation is more important on server than client.
  • Attributes:
    • Range
    • Format
    • Data type
    • Values
  • Controls
    • RegEx, fuzzing
    • White lists, black lists
    • Anti-tampering protection
    • Validate canonical forms

Testing for Injection Flaws Controls

  • Parameterized queries
  • Determine the source of injection
  • Disallow dynamic query construction
  • Error messages and exceptions are explicitly handled
  • Non-essential procedures and statements are removed from DB
  • Database generated errors should not disclose internal database structure
  • Use parsers that prohibit external entities
    • Disallows developers to define their own XML entities
  • White-listing allows only alphanumeric characters is used when querying LDAP stores
  • Developers use escape routines for shell command instead of customly writing their own

Testing for Scripting Attacks Controls

  • Lack of output sanitization
  • Sanitise output (eg escaping or encoding) before sending to client
  • Validate input using current/contectually relevant whitelist
  • Scripts cannot be injected into input sources or the response
  • Allow only valid files with valide extensions to be uploaded
  • Use secure libraries and safe browsing settings
  • Software can still function if active scripting is disabled
  • State management items such as cookies are not accessible

Testing for Non-repudiation Controls

  • Proper session management and auditing
    • NIST 800-92 guidance on the protection of audit trails and the management of security logs
  • Validate that audit trails can accurately determine the actor and their actions
  • Ensure that misuse cases generate auditable trails appropriately as well
  • Validate that user activity is unique, protected and traceable
  • Verify the protection and management of the audit trail and the integrity of audit logs
  • Periodically check log retention

Testing for Spoofing Controls

  • Network spoofing: ARP poisoning, IP spoofing, MAC spoofing
  • User, certificate spoofing
  • Testing spoofability of user/certificate and TLS can attest secure communication and protection against MITM
  • Test cookie expiration, authentication cookies are encrypted
  • User awareness against phishing

Testing for Error and Exception Handling Controls (Failure Testing)

  • Potential causes:
    • Requirement gaps
    • Omitted design
    • Coding errors
  • Fail Secure (Fail safe)
    • Verify C-I-A when the software fails
    • Attention to authentication
    • Account lockout
    • Denying access by default
  • Error and Exception Handling
    • Testing the messaging and encapsulation of error details
    • Verify exceptions are handled and details are encapsulated using user-defined messages and redirects must be performed
    • Test reference ID mapping of error messages
  • Testing for Buffer Overflow Controls
    • Blackbox:
      • Fuzzing
    • Whitebox:
      • Input is sanitised and its size validated
      • Bounds checking of memory allocation is performed
      • Conversion of data types from one are explicitly performed
      • Banned and unsafe APIs are not used
      • Code is compiled with compiler switches that protect the stack, i.e ASLR

Testing for Privileges Escalations Controls

  • Vertical: subject with lower rights gets access to resources that requires higher rights
  • Horizontal: subject gets access to resources that are to be restricted to other subjects at their same privilege level
  • Insecure direct object reference design flaws and coding bugs with complete mediation can lead to privilege escalation.
    • Parameter manipulation checks need to be conducted
    • Webapp: POST (Form) and GET (QueryString) parameters need to be checked

Anti-Reversing Protection Testing

  • Test for code obfuscation
  • Binary analysis for symbolic and textual information (should be removed)
  • Test for anti-debugging mechanism
    • IsDebuggerPresent, user level
    • SystemKernelDebuggerInformation, kernal level

Tools for Security Testing

  • Reconnaissance (Information Gathering) tools
  • Vulnerability scanners
  • Fingerprinting tools
  • Sniffers / Protocol analyzers
  • Password crackers
  • Web security tools - Scanners, Proxies and Vulnerability Management
  • Wireless security tools
  • Reverse engineering tools (Assembler and Disassemblers, Debuggers and Decompilers)
  • Source code analyzers
  • Vulnerability exploitation tools
  • Security oriented Operating Systems
  • Privacy testing tools

Test Data Management

  • Common problem: production data are exported into the test environment.
    • Can lead to confidentiality and privacy violation
  • Non-Synthetic transactions: transactions that serve a business value
  • Synthetic transactions: transactions that serve no business value
    • Passive synthetic transactions: not stored and do not have any residual impact
      • Usually a one-time transaction
    • Active synthetic transaction: processed and stored
  • Build a data model to generate dummy data:
    • Subsetting: the defining of subset criteria
    • Filtering: export selective information from a production system to the test environment
    • Extraction rules: often augmented with database queries
    • All private data must be obfuscated
  • Benefits:
    • Keep data management costs low
    • Assure confidentiality of sensitive data
    • Assure privacy of information by not importing or masking private information
    • Reduce the likelihood of insider threats and frauds

Defect Reporting and Tracking

  • Bug bar: a predetermined level of security defect that must be fixed prior to release.
    • Errors of less significance can either be fixed or deferred
    • Errors that exceed the bug bar threshold must be fixed prior to software release
  • Definitions
    • Bugs: errors in coding
    • Flaws: errors in design
    • Behavioral anomalies: issues in how the application operates
    • Errors and faults: outcome-based issues from other sources
    • Vulnerabilities: items that can be manipulated to make the systemoperate improperly
  • Remediations:
    • Removal of defect
    • Mitigation of defect
    • Transfer of responsibility
    • Ignore the issue
  • Reporting Defects
    • Defect Identifier (ID)
    • Title
    • Description
    • Detailed Steps
    • Expected Results
    • Screenshot
    • Type
    • Environment
    • Build Number
    • Tester Name
    • Reported On
    • Severity
    • Priority
    • Status
    • Assigned to
  • Tracking Defects
    • Issues should be centralised for better management
    • A defect tracking system should have:
      • Defect documentation
      • Integration with Authentication Infrastructure
      • Customizable Workflow
      • Notification
      • Auditing capability
  • Impact Assessment and Corrective Action
    • Risk management principles can be used to determine how the defect is going to be handled
    • Corrective actions have a direct bearing on the risk
      • Fixing the defect (mitigating the risk)
      • Deferring the functionality (not the fix) to a latter version (transferring the risk)
      • Replacing the software (avoiding the risk)