Domain 5. Secure Software Testing

Quality Assurance

Quality: fitness for use according to certain requirements.
ISO 9216: provides guidance for establishing quality in software products.
- Provide a quality model built around functionality, reliability and usability.
  - Additionally around efficiency, maitainability, portability
- Address human side of quality.
Systems Security Engineering Capability Maturity Model (SSE-CMM), a.k.a ISO/IEC 21827:
- An international standard for secure engineering of system.
  - Designed to be a tool to evaluate security engineering practices.
  - Organised into 11 processes and corresponding maturity levels (a standard CMM)
  - De facto standard for evaluating security engineering capability
- Covers:
  - Concept definition
  - Requirement analysis
  - Design and development
  - Integration
  - Installation
  - Operations
  - Maintenance
  - Decommissioning
Open Source Security Testing Methodology Manual (OSSTMM)
- A peer reviewed system describing security testing.
  - Provide a method for assessing operational security built upon analytical metrics
  - Provide a system that can accurately characterise the security of the operational system reliably
  - Can be used to assist in auditing
- Five sections:
  - Data networks
  - Telecommunications
  - Wireless
  - Human
  - Physical security

Software Quality Assurance Testing

Functional:
- Unit
- Logic
- Integration
- Regression
Non-functional:
- Performance
- Scalability
- Environment
- Simulation
Other:
- Privacy
- User acceptance

Test aspects

Reliability: the software is functioning as it is expected.
- Reliability is not just a measure of availability, but functionally complete availability.
Recoverability: the ability for the software to restore itself to an operational state.
Resiliency: how strong the software is to be able to withstand attacks.
Interoperability: the ability of the software to function in disparate environments.
Privacy: to check PII, PHI, PFI is assured confidential.

Testing Artifacts

Test Strategy:
- Outlines the testing approach
- Inform and communicate testing issue
- Test goal, methods, timeline, environment, resources
- Type of test and high level success/fail criteria
- Developed from conceptual design
- Security:
  - Data classification
  - Threat model
  - Access control
Test Plan:
- A granular document that details the test approach systematically
  - The tester’s workflow
  - To verify that the software is reliable
- Three primary components:
  - Requirements
  - Methods
  - Coverage
Test Case:
- Take requirements and define measurables to validate the requirements are met.
- Test case: ID, requirement reference, pre-conditions, actions, input, expected result
Test Script
- Procedures undertaken to perform the test.
  - Developed using test case
- Ensure security requirements are included.
Test Suite
- A collection of test cases
Test Harness
- All necessary components for testing: tools, samples, config, cases, scripts etc
  - Can be used to simulate functions that are still in development (i.e simulation test)
- Promote the principle of leveraging existing components and Psychological acceptability.

Functional Testing (reliability testing)

Purpose: attest the functionality of the software as expected by the business or customer.
- Determine compliance with requirements in terms of reliability, logic, performance and scalability.
Reliability: measures that the software functions as expected by end user.
Resiliency: measures how strongly the software can perform when under attack.
Steps:
- Identifying expected functions
- Creating input data based on specifications
- Determining expected output based on the specifications
- Executing test cases corresponding to functional requirements
- Comparing expected and actual results

Unit Testing

Conducted by developers during implementation phase
- Breaking down functionalities and test in isolation
Quality of Code (QoC)
- High cohesiveness
- Low coupling
At a minimum:
- Ensure functional logic
- Understandable code
- Have a reasonable level of security control
Challenge in agile:
- Code may be dependent on other developers
- Can use drivers (simulate the calling unit) and stubs (simulate the called unit) to solve the issue
Benefits:
- Validate functional logic
- Identify inefficiencies, complexities and vulnerabilities early
- Automate testing processes
- Extend test coverage
- Enable collective code ownership

Logic Testing

Validates the accuracy of the software processing logic.
- Logic testing also includes the testing of predicates.
  - Predicate: something that is affirmed or denied of the subject in a proposition in logic.
- Software with high cyclomatic complexity must undergo logic test.
Source of Boolean predicates:
- Functional requirements specifications like UML diagrams, RTM
- Assurance (security) requirements
- Looping constructs (for, foreach, do while, while)
- Preconditions (if-then)

Integration Testing (System testing)

Individual units of code are aggregated and tested together.
- Identify problems that occur when units of code are combined.
Ensure the integration occurs as designed and data transfers between components are secure and proper.

Regression Testing (verification testing)

To validate the software did not break previous functionality or security and regress to a nonfunctional or insecure state.
- i.e Testing various versions of software
- Primarily focused on implementation issues over design flaws
Need to ensure:
- Root cause of the bug is fixed
- Fixing the bug doesn’t introduce new bugs
- Fixing the bug doesn’t make old bugs reappear
- Modifications still meet requirements
- Unmodified code is not impacted
- Adequate time is allocated for regression tests
Define a library of regression tests for better management:
- At a minimum: boundary conditions and timing tests should be included.
- Can use Relative Attack Surface Quotient (RASQ) as a security metric.
Challenges:
- Determine which tests should be included.
- Interactive differences between components.
- Difficult to vendors who maintain multiple versions.
Common regression issues:
- The fix may cause a fault in some other part of the software
- The fix may undo some other mitigation at the point of the fix
- The fix may repair a special case, but miss the general case, e.g blacklisting
When applying a patch, need to determine the correct level of regression testing required.

Non-Functional Testing

Test for recoverability and environmental aspects.
Recoverability: measures the ability to restore to an expected level of after a security breach.

Performance Testing

Goal: to determine the bottlenecks, not finding vulnerabilities.
- Ensure software performance meets SLA and business expectations.
- Establish a baseline for future regression.
  - Bottlenecks can be reduced by tuning the software.
Load Testing (longevity or endurance or volume testing)
- Identifying max capacity
- An iterative process
Stress Testing
- Determine the breaking point (i.e recoverability)
- To assure recovery
- To assure fail secure

Scalability Testing

Goal: to identify the loads and to mitigate any bottlenecks that hinder scalability.
Environment Testing
- Verify the integrity of configuration and data of the environment.
- Define trust boundaries
- Test data movement across trust boundaries from end to end of the application
Interoperability Testing
- Check upstream and downstream dependency interfaces.
- Examples:
  - Security standards (such as WS-Security for web services implementation) are used
  - Momplete mediation is effectively working to ensure that authentication cannot be bypassed
  - Tokens used for transfer of credentials cannot be stolen, spoofed and replayed
  - Authorization checks post authentication are working properly
Disaster Recovery (DR) Testing
- Verify the recoverability of a software after a disaster
  - Also uncovers data accuracy, integrity and system availability
- Failover testing is dependent on how close a real disaster can be simulated.

Simulation Testing

Testing the application in an environment that mirrors the associated production environment.
- Validate least privilege and configuration mismatches
Common issue:
- Software works in devevelopment and test, but failed in production.
- Software run with admin privilege

Other Testing

Privacy Testing

Test should include the verification of organisational policy controls that impact privacy.
- Network traffic monitoring
- End-points communication monitoring
- Appropriateness of notices and disclaimers
- Opt-in and opt-out mechanisms
- Privacy escalation mechanism, process and documentation

User Acceptance Testing (UAT)

To ensure software meets requirements by the end users.
- Usually as a blackbox testing, primarily focusing on functionality and usability
- The environment should be as close to production as possible.
- Ultimately to decide whether the product can be released or not.
Prerequisites:
- The software must have exited the development phase
- Other quality assurance and security tests must be completed
- Functional and security bugs need to be addressed
- Real world usage scenarios of the software are identified and test cases to cover these scenarios are completed

Testing for Failure

Not all errors in code result in failure and not all vulnerabilities are exploitable.
Need to test for errors that may not result in immediate failure but possese a potential for future issue.

Security Testing Methods

White Box Testing (glass/clear box testing)

Structural analysis, full knowledge assessment.
Test both use/misuse cases
Pre-requisite
- Scope
- Context
- Intended functionality
Inputs:
- Design document
- Source code
- Configuration
- Use/misuse cases
- Test data
- Test environment
- Security specification
Output:
- Test report: defects, flaws, deviations from design, change requests and recommendations

Black Box Testing

Behavioral analysis, zero knowledge assessment.
Pre-deployment test (advised):
- Identify and address security vulnerabilities proactively.
Post-deployment test:
- Identify vulnerabilities that exist in the deployed production.
- Attest the presence and effectiveness of the software security controls.
Common methodologies used by blackbox testing tools:
- Fuzzing
- Scanning
- Penetration testing

White Box Testing vs Black Box Testing

White box can be performed early:
- May not cover code dependencies or 3rd party
- Less insight to exploitability
Black box can attest the exploitability:
- No need for source code
- Scope limited
Criteria to determine approach:
- Extent of Code Coverage (whitebox)
- Root Cause Identification (whitebox)
  - Root Cause Analysis (RCA) is easier when the source code is available
- Logical Flaws Detection (whitebox)
  - Semantic in nature, not syntactic (implementation bugs)
- Number of False Positives and False Negatives (blackbox)
  - False rate is higher in blackbox
- Deployment Issues Determination (blackbox)
Comparison between the white box and black box security testing methodologies:

Methodology	Whitebox	Blakbox
Also known as	Full knowledge assessment	Zero knowledge assessment
Assesses the software’s	Structure	Behavior
Root Cause identification	Can identify the exact line of code or design issue causing the vulnerability	Can analyze only the symptoms of the problem and not necessarily the cause
Extent of code coverage possible	Greater; the source code is available for analysis	Limited; not all code paths may be analyzed
Number of False positives and false negatives	Less; contextual information is available	High; since normal behavior is unknown, expected behavior can also be falsely identified as anomalous
Logical flaws detection	High; design and architectural documents are available for review	Less; limited to no design and architectural documentation is available for review
Deployment issues identification	Limited; assessment is performed in pre-deployment environments	Greater; assessment can be performed in pre- as well as post-deployment production or production-like simulated environment.

Types of Security Testing

Cryptographic Validation Testing

Cryptographically random numbers are essential in cryptosystems and are best produced through cryptographic libraries.
Common mistakes:
- Self-implemented cryptography
- Key length is linked to algorithm choice and is not a common failure mode
Key factors:
- Secure initialisation
- Good random number generator
Standards Conformance
- FIPS 140-2 specifies requirements, specifications, and testing of cryptographic systems for the U.S. federal government.
  - A cryptographic standard associated with the U.S. government.
  - A selection of approved algorithms: AES, RSA, and DSA.
  - Details the environments and means of implementation where cryptographic functions are used.
Environment Validation:
- ISO/IEC 15408 Common Criteria (CC), not directly mapped to FIPS 140-2 security levels
- FIPS 140-2 certificate usually not acceptable in place of CC for env validation
Data Validation:
- FIPS 140-2 mendates validation of data, otherwise considered as unprotected.
Cryptographic Implementation:
- Seed should be random and non-guessable.
- Phase space analysis can be used to determine uniqueness, randomness and strength.
- Ensure keys are not hardcoded in code.
- Key management lifecycle should be validated

Scanning

Scanning can be used in software development to characterise an application on a target platform.
- Scanners available for OWASP Top 10, SANS Top 25, PCI and SOX
- Frequency of scanning shouldn’t just to satisfy compliance but determined by the criticality of the resource.
Fingerprinting
- Active OS fingerprinting: sending packets to the remote host and analyzing the responses
  - Fast, can be detected
  - nmap
- Passive OS fingerprinting: does not contact the remote host, captures traffic from the host
  - Slower, stealthy
  - Siphon, P0f
  - used to identify attacks originating from a botne
- banner grabbing: enumerate and determine server versions
  - eg 21 FTP, 25 SMTP, 80 HTTP
  - Netcat, Telnet
  - Banner cloaking: security through obscurity approach, protect against version enumeration
Scanning can be used to:
- Map the computing ecosystems, infrastructural and application interfaces.
- Identify patch levels.
- Identify server versions, open ports and running services.
- Inventory and validate asset management databases.
- Prove due diligence due care for compliance reasons.
Vulnerability Scanning
- Goal: detect security flaws and weaknesses
- can be used to validate readiness for audit
- PCI DSS required periodic vulnerability scanning
  - report should provide risk rating and recommendations
- signature based scanners are prone to new vulnerabilities
- static scanning: scan the source code
- dynamic scanning: scan in run time
Content Scanning
- Goal: analyze the content within the document for malicious content
  - some scanner can analyse the traffic over tls/ssl
- ensure to check both inbound and outbound transactions
Privacy Scanning
- Goal: detect potential issues that violate privacy policies and end-user trust
- attest software that collect PII for the the assurance of non-disclosure and privacy
- the scanning technology itself shouldn’t violate any privacy regulations

Attack Surface Validation

A test for the resiliency of software to attest the presence and effectiveness of the security controls.
- Begins with creating a test strategy of high risk items first, followed by low risk items.
Information security viz:
- Motive
- Opportunity
- Means
Testing of Security Functionality versus Security Testing
- Security testing:
  - Attacker perspective
  - Validate the ability of the software to withstand attack (resiliency)
    - Can also test resiliency and recoverability
  - Provide due diligence
- Testing security functionality:
  - To assure the functionality of the protection mechanisms
The need for Security Testing:
- Engage as early as possible
- Uncover issues early

Attack Surface Analyzer

A Microsoft authored tool designed to measure the security impact of an application on a Windows.
- A sophisticated scanner.
- Detect the changes that occur to the underlying Windows OS when an application is installed.
- Detect and alert on issues that have been shown to cause security weaknesses.
Benefits:
- View changes in the Windows attack surface resulting from the installation of the application
- Assess the aggregate attack surface change associated with the application in the enterprise
- Evaluate the risk to the platform where the application is proposed to exist
- Provide incident response teams detailed information associated with a Windows platform
- Operates independently of the application that is under test
Scans the Windows OS environment and provides actionable information on the security implications of an application when installed on a Windows platform.

Penetration Testing (Pen-Testing)

Active in native, a mechanism in Verification & Validation and Certification & Accreditation.
- Usually done after deployment
- Rules of Engagement
  - Define the scope: IP, software interfaces, environment, data, infrastructure and not-in-scope
Main objective:
- To attest whether assets can be compromised by exploiting the vulnerabilities
- Measures the resiliency
- Emulate the actions of a potential threat agent
Pen-test steps:
- Reconnaissance (Enumeration and Discovery)
  - Enumeration: fingerprinting, banner grabbing, port and services scans, vulnerability scanning
  - eg: WHOIS, ARIN and DNS lookups, web based reconnaissance
- Resiliency Attestation (Attack and Exploitation)
  - Exploit the discovered vulnerabilities
  - eg: brute forcing of authentication credentials, escalation of privileges, deletion of sensitive logs and audit records, information disclosure, alteration/destruction of data , DoS
- Removal of Evidence (Cleanup activities) and Restoration
  - Target environment is restored as before testing
  - Otherwise other adversary may use the exploits
- Reporting and Recommendations
  - Report on the findings
  - Include technical vulnerabilities, non-compliance, weakness in org process, people
    - Should result in a Plan of Action and Milestones (POA&M) and mitigation strategies, management action plan (MAP)
Usage of penetration test report
- Provide insight into the state of security
- A reference for corrective action
- Define security controls that will mitigate identified vulnerabilities
- Demonstrate due diligence due care processes for compliance
- Enhance SDLC activities such as security risk assessments, C&A and process improvements.
Resources:
- NIST SP 800-115 technical guide to information security testing and assessment.
- Open Source Security Testing Methodology Manual (OSSTMM)
  - Guidance on the activities that need to be performed before, during and after a penetration test.

Fuzzing (fault injection testing)

A brute force approach where faults are injected into the software and its behaviors are observed.
- Fuzz testing finds a wide range of errors with a single test method
- Can be whitebox or blackbox
Common techniques for creating fuzz data:
- Recursion
- Replacement
Smart Fuzzing: uses knowledge of what could go wrong and creates malformed inputs with this knowledge.
Dumb Fuzzing: no fore-knowledge of the data format or protocol specifications, random data input.
Generation-Based Fuzzing
- Create test data based on design specification (i.e foreknowledge) of the data.
- Pros:
  - Greater code coverage
  - More thorough
- Cons:
  - Time consuming
  - Biased
Mutation-Based Fuzzing
- Mutate a good traffic to create new input streams for testing.
- Can lead to DoS
- Recommended to be done in a simulated environment

Software Security Testing

Example categories lists:
- NSA IAM threat list
- STRIDE threat lists

Testing for Input Validation

Input validation is more important on server than client.
Attributes:
- Range
- Format
- Data type
- Values
Controls
- RegEx, fuzzing
- White lists, black lists
- Anti-tampering protection
- Validate canonical forms

Testing for Injection Flaws Controls

Parameterized queries
Determine the source of injection
Disallow dynamic query construction
Error messages and exceptions are explicitly handled
Non-essential procedures and statements are removed from DB
Database generated errors should not disclose internal database structure
Use parsers that prohibit external entities
- Disallows developers to define their own XML entities
White-listing allows only alphanumeric characters is used when querying LDAP stores
Developers use escape routines for shell command instead of customly writing their own

Testing for Scripting Attacks Controls

Lack of output sanitization
Sanitise output (eg escaping or encoding) before sending to client
Validate input using current/contectually relevant whitelist
Scripts cannot be injected into input sources or the response
Allow only valid files with valide extensions to be uploaded
Use secure libraries and safe browsing settings
Software can still function if active scripting is disabled
State management items such as cookies are not accessible

Testing for Non-repudiation Controls

Proper session management and auditing
- NIST 800-92 guidance on the protection of audit trails and the management of security logs
Validate that audit trails can accurately determine the actor and their actions
Ensure that misuse cases generate auditable trails appropriately as well
Validate that user activity is unique, protected and traceable
Verify the protection and management of the audit trail and the integrity of audit logs
Periodically check log retention

Testing for Spoofing Controls

Network spoofing: ARP poisoning, IP spoofing, MAC spoofing
User, certificate spoofing
Testing spoofability of user/certificate and TLS can attest secure communication and protection against MITM
Test cookie expiration, authentication cookies are encrypted
User awareness against phishing

Testing for Error and Exception Handling Controls (Failure Testing)

Potential causes:
- Requirement gaps
- Omitted design
- Coding errors
Fail Secure (Fail safe)
- Verify C-I-A when the software fails
- Attention to authentication
- Account lockout
- Denying access by default
Error and Exception Handling
- Testing the messaging and encapsulation of error details
- Verify exceptions are handled and details are encapsulated using user-defined messages and redirects must be performed
- Test reference ID mapping of error messages
Testing for Buffer Overflow Controls
- Blackbox:
  - Fuzzing
- Whitebox:
  - Input is sanitised and its size validated
  - Bounds checking of memory allocation is performed
  - Conversion of data types from one are explicitly performed
  - Banned and unsafe APIs are not used
  - Code is compiled with compiler switches that protect the stack, i.e ASLR

Testing for Privileges Escalations Controls

Vertical: subject with lower rights gets access to resources that requires higher rights
Horizontal: subject gets access to resources that are to be restricted to other subjects at their same privilege level
Insecure direct object reference design flaws and coding bugs with complete mediation can lead to privilege escalation.
- Parameter manipulation checks need to be conducted
- Webapp: POST (Form) and GET (QueryString) parameters need to be checked

Anti-Reversing Protection Testing

Test for code obfuscation
Binary analysis for symbolic and textual information (should be removed)
Test for anti-debugging mechanism
- IsDebuggerPresent, user level
- SystemKernelDebuggerInformation, kernal level

Tools for Security Testing

Reconnaissance (Information Gathering) tools
Vulnerability scanners
Fingerprinting tools
Sniffers / Protocol analyzers
Password crackers
Web security tools - Scanners, Proxies and Vulnerability Management
Wireless security tools
Reverse engineering tools (Assembler and Disassemblers, Debuggers and Decompilers)
Source code analyzers
Vulnerability exploitation tools
Security oriented Operating Systems
Privacy testing tools

Test Data Management

Common problem: production data are exported into the test environment.
- Can lead to confidentiality and privacy violation
Non-Synthetic transactions: transactions that serve a business value
Synthetic transactions: transactions that serve no business value
- Passive synthetic transactions: not stored and do not have any residual impact
  - Usually a one-time transaction
- Active synthetic transaction: processed and stored
Build a data model to generate dummy data:
- Subsetting: the defining of subset criteria
- Filtering: export selective information from a production system to the test environment
- Extraction rules: often augmented with database queries
- All private data must be obfuscated
Benefits:
- Keep data management costs low
- Assure confidentiality of sensitive data
- Assure privacy of information by not importing or masking private information
- Reduce the likelihood of insider threats and frauds

Defect Reporting and Tracking

Bug bar: a predetermined level of security defect that must be fixed prior to release.
- Errors of less significance can either be fixed or deferred
- Errors that exceed the bug bar threshold must be fixed prior to software release
Definitions
- Bugs: errors in coding
- Flaws: errors in design
- Behavioral anomalies: issues in how the application operates
- Errors and faults: outcome-based issues from other sources
- Vulnerabilities: items that can be manipulated to make the systemoperate improperly
Remediations:
- Removal of defect
- Mitigation of defect
- Transfer of responsibility
- Ignore the issue
Reporting Defects
- Defect Identifier (ID)
- Title
- Description
- Detailed Steps
- Expected Results
- Screenshot
- Type
- Environment
- Build Number
- Tester Name
- Reported On
- Severity
- Priority
- Status
- Assigned to
Tracking Defects
- Issues should be centralised for better management
- A defect tracking system should have:
  - Defect documentation
  - Integration with Authentication Infrastructure
  - Customizable Workflow
  - Notification
  - Auditing capability
Impact Assessment and Corrective Action
- Risk management principles can be used to determine how the defect is going to be handled
- Corrective actions have a direct bearing on the risk
  - Fixing the defect (mitigating the risk)
  - Deferring the functionality (not the fix) to a latter version (transferring the risk)
  - Replacing the software (avoiding the risk)