Mobile App Security Audits: Pen-Test Reports & Compliance Documentation for Flutter

The deliverable nobody told you about

You built a secure app. You followed the compliance requirements. You encrypted data at rest and in transit. You implemented proper authentication, audit logging, access controls. The code is solid.

Your client's security team asks: "Can we see the security assessment?"

They don't want to read your code. They want a document — a structured report that describes what was tested, what was found, what the risk levels are, and what was done about each finding. They want to hand this document to their CISO, their auditors, their insurance provider. They want evidence that a professional security assessment was performed on the application they're about to deploy to their employees or customers.

This is the deliverable that separates a feature-complete app from an enterprise-ready app. It's also the deliverable that justifies the price difference between a $40k engagement and a $120k one.

This post covers how to produce these artifacts: what tools to run, how to structure the reports, what auditors actually look for, and how to integrate security testing into your development workflow so the artifacts accumulate naturally rather than being a painful last-minute scramble.

The artifacts: what you deliver

An enterprise security package typically includes some combination of:

Threat model — what could go wrong, how likely is it, what's the impact
Vulnerability assessment — automated and manual scan results
Penetration test report — active testing of the app's defenses
OWASP MASVS compliance matrix — mapping to the mobile security standard
Remediation summary — what was found and what was fixed
Security architecture document — how the app's security is designed

Not every client needs all of these. A SOC 2 audit might require the vulnerability assessment and compliance matrix. A healthcare client might want the full threat model. A fintech client might require the penetration test report. Ask what their auditors need before you start.

Artifact 1: Threat model

A threat model is a structured analysis of what an attacker could do to your app and what prevents them. It's done before or during development — not after — because it informs design decisions.

The standard methodology is STRIDE (Spoofing, Tampering, Repudiation, Information Disclosure, Denial of Service, Elevation of Privilege), applied to each component of the system.

For a Flutter mobile app, the threat model looks at:

javascript

Components:
├── Flutter app (client)
│   ├── Authentication flow
│   ├── Local data storage
│   ├── API communication
│   ├── Push notification handling
│   └── Deep link / URL handling
├── Backend API
│   ├── Authentication endpoints
│   ├── Data endpoints
│   └── File upload/download
├── Third-party services
│   ├── Analytics
│   ├── Payment processor
│   └── Push notification service
└── Infrastructure
    ├── CDN / load balancer
    ├── Database
    └── Object storage

For each component and each STRIDE category, you ask: "Is this a realistic threat? What control prevents it? How confident are we in that control?"

Here's a condensed example for the authentication flow:

Threat	Category	Likelihood	Impact	Control	Status
Attacker intercepts credentials in transit	Information Disclosure	Low (TLS)	Critical	TLS 1.3 + cert pinning	Mitigated
Attacker brute-forces login	Spoofing	Medium	High	Rate limiting + account lockout after 5 attempts	Mitigated
Stolen device provides app access	Spoofing	Medium	High	Biometric/PIN required + session timeout	Mitigated
Attacker extracts tokens from device storage	Information Disclosure	Low (encrypted storage)	Critical	Tokens in Keychain/Keystore	Mitigated
Session token doesn't expire	Elevation of Privilege	Low (if properly implemented)	High	15-min access token + refresh rotation	Mitigated
Attacker replays stolen auth token	Spoofing	Low	High	Short token lifetime + server-side revocation	Mitigated

The complete threat model for a typical mobile app runs 10-20 pages. It's a living document — update it when you add features or change the architecture.

How to produce it: Sit with your team (and the client's security team, if available) and walk through each component. Use a spreadsheet or a structured document. The value isn't in the format — it's in the systematic thinking.

Artifact 2: Vulnerability assessment with MobSF

A vulnerability assessment combines automated scanning with manual verification. For mobile apps, MobSF (Mobile Security Framework) is the standard open-source tool.

MobSF performs static analysis (examining the APK/IPA without running it) and dynamic analysis (running the app and observing its behavior).

Running MobSF static analysis

bash

# Install MobSF (Docker is the easiest method)
docker pull opensecurity/mobile-security-framework-mobsf

# Run MobSF
docker run -it --rm -p 8000:8000 opensecurity/mobile-security-framework-mobsf

# Build your Flutter app in release mode
flutter build apk --release
# or for iOS
flutter build ipa --release

# Upload the APK/IPA to MobSF at http://localhost:8000
# MobSF will analyze and produce a report

MobSF's static analysis checks for:

Hardcoded secrets: API keys, passwords, certificates embedded in the binary
Insecure configurations: Debug flags enabled, backup allowed, cleartext traffic permitted
Weak cryptography: Use of MD5, SHA1 for security purposes, ECB mode, hardcoded IVs
Insecure data storage: Files written to external storage, world-readable files
Code vulnerabilities: SQL injection patterns, XSS in WebViews, path traversal
Binary protections: Missing ASLR, stack canaries, PIE, ARC (iOS)
Permissions: Excessive permissions, dangerous permission combinations

What MobSF finds in a typical Flutter app

Flutter apps have a specific profile in MobSF scans. Here's what you'll commonly see and how to address each finding:

Finding: Application allows backup (Android)

MobSF flags android:allowBackup="true" in the manifest. This means an attacker with USB debug access can extract the app's data via adb backup.

xml

<!-- android/app/src/main/AndroidManifest.xml -->
<!-- Fix: disable backup or use encrypted backup -->
<application
    android:allowBackup="false"
    ...>

For apps that need backup functionality, use android:fullBackupContent to exclude sensitive files:

xml

<application
    android:allowBackup="true"
    android:fullBackupContent="@xml/backup_rules"
    ...>

xml

<!-- android/app/src/main/res/xml/backup_rules.xml -->
<full-backup-content>
    <exclude domain="sharedpref" path="FlutterSecureStorage"/>
    <exclude domain="database" path="app.db"/>
    <exclude domain="file" path="sensitive/"/>
</full-backup-content>

Finding: Application is debuggable

MobSF flags android:debuggable="true". Flutter release builds set this to false automatically, but verify:

bash

# Verify the release APK is not debuggable
aapt dump badging build/app/outputs/flutter-apk/app-release.apk | grep -i debug

Finding: Cleartext traffic permitted

MobSF flags android:usesCleartextTraffic="true" or missing network security configuration.

xml

<!-- android/app/src/main/AndroidManifest.xml -->
<application
    android:usesCleartextTraffic="false"
    android:networkSecurityConfig="@xml/network_security_config"
    ...>

xml

<!-- android/app/src/main/res/xml/network_security_config.xml -->
<network-security-config>
    <base-config cleartextTrafficPermitted="false">
        <trust-anchors>
            <certificates src="system"/>
        </trust-anchors>
    </base-config>
</network-security-config>

Finding: Hardcoded strings that look like secrets

MobSF scans the decompiled binary for patterns that look like API keys or secrets. In Flutter, string literals in Dart code end up in the libapp.so binary. Even if they're loaded from environment variables at build time, the resolved values are in the binary.

dart

// BAD — this string ends up in the binary, MobSF will flag it
const apiKey = 'sk_live_abc123xyz789';

// BETTER — fetch from secure backend at runtime
// The key never exists in the mobile binary
Future<String> getApiKey() async {
  final response = await httpClient.get('/api/config/keys');
  return response['api_key'];
}

Interpreting MobSF severity levels

MobSF categorizes findings as High, Warning, Info, and Secure.

High: Must fix before release. These are exploitable vulnerabilities.
Warning: Should fix. These are weaknesses that increase attack surface.
Info: Nice to fix. These are best-practice deviations that aren't directly exploitable.
Secure: Controls that are correctly implemented. Include these in your report — auditors want to see what's working, not just what's broken.

Artifact 3: Penetration test report

A penetration test goes beyond automated scanning. It's manual, targeted testing where you (or a security professional) actively try to break the app's security.

For a mobile app, the standard methodology is the OWASP Mobile Application Security Testing Guide (MASTG), which tests against the OWASP Mobile Application Security Verification Standard (MASVS).

The OWASP MASVS categories

MASVS defines security requirements in categories:

MASVS-STORAGE: Secure data storage
MASVS-CRYPTO: Cryptography
MASVS-AUTH: Authentication and authorization
MASVS-NETWORK: Network communication
MASVS-PLATFORM: Platform interaction (intents, deep links, WebViews)
MASVS-CODE: Code quality and build settings
MASVS-RESILIENCE: Reverse engineering and tampering resistance

Manual testing with Frida

Frida is the standard tool for runtime security testing of mobile apps. It injects JavaScript into a running process, allowing you to intercept function calls, modify arguments, bypass security checks, and inspect memory.

bash

# Install Frida
pip install frida-tools

# For Android: push frida-server to device
adb push frida-server-16.x.x-android-arm64 /data/local/tmp/frida-server
adb shell "chmod 755 /data/local/tmp/frida-server"
adb shell "/data/local/tmp/frida-server &"

# List running apps
frida-ps -U

# Attach to your Flutter app
frida -U -n "com.yourapp.name" -l test_script.js

Example Frida scripts for testing common Flutter security controls:

Test: Can SSL pinning be bypassed?

javascript

// frida_ssl_bypass.js
// Tests whether certificate pinning can be trivially bypassed

Java.perform(function() {
    // Attempt to bypass common SSL pinning implementations
    var TrustManager = Java.use('javax.net.ssl.X509TrustManager');
    // ... (standard SSL bypass technique)

    console.log('[*] Attempting SSL pinning bypass...');
    // If this succeeds, your pinning implementation needs hardening
});

If this bypass works against your app, your certificate pinning implementation is insufficient. The pen-test report would note: "SSL certificate pinning was implemented but could be bypassed using Frida instrumentation. Recommend implementing pinning at the native layer and adding root/Frida detection."

Test: Can sensitive data be extracted from memory?

javascript

// frida_memory_dump.js
// Searches for sensitive data in the app's memory

Process.enumerateRanges('r--', {
    onMatch: function(range) {
        try {
            var content = Memory.readUtf8String(range.base, Math.min(range.size, 1024));
            // Search for patterns like API keys, tokens, passwords
            if (content && content.match(/eyJ[A-Za-z0-9_-]+\./)) {
                console.log('[!] JWT token found in memory at: ' + range.base);
            }
        } catch(e) {}
    },
    onComplete: function() {
        console.log('[*] Memory scan complete');
    }
});

Test: Can root/jailbreak detection be bypassed?

If your app includes root detection, test whether it can be bypassed:

javascript

// frida_root_bypass.js
Java.perform(function() {
    // Hook common root detection methods
    var RootBeer = Java.use('com.scottyab.rootbeer.RootBeer');
    RootBeer.isRooted.implementation = function() {
        console.log('[*] Root check bypassed — returned false');
        return false;
    };
});

Structuring the penetration test report

The report should follow a standard structure that auditors expect:

javascript

1. Executive Summary
   - Scope of testing
   - Testing dates
   - Summary of findings by severity
   - Overall risk assessment

2. Methodology
   - OWASP MASTG reference
   - Tools used (MobSF, Frida, Burp Suite, custom scripts)
   - Testing environment (device models, OS versions)
   - Limitations and exclusions

3. Findings
   For each finding:
   - Title
   - Severity (Critical / High / Medium / Low / Informational)
   - OWASP MASVS mapping (e.g., MASVS-STORAGE-1)
   - Description of the vulnerability
   - Steps to reproduce
   - Evidence (screenshots, logs, code snippets)
   - Impact analysis
   - Remediation recommendation
   - Remediation status (Open / Fixed / Accepted Risk)

4. Positive Findings
   - Security controls that were tested and found effective
   - This is important — auditors want to see what's working

5. Appendices
   - Full tool output
   - Test case checklist
   - Tool versions and configurations

Example finding entry:

javascript

Finding: F-003
Title: Sensitive data stored in application logs
Severity: Medium
MASVS: MASVS-STORAGE-2

Description:
During dynamic analysis, application debug logs were found to contain
user authentication tokens and API request bodies including personally
identifiable information. Logs are accessible via `adb logcat` on
devices with USB debugging enabled.

Steps to reproduce:
1. Connect device via USB with debugging enabled
2. Run: adb logcat | grep "com.yourapp"
3. Perform a login in the app
4. Observe authentication token printed in logcat output

Evidence:
[Screenshot of logcat output showing token]

Impact:
An attacker with physical access to a device with USB debugging
enabled could extract authentication tokens and impersonate the user.
Risk is reduced by the requirement for USB debugging access.

Remediation:
Remove all logging of sensitive data in release builds. Use
Flutter's `kReleaseMode` flag to conditionally disable verbose logging.

Status: Fixed in build 2.3.1 (commit abc123)

Artifact 4: OWASP MASVS compliance matrix

The MASVS compliance matrix is a checklist that maps each MASVS requirement to your app's implementation. It's the document that most directly answers "is this app secure?"

Here's a condensed example:

ID	Requirement	Status	Evidence	Notes
MASVS-STORAGE-1	App does not store sensitive data in plaintext	Pass	SQLCipher for DB, Keychain/Keystore for tokens	MobSF scan confirms
MASVS-STORAGE-2	No sensitive data in logs	Pass	kReleaseMode check, log sanitization	Fixed in F-003
MASVS-CRYPTO-1	App uses proven cryptography	Pass	libsodium XSalsa20-Poly1305	See crypto architecture doc
MASVS-CRYPTO-2	App uses proper key management	Pass	Keys in Android Keystore / iOS Keychain	Frida test confirms non-extractable
MASVS-AUTH-1	App enforces authentication	Pass	JWT + refresh token with server validation	Session timeout at 15 min
MASVS-NETWORK-1	App uses TLS for all connections	Pass	Network security config enforces HTTPS	MobSF + Burp Suite confirm
MASVS-NETWORK-2	App performs certificate pinning	Partial	Pinning implemented, bypassable with Frida on rooted devices	Accepted risk — root detection mitigates
MASVS-PLATFORM-1	App validates all deep link inputs	Pass	Input sanitization on all deep link parameters	Manual test with malformed URLs
MASVS-CODE-1	App is signed with valid certificate	Pass	Release signing with app-level key	Keystore in CI vault
MASVS-RESILIENCE-1	App detects rooted/jailbroken devices	Pass	RootBeer (Android) + IOSSecuritySuite	Warning displayed to user

The full MASVS has dozens of requirements. The compliance matrix should cover all of them, even if the answer is "Not applicable" — auditors want to see that you considered each one.

Artifact 5: Remediation summary

The remediation summary is the "before and after" document. For every finding in the vulnerability assessment and penetration test, it shows:

What was found
What the risk was
What was done to fix it
How the fix was verified

javascript

Remediation Summary — YourApp v2.3.1

Finding F-001: Hardcoded API key in binary
  Severity: High
  Found: 2026-03-15
  Fixed: 2026-03-17
  Fix: Moved API key to server-side configuration endpoint.
       Mobile app fetches key at runtime via authenticated API call.
  Verified: Re-scan with MobSF confirms no API key patterns in binary.

Finding F-002: Backup flag enabled on Android
  Severity: Medium
  Found: 2026-03-15
  Fixed: 2026-03-16
  Fix: Set android:allowBackup="false" in AndroidManifest.xml.
  Verified: MobSF re-scan. Manual test: `adb backup` returns empty.

Finding F-003: Sensitive data in logs
  Severity: Medium
  Found: 2026-03-16
  Fixed: 2026-03-18
  Fix: Added log sanitization. Verbose logging disabled in release builds.
  Verified: `adb logcat` during authenticated session shows no tokens.

Finding F-004: Missing FLAG_SECURE on sensitive screens
  Severity: Low
  Found: 2026-03-16
  Fixed: 2026-03-17
  Fix: Added FLAG_SECURE to activities displaying PHI/PII.
  Verified: App switcher shows blank screen instead of content.

Total findings: 12
  Critical: 0
  High: 2 (2 fixed)
  Medium: 5 (5 fixed)
  Low: 3 (3 fixed)
  Informational: 2 (1 fixed, 1 accepted)

All High and Medium findings remediated. Re-assessment scheduled for 2026-04-01.

Artifact 6: Security architecture document

This document describes how security is designed into the app — not as an audit finding, but as a design record. It's for the client's technical team and their auditors.

Structure:

javascript

1. Architecture Overview
   - System diagram showing all components
   - Data flow diagram showing where sensitive data moves
   - Trust boundaries marked

2. Authentication Architecture
   - Auth flow diagram (registration, login, token refresh)
   - Token lifecycle and storage
   - Multi-factor authentication implementation
   - Session management

3. Data Protection
   - Encryption at rest: what's encrypted, with what algorithm, key management
   - Encryption in transit: TLS configuration, certificate pinning
   - Data classification: what's sensitive, what's not

4. Access Control
   - Role-based access control model
   - API authorization model
   - Client-side access control as defense-in-depth

5. Audit and Monitoring
   - What events are logged
   - Log format and storage
   - Anomaly detection capabilities
   - Incident response triggers

6. Third-Party Security
   - List of all third-party SDKs and their data access
   - Privacy policy review for each
   - Data processing agreements in place

7. Build and Deployment Security
   - CI/CD pipeline security
   - Code signing
   - Secret management
   - Dependency vulnerability scanning

Integrating security testing into your workflow

The worst time to produce security artifacts is the week before delivery. The best time is continuously, throughout development.

In CI/CD (automated, every build):

yaml

# GitHub Actions example
security-scan:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4

    - name: Build release APK
      run: flutter build apk --release

    - name: Run MobSF static analysis
      run: |
        docker run --rm -v $(pwd)/build:/input opensecurity/mobile-security-framework-mobsf \
          mobsf --scan /input/app/outputs/flutter-apk/app-release.apk \
          --output /input/mobsf-report.json

    - name: Check for high-severity findings
      run: |
        python scripts/check_mobsf_report.py build/mobsf-report.json \
          --fail-on high

    - name: Dependency vulnerability scan
      run: |
        dart pub outdated --json > dependency-report.json
        # Also run OSV-Scanner for known vulnerabilities
        osv-scanner --lockfile pubspec.lock

    - name: Upload security artifacts
      uses: actions/upload-artifact@v4
      with:
        name: security-reports
        path: |
          build/mobsf-report.json
          dependency-report.json

Per sprint (manual, targeted):

Review new features against the threat model
Run Frida scripts against new functionality
Update the MASVS compliance matrix
Check third-party SDK updates for security patches

Per release (comprehensive):

Full MobSF scan (static + dynamic)
Frida-based manual testing of security controls
Update penetration test report
Update remediation summary
Review and update security architecture document

The business case for security artifacts

Producing these artifacts costs engineering time. Here's why it's worth it:

It closes enterprise deals. Enterprise procurement teams have security review gates. Without artifacts, the deal stalls in security review for weeks or months. With artifacts, the review is fast because you've already done the work their team would need to do.

It reduces liability. If something goes wrong, the documented security assessment shows due diligence. "We performed an OWASP MASVS assessment and addressed all High/Medium findings" is a defensible statement.

It increases your pricing power. The artifacts are a concrete deliverable that most competitors don't provide. When you include a security assessment package in your proposal, you're selling something tangible that the client's security team will champion internally.

It improves the app. The process of systematically testing your app's security finds bugs. Every finding in the vulnerability assessment is a bug you caught before your users (or an attacker) did.

What you can self-perform vs. what needs external auditors

Activity	Self-perform?	External auditor?
Threat modeling	Yes	Validates completeness
MobSF automated scans	Yes	Uses the same tools
Frida-based manual testing	Yes (with training)	Brings deeper expertise
MASVS compliance matrix	Yes	Reviews and certifies
Full penetration test	Possible but less credible	Required for SOC 2 / PCI
Compliance certification	No	Required by the framework
Security architecture doc	Yes	Reviews for completeness

For many engagements, self-performed assessments are sufficient. For compliance-driven clients (SOC 2, PCI-DSS), an independent third-party assessment is required by the framework. Build this into the project budget — third-party mobile app pen tests typically cost $5,000-$20,000 depending on scope.

Even when an external auditor is required, doing the self-assessment first means the external audit goes faster (and costs less). You'll have fixed the easy findings, and the auditor spends their time on the nuanced issues rather than flagging android:allowBackup="true".

This post connects to Compliance Frameworks for the regulatory requirements these artifacts satisfy and Binary Protection for the MASVS-RESILIENCE controls referenced in the compliance matrix.

Security Audit Artifacts: Pen-Test Reports, Vulnerability Assessments, and the Documentation That Wins Enterprise Deals

The deliverable nobody told you about

The artifacts: what you deliver

Artifact 1: Threat model

Artifact 2: Vulnerability assessment with MobSF

Running MobSF static analysis

What MobSF finds in a typical Flutter app

Interpreting MobSF severity levels

Artifact 3: Penetration test report

The OWASP MASVS categories

Manual testing with Frida

Structuring the penetration test report

Artifact 4: OWASP MASVS compliance matrix

Artifact 5: Remediation summary

Artifact 6: Security architecture document

Integrating security testing into your workflow

The business case for security artifacts

What you can self-perform vs. what needs external auditors

Related Topics

Ready to build your app?