Port MASTG-TEST-0004: App Exposing Sensitive Data to Embedded Libraries #3485

Diolor · 2025-09-26T14:14:51Z

This PR closes #2942

Description

Port MASTG-TEST-0004: Sensitive Data Leaked via Embedded Libraries (android)

As I could not create a generic Semgrep rule for all kinds of libraries, the demo uses Firebase Analytics as an example.

TODOs before merging:

Allocate TEST ID
Allocate DEMO ID
Link to weakness

tests-beta/android/MASVS-PLATFORM/MASTG-TEST-xxxx.md

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-0297.md

tests-beta/android/MASVS-PLATFORM/MASTG-TEST-xxxx.md

…alytics

cpholguera · 2025-12-03T09:34:17Z

A couple of things are going on here:

FIrst of all just to clarify scope. The weaknesses we should be targeting are:

"MASWE-0112: Inadequate Data Collection Declarations"
"MASWE-0111: Inadequate Privacy Policy"
We also have "MASWE-0108: Sensitive Data in Network Traffic," but this should be deprecated as it is covered by the two above. In the end, the weakness isn't that the sensitive data can be found (encrypted!) in network traffic. It's only an issue if that's not properly declared.
Note that for sensitive data in cleartext connections, we have MASWE-0050.

Test

The original test has several parts.

Part 1

To determine whether API calls and functions provided by the third-party library are used according to best practices, review their source code, requested permissions, and check for any known vulnerabilities.

This isn't good, and parts of it actually belong somewhere else.

Your new test addresses the "identification of potentially sensitive data that may have been passed to embedded third-party libraries used by the application."

TODO:

Let's keep this test but only its "Method 2," point 2.
- title: App Exposing Sensitive Data to Embedded Libraries -> Update to something like "References to SDK APIs Known to Handle Sensitive Data"
Let's open an issue to create a new test that uses "Method 2," point 1.
- title: Use of Third-Party Tracking & Analytics SDKs (something like this)
Let's open an issue to create a new test that uses "Method 1" with MASTG-TECH-0119: Intercepting HTTP Traffic by Hooking Network APIs at the Application Layer

Part 2

All data that's sent to third-party services should be anonymized to prevent exposure of PII (Personally Identifiable Information) that would allow the third party to identify the user account. No other data (such as IDs that can be mapped to a user account or session) should be sent to a third party.

This seems to be related to MASWE-0109: Lack of Anonymization or Pseudonymisation Measures.

TODO:

Let's open an issue for this part, which will require additional research work.

Part 3

Check all requests to external services for embedded sensitive information. [...]

This is MASTG-TEST-0206: Undeclared PII in Network Traffic Capture and the new suggested test that uses MASTG-TECH-0119 ("Method 1" in your current test).

Demo

The proposed demo is a bit misleading. When analyzing the issue statically, you won't find the sensitive user data (e.g., email, name, username) in the results. We would need to make this more realistic.

You can only find the use of third-party SDK APIs that POTENTIALLY will hold sensitive data, e.g., in your example from Firebase (eventBundle.putString, analytics.logEvent, analytics.setUserId, etc.).
The test should explicitly say that only calls to those methods coming from the main app files and modules are valid. Otherwise, you'll have false positives just because the app contains the SDK, which contains those methods.

TODO: update demo title to "Uses of Firebase Analytics APIs on Potential PII with Semgrep"

New Demo

We need a dynamic demo and test: we hook all those APIs and will find out which ones are actually used and what they contain.

We can then correlate that with the hooks to the network APIs and the traffic capture.

For the data we detect using these tests, the final questions are:

Does it lack anonymization or pseudonymisation (MASWE-0109)?
Is it properly declared/disclosed to the app user? (MASWE-0111, MASWE-0112)

Putting it all together

Now we have a pretty solid test strategy:

Static:

We use static to know which Tracking & Analytics SDKs the app uses (new separate test).
We use the previous info and use static again to check for use of APIs (this PR's test).

Dynamic:

We use dynamic on the discovered APIs to learn which ones are used and what sensitive data they handle (new separate test).
We use dynamic on networking APIs to see what sensitive data is sent over the network and by which SDK (new separate test).

Network:

We use network analysis (traffic capture-based) as proof for the actual transmission (new separate test).

Summary of ## Required Actions for This PR

Limit the current test to only “Method 2, point 2.”
- Update the test title to: “References to SDK APIs Known to Handle Sensitive Data.”
Clarify the scope in the demo section.
- Update the demo title to: “Uses of Firebase Analytics APIs on Potential PII with Semgrep.”
- Add a note specifying that only calls from the app’s own files/modules are considered valid (to avoid false positives from SDK code).
Remove or relocate content that belongs in other tests.
- Remove any parts related to:
  - identifying third-party SDKs,
  - anonymization requirements,
  - network traffic analysis.
- Keep the PR focused strictly on detecting potential sensitive-data-handling SDK API usage via static analysis.
Clarify the limitations of static analysis in this test.
- Add text stating that static analysis detects only potential sensitive data handling, not actual user data.

Separate Follow-Up Issues (not part of this PR)

Create issues for:

A new test based on Method 2, point 1:
“Use of Third-Party Tracking & Analytics SDKs.”
A new test based on Method 1 using MASTG-TECH-0119 (network API hooking).
A new test for anonymization/pseudonymisation (MASWE-0109).
A new dynamic test for discovering actual sensitive data passed to SDKs.
A new network-analysis test for verifying transmission of sensitive data.

Adjust demo addressing test number 1

Diolor · 2025-12-04T14:16:43Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te0.md

+profiles: [P]
+---
+
+## Overview


to be moved in a new issue

Diolor · 2025-12-04T14:16:48Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te4.md

+profiles: [P]
+---
+
+## Overview


to be moved in a new issue

Diolor · 2025-12-04T14:51:04Z

@cpholguera

Adjusted the scope of this PR + see point 4 bellow.

1. A new test based on Method 2, point 1: "Use of Third-Party Tracking & Analytics SDKs."
Wrote 02te0. It can be moved to a new issue with a new demo

2. A new test based on Method 1 using MASTG-TECH-0119 (network API hooking).
Wrote 02te4. It can be moved to a new issue with a new demo

3. A new test for anonymization/pseudonymisation (MASWE-0109).
willdo

4. A new dynamic test for discovering actual sensitive data passed to SDKs.
Wrote 02te3 and a demo. Can be reviewed together with currently in main scope test/demo in this PR.

5. A new network-analysis test for verifying transmission of sensitive data.
willdo

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te1.md

cpholguera · 2025-12-04T17:31:02Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te1.md

+
+## Steps
+
+1. Identify common SDK APIs (methods) the SDK uses as entry points to collect data by researching the library's documentation online or its codebase. For example, if the library is `com.example.analytics` and it has a method `trackEvent(String eventName, Map<String, String> properties)` used to accept data, then the method to search for would be `com.example.analytics.trackEvent`.


I'd use a real example, e.g. from Firebase. Also this sounds like it should be a prerequisite and not a step.

We do not have a Prerequisites section. Move to the overview, then?

cpholguera · 2025-12-04T17:33:51Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te1.md

+
+## Evaluation
+
+The test case fails if you can find the use of these SDK methods in the app code, indicating that the app is sharing data with the third-party SDK. If no such references are found, the test case passes.


I'd say: if the methods are found and the data they handle is sensitive. Since this is static analysis, it'd require additional reverse engineering and ideally running the dynamic version of it.

Maybe we should start using techniques in the evaluation (in this case reverse eng.). I'm seeing a pattern, see the next test.

I believe this suggestion goes down a consolidation path, and fyi it's contrary to the initial direction:
Keep the PR focused strictly on detecting potential sensitive-data-handling SDK API usage via static analysis. What is suggested is mixing this (2te1) with 2te3, or?

The current format is:

02te0: detect analytics SDKs in App (via SBOM)

02te1: detect references on the SDK (found in 02te0)

02te3: validate that references found in 02te0 are used

Essentially, where one test finishes, the next takes over.

I could see 02te1 and 3 merged, however, I cannot see 02te3 living along with 02te1. 🤔

cpholguera · 2025-12-04T17:34:55Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te1.md

+title: References to SDK APIs Known to Handle Sensitive Data
+id: MASTG-TEST-02te1
+type: [static]
+weakness: MASWE-0112


Suggested change

weakness: MASWE-0112

weakness: MASWE-0112

prerequisites:

- identify-sensitive-data

Hmm, I don't think so, because this test evaluates references, not validating the existence of sensitive data. 02te3 does that (and has it). Why do you think this is needed?

cpholguera · 2025-12-04T17:37:13Z

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-02te3.md

+
+## Overview
+
+This test verifies whether an app is sending sensitive data to an embedded SDK (third-party library) via its APIs (methods).


We're about to unify static and dynamic as much as possible. This is a perfect example of the test purpose being exactly the same and the way of testing (with its pros/cons) being different. I'd keep the same description for these 2 tests for now.

Suggested change

This test verifies whether an app is sending sensitive data to an embedded SDK (third-party library) via its APIs (methods).

This test verifies whether an app uses SDK (third-party library) APIs known to handle sensitive data.

cpholguera · 2025-12-04T18:13:55Z

demos/android/MASVS-PRIVACY/MASTG-DEMO-00de3/MainActivity.kt

+    ) {
+        Column(modifier = Modifier.padding(16.dp)) {
+            // Normal visible selection UI: list of radio buttons for blood types
+            val bloodTypes = listOf("A+", "A-", "B+", "B-", "AB+", "AB-", "O+", "O-")


Very nice but I'd avoid adding anything to the UI unless the test is about anything involving the UI.

Both Demos have this on purpose: To give the reader a strong understanding of the difference between static and dynamic content. If I had hardcoded a sensitive value here, then why not use Semgrep instead?

This extra UI conveys the point of why we actually use a dynamic test here.

cpholguera · 2025-12-04T18:16:01Z

demos/android/MASVS-PRIVACY/MASTG-DEMO-00de1/MainActivity.kt

+                onValueChange = { input = it },
+                modifier = Modifier
+                    .fillMaxWidth(),
+                label = { Text("Enter something to sent to Firebase Analytics") },


Also nice, but I'd also avoid modifying the UI if it's not strictly needed. The simpler the UI the better. See same suggestion for the other demo.

Let's discuss it in the other comment. For here: this input gives emphasis on potential sensitive data as explained on TEST. The user might write their name, password, or....not. Therefore, I find it critical to convey that "potential" sensitive data message.

demos/android/MASVS-PRIVACY/MASTG-DEMO-00de3/hooks.js

demos/android/MASVS-PRIVACY/MASTG-DEMO-00de3/output.json

cpholguera · 2025-12-04T18:22:06Z

demos/android/MASVS-PRIVACY/MASTG-DEMO-00de1/MastgTest.kt

Maybe you can just use the code from 00de3 for both demos.

let's agree on UI topics first :)

Co-authored-by: Carlos Holguera <perezholguera@gmail.com>

Diolor added 8 commits September 25, 2025 20:31

Deprecate old test

0cdcd45

Add mitmproxy in TECH-0108

536aacb

Create the v2 TEST

525a75e

Create a demo based on the test

ea8e1ed

Fix lint

31d6209

Fix lint

52d0619

Fix lint

99e15f8

Fix lint

a12122a

Diolor commented Sep 26, 2025

View reviewed changes

tests-beta/android/MASVS-PLATFORM/MASTG-TEST-xxxx.md Outdated Show resolved Hide resolved

Diolor requested review from TheDauntless, cpholguera and sushi2k September 26, 2025 14:23

Diolor self-assigned this Sep 26, 2025

cpholguera reviewed Sep 27, 2025

View reviewed changes

tests-beta/android/MASVS-PRIVACY/MASTG-TEST-0297.md Outdated Show resolved Hide resolved

cpholguera reviewed Sep 27, 2025

View reviewed changes

tests-beta/android/MASVS-PLATFORM/MASTG-TEST-xxxx.md Outdated Show resolved Hide resolved

cpholguera reviewed Sep 27, 2025

View reviewed changes

tests-beta/android/MASVS-PLATFORM/MASTG-TEST-xxxx.md Outdated Show resolved Hide resolved

Diolor mentioned this pull request Sep 29, 2025

New technique: identify embedded libraries with network access #3487

Open

Diolor added the MASTG Refactor label Sep 29, 2025

Remove prerequisite

4618723

Diolor mentioned this pull request Oct 1, 2025

New technique: Merged manifest #3489

Closed

Diolor added 2 commits October 1, 2025 16:49

Rewrite test to include frida

13118f7

Move to Privacy

ff3bf25

Diolor marked this pull request as ready for review October 2, 2025 08:10

Diolor added 2 commits October 6, 2025 13:17

Move Firebase Analytics dependency

a45c42f

Move Firebase Analytics dependency

ccb8c1c

Diolor added the Android label Oct 6, 2025

Update title to clarify sensitive information exposure in Firebase An…

2037585

…alytics

Diolor changed the title ~~Port MASTG-TEST-0004: Sensitive Data Leaked via Embedded Libraries (android)~~ Port MASTG-TEST-0004: App Exposing Sensitive Data to Embedded Libraries Oct 6, 2025

Diolor added 2 commits October 20, 2025 12:01

Allocate IDs

7e4ecf2

Merge remote-tracking branch 'origin/master' into port-MASTG-TEST-0004

9078204

Diolor and others added 4 commits October 20, 2025 12:03

Allocate IDs

fab5dea

Update weakness reference in MASTG-TEST-0297.md

4351d41

Merge branch 'master' into port-MASTG-TEST-0004

625d6b3

Merge branch 'master' into port-MASTG-TEST-0004

dc4e504

Diolor added 3 commits December 4, 2025 13:49

Split test into 4 smaller tests.

2e7adb8

Adjust demo addressing test number 1

Fix id

1d02dae

Add demo for dynamic evaluation of sensitive data in runtime

4dea6df

Diolor commented Dec 4, 2025

View reviewed changes

cpholguera requested changes Dec 4, 2025

View reviewed changes

Diolor and others added 3 commits December 5, 2025 10:58

Update demos/android/MASVS-PRIVACY/MASTG-DEMO-00de3/hooks.js

0bc53db

Co-authored-by: Carlos Holguera <perezholguera@gmail.com>

Apply suggestions from code review

1183269

Co-authored-by: Carlos Holguera <perezholguera@gmail.com>

Update references in MASTG-TEST-02te4.md and MASTG-TEST-0004.md

64fcf89

Diolor requested a review from cpholguera December 5, 2025 14:58


		## Steps

		1. Identify common SDK APIs (methods) the SDK uses as entry points to collect data by researching the library's documentation online or its codebase. For example, if the library is `com.example.analytics` and it has a method `trackEvent(String eventName, Map<String, String> properties)` used to accept data, then the method to search for would be `com.example.analytics.trackEvent`.


		## Evaluation

		The test case fails if you can find the use of these SDK methods in the app code, indicating that the app is sharing data with the third-party SDK. If no such references are found, the test case passes.


		## Overview

		This test verifies whether an app is sending sensitive data to an embedded SDK (third-party library) via its APIs (methods).

	This test verifies whether an app is sending sensitive data to an embedded SDK (third-party library) via its APIs (methods).
	This test verifies whether an app uses SDK (third-party library) APIs known to handle sensitive data.

Uh oh!

Port MASTG-TEST-0004: App Exposing Sensitive Data to Embedded Libraries #3485

Are you sure you want to change the base?

Port MASTG-TEST-0004: App Exposing Sensitive Data to Embedded Libraries #3485

Uh oh!

Conversation

Diolor commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cpholguera commented Dec 3, 2025

Test

Part 1

Part 2

Part 3

Demo

New Demo

Putting it all together

Summary of ## Required Actions for This PR

Separate Follow-Up Issues (not part of this PR)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Diolor commented Dec 4, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Diolor commented Sep 26, 2025 •

edited

Loading