Back out "Re-apply D101260086: Android unified error reporting" by kirklandsign · Pull Request #19201 · pytorch/executorch

kirklandsign · 2026-04-29T00:47:00Z

Summary

Make internal repo in sync

Test plan

CI

The original backout D102493838 only reverted fbcode/executorch mirror files but missed xplat/executorch source files, leaving GitHub out of sync. This applies the same backout to the xplat source.

pytorch-bot · 2026-04-29T00:47:03Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19201

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Rolling out OSDC (ARC) runners on pull & trunk workflows in PyTorch main

✅ You can merge normally! (2 Unrelated Failures)

As of commit 8bd0329 with merge base c4ec988 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

pull / unittest / macos / macos-job (gh) (similar failure)
##[error]The operation was canceled.

BROKEN TRUNK - The following job failed but was present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-editable / macos / macos-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-29T00:47:44Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR backs out the prior “Android unified error reporting” changes to resync with an internal repository, shifting several Android APIs away from exception-based reporting and updating JNI method bindings accordingly.

Changes:

Switch Android Module.loadMethod() and several LLM APIs to return integer status codes instead of throwing.
Rename/update JNI registered method names (e.g., *Native → direct names) and adjust corresponding Java declarations.
Update Android instrumentation tests and minibench runners to use status codes.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 14 comments.

Show a summary per file

File	Description
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/ModelRunner.java	Uses status code returned by `Module.loadMethod()` for benchmark reporting.
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/LlmModelRunner.java	Removes exception handling around LLM load/generate and forwards returned status.
extension/android/jni/jni_layer_llama.cpp	Changes LLM JNI load error handling and renames registered native method names.
extension/android/jni/jni_layer.cpp	Renames registered method names and changes EValue input type handling in execute path.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java	Converts generate/load/resetContext APIs to native/status-based flow; removes destroyed-state guards; changes prefill error handling.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmCallback.java	Makes default `onError` implementation a no-op.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/Module.java	Changes method signatures and replaces some wrapper methods with direct native methods.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecutorchRuntimeException.java	Adjusts error-code constants/mappings and message formatting.
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/ModuleInstrumentationTest.kt	Updates tests to assert status codes instead of exceptions (with one expected-exception mismatch).
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/LlmModuleInstrumentationTest.kt	Updates tests to check `load()` return status.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-29T00:53:57Z

    private const val TEST_FILE_NAME = "/stories.pte"
    private const val TOKENIZER_FILE_NAME = "/tokenizer.bin"
    private const val TEST_PROMPT = "Hello"
+    private const val OK = 0x00


This test introduces a local OK constant (0x00). To keep tests aligned with the runtime’s source of truth and avoid drift, prefer using ExecutorchRuntimeException.OK (or a shared constant) instead of duplicating the numeric value here.

Suggested change

private const val OK = 0x00

Copilot · 2026-04-29T00:53:57Z

@@ -385,12 +385,6 @@ class ExecuTorchJni : public facebook::jni::HybridClass<ExecuTorchJni> {
        static const auto toBoolMethod =
            JEValue::javaClassStatic()->getMethod<jboolean()>("toBool");
        evalues.emplace_back(static_cast<bool>(toBoolMethod(jevalue)));


In the EValue input conversion loop, unsupported type codes are now silently ignored (the previous branch that threw InvalidArgument was removed). This can result in executing the method with missing/shifted inputs, producing incorrect results or confusing downstream errors. Restore explicit handling for unknown type codes (e.g., throw an ExecuTorch exception / return empty result) so invalid inputs fail fast.

Suggested change

evalues.emplace_back(static_cast<bool>(toBoolMethod(jevalue)));

evalues.emplace_back(static_cast<bool>(toBoolMethod(jevalue)));

} else {

ET_LOG(

Error,

"Unsupported JEValue type code %d at input index %d.",

static_cast<int>(typeCode),

i);

return facebook::jni::JArrayClass<JEValue>::newArray(0);

Copilot · 2026-04-29T00:53:57Z

-      } catch (Exception e) {
-        status = -1;
-      }
+      int status = mLlmModelRunner.mModule.load();


handleMessage() no longer catches exceptions around mModule.load(). With the current JNI implementation, load() can throw (e.g., on non-zero native load result), which would crash the Handler thread and prevent onModelLoaded(status) from being called. Either restore exception handling here or ensure LlmModule.load() is a pure status-returning API that never throws.

Suggested change

int status = mLlmModelRunner.mModule.load();

int status = -1;

try {

status = mLlmModelRunner.mModule.load();

} catch (RuntimeException e) {

// Preserve handler thread liveness and report load failure via status callback.

}

Copilot · 2026-04-29T00:53:58Z

-    } finally {
-      mLock.unlock();
-    }
+    return readLogBufferNative();


readLogBuffer() now forwards directly to readLogBufferNative() without any mHybridData.isValid() check. After destroy(), this risks invoking JNI on an invalid HybridData/native pointer (potential crash). Please add the same isValid guard used by execute/loadMethod (return an empty array or similar when destroyed) and consider also guarding with mLock if the native side isn’t thread-safe.

Suggested change

return readLogBufferNative();

mLock.lock();

try {

if (!mHybridData.isValid()) {

return new String[0];

}

return readLogBufferNative();

} finally {

mLock.unlock();

}

Copilot · 2026-04-29T00:53:58Z

-      if (errorCode != 0) {
-        throw new ExecutorchRuntimeException(errorCode, "Failed to load method: " + methodName);
+        Log.e("ExecuTorch", "Attempt to use a destroyed module");
+        return 0x2; // InvalidState


loadMethod() returns a hard-coded 0x2 for the destroyed-module case. Since this class already depends on ExecutorchRuntimeException’s error-code constants (and the comment says “keep in sync with runtime/core/error.h”), prefer using a named constant (e.g., ExecutorchRuntimeException.INVALID_STATE) to avoid magic numbers drifting from native definitions.

Suggested change

return 0x2; // InvalidState

return ExecutorchRuntimeException.INVALID_STATE;

Copilot · 2026-04-29T00:53:59Z

  @DoNotStrip
-  private native String[] getMethodsNative();
+  public native String[] getMethods();


getMethods() is now a direct native call with no mLock/mHybridData.isValid() guard. This breaks the class’s existing thread-safety + destroyed-object pattern (execute/loadMethod still lock and check isValid), and it can call into freed native state after destroy(). Consider restoring a Java wrapper that locks + checks isValid (returning an empty array on destroyed), and keep the native method private (e.g., getMethodsNative).

Copilot · 2026-04-29T00:53:59Z

  @Experimental
-  public boolean etdump() {
-    mLock.lock();
-    try {
-      if (!mHybridData.isValid()) {
-        throw new IllegalStateException("Module has been destroyed");
-      }
-      return etdumpNative();
-    } finally {
-      mLock.unlock();
-    }
-  }
-
  @DoNotStrip
-  private native boolean etdumpNative();
+  public native boolean etdump();


etdump() is now a direct native method without an isValid check or locking. Calling this after destroy() (or concurrently with execute()) can race with native teardown and lead to crashes or corrupted dumps. Wrap the call in a Java method that acquires mLock and checks mHybridData.isValid(), returning false when destroyed.

Copilot · 2026-04-29T00:53:59Z

  public void resetNative() {
-    if (mDestroyed) return;
-    mDestroyed = true;
    mHybridData.resetNative();
  }


LlmModule removed the destroyed-state guard (mDestroyed/checkNotDestroyed), but resetNative() still frees the underlying HybridData. After resetNative(), subsequent calls into native methods like generate()/load()/resetContext() can invoke JNI on an invalid native pointer (potential crash). Consider restoring an isValid/destroyed guard (similar to Module/TrainingModule) to fail fast after resetNative().

Copilot · 2026-04-29T00:54:00Z

  @Test(expected = RuntimeException::class)
  @Throws(IOException::class)
  fun testNonPteFile() {
    val module = Module.load(getTestFilePath(NON_PTE_FILE_NAME))

-    module.loadMethod(FORWARD_METHOD)
+    val loadMethod = module.loadMethod(FORWARD_METHOD)
+    Assert.assertEquals(loadMethod.toLong(), INVALID_ARGUMENT.toLong())
  }


testNonPteFile is still annotated with @test(expected = RuntimeException::class), but the test body now asserts that loadMethod() returns INVALID_ARGUMENT rather than throwing. Unless Module.load() is expected to throw before reaching the assertions, this test will fail. Update the test to either assert the returned error code (and remove the expected-exception annotation) or keep exception-based assertions consistently.

Copilot · 2026-04-29T00:54:00Z

-      System.err.println("ExecuTorch LLM error " + errorCode + ": " + message);
-    }
-  }
+  default void onError(int errorCode, String message) {}


LlmCallback.onError’s default implementation was changed from logging to a no-op. If callers don’t override onError, runtime failures will become silent, making it hard to diagnose issues in production/benchmarks. Consider keeping a minimal default log (or at least documenting that implementations should override onError) to preserve baseline observability.

psiddh · 2026-04-29T19:02:37Z

Thanks @kirklandsign , Yes we need this to keep things in sync

Back out "Re-apply D101260086: Android unified error reporting"

8bd0329

The original backout D102493838 only reverted fbcode/executorch mirror files but missed xplat/executorch source files, leaving GitHub out of sync. This applies the same backout to the xplat source.

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2026

kirklandsign requested review from JacobSzwejbka and psiddh April 29, 2026 00:47

kirklandsign marked this pull request as ready for review April 29, 2026 00:47

kirklandsign requested a review from Gasoonjia as a code owner April 29, 2026 00:47

Copilot AI review requested due to automatic review settings April 29, 2026 00:47

Copilot started reviewing on behalf of kirklandsign April 29, 2026 00:47 View session

Copilot AI reviewed Apr 29, 2026

View reviewed changes

JacobSzwejbka approved these changes Apr 29, 2026

View reviewed changes

psiddh approved these changes Apr 29, 2026

View reviewed changes

kirklandsign merged commit 38fa0ff into main Apr 29, 2026
180 of 184 checks passed

kirklandsign deleted the android/backout-unified-error-reporting branch April 29, 2026 20:01

-        evalues.emplace_back(static_cast<bool>(toBoolMethod(jevalue)));
+        evalues.emplace_back(static_cast<bool>(toBoolMethod(jevalue)));
+      } else {
+        ET_LOG(
+            Error,
+            "Unsupported JEValue type code %d at input index %d.",
+            static_cast<int>(typeCode),
+            i);
+        return facebook::jni::JArrayClass<JEValue>::newArray(0);

-      int status = mLlmModelRunner.mModule.load();
+      int status = -1;
+      try {
+        status = mLlmModelRunner.mModule.load();
+      } catch (RuntimeException e) {
+        // Preserve handler thread liveness and report load failure via status callback.
+      }

	return 0x2; // InvalidState
	return ExecutorchRuntimeException.INVALID_STATE;

Conversation

kirklandsign commented Apr 29, 2026

Summary

Test plan

Uh oh!

pytorch-bot Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19201

❗ 1 Active SEVs

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

github-actions Bot commented Apr 29, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

psiddh commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

pytorch-bot Bot commented Apr 29, 2026 •

edited

Loading

This PR needs a `release notes:` label