Skip to content

Conversation

@adrianboguszewski
Copy link
Contributor

No description provided.

ezelanza and others added 30 commits July 24, 2025 12:16
…--num_frames) for non-interactive captioning
…in path expression

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
if isinstance(image_input, str):
source_path = Path(image_input)
try:
source_resolved = source_path.resolve()

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 9 hours ago

To correctly fix this vulnerability, the function should NOT copy an arbitrary path supplied by the user, nor should it copy files from outside a known, safe upload directory. The best fix is to:

  • Only allow copying images when the source file is in a controlled, expected temporary upload directory (such as the one that Gradio uses).
  • Optionally, reject absolute, symlink, or non-upload temp paths.
  • The most robust and general mitigation is:
    • Use werkzeug.utils.secure_filename to sanitize the filename (if the application uses Flask and Werkzeug).
    • Or (since Gradio's temporary files are provided as safe temporary files, not controlled names) simply load the file contents into PIL and resave into the destination directory, never copying the file path directly.

Thus, the best fix is:
When input is a string (assumed file path from file upload), always open the file as an image, validate it's a supported image, and save it to the destination using PIL. Never allow arbitrary path copy, and handle errors gracefully.

What to change:

  • In save_uploaded_image:
    • For image_input as a string, open via Image.open(), load the image, and save to saved_image_path using PIL—without ever copying the original file (i.e., do not use shutil.copy2()).
    • This blocks all file path traversal exploits, as only valid images are accepted, and files are not blindly copied.

No additional imports are required, as PIL.Image is already imported.


Suggested changeset 1
ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
--- a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
+++ b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
@@ -370,22 +370,13 @@
     if isinstance(image_input, str):
         source_path = Path(image_input)
         try:
-            source_resolved = source_path.resolve()
-            dest_resolved = destination_dir.resolve()
+            with Image.open(source_path) as img:
+                img.load()  # make sure it's a valid image and not a broken file
+                img.save(saved_image_path)
         except Exception as exc:
             raise ValueError(
-                f"Error: Could not resolve path: {image_input} ({exc})"
+                f"Error: Unable to read and save image from uploaded file: {exc} ({image_input})"
             ) from exc
-
-        if not _path_within_directory(source_resolved, dest_resolved):
-            raise ValueError(f"Error: Unsafe file path: {image_input}")
-
-        if not source_path.exists():
-            raise FileNotFoundError(
-                f"Error: Image file not found at {image_input}"
-            )
-
-        shutil.copy2(source_path, saved_image_path)
         return saved_image_path
 
     if hasattr(image_input, "shape"):
EOF
@@ -370,22 +370,13 @@
if isinstance(image_input, str):
source_path = Path(image_input)
try:
source_resolved = source_path.resolve()
dest_resolved = destination_dir.resolve()
with Image.open(source_path) as img:
img.load() # make sure it's a valid image and not a broken file
img.save(saved_image_path)
except Exception as exc:
raise ValueError(
f"Error: Could not resolve path: {image_input} ({exc})"
f"Error: Unable to read and save image from uploaded file: {exc} ({image_input})"
) from exc

if not _path_within_directory(source_resolved, dest_resolved):
raise ValueError(f"Error: Unsafe file path: {image_input}")

if not source_path.exists():
raise FileNotFoundError(
f"Error: Image file not found at {image_input}"
)

shutil.copy2(source_path, saved_image_path)
return saved_image_path

if hasattr(image_input, "shape"):
Copilot is powered by AI and may make mistakes. Always verify output.
if not _path_within_directory(source_resolved, dest_resolved):
raise ValueError(f"Error: Unsafe file path: {image_input}")

if not source_path.exists():

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 9 hours ago

To correctly prevent uncontrolled file access, the code must ensure that any path provided by the user—even after normalization and full path resolution—must only be accessed if it resides within a designated safe directory. However, in the context of an image upload, Gradio may provide a file path pointing to a temporary location internal to Gradio, not necessarily under the intended destination directory, but often under a pre-set temporary directory.

The robust fix is:

  • Ensure that, when a string path is provided, only files under an explicit trusted directory (such as the Gradio temporary upload directory) can be accessed and processed.
  • Optionally, if you want to enforce that only files under the intended destination (destination_dir) are accepted, you must reject all others, since the previous check is not sufficient.
  • Since in practice, Gradio typically uploads files to a known temp directory (e.g., "gradio_uploaded_files" or /tmp), restrict access to files only under this directory.
  • Alternatively, use Python’s os.path.basename to only allow file names (no directory components) supplied by the user, and reconstruct a full path based only on this safe name.

The change should be made in save_uploaded_image in ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py. Specifically:

  • Replace the current check _path_within_directory(source_resolved, dest_resolved) with a comparison that ensures source_resolved is within a known safe upload directory (such as "gradio_uploaded_files" under the project root or another env var—fail safe if not detected).
  • If that directory is not set up, restrict to only allow file names: if image_input includes any path components, reject it.
  • Consider documenting or logging the source directory being allowed, for auditability.

Additionally, it may help to add import of os for path operations, but it is already imported.


Suggested changeset 1
ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
--- a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
+++ b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
@@ -371,15 +371,21 @@
         source_path = Path(image_input)
         try:
             source_resolved = source_path.resolve()
-            dest_resolved = destination_dir.resolve()
         except Exception as exc:
             raise ValueError(
                 f"Error: Could not resolve path: {image_input} ({exc})"
             ) from exc
 
-        if not _path_within_directory(source_resolved, dest_resolved):
-            raise ValueError(f"Error: Unsafe file path: {image_input}")
+        # Restrict source file to only a known safe directory, e.g. Gradio upload temp.
+        allowed_upload_dir = Path.cwd() / "gradio_uploaded_files"
+        try:
+            allowed_upload_dir_resolved = allowed_upload_dir.resolve()
+        except Exception:
+            raise ValueError(f"Error: Could not resolve allowed upload dir: {allowed_upload_dir}")
 
+        if not _path_within_directory(source_resolved, allowed_upload_dir_resolved):
+            raise ValueError(f"Error: Unsafe file path: {image_input} (not within allowed upload dir)")
+
         if not source_path.exists():
             raise FileNotFoundError(
                 f"Error: Image file not found at {image_input}"
EOF
@@ -371,15 +371,21 @@
source_path = Path(image_input)
try:
source_resolved = source_path.resolve()
dest_resolved = destination_dir.resolve()
except Exception as exc:
raise ValueError(
f"Error: Could not resolve path: {image_input} ({exc})"
) from exc

if not _path_within_directory(source_resolved, dest_resolved):
raise ValueError(f"Error: Unsafe file path: {image_input}")
# Restrict source file to only a known safe directory, e.g. Gradio upload temp.
allowed_upload_dir = Path.cwd() / "gradio_uploaded_files"
try:
allowed_upload_dir_resolved = allowed_upload_dir.resolve()
except Exception:
raise ValueError(f"Error: Could not resolve allowed upload dir: {allowed_upload_dir}")

if not _path_within_directory(source_resolved, allowed_upload_dir_resolved):
raise ValueError(f"Error: Unsafe file path: {image_input} (not within allowed upload dir)")

if not source_path.exists():
raise FileNotFoundError(
f"Error: Image file not found at {image_input}"
Copilot is powered by AI and may make mistakes. Always verify output.
f"Error: Image file not found at {image_input}"
)

shutil.copy2(source_path, saved_image_path)

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI about 9 hours ago

General fix:
Ensure that when a user uploads or provides an image, the file path is not used unless it is guaranteed to be within an intended, safe directory (e.g., a temp/untrusted uploads directory controlled by the application). The code should not allow user input to reference arbitrary files on disk. Allow only files physically uploaded by the user or, even more robustly, accept only in-memory file-like objects or arrays.

Best way to fix, in detail:

  • In save_uploaded_image, when image_input is a string (i.e., a file path), verify that the file path is in a designated, safe uploads directory. This should match the actual temp directory that Gradio uses for Image upload values, or otherwise refuse to process it.
  • Instead of using _path_within_directory(source_resolved, dest_resolved), which does not make sense (the source must be under the uploads or temp dir, not the destination), check that source_resolved is under a known safe uploads directory, such as /tmp or another directory you set up for the application.
  • If not, refuse to process or raise an error. Do not allow absolute paths, symlinks or parent directory traversal (..).
  • Optionally, document acceptable temp directories (e.g., Gradio typically stores uploaded files in a specific temp folder, which can be computed or set).

Required changes:

  • Add a constant (e.g., SAFE_UPLOAD_ROOT) at the module level in util.py specifying the allowed uploads directory.
  • In save_uploaded_image, when handling paths, normalize and validate that the source file is under SAFE_UPLOAD_ROOT.
  • Refuse to process files outside this area.
  • (No change needed for handling numpy arrays.)

Suggested changeset 1
ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
--- a/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
+++ b/ai_ref_kits/agentic_multimodal_travel_planer/utils/util.py
@@ -17,6 +17,10 @@
 import requests
 from PIL import Image
 
+# Directory under which all temp uploaded files must reside
+# Change this path if your app uses a different uploads temp folder.
+SAFE_UPLOAD_ROOT = Path("/tmp")
+
 def validate_llm_endpoint(api_base, timeout=5):
     """Validate if the LLM API endpoint is accessible"""
     try:
@@ -371,14 +375,16 @@
         source_path = Path(image_input)
         try:
             source_resolved = source_path.resolve()
-            dest_resolved = destination_dir.resolve()
         except Exception as exc:
             raise ValueError(
                 f"Error: Could not resolve path: {image_input} ({exc})"
             ) from exc
 
-        if not _path_within_directory(source_resolved, dest_resolved):
-            raise ValueError(f"Error: Unsafe file path: {image_input}")
+        # Only permit temp files within the safe upload directory
+        if not _path_within_directory(source_resolved, SAFE_UPLOAD_ROOT):
+            raise ValueError(
+                f"Error: Unsafe or disallowed file path: {image_input}"
+            )
 
         if not source_path.exists():
             raise FileNotFoundError(
EOF
@@ -17,6 +17,10 @@
import requests
from PIL import Image

# Directory under which all temp uploaded files must reside
# Change this path if your app uses a different uploads temp folder.
SAFE_UPLOAD_ROOT = Path("/tmp")

def validate_llm_endpoint(api_base, timeout=5):
"""Validate if the LLM API endpoint is accessible"""
try:
@@ -371,14 +375,16 @@
source_path = Path(image_input)
try:
source_resolved = source_path.resolve()
dest_resolved = destination_dir.resolve()
except Exception as exc:
raise ValueError(
f"Error: Could not resolve path: {image_input} ({exc})"
) from exc

if not _path_within_directory(source_resolved, dest_resolved):
raise ValueError(f"Error: Unsafe file path: {image_input}")
# Only permit temp files within the safe upload directory
if not _path_within_directory(source_resolved, SAFE_UPLOAD_ROOT):
raise ValueError(
f"Error: Unsafe or disallowed file path: {image_input}"
)

if not source_path.exists():
raise FileNotFoundError(
Copilot is powered by AI and may make mistakes. Always verify output.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants