-
Notifications
You must be signed in to change notification settings - Fork 139
Open
Description
The image works fine initially but the more I run it (aka call the Lambda), the more often do I get this error back:
disconnected: Unable to receive message from renderer\n (failed to check if window was closed: disconnected: not connected to DevTools)\n (Session info: chrome=124.0.6367.207)\nStacktrace:
Dockerfile
FROM umihico/aws-lambda-selenium-python:latest
COPY main.py ./
COPY status.py ./
RUN pip install requests
CMD [ "main.lambda_handler" ]
status.py
is just wrappers for returning appropriate status codes. My only dependencies are selenium and requests.
main.py
Here is the relevant code calling selenium:
class HtmlToPdf:
@staticmethod
def _to_html(shipping_label: str) -> str:
try:
shipping_bytes = base64.b64decode(shipping_label)
if str(shipping_bytes).startswith('b\'%PDF'):
raise AlreadyPDFException
return shipping_bytes.decode()
except (binascii.Error, ValueError, UnicodeDecodeError):
return shipping_label
@staticmethod
def _html_to_uri(html_string: str):
return "data:text/html;charset=utf-8," + quote(html_string)
@staticmethod
def _get_driver():
user_data_dir = mkdtemp()
data_path = mkdtemp()
disk_cache_dir = mkdtemp()
selenium_dir = "/tmp/selenium"
if not os.path.exists(selenium_dir):
os.mkdir(selenium_dir)
options = webdriver.ChromeOptions()
service = webdriver.ChromeService("/opt/chromedriver")
options.binary_location = '/opt/chrome/chrome'
options.add_argument("--headless=new")
options.add_argument('--no-sandbox')
options.add_argument("--disable-gpu")
options.add_argument("--window-size=1280x1696")
options.add_argument("--single-process")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-dev-tools")
options.add_argument("--no-zygote")
options.add_argument(f"--user-data-dir={user_data_dir}")
options.add_argument(f"--data-path={data_path}")
options.add_argument(f"--disk-cache-dir={disk_cache_dir}")
options.add_argument("--remote-debugging-port=9222")
options.add_argument(f"--homedir={selenium_dir}")
chrome = webdriver.Chrome(options=options, service=service)
return chrome
@classmethod
def selenium_converter(cls, b64_html: str) -> str:
"""
Converts a base64 encoded HTML string into a base65 encoded PDF string
:param b64_html: Base64 encoded HTML string
:return: Base64 encoded PDF string
"""
try:
html = cls._to_html(b64_html)
except AlreadyPDFException:
return b64_html
html_uri = cls._html_to_uri(html)
driver = cls._get_driver()
# Navigate to the HTML page
driver.get(html_uri)
# Wait for the page to fully load (adjust the timeout as needed)
driver.implicitly_wait(3)
# Save the page as PDF
pdf_bytes = driver.execute_cdp_cmd("Page.printToPDF", {"landscape": False})
# Close the WebDriver
driver.quit()
return pdf_bytes['data']
def lambda_handler(event, context):
try:
body = event.get('body')
if body is not None:
try:
body = json.loads(body, use_decimal=True)
bs64_encoded_html = body['html']
except ValueError:
raise status.Base400Exception('Invalid json received.')
except KeyError:
raise status.Base400Exception('No HTML was provided')
else:
raise status.Base400Exception('No body provided.')
b64_pdf = HtmlToPdf().selenium_converter(bs64_encoded_html)
if not b64_pdf:
raise status.Base500Exception('PDF could not be converted')
raise status.Success(
{
"message": "PDF conversion was successful",
"data": {
"pdf": b64_pdf,
"type": "base64_encoded",
}
}
)
except status.Success as response:
return response.json()
except status.Base500Exception as response:
return response.json()
except status.Base400Exception as response:
return response.json()
except Exception as e:
return status.Base500Exception(f"Something unexpected happened: {e}").json()
Lambda Configs
- Architecture: x86_64
- Memory: 1024MB
- Ephemeral Storage: 512MB
For context, I would call the lambda in a loop of 5000. It would return the aforementioned error 4 times. When I do that again without changing the image, it returns error 8-9 times. Then 20+ times and so on.
This goes away and resets when I deploy the (same) image again.
What I have tried so far
- Deleting the folders created by
mkdtmemp()
- Deleting the
/tmp/
folder as cleanup at end of function execution
I appreciate any help and advice. Thank you!
JoLBree and umihico
Metadata
Metadata
Assignees
Labels
No labels